ProDMPs A Uniﬁed Perspective on Dynamic and Probabilistic Movement Primitives Ge Li1 Zeqi Jin1 Michael V olpp1 Fabian Otto23 Rudolf Lioutikov1 and Gerhard Neumann1

2025-05-02 0 0 2.97MB 12 页 10玖币

侵权投诉

ProDMPs: A Uniﬁed Perspective on Dynamic and Probabilistic

Movement Primitives

Ge Li1, Zeqi Jin1, Michael Volpp1, Fabian Otto2,3, Rudolf Lioutikov1, and Gerhard Neumann1

Abstract— Movement Primitives (MPs) are a well-known

concept to represent and generate modular trajectories. MPs

can be broadly categorized into two types: (a) dynamics-based

approaches that generate smooth trajectories from any ini-

tial state, e. g., Dynamic Movement Primitives (DMPs), and

(b) probabilistic approaches that capture higher-order statis-

tics of the motion, e. g., Probabilistic Movement Primitives

(ProMPs). To date, however, there is no method that uniﬁes

both, i. e. that can generate smooth trajectories from an arbi-

trary initial state while capturing higher-order statistics. In this

paper, we introduce a uniﬁed perspective of both approaches by

solving the ODE underlying the DMPs. We convert expensive

online numerical integration of DMPs into basis functions that

can be computed ofﬂine. These basis functions can be used

to represent trajectories or trajectory distributions similar to

ProMPs while maintaining all the properties of dynamical

systems. Since we inherit the properties of both methodologies,

we call our proposed model Probabilistic Dynamic Movement

Primitives (ProDMPs). Additionally, we embed ProDMPs in

deep neural network architecture and propose a new cost

function for efﬁcient end-to-end learning of higher-order trajec-

tory statistics. To this end, we leverage Bayesian Aggregation

for non-linear iterative conditioning on sensory inputs. Our

proposed model achieves smooth trajectory generation, goal-

attractor convergence, correlation analysis, non-linear condi-

tioning, and online re-planing in one framework.

I. INTRODUCTION

Movement Primitives (MPs) are a prominent tool for

motion representation and synthesis in robotics. They serve

as basic movement elements, modulate the motion behavior,

and form more complex movements through combination

or concatenation. This work focuses on trajectory-based

movement representations [1, 2]. Given a parameter vector,

such representations generate desired trajectories for the

robot to follow. These methods have gained much popularity

in imitation and reinforcement learning (IL, RL) [3–7] due

to their concise parametrization and ﬂexibility to modulate

movement. Current methods can be roughly classiﬁed into

approaches based on dynamical systems [1, 8–11] and prob-

abilistic approaches [2, 12, 13], with both types offering their

own advantages. The dynamical systems-based approaches,

such as Dynamic Movement Primitives (DMPs), guarantee

that the generated trajectories start precisely at the current

position and velocity of the robot, which allows for smooth

trajectory replanning i. e., changing the parameters of the

MPs during motion execution [11, 14]. However, since DMPs

represent the trajectory via the forcing term instead of a

direct representation of the trajectory position, numerical

1Karlsruhe Institute of Technology, Germany. ge.li@kit.edu

2Bosch Center for Artiﬁcial Intelligence, Germany.

3University of T ¨

ubingen, Germany.

integration from acceleration to position has to be applied

to formulate the trajectory, which constitutes an additional

workload and makes the estimation of the trajectory statistics

difﬁcult [15]. Probabilistic methods, such as Probabilistic

Movement Primitive (ProMP), are able to acquire such

statistics, thus making them the key enablers for acquiring

variable-stiffness controllers and the trajectory’s temporal

and DoFs correlation. These methods further perform as gen-

erative models, facilitating the sampling of new trajectories.

However, the lack of internal dynamics of these approaches

suffers from discontinuities in position and velocity between

old and new trajectories in the case of replanning.

In this work, we propose Probabilistic Dynamic Movement

Primitives (ProDMPs) which unify both methodologies. We

show that the trajectory of a DMP, obtained by integrating

its second-order dynamical system, can be expressed by a

linear basis function model that depends on the parameters

of the DMP, i. e., the weights of the forcing function and the

goal attractor. The linear basis functions can be obtained by

integrating the original basis functions used in the DMP - an

operation that only needs to be performed once ofﬂine in the

ProDMPs. Recently, MP research has been extended to deep

neural network (NN) architectures [10, 11, 13] that enable

conditioning the trajectory generation on high-dimensional

context variables, such as images. Following these ideas, we

integrate our representation into a deep neural architecture

that allows non-linear conditioning on a varying number

of conditioning events. These events are aggregated using

Bayesian Aggregation (BA) into a latent probabilistic repre-

sentation [16] which is mapped to a Gaussian distribution

in the parameter space of the ProDMPs. We summarize

the contributions of this paper as: (a) We unify ProMPs

and DMPs into one consistent framework that inherits the

beneﬁts of both formulations. (b) We enable to compute

distributions and to capture correlations of DMPs trajectories,

while (c) the robot’s current state can be inscribed into the

trajectory distribution through boundary conditions, allowing

for smooth replanning. (d) Moreover, the ofﬂine integration

of the basis functions signiﬁcantly facilitates the integration

into neural network architectures, reducing the computation

time by a factor of 10. (e) Hence, we embed ProDMPs in

a deep encoder-decoder architecture that allows non-linear

conditioning on a set of high-dimensional observations with

varying information levels. We evaluate our method on three

digit-writing tasks using images as inputs, a simulated robot

pushing task with a complex physical interaction, and a real

robot picking task with shifting object positions. We compare

our model with state-of-the-art NN-based DMPs [9–11] and

arXiv:2210.01531v1 [cs.RO] 4 Oct 2022

the NN-based probabilistic method [13].

II. RELATED WORK

Paraschos et al. [2] established ProMPs to model MPs as

a trajectory distribution that captures temporal correlation

and correlations between the DoFs. ProMPs maintain a

Gaussian distribution over the parameters and can map it to

the corresponding trajectory distribution using a linear basis

function model. In contrast, such a distribution mapping is

not allowed for the DMP-based approaches, as the trajectory

is integrated numerically from the forcing function. Previous

methods, like GMM/GMR-DMPs [17, 18] used Gaussian

Mixture Models to cover the trajectories’ domain. Yet, this

does not capture temporal correlation nor does it provide

a generative approach to trajectories. Other methods which

have learned distributions over DMP weights [19, 20], do not

connect the weights distribution to the trajectory distribution,

as trajectories can only be obtained by integration. Hence, it

is also hard to learn the weights distribution reversely from

trajectories in an end-to-end manner.

To learn DMP parameters from high-dimensional sensory

inputs, Gams et al. [9] and Ridge et al. [10] designed an

encoder-decoder architecture to learn the weights of a single

DMP from digit images, and derive the gradient of the

trajectory with respect to the learnable parameters. Bahl et al.

[11] propose Neural Dynamic Policies (NDP) that allow

replanning the DMP parameters throughout the execution

of the trajectory which has also been extended to the RL

setting. The learning objective of these two methods for IL

is to optimize the mean squared error (MSE) between the

predicted and the ground-truth trajectories using backpropa-

gation. However, to formulate a trajectory, DMPs must apply

numerical integration during the NN training procedure,

which signiﬁcantly increases the computational workload

in both forward and backward propagation, rendering these

approaches cumbersome to use. Additionally, the integration-

based trajectory representation limits the use of probabilistic

methods and hence these NN-DMPs approaches cannot be

trained using a probabilistic log-likelihood (LL) loss.

Probabilistic MPs approaches have also been extended

with deep NN architectures. Seker et al. [13] directly use

a Conditional Neural Processes model [21] as a trajectory

generator, i.e. Conditional Neural Movement Primitives (CN-

MPs), to predict the trajectory distribution with an encoder-

decoder NNs model. While such an architecture enables non-

linear conditioning on high-dimensional inputs, it can only

predict an isotropic trajectory variance at each time step. The

temporal and DoFs correlations are missing, which makes

sampling consistently in time and DoFs infeasible. Besides,

both ProMPs and CNMPs neglect dynamics, i. e. , when

changing trajectory parameters during trajectory execution,

the newly generated trajectory will contain discontinuities

at the replanning time point. To execute such trajectories,

a heuristic controller is used to freeze the time and catch

up with the jump [13]. However, such a waiting mechanism

does not scale to time-sensitive motions and tasks.

III. A UNIFIED PERSPECTIVE ON DYNAMIC AND

PROBABILISTIC MOVEMENT PRIMITIVES

We ﬁrst brieﬂy cover the fundamental aspects of DMPs.

Then, we derive the analytical solution of the DMPs’ ODE to

develop our new ProDMPs representation. For convenience,

we introduce our approach through a 1-DoF dynamical

system and later extend it to a high-DoFs system.

A. Solving DMPs’ Ordinary Differential Equation

For a single movement execution as a trajectory λ=

[yt]t=0:T, Schaal [1] and Ijspeert et al. [8] model it as

a second-order linear dynamical system with a non-linear

forcing function f,

τ2¨y=α(β(g−y)−τ˙y)+f(x), f (x) = xPϕi(x)wi

Pϕi(x)=xϕ|

xw,

(1)

where y=y(t),˙y= dy/dt, ¨y= d2y/dt2represent the

position, velocity, and acceleration of the system at time

step t, respectively. Here, we use the original formulation

of DMPs in [1] without any extensions. αand βare

spring-damper constants, gis a goal attractor, and τis a

time constant which can be used to adapt the execution

speed of the resulting trajectory. To this end, DMPs deﬁne

the forcing function over an exponentially decaying phase

variable x(t) = exp(−αx/τ t), where ϕi(x)represents the

(unnormalized) basis functions and wi∈w,i= 1...N are

the corresponding weights. The trajectory of the motion λ

is obtained by integrating the dynamical system, or more

speciﬁcally, numerical integration from starting time to the

target time point. The dynamical system deﬁned in Eq. (1) is

a second-order linear non-homogeneous ordinary differential

equation (ODE) with constant coefﬁcients, whose closed-

form solution can be derived analytically. We rewrite the

ODE and its homogeneous counterpart in standard form as

DMPs’ ODE: ¨y+α

τ˙y+αβ

τ2y=f(x)

τ2+αβ

τ2g≡F(x, g),(2)

Homo. ODE: ¨y+α

τ˙y+αβ

τ2y= 0,(3)

where Fdenotes some function of xand g. Using the method

of variation of constants [22], the closed-form solution of the

second-order ODE in Eq. (2), i. e. , the trajectory position, is

y=c1y1+c2y2−y1Zy2F

Ydt+y2Zy1F

Ydt, (4)

where y1,y2are two linearly independent complementary

functions of the homogeneous ODE given in Eq. (3). c1,

c2are two constants which are determined by the boundary

conditions (BCs) of the ODE, and Y=y1˙y2−˙y1y2. Both

integrals in Eq. (4) are indeﬁnite. With appropriate values

β=α/4[1, 8], the system is critically damped, and the

corresponding characteristic equation of the homogeneous

ODE, i. e. ∆=(α2−4αβ)/τ2will be 0. Consequently, y1,

y2will take the form

y1=y1(t) = exp −α

2τt, y2=y2(t) = texp −α

2τt.(5)

Using this result, the term Ycan also be simpliﬁed to

Y= exp (−αt/τ)6= 0.To get y, we need to solve the two

indeﬁnite integrals in Eq. (4) as

I1(t) = Zy2F

Ydt=Ztexp α

2τtF(x, g)dt,

I2(t) = Zy1F

Ydt=Zexp α

2τtF(x, g)dt.

(6)

Applying the Fundamental Theorem of Calculus, i. e. ,

Rh(t)dt=Rt

0h(t0)dt0+c,c∈R, together with the deﬁnition

of the forcing function fin Eq. (1) and F(x, g)in Eq. (2),

I1(t)can be expressed as

I1(t) = 1

τ2Zt

t0exp α

2τt0x(t0)ϕ|

xwdt0

+Zt

t0exp α

2τt0α2

4gdt0+c3

(7)

τ2Zt

t0expα

2τt0x(t0)ϕ|

xdt0w

+α

2τt−1exp α

2τt+ 1g+c3,

(8)

where c3is a constant ﬁxed by the BCs. From Eq. (7)

to Eq. (8), we move the time-independent parameters w

and gout of their corresponding integrals. Notice that the

remaining part of the second integral has an analytical

solution. The remaining part of the ﬁrst integral, however,

has no closed-form solution because the basis functions ϕx

may be arbitrarily complex. Denoting these integrals as

p1(t)≡1

τ2Zt

t0exp α

2τt0x(t0)ϕ|

xdt0,

q1(t)≡α

2τt−1exp α

2τt+ 1,

(9)

where p1is a N-dim vector and q1a scalar, we can express

I1(t) = p1(t)|w+q1(t)g+c3.Following the same steps, we

can obtain a similar solution for I2, i. e. , I2(t) = p2(t)|w+

q2(t)g+c4, where we present p2(t)and q2(t)in Eq. (10).

p2(t) = 1

τ2Zt

exp α

2τt0x(t0)ϕ|

xdt0,

q2(t) = α

2τexp α

2τt−1(10)

B. DMPs’ Linear Basis Functions Representation.

Substituting the two integrals I1and I2in Eq. (4) by their

derived form, the constants c3and c4can then be merged into

c1and c2, respectively. We can now express the position of

DMPs in Eq. (4) as a summation of complementary functions

y1and y2, plus a linear basis function representation of the

weights wand the goal attractor g

y=c1y1+c2y2+y2p2−y1p1y2q2−y1q1w

g

≡c1y1+c2y2+Φ|wg,

(11)

where wgis a N+1-dim vector containing the weights w

and the goal g. The resulting basis functions for wand g

are represented by Φ(t), which can be solved numerically,

cf. Fig. 1. The constants c1and c2are determined by solving

a boundary condition problem where we use the current

position and velocity of the robot to inscribe where the

trajectory should start.

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.5

1.0

1.5

1e−3

(a) Weights’ basis

0.0 0.2 0.4 0.6 0.8 1.0

0.00

0.25

0.50

0.75

1.00

(b) Goal’s basis

Fig. 1: The basis functions Φof the ProDMPs can be computed

ofﬂine and later used in online trajectory generation.

˙y¨yy

DNN

Online:

MSE Loss

(a) NN + DMPs couple the numerical integration and the learning.

DNN

Online:

Ofﬂine, computed once:

wgor

p(wg)

λor

p(λ)

MSE or

LL Loss

(b) NN + ProDMPs decouple the numerical integration from the

learning pipeline and can also model trajectory distributions.

Fig. 2: Comparison of trajectory generation pipelines between

(a) NN-based DMPs [9–11] and (b) ProDMPs. The node DNN

represents arbitrary deep neural network architecture. The blue

arrows denote the learning pipeline while the red arrows the nu-

merical integration. Our method transforms the expensive numerical

integration as basis functions computed ofﬂine which speeds up the

trajectory computation and allows trajectory distribution prediction.

TABLE I: Computation time of both pipelines. Here a 2-DoFs, 6-

seconds long, 1000 Hz trajectory is generated from a 22-dim wg

parameter vector. We tested both forward pass (FP) and backward

pass (BP). A 3-layer fully connected (FC) network with [10, 128,

22] neurons on input, hidden and output layers respectively are

used to simulate the learning procedure. The keyword +BC are

the settings where the boundary conditions are renewed so that

the coefﬁcients c1and c2need to be recomputed. Otherwise,

they remain unchanged. The result shows that our model is 200-

4600 times faster than the NN-DMPs in different settings. We

use a Nvidia® RTX-3080Ti GPU for our test. In a full learning

experiment with NN architectures, this speed difference translates

into a speed-up of around 10 times (see experiments).

Pipelines FP FP + BC BP BP + BC

NN-DMPs 0.6057 s 0.6145 s 1.5261 s 1.5737 s

ProDMPs 0.00013 s 0.0027 s 0.00105 s 0.0039 s

Speed-up ×4659 ×227 ×1453 ×403

In contrast to previous NN-DMPs methods [9–11], our

model separates the learnable parameters from the numerical

integrals which are transformed as basis functions. These

basis functions are shared by all trajectories to be generated

during learning procedure. Hence, we can pre-compute these

basis functions once ofﬂine at ﬁrst and use them as constants

in online trajectory generation. Consequently, we exclude

numerical integrals from forward and backward propagation

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

ProDMPs:AUniedPerspectiveonDynamicandProbabilisticMovementPrimitivesGeLi1,ZeqiJin1,MichaelVolpp1,FabianOtto2;3,RudolfLioutikov1,andGerhardNeumann1AbstractMovementPrimitives(MPs)areawell-knownconcepttorepresentandgeneratemodulartrajectories.MPscanbebroadlycategorizedintotwotypes:(a)dynamics-basedap...

展开>> 收起<<

ProDMPs A Uniﬁed Perspective on Dynamic and Probabilistic Movement Primitives Ge Li1 Zeqi Jin1 Michael V olpp1 Fabian Otto23 Rudolf Lioutikov1 and Gerhard Neumann1.pdf

共12页,预览3页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

ProDMPs A Uniﬁed Perspective on Dynamic and Probabilistic Movement Primitives Ge Li1 Zeqi Jin1 Michael V olpp1 Fabian Otto23 Rudolf Lioutikov1 and Gerhard Neumann1

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: