ProDMPs A Unified Perspective on Dynamic and Probabilistic Movement Primitives Ge Li1 Zeqi Jin1 Michael V olpp1 Fabian Otto23 Rudolf Lioutikov1 and Gerhard Neumann1

2025-05-02 0 0 2.97MB 12 页 10玖币
侵权投诉
ProDMPs: A Unified Perspective on Dynamic and Probabilistic
Movement Primitives
Ge Li1, Zeqi Jin1, Michael Volpp1, Fabian Otto2,3, Rudolf Lioutikov1, and Gerhard Neumann1
Abstract Movement Primitives (MPs) are a well-known
concept to represent and generate modular trajectories. MPs
can be broadly categorized into two types: (a) dynamics-based
approaches that generate smooth trajectories from any ini-
tial state, e. g., Dynamic Movement Primitives (DMPs), and
(b) probabilistic approaches that capture higher-order statis-
tics of the motion, e. g., Probabilistic Movement Primitives
(ProMPs). To date, however, there is no method that unifies
both, i. e. that can generate smooth trajectories from an arbi-
trary initial state while capturing higher-order statistics. In this
paper, we introduce a unified perspective of both approaches by
solving the ODE underlying the DMPs. We convert expensive
online numerical integration of DMPs into basis functions that
can be computed offline. These basis functions can be used
to represent trajectories or trajectory distributions similar to
ProMPs while maintaining all the properties of dynamical
systems. Since we inherit the properties of both methodologies,
we call our proposed model Probabilistic Dynamic Movement
Primitives (ProDMPs). Additionally, we embed ProDMPs in
deep neural network architecture and propose a new cost
function for efficient end-to-end learning of higher-order trajec-
tory statistics. To this end, we leverage Bayesian Aggregation
for non-linear iterative conditioning on sensory inputs. Our
proposed model achieves smooth trajectory generation, goal-
attractor convergence, correlation analysis, non-linear condi-
tioning, and online re-planing in one framework.
I. INTRODUCTION
Movement Primitives (MPs) are a prominent tool for
motion representation and synthesis in robotics. They serve
as basic movement elements, modulate the motion behavior,
and form more complex movements through combination
or concatenation. This work focuses on trajectory-based
movement representations [1, 2]. Given a parameter vector,
such representations generate desired trajectories for the
robot to follow. These methods have gained much popularity
in imitation and reinforcement learning (IL, RL) [3–7] due
to their concise parametrization and flexibility to modulate
movement. Current methods can be roughly classified into
approaches based on dynamical systems [1, 8–11] and prob-
abilistic approaches [2, 12, 13], with both types offering their
own advantages. The dynamical systems-based approaches,
such as Dynamic Movement Primitives (DMPs), guarantee
that the generated trajectories start precisely at the current
position and velocity of the robot, which allows for smooth
trajectory replanning i. e., changing the parameters of the
MPs during motion execution [11, 14]. However, since DMPs
represent the trajectory via the forcing term instead of a
direct representation of the trajectory position, numerical
1Karlsruhe Institute of Technology, Germany. ge.li@kit.edu
2Bosch Center for Artificial Intelligence, Germany.
3University of T ¨
ubingen, Germany.
integration from acceleration to position has to be applied
to formulate the trajectory, which constitutes an additional
workload and makes the estimation of the trajectory statistics
difficult [15]. Probabilistic methods, such as Probabilistic
Movement Primitive (ProMP), are able to acquire such
statistics, thus making them the key enablers for acquiring
variable-stiffness controllers and the trajectory’s temporal
and DoFs correlation. These methods further perform as gen-
erative models, facilitating the sampling of new trajectories.
However, the lack of internal dynamics of these approaches
suffers from discontinuities in position and velocity between
old and new trajectories in the case of replanning.
In this work, we propose Probabilistic Dynamic Movement
Primitives (ProDMPs) which unify both methodologies. We
show that the trajectory of a DMP, obtained by integrating
its second-order dynamical system, can be expressed by a
linear basis function model that depends on the parameters
of the DMP, i. e., the weights of the forcing function and the
goal attractor. The linear basis functions can be obtained by
integrating the original basis functions used in the DMP - an
operation that only needs to be performed once offline in the
ProDMPs. Recently, MP research has been extended to deep
neural network (NN) architectures [10, 11, 13] that enable
conditioning the trajectory generation on high-dimensional
context variables, such as images. Following these ideas, we
integrate our representation into a deep neural architecture
that allows non-linear conditioning on a varying number
of conditioning events. These events are aggregated using
Bayesian Aggregation (BA) into a latent probabilistic repre-
sentation [16] which is mapped to a Gaussian distribution
in the parameter space of the ProDMPs. We summarize
the contributions of this paper as: (a) We unify ProMPs
and DMPs into one consistent framework that inherits the
benefits of both formulations. (b) We enable to compute
distributions and to capture correlations of DMPs trajectories,
while (c) the robot’s current state can be inscribed into the
trajectory distribution through boundary conditions, allowing
for smooth replanning. (d) Moreover, the offline integration
of the basis functions significantly facilitates the integration
into neural network architectures, reducing the computation
time by a factor of 10. (e) Hence, we embed ProDMPs in
a deep encoder-decoder architecture that allows non-linear
conditioning on a set of high-dimensional observations with
varying information levels. We evaluate our method on three
digit-writing tasks using images as inputs, a simulated robot
pushing task with a complex physical interaction, and a real
robot picking task with shifting object positions. We compare
our model with state-of-the-art NN-based DMPs [9–11] and
arXiv:2210.01531v1 [cs.RO] 4 Oct 2022
the NN-based probabilistic method [13].
II. RELATED WORK
Paraschos et al. [2] established ProMPs to model MPs as
a trajectory distribution that captures temporal correlation
and correlations between the DoFs. ProMPs maintain a
Gaussian distribution over the parameters and can map it to
the corresponding trajectory distribution using a linear basis
function model. In contrast, such a distribution mapping is
not allowed for the DMP-based approaches, as the trajectory
is integrated numerically from the forcing function. Previous
methods, like GMM/GMR-DMPs [17, 18] used Gaussian
Mixture Models to cover the trajectories’ domain. Yet, this
does not capture temporal correlation nor does it provide
a generative approach to trajectories. Other methods which
have learned distributions over DMP weights [19, 20], do not
connect the weights distribution to the trajectory distribution,
as trajectories can only be obtained by integration. Hence, it
is also hard to learn the weights distribution reversely from
trajectories in an end-to-end manner.
To learn DMP parameters from high-dimensional sensory
inputs, Gams et al. [9] and Ridge et al. [10] designed an
encoder-decoder architecture to learn the weights of a single
DMP from digit images, and derive the gradient of the
trajectory with respect to the learnable parameters. Bahl et al.
[11] propose Neural Dynamic Policies (NDP) that allow
replanning the DMP parameters throughout the execution
of the trajectory which has also been extended to the RL
setting. The learning objective of these two methods for IL
is to optimize the mean squared error (MSE) between the
predicted and the ground-truth trajectories using backpropa-
gation. However, to formulate a trajectory, DMPs must apply
numerical integration during the NN training procedure,
which significantly increases the computational workload
in both forward and backward propagation, rendering these
approaches cumbersome to use. Additionally, the integration-
based trajectory representation limits the use of probabilistic
methods and hence these NN-DMPs approaches cannot be
trained using a probabilistic log-likelihood (LL) loss.
Probabilistic MPs approaches have also been extended
with deep NN architectures. Seker et al. [13] directly use
a Conditional Neural Processes model [21] as a trajectory
generator, i.e. Conditional Neural Movement Primitives (CN-
MPs), to predict the trajectory distribution with an encoder-
decoder NNs model. While such an architecture enables non-
linear conditioning on high-dimensional inputs, it can only
predict an isotropic trajectory variance at each time step. The
temporal and DoFs correlations are missing, which makes
sampling consistently in time and DoFs infeasible. Besides,
both ProMPs and CNMPs neglect dynamics, i. e. , when
changing trajectory parameters during trajectory execution,
the newly generated trajectory will contain discontinuities
at the replanning time point. To execute such trajectories,
a heuristic controller is used to freeze the time and catch
up with the jump [13]. However, such a waiting mechanism
does not scale to time-sensitive motions and tasks.
III. A UNIFIED PERSPECTIVE ON DYNAMIC AND
PROBABILISTIC MOVEMENT PRIMITIVES
We first briefly cover the fundamental aspects of DMPs.
Then, we derive the analytical solution of the DMPs’ ODE to
develop our new ProDMPs representation. For convenience,
we introduce our approach through a 1-DoF dynamical
system and later extend it to a high-DoFs system.
A. Solving DMPs’ Ordinary Differential Equation
For a single movement execution as a trajectory λ=
[yt]t=0:T, Schaal [1] and Ijspeert et al. [8] model it as
a second-order linear dynamical system with a non-linear
forcing function f,
τ2¨y=α(β(gy)τ˙y)+f(x), f (x) = xPϕi(x)wi
Pϕi(x)=xϕ|
xw,
(1)
where y=y(t),˙y= dy/dt, ¨y= d2y/dt2represent the
position, velocity, and acceleration of the system at time
step t, respectively. Here, we use the original formulation
of DMPs in [1] without any extensions. αand βare
spring-damper constants, gis a goal attractor, and τis a
time constant which can be used to adapt the execution
speed of the resulting trajectory. To this end, DMPs define
the forcing function over an exponentially decaying phase
variable x(t) = exp(αx/τ t), where ϕi(x)represents the
(unnormalized) basis functions and wiw,i= 1...N are
the corresponding weights. The trajectory of the motion λ
is obtained by integrating the dynamical system, or more
specifically, numerical integration from starting time to the
target time point. The dynamical system defined in Eq. (1) is
a second-order linear non-homogeneous ordinary differential
equation (ODE) with constant coefficients, whose closed-
form solution can be derived analytically. We rewrite the
ODE and its homogeneous counterpart in standard form as
DMPs’ ODE: ¨y+α
τ˙y+αβ
τ2y=f(x)
τ2+αβ
τ2gF(x, g),(2)
Homo. ODE: ¨y+α
τ˙y+αβ
τ2y= 0,(3)
where Fdenotes some function of xand g. Using the method
of variation of constants [22], the closed-form solution of the
second-order ODE in Eq. (2), i. e. , the trajectory position, is
y=c1y1+c2y2y1Zy2F
Ydt+y2Zy1F
Ydt, (4)
where y1,y2are two linearly independent complementary
functions of the homogeneous ODE given in Eq. (3). c1,
c2are two constants which are determined by the boundary
conditions (BCs) of the ODE, and Y=y1˙y2˙y1y2. Both
integrals in Eq. (4) are indefinite. With appropriate values
β=α/4[1, 8], the system is critically damped, and the
corresponding characteristic equation of the homogeneous
ODE, i. e. ∆=(α24αβ)2will be 0. Consequently, y1,
y2will take the form
y1=y1(t) = exp α
2τt, y2=y2(t) = texp α
2τt.(5)
Using this result, the term Ycan also be simplified to
Y= exp (αt/τ)6= 0.To get y, we need to solve the two
indefinite integrals in Eq. (4) as
I1(t) = Zy2F
Ydt=Ztexp α
2τtF(x, g)dt,
I2(t) = Zy1F
Ydt=Zexp α
2τtF(x, g)dt.
(6)
Applying the Fundamental Theorem of Calculus, i. e. ,
Rh(t)dt=Rt
0h(t0)dt0+c,cR, together with the definition
of the forcing function fin Eq. (1) and F(x, g)in Eq. (2),
I1(t)can be expressed as
I1(t) = 1
τ2Zt
0
t0exp α
2τt0x(t0)ϕ|
xwdt0
+Zt
0
t0exp α
2τt0α2
4gdt0+c3
(7)
=1
τ2Zt
0
t0expα
2τt0x(t0)ϕ|
xdt0w
+α
2τt1exp α
2τt+ 1g+c3,
(8)
where c3is a constant fixed by the BCs. From Eq. (7)
to Eq. (8), we move the time-independent parameters w
and gout of their corresponding integrals. Notice that the
remaining part of the second integral has an analytical
solution. The remaining part of the first integral, however,
has no closed-form solution because the basis functions ϕx
may be arbitrarily complex. Denoting these integrals as
p1(t)1
τ2Zt
0
t0exp α
2τt0x(t0)ϕ|
xdt0,
q1(t)α
2τt1exp α
2τt+ 1,
(9)
where p1is a N-dim vector and q1a scalar, we can express
I1(t) = p1(t)|w+q1(t)g+c3.Following the same steps, we
can obtain a similar solution for I2, i. e. , I2(t) = p2(t)|w+
q2(t)g+c4, where we present p2(t)and q2(t)in Eq. (10).
p2(t) = 1
τ2Zt
0
exp α
2τt0x(t0)ϕ|
xdt0,
q2(t) = α
2τexp α
2τt1(10)
B. DMPs’ Linear Basis Functions Representation.
Substituting the two integrals I1and I2in Eq. (4) by their
derived form, the constants c3and c4can then be merged into
c1and c2, respectively. We can now express the position of
DMPs in Eq. (4) as a summation of complementary functions
y1and y2, plus a linear basis function representation of the
weights wand the goal attractor g
y=c1y1+c2y2+y2p2y1p1y2q2y1q1w
g
c1y1+c2y2+Φ|wg,
(11)
where wgis a N+1-dim vector containing the weights w
and the goal g. The resulting basis functions for wand g
are represented by Φ(t), which can be solved numerically,
cf. Fig. 1. The constants c1and c2are determined by solving
a boundary condition problem where we use the current
position and velocity of the robot to inscribe where the
trajectory should start.
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.5
1.0
1.5
1e−3
(a) Weights’ basis
0.0 0.2 0.4 0.6 0.8 1.0
0.00
0.25
0.50
0.75
1.00
(b) Goal’s basis
Fig. 1: The basis functions Φof the ProDMPs can be computed
offline and later used in online trajectory generation.
˙y¨yy
f
wg
DNN
Online:
MSE Loss
(a) NN + DMPs couple the numerical integration and the learning.
DNN
Online:
Offline, computed once:
wgor
p(wg)
Φ
λor
p(λ)
MSE or
LL Loss
(b) NN + ProDMPs decouple the numerical integration from the
learning pipeline and can also model trajectory distributions.
Fig. 2: Comparison of trajectory generation pipelines between
(a) NN-based DMPs [9–11] and (b) ProDMPs. The node DNN
represents arbitrary deep neural network architecture. The blue
arrows denote the learning pipeline while the red arrows the nu-
merical integration. Our method transforms the expensive numerical
integration as basis functions computed offline which speeds up the
trajectory computation and allows trajectory distribution prediction.
TABLE I: Computation time of both pipelines. Here a 2-DoFs, 6-
seconds long, 1000 Hz trajectory is generated from a 22-dim wg
parameter vector. We tested both forward pass (FP) and backward
pass (BP). A 3-layer fully connected (FC) network with [10, 128,
22] neurons on input, hidden and output layers respectively are
used to simulate the learning procedure. The keyword +BC are
the settings where the boundary conditions are renewed so that
the coefficients c1and c2need to be recomputed. Otherwise,
they remain unchanged. The result shows that our model is 200-
4600 times faster than the NN-DMPs in different settings. We
use a Nvidia® RTX-3080Ti GPU for our test. In a full learning
experiment with NN architectures, this speed difference translates
into a speed-up of around 10 times (see experiments).
Pipelines FP FP + BC BP BP + BC
NN-DMPs 0.6057 s 0.6145 s 1.5261 s 1.5737 s
ProDMPs 0.00013 s 0.0027 s 0.00105 s 0.0039 s
Speed-up ×4659 ×227 ×1453 ×403
In contrast to previous NN-DMPs methods [9–11], our
model separates the learnable parameters from the numerical
integrals which are transformed as basis functions. These
basis functions are shared by all trajectories to be generated
during learning procedure. Hence, we can pre-compute these
basis functions once offline at first and use them as constants
in online trajectory generation. Consequently, we exclude
numerical integrals from forward and backward propagation
摘要:

ProDMPs:AUniedPerspectiveonDynamicandProbabilisticMovementPrimitivesGeLi1,ZeqiJin1,MichaelVolpp1,FabianOtto2;3,RudolfLioutikov1,andGerhardNeumann1Abstract—MovementPrimitives(MPs)areawell-knownconcepttorepresentandgeneratemodulartrajectories.MPscanbebroadlycategorizedintotwotypes:(a)dynamics-basedap...

展开>> 收起<<
ProDMPs A Unified Perspective on Dynamic and Probabilistic Movement Primitives Ge Li1 Zeqi Jin1 Michael V olpp1 Fabian Otto23 Rudolf Lioutikov1 and Gerhard Neumann1.pdf

共12页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:12 页 大小:2.97MB 格式:PDF 时间:2025-05-02

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 12
客服
关注