VP-STO Via-point-based Stochastic Trajectory Optimization for Reactive Robot Behavior Julius Jankowski12 Lara Bruderm uller3 Nick Hawes3and Sylvain Calinon12

2025-05-06 0 0 5.12MB 9 页 10玖币
侵权投诉
VP-STO: Via-point-based Stochastic Trajectory Optimization for
Reactive Robot Behavior
Julius Jankowski1,2, Lara Bruderm¨
uller3, Nick Hawes3and Sylvain Calinon1,2
Abstract Achieving reactive robot behavior in complex
dynamic environments is still challenging as it relies on being
able to solve trajectory optimization problems quickly enough,
such that we can replan the future motion at frequencies
which are sufficiently high for the task at hand. We argue that
current limitations in Model Predictive Control (MPC) for robot
manipulators arise from inefficient, high-dimensional trajectory
representations and the negligence of time-optimality in the
trajectory optimization process. Therefore, we propose a motion
optimization framework that optimizes jointly over space and
time, generating smooth and timing-optimal robot trajectories
in joint-space. While being task-agnostic, our formulation
can incorporate additional task-specific requirements, such as
collision avoidance, and yet maintain real-time control rates,
demonstrated in simulation and real-world robot experiments
on closed-loop manipulation. For additional material, please
visit https://sites.google.com/oxfordrobotics.institute/vp-sto.
I. INTRODUCTION
In this paper we consider the problem of generating
continuous, timing-optimal and smooth trajectories for robots
operating in dynamic environments. Such task settings re-
quire the robot to be reactive to unforeseen changes in
the environment, e.g., due to dynamic obstacles, as well
as to be robust and compliant when operating alongside
or together with humans. However, generating this kind
of reactive and yet efficient robot behavior within a high-
dimensional configuration space is significantly challenging.
This is especially the case in robot manipulation scenarios
with many degrees of freedom (DoFs) as the resulting
high-dimensional and multi-objective optimization problems
are difficult to solve on-the-fly. A widespread approach in
robotics is to formulate the task of motion generation as
an optimization problem. Such trajectory-optimization based
methods aim at finding a trajectory that minimizes a cost
function, e.g., motion smoothness, subject to constraints,
e.g., collision avoidance. Solution strategies can either be
gradient-based or sampling-based. Approaches falling in the
former category, e.g., CHOMP [1] and TrajOpt [2], typically
employ second-order iterative methods to find locally optimal
solutions. However, they require the cost function to be
once or even twice-differentiable, which constitutes a major
limitation for manipulation tasks as they usually involve
*Authors contributed equally.
JJ and SC were supported by the Swiss National Science Foundation
(SNSF) through the CODIMAN project. LB was supported by an Amazon
Web Services Lighthouse scholarship. NH received EPSRC funding via the
“From Sensing to Collaboration” programme grant [EP/V000748/1].
1Idiap Research Institute, Martigny, CH; name.surname@idiap.ch
2Ecole Polytechnique F´
ed´
erale de Lausanne (EPFL), CH
3Oxford Robotics Institute, University of Oxford, UK; {larab,
nickh}@robots.ox.ac.uk.
Fig. 1. Experiment settings. Left: Pick-and-place scenario, where the task
is to grasp a bowling pin that is arbitrarily handed over to the robot and to
place it upright in the middle of the table. Right: Pushing scenario, where
the robot has to push the center of the green coffee packet to a moving
target location indicated by the tip of the metal stick.
many complex, discontinuous cost terms and constraints. In
contrast, sampling-based methods [3], [4] can operate on
discontinuous costs by sampling candidate trajectories from a
proposal distribution, evaluating them on the objective, and
updating the proposal distribution according to their rela-
tive performance. Compared to gradient-based optimization,
stochastic approaches typically also achieve higher robust-
ness to difficult reward landscapes due to their exploratory
properties [5]. Yet, achieving reactive robot behavior is
challenging as it requires solving trajectory optimization
problems at frequencies which are sufficiently high for the
task at hand. This issue can be alleviated in Model Predic-
tive Control (MPC) settings by optimizing over a shorter
receding time-horizon. Stochastic, gradient-free trajectory
optimization, such as Model-Predictive Path Integral (MPPI)
control [6] and the Cross-Entropy-Method (CEM) [4], com-
bined with MPC, also known as sampling-based MPC, has
proven state-of-the-art real-time performance on real robotic
systems in challenging and dynamic environments [7], [8],
[9]. However, these works still suffer from limited long-term
anticipation, e.g., getting stuck in front of obstacles, due to
the optimization over a short receding horizon.
Motivated by the above, we propose Via-Point-based
Stochastic Trajectory Optimization (VP-STO), a framework
that introduces the following contributions
1) A low-dimensional, time-continuous representation of
trajectories in joint-space based on via-points that by-
design respect kinodynamic constraints of the robot.
2) Stochastic via-point optimization, based on an evo-
lutionary strategy, aiming at minimizing movement
duration and task-related cost terms.
3) An MPC algorithm optimizing over the full horizon
for real-time application in complex high-dimensional
task settings, such as closed-loop object manipulation.
arXiv:2210.04067v2 [cs.RO] 14 Mar 2023
II. RELATED WORK
In the context of closed-loop object manipulation with
MPC, successful approaches to producing reactive robot
behavior typically optimize in joint-space subject to kinody-
namic constraints. While Fishman et al. use gradient-based
MPC in order to find trajectories for human-robot handovers
[10], a very recent approach named STORM [9] employed
sampling-based MPC on robotic manipulation tasks. It is able
to generate particularly smooth trajectories via low discrep-
ancy action sampling, smooth interpolation and careful cost
function design. Moreover, the parallelizability of sampling-
based MPC is exploited by deploying the stochastic tensor
optimization framework on a GPU. However, in contrast to
our work, the approach relies on optimizing over a short
receding horizon.
In the realm of time-parametrization of trajectories, most
existing approaches fix the overall motion duration or do not
specify it at all. For instance, the majority of MPC-based
approaches only handle time implicitly via kinodynamic
constraints. While the works of [11], [12] progress the state
of the art in time-optimal MPC, their applicability to high-
dimensional robotic systems yet is limited. In the context
of motion planning, T-CHOMP [13] jointly optimizes a
trajectory and the corresponding via-point timings. Yet, the
total execution time is still fixed in advance. The way we
approach the minimization of the movement duration is most
similar to the work of [14]. However, in contrast to our
work, their approach optimizes via-points and their timing
separately.
III. PRELIMINARIES: TRAJECTORY REPRESENTATION
The way we represent trajectories is based on previous
work showing that the closed-form solution to the following
optimization problem
min Z1
0
q00(s)>q00 (s)ds
s.t.q(sn) = qn, n = 1, ..., N
q(0) = q0,q0(0) = q0
0,q(1) = qT,q0(1) = q0
T
(1)
is given by cubic splines [15] and that it can be formulated
as a weighted superposition of basis functions [16]. Hence,
the robot’s configuration is defined as q(s) = Φ(s)wRD,
with Dbeing the number of degrees of freedom. The matrix
Φ(s)contains the basis functions which are weighted by
the vector w1. The trajectory is defined on the interval S=
[0,1], while the time tmaps to the phase variable s=t
T∈ S
with Tbeing the total duration of the trajectory. Conse-
quently, joint velocities and accelerations along the trajectory
are given by ˙
q(s) = 1
TΦ0(s)wand ¨
q(s) = 1
T2Φ00(s)w,
respectively2. The basis function weights winclude the
trajectory constraints consisting of the boundary condition
parameters wbc = [q>
0,q0>
0,q>
T,q0>
T]>and Nvia-points the
1A more detailed explanation of the basis functions and their derivation
can be found in the appendix of [16].
2We use the notation f0(s)for derivatives w.r.t. sand the notation ˙
f(s)
for derivatives w.r.t. t.
trajectory has to pass through qvia = [q>
1,...,q>
N]>RDN ,
such that w= [q>
via,w>
bc]>. Throughout this paper, the via-
point timings snare assumed to be uniformly distributed in
S. Note that boundary velocities map to boundary deriva-
tives w.r.t. sby multiplying them with the total duration
T,i.e.,q0
0=T˙
q0and q0
T=T˙
qT. Furthermore, the
optimization problem in Eq. (1) minimizes not only the
objective q00 (s), but also the integral over accelerations, since
q00(s) = T2¨
q(s)and thus the objective R1
0¨
q(s)>¨
q(s)ds
directly maps to 1
T4R1
0q00(s)>q00 (s)ds, corresponding to the
control effort. It is minimal iff the objective in Eq. (1) is
minimal. As a result, this trajectory representation provides
a linear mapping from via points, boundary conditions and
the movement duration to a time-continuous and smooth
trajectory.
In the remainder of the paper, we exploit this explicit
parameterization with via-points and boundary conditions by
optimizing only the via-points while keeping the predefined
boundary condition parameters fixed. Thus, we write the
computation of the trajectory as a superposition of a via-
point term and a boundary constraints term, i.e.,q(s) =
Φvia(s)qvia +Φbc(s)wbc. The matrices Φvia(s)and Φbc(s)
are extracted from the basis function matrix Φ(s).
IV. VP-STO: VIA-POINT-BASED STOCHASTIC
TRAJECTORY OPTIMIZATION
In the following, we introduce our stochastic trajectory
optimization framework. The core idea is to find via-points
qvia such that the synthesized trajectory minimizes a task-
related objective, i.e.,
min
qvia
c[q(s),˙
q(s),¨
q(s), T ].(2)
Based on these via-points, we efficiently synthesize
high-quality trajectories, i.e.,qvia ξwith ξ=
{q(s),˙
q(s),¨
q(s), T }. We aim at synthesizing trajectories
that by-design minimize task-agnostic objectives, i.e.,min-
imum time and smoothness, and satisfy task-agnostic con-
straints, i.e., equality constraints on the initial and final
state and inequality constraints on joint-space velocities and
accelerations. We employ stochastic black-box optimization,
namely Covariance Matrix Adaptation (CMA-ES) [5] to op-
timize for the via-points. As each trajectory constructed from
the sampled via-points already provides the optimal solution
to the optimization problem given in Eq. 1, the CMA-
ES optimization in the low-dimensional via-point space is
particularly fast, evaluating only high-quality trajectories.
Moreover, with CMA-ES we are not only able to quickly
converge to a local minimum, but to also leverage the
exploration aspect of the evolutionary strategy (ES). In
more detail, this nested optimization process, which is also
illustrated in Fig. 2, comprises the following steps. First,
a new population of Mvia-points qvia is sampled from a
Gaussian distribution N(µvia,Σvia). As qvia is a vector of
the stacked via-points, note that µvia RDN and Σvia
RDN ×DN . By taking Msamples in this higher-dimensional
space, instead of M·Nsamples for all via points separately
摘要:

VP-STO:Via-point-basedStochasticTrajectoryOptimizationforReactiveRobotBehaviorJuliusJankowski1;2,LaraBruderm¨uller3,NickHawes3andSylvainCalinon1;2Abstract—Achievingreactiverobotbehaviorincomplexdynamicenvironmentsisstillchallengingasitreliesonbeingabletosolvetrajectoryoptimizationproblemsquicklyen...

展开>> 收起<<
VP-STO Via-point-based Stochastic Trajectory Optimization for Reactive Robot Behavior Julius Jankowski12 Lara Bruderm uller3 Nick Hawes3and Sylvain Calinon12.pdf

共9页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:9 页 大小:5.12MB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 9
客服
关注