VP-STO Via-point-based Stochastic Trajectory Optimization for Reactive Robot Behavior Julius Jankowski12 Lara Bruderm uller3 Nick Hawes3and Sylvain Calinon12

2025-05-06 3 0 5.12MB 9 页 10玖币

侵权投诉

VP-STO: Via-point-based Stochastic Trajectory Optimization for

Reactive Robot Behavior

Julius Jankowski∗1,2, Lara Bruderm¨

uller∗3, Nick Hawes3and Sylvain Calinon1,2

Abstract— Achieving reactive robot behavior in complex

dynamic environments is still challenging as it relies on being

able to solve trajectory optimization problems quickly enough,

such that we can replan the future motion at frequencies

which are sufﬁciently high for the task at hand. We argue that

current limitations in Model Predictive Control (MPC) for robot

manipulators arise from inefﬁcient, high-dimensional trajectory

representations and the negligence of time-optimality in the

trajectory optimization process. Therefore, we propose a motion

optimization framework that optimizes jointly over space and

time, generating smooth and timing-optimal robot trajectories

in joint-space. While being task-agnostic, our formulation

can incorporate additional task-speciﬁc requirements, such as

collision avoidance, and yet maintain real-time control rates,

demonstrated in simulation and real-world robot experiments

on closed-loop manipulation. For additional material, please

visit https://sites.google.com/oxfordrobotics.institute/vp-sto.

I. INTRODUCTION

In this paper we consider the problem of generating

continuous, timing-optimal and smooth trajectories for robots

operating in dynamic environments. Such task settings re-

quire the robot to be reactive to unforeseen changes in

the environment, e.g., due to dynamic obstacles, as well

as to be robust and compliant when operating alongside

or together with humans. However, generating this kind

of reactive and yet efﬁcient robot behavior within a high-

dimensional conﬁguration space is signiﬁcantly challenging.

This is especially the case in robot manipulation scenarios

with many degrees of freedom (DoFs) as the resulting

high-dimensional and multi-objective optimization problems

are difﬁcult to solve on-the-ﬂy. A widespread approach in

robotics is to formulate the task of motion generation as

an optimization problem. Such trajectory-optimization based

methods aim at ﬁnding a trajectory that minimizes a cost

function, e.g., motion smoothness, subject to constraints,

e.g., collision avoidance. Solution strategies can either be

gradient-based or sampling-based. Approaches falling in the

former category, e.g., CHOMP [1] and TrajOpt [2], typically

employ second-order iterative methods to ﬁnd locally optimal

solutions. However, they require the cost function to be

once or even twice-differentiable, which constitutes a major

limitation for manipulation tasks as they usually involve

*Authors contributed equally.

JJ and SC were supported by the Swiss National Science Foundation

(SNSF) through the CODIMAN project. LB was supported by an Amazon

Web Services Lighthouse scholarship. NH received EPSRC funding via the

“From Sensing to Collaboration” programme grant [EP/V000748/1].

1Idiap Research Institute, Martigny, CH; name.surname@idiap.ch

2Ecole Polytechnique F´

ed´

erale de Lausanne (EPFL), CH

3Oxford Robotics Institute, University of Oxford, UK; {larab,

nickh}@robots.ox.ac.uk.

Fig. 1. Experiment settings. Left: Pick-and-place scenario, where the task

is to grasp a bowling pin that is arbitrarily handed over to the robot and to

place it upright in the middle of the table. Right: Pushing scenario, where

the robot has to push the center of the green coffee packet to a moving

target location indicated by the tip of the metal stick.

many complex, discontinuous cost terms and constraints. In

contrast, sampling-based methods [3], [4] can operate on

discontinuous costs by sampling candidate trajectories from a

proposal distribution, evaluating them on the objective, and

updating the proposal distribution according to their rela-

tive performance. Compared to gradient-based optimization,

stochastic approaches typically also achieve higher robust-

ness to difﬁcult reward landscapes due to their exploratory

properties [5]. Yet, achieving reactive robot behavior is

challenging as it requires solving trajectory optimization

problems at frequencies which are sufﬁciently high for the

task at hand. This issue can be alleviated in Model Predic-

tive Control (MPC) settings by optimizing over a shorter

receding time-horizon. Stochastic, gradient-free trajectory

optimization, such as Model-Predictive Path Integral (MPPI)

control [6] and the Cross-Entropy-Method (CEM) [4], com-

bined with MPC, also known as sampling-based MPC, has

proven state-of-the-art real-time performance on real robotic

systems in challenging and dynamic environments [7], [8],

[9]. However, these works still suffer from limited long-term

anticipation, e.g., getting stuck in front of obstacles, due to

the optimization over a short receding horizon.

Motivated by the above, we propose Via-Point-based

Stochastic Trajectory Optimization (VP-STO), a framework

that introduces the following contributions

1) A low-dimensional, time-continuous representation of

trajectories in joint-space based on via-points that by-

design respect kinodynamic constraints of the robot.

2) Stochastic via-point optimization, based on an evo-

lutionary strategy, aiming at minimizing movement

duration and task-related cost terms.

3) An MPC algorithm optimizing over the full horizon

for real-time application in complex high-dimensional

task settings, such as closed-loop object manipulation.

arXiv:2210.04067v2 [cs.RO] 14 Mar 2023

II. RELATED WORK

In the context of closed-loop object manipulation with

MPC, successful approaches to producing reactive robot

behavior typically optimize in joint-space subject to kinody-

namic constraints. While Fishman et al. use gradient-based

MPC in order to ﬁnd trajectories for human-robot handovers

[10], a very recent approach named STORM [9] employed

sampling-based MPC on robotic manipulation tasks. It is able

to generate particularly smooth trajectories via low discrep-

ancy action sampling, smooth interpolation and careful cost

function design. Moreover, the parallelizability of sampling-

based MPC is exploited by deploying the stochastic tensor

optimization framework on a GPU. However, in contrast to

our work, the approach relies on optimizing over a short

receding horizon.

In the realm of time-parametrization of trajectories, most

existing approaches ﬁx the overall motion duration or do not

specify it at all. For instance, the majority of MPC-based

approaches only handle time implicitly via kinodynamic

constraints. While the works of [11], [12] progress the state

of the art in time-optimal MPC, their applicability to high-

dimensional robotic systems yet is limited. In the context

of motion planning, T-CHOMP [13] jointly optimizes a

trajectory and the corresponding via-point timings. Yet, the

total execution time is still ﬁxed in advance. The way we

approach the minimization of the movement duration is most

similar to the work of [14]. However, in contrast to our

work, their approach optimizes via-points and their timing

separately.

III. PRELIMINARIES: TRAJECTORY REPRESENTATION

The way we represent trajectories is based on previous

work showing that the closed-form solution to the following

optimization problem

min Z1

q00(s)>q00 (s)ds

s.t.q(sn) = qn, n = 1, ..., N

q(0) = q0,q0(0) = q0

0,q(1) = qT,q0(1) = q0

(1)

is given by cubic splines [15] and that it can be formulated

as a weighted superposition of basis functions [16]. Hence,

the robot’s conﬁguration is deﬁned as q(s) = Φ(s)w∈RD,

with Dbeing the number of degrees of freedom. The matrix

Φ(s)contains the basis functions which are weighted by

the vector w1. The trajectory is deﬁned on the interval S=

[0,1], while the time tmaps to the phase variable s=t

T∈ S

with Tbeing the total duration of the trajectory. Conse-

quently, joint velocities and accelerations along the trajectory

are given by ˙

q(s) = 1

TΦ0(s)wand ¨

q(s) = 1

T2Φ00(s)w,

respectively2. The basis function weights winclude the

trajectory constraints consisting of the boundary condition

parameters wbc = [q>

0,q0>

0,q>

T,q0>

T]>and Nvia-points the

1A more detailed explanation of the basis functions and their derivation

can be found in the appendix of [16].

2We use the notation f0(s)for derivatives w.r.t. sand the notation ˙

f(s)

for derivatives w.r.t. t.

trajectory has to pass through qvia = [q>

1,...,q>

N]>∈RDN ,

such that w= [q>

via,w>

bc]>. Throughout this paper, the via-

point timings snare assumed to be uniformly distributed in

S. Note that boundary velocities map to boundary deriva-

tives w.r.t. sby multiplying them with the total duration

T,i.e.,q0

0=T˙

q0and q0

T=T˙

qT. Furthermore, the

optimization problem in Eq. (1) minimizes not only the

objective q00 (s), but also the integral over accelerations, since

q00(s) = T2¨

q(s)and thus the objective R1

0¨

q(s)>¨

q(s)ds

directly maps to 1

T4R1

0q00(s)>q00 (s)ds, corresponding to the

control effort. It is minimal iff the objective in Eq. (1) is

minimal. As a result, this trajectory representation provides

a linear mapping from via points, boundary conditions and

the movement duration to a time-continuous and smooth

trajectory.

In the remainder of the paper, we exploit this explicit

parameterization with via-points and boundary conditions by

optimizing only the via-points while keeping the predeﬁned

boundary condition parameters ﬁxed. Thus, we write the

computation of the trajectory as a superposition of a via-

point term and a boundary constraints term, i.e.,q(s) =

Φvia(s)qvia +Φbc(s)wbc. The matrices Φvia(s)and Φbc(s)

are extracted from the basis function matrix Φ(s).

IV. VP-STO: VIA-POINT-BASED STOCHASTIC

TRAJECTORY OPTIMIZATION

In the following, we introduce our stochastic trajectory

optimization framework. The core idea is to ﬁnd via-points

qvia such that the synthesized trajectory minimizes a task-

related objective, i.e.,

min

qvia

c[q(s),˙

q(s),¨

q(s), T ].(2)

Based on these via-points, we efﬁciently synthesize

high-quality trajectories, i.e.,qvia →ξwith ξ=

{q(s),˙

q(s),¨

q(s), T }. We aim at synthesizing trajectories

that by-design minimize task-agnostic objectives, i.e.,min-

imum time and smoothness, and satisfy task-agnostic con-

straints, i.e., equality constraints on the initial and ﬁnal

state and inequality constraints on joint-space velocities and

accelerations. We employ stochastic black-box optimization,

namely Covariance Matrix Adaptation (CMA-ES) [5] to op-

timize for the via-points. As each trajectory constructed from

the sampled via-points already provides the optimal solution

to the optimization problem given in Eq. 1, the CMA-

ES optimization in the low-dimensional via-point space is

particularly fast, evaluating only high-quality trajectories.

Moreover, with CMA-ES we are not only able to quickly

converge to a local minimum, but to also leverage the

exploration aspect of the evolutionary strategy (ES). In

more detail, this nested optimization process, which is also

illustrated in Fig. 2, comprises the following steps. First,

a new population of Mvia-points qvia is sampled from a

Gaussian distribution N(µvia,Σvia). As qvia is a vector of

the stacked via-points, note that µvia ∈RDN and Σvia ∈

RDN ×DN . By taking Msamples in this higher-dimensional

space, instead of M·Nsamples for all via points separately

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

VP-STO:Via-point-basedStochasticTrajectoryOptimizationforReactiveRobotBehaviorJuliusJankowski1;2,LaraBruderm¨uller3,NickHawes3andSylvainCalinon1;2AbstractAchievingreactiverobotbehaviorincomplexdynamicenvironmentsisstillchallengingasitreliesonbeingabletosolvetrajectoryoptimizationproblemsquicklyen...

展开>> 收起<<

VP-STO Via-point-based Stochastic Trajectory Optimization for Reactive Robot Behavior Julius Jankowski12 Lara Bruderm uller3 Nick Hawes3and Sylvain Calinon12.pdf

共9页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

VP-STO Via-point-based Stochastic Trajectory Optimization for Reactive Robot Behavior Julius Jankowski12 Lara Bruderm uller3 Nick Hawes3and Sylvain Calinon12

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: