Constrained Differential Dynamic Programming A primal-dual augmented Lagrangian approach

2025-05-02 0 0 5.32MB 8 页 10玖币

侵权投诉

Constrained Differential Dynamic Programming:

A primal-dual augmented Lagrangian approach

Wilson Jalleta,b,*, Antoine Bambadeb,c, Nicolas Mansardaand Justin Carpentierb

Abstract— Trajectory optimization is an efﬁcient approach

for solving optimal control problems for complex robotic

systems. It relies on two key components: ﬁrst the tran-

scription into a sparse nonlinear program, and second the

corresponding solver to iteratively compute its solution. On

one hand, differential dynamic programming (DDP) provides

an efﬁcient approach to transcribe the optimal control problem

into a ﬁnite-dimensional problem while optimally exploiting

the sparsity induced by time. On the other hand, augmented

Lagrangian methods make it possible to formulate efﬁcient

algorithms with advanced constraint-satisfaction strategies. In

this paper, we propose to combine these two approaches into

an efﬁcient optimal control algorithm accepting both equality

and inequality constraints. Based on the augmented Lagrangian

literature, we ﬁrst derive a generic primal-dual augmented

Lagrangian strategy for nonlinear problems with equality

and inequality constraints. We then apply it to the dynamic

programming principle to solve the value-greedy optimization

problems inherent to the backward pass of DDP, which we

combine with a dedicated globalization strategy, resulting in

a Newton-like algorithm for solving constrained trajectory

optimization problems. Contrary to previous attempts of formu-

lating an augmented Lagrangian version of DDP, our approach

exhibits adequate convergence properties without any switch in

strategies. We empirically demonstrate its interest with several

case-studies from the robotics literature.

I. INTRODUCTION

In this paper, we are interested in solving constrained

continuous-time optimal control problems (OCP) of the

form:

min

x,u ZT

`(t, x(t), u(t)) dt +`T(x(T)) (1a)

s.t. f(t, x(t), u(t),˙x(t)) = 0, t ∈[0, T )(1b)

x(0) = ¯x0(1c)

h(t, x(t), u(t)) 60(1d)

hT(x(T)) 60,(1e)

where `and `Tare the running and terminal costs respec-

tively, (1b) accounts for the system dynamics written as a

differential-algebraic equation (DAE) (and includes the ODE

case ˙x=f(t, x(t), u(t))). We denote Xand Uthe state and

aLAAS-CNRS, 7 Avenue du Colonel Roche, F-31400 Toulouse, France

bInria, D´

epartement d’informatique de l’ENS, ´

Ecole normale sup´

erieure,

CNRS, PSL Research University, Paris, France

cENPC, France, *corresponding author: wjallet@laas.fr

This work was supported in part by the HPC resources from GENCI-

IDRIS (Grant AD011011342), the French government under management of

Agence Nationale de la Recherche as part of the ”Investissements d’avenir”

program, reference ANR-19-P3IA-0001 (PRAIRIE 3IA Institute) and ANR-

19- P3IA-000 (ANITI 3IA Institute), Louis Vuitton ENS Chair on Artiﬁcial

Intelligence, and the European project MEMMO (Grant 780684).

control spaces, T > 0the time horizon, ¯x0∈ X the initial

condition, h(·)and hT(·)the path and terminal constraints.

For numerical resolution, the continuous-time OCP (1)

must be transcribed into a ﬁnite-dimensional optimization

problem (i.e., with a ﬁnite number of variables, which

the continuous-time trajectories are not) [1]. Several tran-

scriptions are possible [2], [3], [4]. Differential Dynamic

Programming (DDP) is a particular OC algorithm which

implies a direct transcription known as single shooting [5].

Popularized in robotics in the late 2000s [6], it has the

advantage over other transcriptions of providing a simple

formulation, optimally exploiting the sparsity of the resulting

nonlinear programs while providing feedback gains at no

extra cost. The corresponding transcription, extended to any

constraints, reads:

min

x,u

N−1

k=0

`k(xk, uk) + `N(xN)(2a)

s.t. fk(xk, uk, xk+1) = 0, k ∈J0, N −1K(2b)

x0= ¯x0(2c)

hk(xk, uk)60(2d)

hN(xN)60,(2e)

where hk, hN, fkare appropriate functions discretizing the

dynamics and path constraints depending on the given nu-

merical discretization scheme employed. The `kare approxi-

mations of the cost integrals Rtk+1

tk`(t, x(t), u(t)) dt. We use

the shorthands xdef

= (x0, . . . , xN)and udef

= (u0, . . . , uN−1)

for the discretized state and control trajectories.

While the nominal DDP algorithm is not able to handle

path constraints, implicit integrators or multiple-shooting

stabilization, several improvements have been proposed over

the years to equip it with these properties. In this paper, we

focus on the handling of equality and inequality constraints,

and we ﬁrst review previous work focusing on it.

A ﬁrst subcase of interest only considers OCP with control

bounds, which can be handled by a projected quasi-Newton

approach [7]. Several other projection-based formulations

have then been proposed to extend DDP [8], [9], none of

which have been shown to be robust enough to be widely

adopted in robotics. To account fro inequality constraints,

interior-point methods [10], [11] have also been recently

investigated; however, these do not allow for easy warm-

starting [3] which is unsuitable for online optimization and

application to model-predictive control (MPC) [12], [13].

In the past few years, augmented Lagrangian approaches

have emerged as a suitable solution for solving constrained

arXiv:2210.15409v2 [cs.RO] 28 Oct 2022

trajectory optimization problems [14]. As argued later in

this paper, it offers many of the good properties that we

need for trajectory optimization: super-linear convergence or

even more quadratic convergence, stability, ability to warm-

start, and so on. Yet the ﬁrst attempt to write dedicated

OCP solvers based on augmented Lagrangians exhibited poor

convergence properties. Thereby, further reﬁnement using a

projection in a two-stage approach had to be introduced in

the solver ALTRO [15]. The penalty function used in ALTRO

was then recognized to be irregular and discarded in [16],

which introduces a switch to an SQP formulation to converge

to a higher precision.

A key idea that we exploit in this paper is to introduce the

augmented Lagrangian formulation directly in the backward

pass, to solve the value-greedy problems while directly

considering the constraints, as initially proposed for multi-

phase constrained problems [17]. This enables us to obtain

better numerical accuracy for equality-constrained problems,

by stabilizing the backward pass using a primal-dual system

of equations to compute the control and multipliers to-

gether [18], and a monotonic update of the penalty parameter

derived from the bound-constrained Lagragian (BCL) [19]

strategy. Their method converges reliably to good numerical

accuracy. We have recently extended this formulation to also

account for the dynamics and other equality constraints using

a primal-dual augmented Lagrangian, allowing for the inclu-

sion of infeasible initialization and implicit integrators [20].

In this paper, we introduce a complete augmented La-

grangian DDP algorithm for handling both equality and

inequality constraints, and validate it on several real-size

robotic scenarios. We ﬁrst introduce in Sec. II a primal-

dual algorithm, rooted in the nonlinear programming litera-

ture [21], to handle generic nonlinear optimization problems

(NLPs). We then adapt it to the speciﬁc case of OCPs of

the form (2) in Sec. III. It results in an overall second-

order quasi-Newton-like algorithm with good convergence

properties for solving constrained trajectory optimization

problems. We ﬁnally benchmark our method in Sec. IV on

various standard case studies from the robotics literature. A

companion video is available1.

II. THE PRIMAL-DUAL AUGMENTED LAGRANGIAN

METHOD FOR CONSTRAINED OPTIMIZATION

This section introduces our augmented Lagrangian ap-

proach to solve constrained nonlinear optimization problems

(NLP) of the form:

min

x∈Rnf(x)

s.t. c(x)=0, h(x)60,(3)

where cand hstands for equality and inequality constraints

respectively. We then adapt this approach in Sec. III to

the case of trajectory optimization. While many augmented

Lagrangian approaches have been introduced in the optimiza-

tion literature [22], most of them rely on alternating between

primal solving and dual updates. In this work, we propose

1https://peertube.laas.fr/videos/watch/dfeca51c-c2cf- 468a-b46a- 86f808e9a561

instead to compute combined primal-dual steps by taking

inspiration from the work of Gill and Robinson in [21],

which we extend by also considering inequality constraints

and by connecting it to the proximal method of multipliers

(PMM) [23] that we use to for numerical robustness. We

discuss these contributions in more detail at the end of this

section.

A. Optimality conditions

The Lagrangian Lassociated with (3) is deﬁned by:

L(x, λ, ν) = f(x) + λ>c(x) + ν>h(x), λ ∈Rne, ν ∈Rni

(4)

A saddle point of Lis a solution of (3). This leads to the

Karush-Kuhn-Tucker (KKT) necessary conditions [24] for

ensuring a primal-dual point (x, λ, ν)to be optimal:

∇xL(x, λ, ν)=0,

c(x) = 0 and h(x)60,

ν>0and h(x)>ν= 0.

(KKT)

In practice, we search for a triplet (x, λ, ν)satisfying these

optimality conditions (KKT) up to a certain level of prede-

ﬁned accuracy abs >0, leading us to the following natural

absolute stopping criterion:

k∇xL(x, λ, ν)k∞6abs,

k(c(x),[h(x)]+)k∞6abs,(5)

where [z]+denotes the projection of z∈Rnzon Rnz

B. Equality constrained nonlinear programming problems

In this section, we provide a high-level overview on the

primal-dual augmented Lagrangian (PDAL) method. It is

closely related to the probably even more famous Method

of Multipliers (MM) [25, Chapter 2], which we review ﬁrst

in the context of purely equality-constrained NLPs.

Primal-dual augmented Lagrangian. The PDAL function

µ[21, Section 3] is deﬁned by augmenting the standard

Lagrangian L(4) with two squared `2penalties:

µ(x, λ;λe)def

=L(x, λe,0) + 1

2µekc(x)k2

2µekc(x) + µe(λe−λ)k2

2,(6)

where µe>0is a scalar2. The PDAL method then searches

a sequence of iterates approximately minimizing (6) [28]:

xl+1, λl+1 ≈ωlmin

x,λ MA

µ(x, λ;λl),(7)

where ≈ωlstands for requiring (xl+1, λl+1)to be an

ωl-approximate solution to subproblem (7). The approxima-

tion is controlled via the following condition:

krl(xl+1, λl+1)k∞6ωl,(8)

where rlaccounts for the optimality conditions at iteration

l: if krl(x, λ)k∞6abs then (x, λ)is a solution to (7) at

precision abs. The formula for rlwill be speciﬁed in the

2Some authors associate a penalty parameter to each constraint. In this

case, the penalty parameters µeis a matrix Σλ[26], [27].

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

ConstrainedDifferentialDynamicProgramming:Aprimal-dualaugmentedLagrangianapproachWilsonJalleta,b,*,AntoineBambadeb,c,NicolasMansardaandJustinCarpentierbAbstractTrajectoryoptimizationisanefcientapproachforsolvingoptimalcontrolproblemsforcomplexroboticsystems.Itreliesontwokeycomponents:rstthetran-s...

展开>> 收起<<

Constrained Differential Dynamic Programming A primal-dual augmented Lagrangian approach.pdf

共8页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Constrained Differential Dynamic Programming A primal-dual augmented Lagrangian approach

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: