Constrained Differential Dynamic Programming A primal-dual augmented Lagrangian approach

2025-05-02 0 0 5.32MB 8 页 10玖币
侵权投诉
Constrained Differential Dynamic Programming:
A primal-dual augmented Lagrangian approach
Wilson Jalleta,b,*, Antoine Bambadeb,c, Nicolas Mansardaand Justin Carpentierb
Abstract Trajectory optimization is an efficient approach
for solving optimal control problems for complex robotic
systems. It relies on two key components: first the tran-
scription into a sparse nonlinear program, and second the
corresponding solver to iteratively compute its solution. On
one hand, differential dynamic programming (DDP) provides
an efficient approach to transcribe the optimal control problem
into a finite-dimensional problem while optimally exploiting
the sparsity induced by time. On the other hand, augmented
Lagrangian methods make it possible to formulate efficient
algorithms with advanced constraint-satisfaction strategies. In
this paper, we propose to combine these two approaches into
an efficient optimal control algorithm accepting both equality
and inequality constraints. Based on the augmented Lagrangian
literature, we first derive a generic primal-dual augmented
Lagrangian strategy for nonlinear problems with equality
and inequality constraints. We then apply it to the dynamic
programming principle to solve the value-greedy optimization
problems inherent to the backward pass of DDP, which we
combine with a dedicated globalization strategy, resulting in
a Newton-like algorithm for solving constrained trajectory
optimization problems. Contrary to previous attempts of formu-
lating an augmented Lagrangian version of DDP, our approach
exhibits adequate convergence properties without any switch in
strategies. We empirically demonstrate its interest with several
case-studies from the robotics literature.
I. INTRODUCTION
In this paper, we are interested in solving constrained
continuous-time optimal control problems (OCP) of the
form:
min
x,u ZT
0
`(t, x(t), u(t)) dt +`T(x(T)) (1a)
s.t. f(t, x(t), u(t),˙x(t)) = 0, t [0, T )(1b)
x(0) = ¯x0(1c)
h(t, x(t), u(t)) 60(1d)
hT(x(T)) 60,(1e)
where `and `Tare the running and terminal costs respec-
tively, (1b) accounts for the system dynamics written as a
differential-algebraic equation (DAE) (and includes the ODE
case ˙x=f(t, x(t), u(t))). We denote Xand Uthe state and
aLAAS-CNRS, 7 Avenue du Colonel Roche, F-31400 Toulouse, France
bInria, D´
epartement d’informatique de l’ENS, ´
Ecole normale sup´
erieure,
CNRS, PSL Research University, Paris, France
cENPC, France, *corresponding author: wjallet@laas.fr
This work was supported in part by the HPC resources from GENCI-
IDRIS (Grant AD011011342), the French government under management of
Agence Nationale de la Recherche as part of the ”Investissements d’avenir”
program, reference ANR-19-P3IA-0001 (PRAIRIE 3IA Institute) and ANR-
19- P3IA-000 (ANITI 3IA Institute), Louis Vuitton ENS Chair on Artificial
Intelligence, and the European project MEMMO (Grant 780684).
control spaces, T > 0the time horizon, ¯x0∈ X the initial
condition, h(·)and hT(·)the path and terminal constraints.
For numerical resolution, the continuous-time OCP (1)
must be transcribed into a finite-dimensional optimization
problem (i.e., with a finite number of variables, which
the continuous-time trajectories are not) [1]. Several tran-
scriptions are possible [2], [3], [4]. Differential Dynamic
Programming (DDP) is a particular OC algorithm which
implies a direct transcription known as single shooting [5].
Popularized in robotics in the late 2000s [6], it has the
advantage over other transcriptions of providing a simple
formulation, optimally exploiting the sparsity of the resulting
nonlinear programs while providing feedback gains at no
extra cost. The corresponding transcription, extended to any
constraints, reads:
min
x,u
N1
X
k=0
`k(xk, uk) + `N(xN)(2a)
s.t. fk(xk, uk, xk+1) = 0, k J0, N 1K(2b)
x0= ¯x0(2c)
hk(xk, uk)60(2d)
hN(xN)60,(2e)
where hk, hN, fkare appropriate functions discretizing the
dynamics and path constraints depending on the given nu-
merical discretization scheme employed. The `kare approxi-
mations of the cost integrals Rtk+1
tk`(t, x(t), u(t)) dt. We use
the shorthands xdef
= (x0, . . . , xN)and udef
= (u0, . . . , uN1)
for the discretized state and control trajectories.
While the nominal DDP algorithm is not able to handle
path constraints, implicit integrators or multiple-shooting
stabilization, several improvements have been proposed over
the years to equip it with these properties. In this paper, we
focus on the handling of equality and inequality constraints,
and we first review previous work focusing on it.
A first subcase of interest only considers OCP with control
bounds, which can be handled by a projected quasi-Newton
approach [7]. Several other projection-based formulations
have then been proposed to extend DDP [8], [9], none of
which have been shown to be robust enough to be widely
adopted in robotics. To account fro inequality constraints,
interior-point methods [10], [11] have also been recently
investigated; however, these do not allow for easy warm-
starting [3] which is unsuitable for online optimization and
application to model-predictive control (MPC) [12], [13].
In the past few years, augmented Lagrangian approaches
have emerged as a suitable solution for solving constrained
arXiv:2210.15409v2 [cs.RO] 28 Oct 2022
trajectory optimization problems [14]. As argued later in
this paper, it offers many of the good properties that we
need for trajectory optimization: super-linear convergence or
even more quadratic convergence, stability, ability to warm-
start, and so on. Yet the first attempt to write dedicated
OCP solvers based on augmented Lagrangians exhibited poor
convergence properties. Thereby, further refinement using a
projection in a two-stage approach had to be introduced in
the solver ALTRO [15]. The penalty function used in ALTRO
was then recognized to be irregular and discarded in [16],
which introduces a switch to an SQP formulation to converge
to a higher precision.
A key idea that we exploit in this paper is to introduce the
augmented Lagrangian formulation directly in the backward
pass, to solve the value-greedy problems while directly
considering the constraints, as initially proposed for multi-
phase constrained problems [17]. This enables us to obtain
better numerical accuracy for equality-constrained problems,
by stabilizing the backward pass using a primal-dual system
of equations to compute the control and multipliers to-
gether [18], and a monotonic update of the penalty parameter
derived from the bound-constrained Lagragian (BCL) [19]
strategy. Their method converges reliably to good numerical
accuracy. We have recently extended this formulation to also
account for the dynamics and other equality constraints using
a primal-dual augmented Lagrangian, allowing for the inclu-
sion of infeasible initialization and implicit integrators [20].
In this paper, we introduce a complete augmented La-
grangian DDP algorithm for handling both equality and
inequality constraints, and validate it on several real-size
robotic scenarios. We first introduce in Sec. II a primal-
dual algorithm, rooted in the nonlinear programming litera-
ture [21], to handle generic nonlinear optimization problems
(NLPs). We then adapt it to the specific case of OCPs of
the form (2) in Sec. III. It results in an overall second-
order quasi-Newton-like algorithm with good convergence
properties for solving constrained trajectory optimization
problems. We finally benchmark our method in Sec. IV on
various standard case studies from the robotics literature. A
companion video is available1.
II. THE PRIMAL-DUAL AUGMENTED LAGRANGIAN
METHOD FOR CONSTRAINED OPTIMIZATION
This section introduces our augmented Lagrangian ap-
proach to solve constrained nonlinear optimization problems
(NLP) of the form:
min
xRnf(x)
s.t. c(x)=0, h(x)60,(3)
where cand hstands for equality and inequality constraints
respectively. We then adapt this approach in Sec. III to
the case of trajectory optimization. While many augmented
Lagrangian approaches have been introduced in the optimiza-
tion literature [22], most of them rely on alternating between
primal solving and dual updates. In this work, we propose
1https://peertube.laas.fr/videos/watch/dfeca51c-c2cf- 468a-b46a- 86f808e9a561
instead to compute combined primal-dual steps by taking
inspiration from the work of Gill and Robinson in [21],
which we extend by also considering inequality constraints
and by connecting it to the proximal method of multipliers
(PMM) [23] that we use to for numerical robustness. We
discuss these contributions in more detail at the end of this
section.
A. Optimality conditions
The Lagrangian Lassociated with (3) is defined by:
L(x, λ, ν) = f(x) + λ>c(x) + ν>h(x), λ Rne, ν Rni
+
(4)
A saddle point of Lis a solution of (3). This leads to the
Karush-Kuhn-Tucker (KKT) necessary conditions [24] for
ensuring a primal-dual point (x, λ, ν)to be optimal:
xL(x, λ, ν)=0,
c(x) = 0 and h(x)60,
ν>0and h(x)>ν= 0.
(KKT)
In practice, we search for a triplet (x, λ, ν)satisfying these
optimality conditions (KKT) up to a certain level of prede-
fined accuracy abs >0, leading us to the following natural
absolute stopping criterion:
k∇xL(x, λ, ν)k6abs,
k(c(x),[h(x)]+)k6abs,(5)
where [z]+denotes the projection of zRnzon Rnz
+.
B. Equality constrained nonlinear programming problems
In this section, we provide a high-level overview on the
primal-dual augmented Lagrangian (PDAL) method. It is
closely related to the probably even more famous Method
of Multipliers (MM) [25, Chapter 2], which we review first
in the context of purely equality-constrained NLPs.
Primal-dual augmented Lagrangian. The PDAL function
LA
µ[21, Section 3] is defined by augmenting the standard
Lagrangian L(4) with two squared `2penalties:
MA
µ(x, λ;λe)def
=L(x, λe,0) + 1
2µekc(x)k2
2
+1
2µekc(x) + µe(λeλ)k2
2,(6)
where µe>0is a scalar2. The PDAL method then searches
a sequence of iterates approximately minimizing (6) [28]:
xl+1, λl+1 ωlmin
x,λ MA
µ(x, λ;λl),(7)
where ωlstands for requiring (xl+1, λl+1)to be an
ωl-approximate solution to subproblem (7). The approxima-
tion is controlled via the following condition:
krl(xl+1, λl+1)k6ωl,(8)
where rlaccounts for the optimality conditions at iteration
l: if krl(x, λ)k6abs then (x, λ)is a solution to (7) at
precision abs. The formula for rlwill be specified in the
2Some authors associate a penalty parameter to each constraint. In this
case, the penalty parameters µeis a matrix Σλ[26], [27].
摘要:

ConstrainedDifferentialDynamicProgramming:Aprimal-dualaugmentedLagrangianapproachWilsonJalleta,b,*,AntoineBambadeb,c,NicolasMansardaandJustinCarpentierbAbstract—Trajectoryoptimizationisanefcientapproachforsolvingoptimalcontrolproblemsforcomplexroboticsystems.Itreliesontwokeycomponents:rstthetran-s...

展开>> 收起<<
Constrained Differential Dynamic Programming A primal-dual augmented Lagrangian approach.pdf

共8页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:8 页 大小:5.32MB 格式:PDF 时间:2025-05-02

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 8
客服
关注