Tremendous numerical efforts have been made in order to propose efficient algorithms for solving problem
related to (1), or the corresponding HJB equation (3). Precise numerical methods have been developed, using
approximations on grids, such as Markov Chains approximations [33], finite difference schemes (monotone
schemes [17], semi-Lagrangian schemes (see e.g. [18, 20]) ENO or WENO higher-order schemes [37, 38],
finite element methods (see [31]), discontinuous Galerkin methods [28, 35], and in particular [12, 13] for
(3), see also [40], or max-plus approaches [2]). However, grid-based methods are limited to low dimensions
because of the well-known curse of dimensionality. In order to tackle this difficulty, various approaches are
studied, such as spare grids methods [16, 21], tree structure approximation algorithm (see e.g.[3]), tensor
decomposition methods [19], max-plus approaches in [36].
In the deterministic context, problem (1) is motivated by deterministic optimal control with state con-
straints (see e.g. [14] and [4]). In [41], the HJB equation (3) is approximated by deep neural networks
(DNN) for solving state constrained reachability control problems of dimension up to d= 10. In [15] or
in [6], formulation (1) is used to solve an abort landing problem (using different numerical approaches); in
[11], equations such as (3) are used to solve an aircraft payload optimization problem; a multi-vehicle safe
trajectory planning is considered in [8].
On the other hand, for stochastic control, DNN approximations were already used for gas storage op-
timization in [9], where the control approximated by a neural network was the amount of gas injected or
withdrawn in the storage. This approach has been adapted and popularized recently for the resolution of
BSDE in [26] (deep BSDE algorithm). For a convergence study of such algorithms in a more general context,
see [27].
In this work, we study some neural networks approximations for (1). We are particularly interested
for the obtention of a rigorous error analysis of such approximations. We follow the approach of [29] (and
its companion paper [7]), combining neural networks approximations and Bellman’s dynamic programming
principle. We obtain precise error estimates in an average norm.
Note that the work of [29] is developed in the stochastic context, where an error analysis is given. However
this error analysis somehow relies strongly on a diffusion assumption of the model (transition probabilities
with densities are assumed to exists). In our case, we would need to assume that the deterministic process
admits a density, which is not the case (see remark 6.5). Therefore the proof of [29] does not apply to the
deterministic context. Here we propose a new approach for the convergence analysis, leading to new error
estimates. We chose to present the algorithm on a running cost optimal control problem, but the approach
can be generalized to Bolza or Mayer problems (see e.g. [4, 6]).
For sake of completeness, let us notice that the ideas of [29] are related to methods already proposed
in [23] and [10] for the resolution of Backward Stochastic Differential Equations (BSDE), where the control
function is calculated by regression on a space of some basis functions (the Hybrid-Now algorithm is related
to [23], and the performance iteration algorithm is related to an improved algorithm in [10]). For recent
developments, see [24] using classical linear regressions, and [30] and [22] for BSDE approximations using
neural networks.
From the numerical point of view, we illustrate our algorithms on some academic front-propagation
problems with or without obstacles. We focus on a ”Lagrangian scheme” (a deterministic equivalent of
the performance iteration scheme of [29]), and also compare with other algorithms : a ”semi-Lagrangian
algorithm” (similar to the Hybrid-Now algorithm of [29]) and an hybrid algorithm combining the two previous,
involving successive projection of the value function on neural network spaces.
The plan of the paper is the following. In section 2 we define a semi-discrete value approximation for (1),
with controlled error with respect to the continuous value, using piecewise constant controls. In section 3,
equivalent reformulations of the problem are given using feedback controls and dynamic programming. In
section 4, an approximation result of the discrete value function by using Lipschitz continuous feedback
controls is given. In section 5 we present three numerical schemes using neural networks approximations
(for the approximation of feedback controls and/or for the value), using general Runge Kutta schemes for
the approximation of the controlled dynamics. Section 6 contains our main convergence result for one of the
proposed scheme (the Lagrangian scheme) which involves only approximations of the feedback controls, and
section 7 focuses on the proof of our main result. Section 8 is devoted to some numerical academic examples
2