Optimization-Informed Neural Networks

2025-04-29
0
0
2.78MB
33 页
10玖币
侵权投诉
Optimization-Informed Neural Networks
Dawen Wua,∗, Abdel Lissera
aUniversit´e Paris-Saclay, CNRS, CentraleSup´elec, Laboratoire des signaux et syst`emes, 91190, Gif-sur-Yvette, France
Abstract
Solving constrained nonlinear optimization problems (CNLPs) is a longstanding computational problem
that arises in various fields, e.g., economics, computer science, and engineering. We propose optimization-
informed neural networks (OINN), a deep learning approach to solve CNLPs. By neurodynamic optimization
methods, a CNLP is first reformulated as an initial value problem (IVP) involving an ordinary differential
equation (ODE) system. A neural network model is then used as an approximate state solution for this
IVP, and the endpoint of the approximate state solution is a prediction to the CNLP. We propose a novel
training algorithm that directs the model to hold the best prediction during training. In a nutshell, OINN
transforms a CNLP into a neural network training problem. By doing so, we can solve CNLPs based on deep
learning infrastructure only, without using standard optimization solvers or numerical integration solvers.
The effectiveness of the proposed approach is demonstrated through a collection of classical problems, e.g.,
variational inequalities, nonlinear complementary problems, and standard CNLPs.
Keywords: Constrained nonlinear optimization problems, Neural networks, Neurodynamic optimization,
ODE system
1. Introduction
Constrained nonlinear optimization problems (CNLPs) play a central role in operations research and have
a wide range of real-world applications, such as production planning, resource allocation, portfolio selection,
portfolio optimization, feature selection, equilibrium problems (Xiao & Boyd, 2006; Leung & Wang, 2020;
Wang et al., 2021; Wu & Lisser, 2022). CNLPs have been studied at both the theoretical and practical levels
for the last few decades (Bertsekas, 1997; Boyd et al., 2004).
Neurodynamic optimization methods model a CNLP by the mean of an ordinary differential equation
(ODE) system. Hopfield & Tank (1985) pioneered this study and solved the well-known “traveling salesman”
problem by the Hopfield network. Kennedy & Chua (1988) extended the method to solve nonlinear convex
programming problems by using a penalty parameter. However, the disadvantage of this penalty parameter
method is that the true minimizer is obtained only when the penalty parameter goes to infinity. When the
penalty parameter is too large, the method hardly converges to the optimal solution. Since then, researchers
∗Corresponding author
Email address: dawen.wu@centralesupelec.fr, abdel.lisser@l2s.centralesupelec.fr (Abdel Lisser)
Preprint submitted to XXX June 27, 2023
arXiv:2210.02113v3 [math.OC] 25 Jun 2023
have improved the method gradually without using the penalty parameter. Rodriguez-Vazquez et al. (1990);
Xia et al. (2002); Gao et al. (2004); Xia & Feng (2007); Xia & Wang (2015) proposed neurodynamic methods
based on a projection function. Besides the convex and smooth optimization problems, Forti et al. (2004);
Xue & Bian (2008); Qin & Xue (2014) solved non-smooth CNLPs using differential inclusion theory and sub-
gradient. Additionally, pseudoconvex optimization problems have been studied based on various assumptions
(Guo et al., 2011; Qin et al., 2013; Xu et al., 2020).
With the rapid growth of available data and computing resources, deep learning now has a wide range
of applications, e.g., image processing (Krizhevsky et al., 2012; Goodfellow et al., 2016), natural language
processing (Devlin et al., 2018), bioinformatics (Min et al., 2017; Jumper et al., 2021). In operations research,
a neural network is used as a solver component to solve the mixed integer programming problem (Nair et al.,
2020). Graphical neural networks can be used for combinatorial optimization problems directly as solvers or
to enhance standard solvers (Cappart et al., 2021).
Dissanayake & Phan-Thien (1994) initially used a neural network as an approximate solution to differential
equations, where the training objective is to satisfy the given differential equation and boundary conditions.
Lagaris et al. (1998) constructed a neural network to satisfy an initial/boundary condition, and they discussed
the use of ODE and PDE problems, respectively. Lagaris et al. (2000); McFall & Mahan (2009) extended the
Lagris’ method to irregular boundaries. Raissi et al. (2019) introduced physics-informed neural networks to
solve forward and inverse problems involving PDEs. Sirignano & Spiliopoulos (2018) presents a theoretical
analysis that shows the neural network approximator converges to the PDE solution as hidden units go
to infinity. Deep learning approaches are attempting to overcome the challenge of solving high-dimensional
nonlinear PDEs (Han et al., 2017; Yu et al., 2018; Han et al., 2018; Beck et al., 2019). This line of research has
been extended to various fields, e.g., computational mechanics (Anitescu et al., 2019; Samaniego et al., 2020;
Guo et al., 2021). All the above methods use one neural network to solve one ODE/PDE problem. Flamant
et al. (2020) parameterizes ODE systems and uses the parameters as an input to a neural network so that one
neural network can solve multiple ODE systems. The universal approximation theorem of neural networks
states that a neural network can approximate any continuous function to arbitrary accuracy (Cybenko, 1989;
Hornik et al., 1989; Sonoda & Murata, 2017). Automatic differentiation tools facilitate the computation of
derivative, gradient, and Jacobian matrix (Baydin et al., 2018; Paszke et al., 2019). Software packages have
been developed to implement these deep learning methods for solving differential equations (Lu et al., 2021;
Chen et al., 2020).
1.1. Contributions
The contributions of this paper can be summarized as follows.
•We propose a deep learning approach to solve CNLPs, namely OINN. To the best of our knowledge,
this is the first time deep learning is used to solve CNLPs. OINN reformulates a CNLP as a neural
network training problem via neurodynamic optimization. Thus, we can solve the CNLP by only deep
2
learning infrastructure without using any standard CNLP solvers or numerical integration solvers.
•We propose a dedicated algorithm to train the OINN model toward solving the CNLP. This algorithm
is based on the epsilon metric, which is used to evaluate approximate solutions to the CNLP.
•We present the difference between OINN and numerical integration methods for solving a CNLP. OINN
can give an approximate solution at any round of iterations, while the numerical integration methods
can only give the solution at the end of the program. We show the computational advantages of OINN
thanks to this feature.
The remaining sections are organized as follows. The background information necessary to understand
this paper is provided in Section 2, including an introduction of CNLPs, neurodynamic optimization methods,
and numerical integration methods. Section 3 describes the OINN model and how it solves a CNLP. Training
of the OINN model is given in Section 4. Experimental results are given in Section 5 where six different
CNLP instances are solved with OINN, and a comparison between OINN and numerical integration methods
is presented. Section 6 summarizes this paper and gives future directions.
1.2. Notations
The notations list for this paper is shown in Table 1.
Notation Definition
y∈RnVariables of a CNLP. nrefers to the number of variables.
y∗∈RnOptimal solution of a CNLP.
x∈Rj,u∈RkPrimal variables and dual variables of a standard CNLP.
jrefers to the number of primary variables, krefers to the number of dual variables
P(·) A projection function that map variables onto a feasible set.
Φ(y) = dy
dt :Rn→RnAn ODE system
¯
y(t) : R→RnTrue state solution of an ODE system
ˆ
y(t) : R→RnApproximate state solution obtained by numerical integration methods
y(t;w) An OINN model, where ware the model parameters
y(T;w)∈RnThe endpoint of an OINN model
y0∈RnAn initial point of an ODE system
[0, T ]⊂RA time range of an ODE system
ϵ(·) Epsilon metric for evaluating a solution of CNLP
L(t;w) Loss function of OINN
E(w) Objective function of OINN
OINN Optimization-informed neural networks
CNLP Constrained nonlinear optimization problem
ODE Ordinary differential equation
IVP Initial value problem
NPE Nonlinear projection equation
Table 1: Notations
3
2. Preliminaries
Serval types of CNLP are introduced in Section 2.1. Neurodynamic optimization methods, which model a
CNLP as an ODE system, are introduced in Section 2.2. The initial value problem and numerical integration
methods are described in Section 2.3.
2.1. Constrained nonlinear optimization problems
This subsection introduces four types of CNLP, i.e., standard CNLP, variational inequality, nonlinear
complementary problem, and nonlinear projection equation.
Standard CNLP The standard CNLP has the following form
min
xf(x)
s.t.
g(x)≤0,
Ax =b,
(1)
where x∈Rjis the primal variable, u∈Rkis the dual variable associated with the constraint g(x). The
objective function f(x) : Rj→Ris not necessary convex or smooth. The constraint function g(x) : Rj→Rk
is convex but not necessarily smooth, A∈Re×jand b∈Re. The following projection function can project
the variable xonto the equality constraints feasible set {x∈Rj|Ax =b}
Peq(x) = x−ATAAT−1(Ax −b).(2)
The standard CNLP (1) is the most common form of CNLP, and we can classify it according to the property
of the objective and constraint functions. For example, it is called quadratic programming if the objective
function is quadratic and the constraints are linear; Nonsmooth optimization problems are those that involve
non-smooth functions.
Nonlinear projection equation (NPE) A NPE aims at finding a vector y∗∈Rnsuch that satisfies
PΩ(y−G(y)) = y,(3)
where y∈Rnis a real vector, G(·) : Rn→Rnis a locally Lipschitz continuous function, Ω = {y∈Rn|l−
i≤
yi≤l+
i, i = 1, . . . , n}is a box-constrained feasible set, where l−
iand l+
iare the lower and upper bounds of
yi, respectively. PΩ(·) : Rn→Ω is a projection function that project the variable onto the feasible set Ω,
4
defined by
PΩ(s) = P1
Ω(s1), . . . , P n
Ω(sn)T,where Pi
Ω(si) =
l−
i,if si< l−
i
si,if l−
i≤si≤l+
i
l+
i,otherwise.
(4)
Variational inequality (VI) A VI aims at finding a vector y∗∈Ω such that the following inequalities
hold
(y−y∗)TG(y∗)≥0,y∈Ω.(5)
where G(·) and Ω are the same as in (3). Variational inequality provides a reformulation of the Nash
equilibrium in game theory to study equilibrium properties, including existence, uniqueness, and convergence
(Patriksson, 2013; Parise & Ozdaglar, 2019; Singh & Lisser, 2018).
Nonlinear complementary problem (NCP) A NCP is to find out a vector y∗that satisfies
G(y)≥0,y≥0, G(y)Ty= 0.(6)
where G(·) is the same as in (3). Nonlinear complementarity problems arise in many practical applications.
For example, finding a Nash equilibrium is a special case of the Linear complementarity problem; KKT
systems of mathematical programming problems can be formulated as NCP problems.
Both the variational inequality (5) and nonlinear complementary problem (6) can be reformulated as
an NPE problem (Harker & Pang, 1990; Robinson, 1992). In addition, when the objective and constraint
functions are convex and smooth, the CNLP (1) can also be reformulated as an NPE problem, where the
variable vector is composed by the primal and dual variable, i.e., y= (xT,uT)T.
2.2. Neurodynamic optimization
This subsection introduces neurodynamic optimization methods, which model a CNLP by an ODE system.
Consider a CNLP with an optimal solution y∗. A neurodynamic approach establishes a dynamical system
in the form of a first-order ODE system, i.e., dy
dt = Φ(y). The state solution y(t) is expected to converge
to the optimal solution of the CNLP, i.e., limt→∞ y(t) = y∗. Here, we present three different neurodynamic
approaches (Xia & Feng, 2007; Qin & Xue, 2014; Xu et al., 2020), each of which solves a type of CNLP.
Definition 1. Consider an ODE system dy
dt = Φ(y), where Φ(y) : Rn→Rn. Given a point (t0,y0)∈Rn+1,
a vector value function y(t) : R→Rnis called a state solution, if it satisfies the ODE system dy
dt = Φ(y)
and the initial condition y(t0) = y0.
Xia & Feng (2007) proposed a neurodynamic approach to model the nonlinear projection equation (3).
The ODE system is as follows
dy
dt=λ(−G(PΩ(y)) + PΩ(y)−y),(7)
where λ > 0 is a parameter controlling the convergence rate.
5
摘要:
展开>>
收起<<
Optimization-InformedNeuralNetworksDawenWua,∗,AbdelLisseraaUniversit´eParis-Saclay,CNRS,CentraleSup´elec,Laboratoiredessignauxetsyst`emes,91190,Gif-sur-Yvette,FranceAbstractSolvingconstrainednonlinearoptimizationproblems(CNLPs)isalongstandingcomputationalproblemthatarisesinvariousfields,e.g.,economi...
声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
相关推荐
-
VIP免费2024-11-21 0
-
VIP免费2024-11-21 1
-
VIP免费2024-11-21 0
-
VIP免费2024-11-21 2
-
VIP免费2024-11-21 0
-
VIP免费2024-11-21 0
-
VIP免费2024-11-21 0
-
VIP免费2024-11-21 0
-
VIP免费2024-11-21 1
-
VIP免费2024-11-21 1
分类:图书资源
价格:10玖币
属性:33 页
大小:2.78MB
格式:PDF
时间:2025-04-29
作者详情
-
Voltage-Controlled High-Bandwidth Terahertz Oscillators Based On Antiferromagnets Mike A. Lund1Davi R. Rodrigues2Karin Everschor-Sitte3and Kjetil M. D. Hals1 1Department of Engineering Sciences University of Agder 4879 Grimstad Norway10 玖币0人下载
-
Voltage-controlled topological interface states for bending waves in soft dielectric phononic crystal plates10 玖币0人下载