Optimization-Informed Neural Networks

2025-04-29 1 0 2.78MB 33 页 10玖币

侵权投诉

Dawen Wua,∗, Abdel Lissera

aUniversit´e Paris-Saclay, CNRS, CentraleSup´elec, Laboratoire des signaux et syst`emes, 91190, Gif-sur-Yvette, France

Abstract

Solving constrained nonlinear optimization problems (CNLPs) is a longstanding computational problem

that arises in various ﬁelds, e.g., economics, computer science, and engineering. We propose optimization-

informed neural networks (OINN), a deep learning approach to solve CNLPs. By neurodynamic optimization

methods, a CNLP is ﬁrst reformulated as an initial value problem (IVP) involving an ordinary diﬀerential

equation (ODE) system. A neural network model is then used as an approximate state solution for this

IVP, and the endpoint of the approximate state solution is a prediction to the CNLP. We propose a novel

training algorithm that directs the model to hold the best prediction during training. In a nutshell, OINN

transforms a CNLP into a neural network training problem. By doing so, we can solve CNLPs based on deep

learning infrastructure only, without using standard optimization solvers or numerical integration solvers.

The eﬀectiveness of the proposed approach is demonstrated through a collection of classical problems, e.g.,

variational inequalities, nonlinear complementary problems, and standard CNLPs.

Keywords: Constrained nonlinear optimization problems, Neural networks, Neurodynamic optimization,

ODE system

1. Introduction

Constrained nonlinear optimization problems (CNLPs) play a central role in operations research and have

a wide range of real-world applications, such as production planning, resource allocation, portfolio selection,

portfolio optimization, feature selection, equilibrium problems (Xiao & Boyd, 2006; Leung & Wang, 2020;

Wang et al., 2021; Wu & Lisser, 2022). CNLPs have been studied at both the theoretical and practical levels

for the last few decades (Bertsekas, 1997; Boyd et al., 2004).

Neurodynamic optimization methods model a CNLP by the mean of an ordinary diﬀerential equation

(ODE) system. Hopﬁeld & Tank (1985) pioneered this study and solved the well-known “traveling salesman”

problem by the Hopﬁeld network. Kennedy & Chua (1988) extended the method to solve nonlinear convex

programming problems by using a penalty parameter. However, the disadvantage of this penalty parameter

method is that the true minimizer is obtained only when the penalty parameter goes to inﬁnity. When the

penalty parameter is too large, the method hardly converges to the optimal solution. Since then, researchers

∗Corresponding author

Email address: dawen.wu@centralesupelec.fr, abdel.lisser@l2s.centralesupelec.fr (Abdel Lisser)

Preprint submitted to XXX June 27, 2023

arXiv:2210.02113v3 [math.OC] 25 Jun 2023

have improved the method gradually without using the penalty parameter. Rodriguez-Vazquez et al. (1990);

Xia et al. (2002); Gao et al. (2004); Xia & Feng (2007); Xia & Wang (2015) proposed neurodynamic methods

based on a projection function. Besides the convex and smooth optimization problems, Forti et al. (2004);

Xue & Bian (2008); Qin & Xue (2014) solved non-smooth CNLPs using diﬀerential inclusion theory and sub-

gradient. Additionally, pseudoconvex optimization problems have been studied based on various assumptions

(Guo et al., 2011; Qin et al., 2013; Xu et al., 2020).

With the rapid growth of available data and computing resources, deep learning now has a wide range

of applications, e.g., image processing (Krizhevsky et al., 2012; Goodfellow et al., 2016), natural language

processing (Devlin et al., 2018), bioinformatics (Min et al., 2017; Jumper et al., 2021). In operations research,

a neural network is used as a solver component to solve the mixed integer programming problem (Nair et al.,

2020). Graphical neural networks can be used for combinatorial optimization problems directly as solvers or

to enhance standard solvers (Cappart et al., 2021).

Dissanayake & Phan-Thien (1994) initially used a neural network as an approximate solution to diﬀerential

equations, where the training objective is to satisfy the given diﬀerential equation and boundary conditions.

Lagaris et al. (1998) constructed a neural network to satisfy an initial/boundary condition, and they discussed

the use of ODE and PDE problems, respectively. Lagaris et al. (2000); McFall & Mahan (2009) extended the

Lagris’ method to irregular boundaries. Raissi et al. (2019) introduced physics-informed neural networks to

solve forward and inverse problems involving PDEs. Sirignano & Spiliopoulos (2018) presents a theoretical

analysis that shows the neural network approximator converges to the PDE solution as hidden units go

to inﬁnity. Deep learning approaches are attempting to overcome the challenge of solving high-dimensional

nonlinear PDEs (Han et al., 2017; Yu et al., 2018; Han et al., 2018; Beck et al., 2019). This line of research has

been extended to various ﬁelds, e.g., computational mechanics (Anitescu et al., 2019; Samaniego et al., 2020;

Guo et al., 2021). All the above methods use one neural network to solve one ODE/PDE problem. Flamant

et al. (2020) parameterizes ODE systems and uses the parameters as an input to a neural network so that one

neural network can solve multiple ODE systems. The universal approximation theorem of neural networks

states that a neural network can approximate any continuous function to arbitrary accuracy (Cybenko, 1989;

Hornik et al., 1989; Sonoda & Murata, 2017). Automatic diﬀerentiation tools facilitate the computation of

derivative, gradient, and Jacobian matrix (Baydin et al., 2018; Paszke et al., 2019). Software packages have

been developed to implement these deep learning methods for solving diﬀerential equations (Lu et al., 2021;

Chen et al., 2020).

1.1. Contributions

The contributions of this paper can be summarized as follows.

•We propose a deep learning approach to solve CNLPs, namely OINN. To the best of our knowledge,

this is the ﬁrst time deep learning is used to solve CNLPs. OINN reformulates a CNLP as a neural

network training problem via neurodynamic optimization. Thus, we can solve the CNLP by only deep

learning infrastructure without using any standard CNLP solvers or numerical integration solvers.

•We propose a dedicated algorithm to train the OINN model toward solving the CNLP. This algorithm

is based on the epsilon metric, which is used to evaluate approximate solutions to the CNLP.

•We present the diﬀerence between OINN and numerical integration methods for solving a CNLP. OINN

can give an approximate solution at any round of iterations, while the numerical integration methods

can only give the solution at the end of the program. We show the computational advantages of OINN

thanks to this feature.

The remaining sections are organized as follows. The background information necessary to understand

this paper is provided in Section 2, including an introduction of CNLPs, neurodynamic optimization methods,

and numerical integration methods. Section 3 describes the OINN model and how it solves a CNLP. Training

of the OINN model is given in Section 4. Experimental results are given in Section 5 where six diﬀerent

CNLP instances are solved with OINN, and a comparison between OINN and numerical integration methods

is presented. Section 6 summarizes this paper and gives future directions.

1.2. Notations

The notations list for this paper is shown in Table 1.

Notation Deﬁnition

y∈RnVariables of a CNLP. nrefers to the number of variables.

y∗∈RnOptimal solution of a CNLP.

x∈Rj,u∈RkPrimal variables and dual variables of a standard CNLP.

jrefers to the number of primary variables, krefers to the number of dual variables

P(·) A projection function that map variables onto a feasible set.

Φ(y) = dy

dt :Rn→RnAn ODE system

y(t) : R→RnTrue state solution of an ODE system

y(t) : R→RnApproximate state solution obtained by numerical integration methods

y(t;w) An OINN model, where ware the model parameters

y(T;w)∈RnThe endpoint of an OINN model

y0∈RnAn initial point of an ODE system

[0, T ]⊂RA time range of an ODE system

ϵ(·) Epsilon metric for evaluating a solution of CNLP

L(t;w) Loss function of OINN

E(w) Objective function of OINN

OINN Optimization-informed neural networks

CNLP Constrained nonlinear optimization problem

ODE Ordinary diﬀerential equation

IVP Initial value problem

NPE Nonlinear projection equation

Table 1: Notations

2. Preliminaries

Serval types of CNLP are introduced in Section 2.1. Neurodynamic optimization methods, which model a

CNLP as an ODE system, are introduced in Section 2.2. The initial value problem and numerical integration

methods are described in Section 2.3.

2.1. Constrained nonlinear optimization problems

This subsection introduces four types of CNLP, i.e., standard CNLP, variational inequality, nonlinear

complementary problem, and nonlinear projection equation.

Standard CNLP The standard CNLP has the following form











min

xf(x)

s.t.

g(x)≤0,

Ax =b,

(1)

where x∈Rjis the primal variable, u∈Rkis the dual variable associated with the constraint g(x). The

objective function f(x) : Rj→Ris not necessary convex or smooth. The constraint function g(x) : Rj→Rk

is convex but not necessarily smooth, A∈Re×jand b∈Re. The following projection function can project

the variable xonto the equality constraints feasible set {x∈Rj|Ax =b}

Peq(x) = x−ATAAT−1(Ax −b).(2)

The standard CNLP (1) is the most common form of CNLP, and we can classify it according to the property

of the objective and constraint functions. For example, it is called quadratic programming if the objective

function is quadratic and the constraints are linear; Nonsmooth optimization problems are those that involve

non-smooth functions.

Nonlinear projection equation (NPE) A NPE aims at ﬁnding a vector y∗∈Rnsuch that satisﬁes

PΩ(y−G(y)) = y,(3)

where y∈Rnis a real vector, G(·) : Rn→Rnis a locally Lipschitz continuous function, Ω = {y∈Rn|l−

i≤

yi≤l+

i, i = 1, . . . , n}is a box-constrained feasible set, where l−

iand l+

iare the lower and upper bounds of

yi, respectively. PΩ(·) : Rn→Ω is a projection function that project the variable onto the feasible set Ω,

deﬁned by

PΩ(s) = P1

Ω(s1), . . . , P n

Ω(sn)T,where Pi

Ω(si) = 









l−

i,if si< l−

si,if l−

i≤si≤l+

i,otherwise.

(4)

Variational inequality (VI) A VI aims at ﬁnding a vector y∗∈Ω such that the following inequalities

hold

(y−y∗)TG(y∗)≥0,y∈Ω.(5)

where G(·) and Ω are the same as in (3). Variational inequality provides a reformulation of the Nash

equilibrium in game theory to study equilibrium properties, including existence, uniqueness, and convergence

(Patriksson, 2013; Parise & Ozdaglar, 2019; Singh & Lisser, 2018).

Nonlinear complementary problem (NCP) A NCP is to ﬁnd out a vector y∗that satisﬁes

G(y)≥0,y≥0, G(y)Ty= 0.(6)

where G(·) is the same as in (3). Nonlinear complementarity problems arise in many practical applications.

For example, ﬁnding a Nash equilibrium is a special case of the Linear complementarity problem; KKT

systems of mathematical programming problems can be formulated as NCP problems.

Both the variational inequality (5) and nonlinear complementary problem (6) can be reformulated as

an NPE problem (Harker & Pang, 1990; Robinson, 1992). In addition, when the objective and constraint

functions are convex and smooth, the CNLP (1) can also be reformulated as an NPE problem, where the

variable vector is composed by the primal and dual variable, i.e., y= (xT,uT)T.

2.2. Neurodynamic optimization

This subsection introduces neurodynamic optimization methods, which model a CNLP by an ODE system.

Consider a CNLP with an optimal solution y∗. A neurodynamic approach establishes a dynamical system

in the form of a ﬁrst-order ODE system, i.e., dy

dt = Φ(y). The state solution y(t) is expected to converge

to the optimal solution of the CNLP, i.e., limt→∞ y(t) = y∗. Here, we present three diﬀerent neurodynamic

approaches (Xia & Feng, 2007; Qin & Xue, 2014; Xu et al., 2020), each of which solves a type of CNLP.

Deﬁnition 1. Consider an ODE system dy

dt = Φ(y), where Φ(y) : Rn→Rn. Given a point (t0,y0)∈Rn+1,

a vector value function y(t) : R→Rnis called a state solution, if it satisﬁes the ODE system dy

dt = Φ(y)

and the initial condition y(t0) = y0.

Xia & Feng (2007) proposed a neurodynamic approach to model the nonlinear projection equation (3).

The ODE system is as follows

dt=λ(−G(PΩ(y)) + PΩ(y)−y),(7)

where λ > 0 is a parameter controlling the convergence rate.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

Optimization-InformedNeuralNetworksDawenWua,∗,AbdelLisseraaUniversit´eParis-Saclay,CNRS,CentraleSup´elec,Laboratoiredessignauxetsyst`emes,91190,Gif-sur-Yvette,FranceAbstractSolvingconstrainednonlinearoptimizationproblems(CNLPs)isalongstandingcomputationalproblemthatarisesinvariousfields,e.g.,economi...

展开>> 收起<<

Optimization-Informed Neural Networks.pdf

共33页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Optimization-Informed Neural Networks

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: