Boundary-safe PINNs extension Application to non-linear parabolic PDEs in counterparty credit risk Joel P. Villarino12and Alvaro Leitao12and Jos e A. Garc a-Rodr guez12

2025-04-30 0 0 1.81MB 34 页 10玖币

侵权投诉

Boundary-safe PINNs extension: Application to non-linear

parabolic PDEs in counterparty credit risk∗

Joel P. Villarino1,2and ´

Alvaro Leitao1,2and Jos´e A. Garc´ıa-Rodr´ıguez1,2

October 6, 2022

1 M2NICA research group, University of Coru˜na, Spain

2 CITIC research center, Spain

E-mails: joel.perez.villarino@udc.es / alvaro.leitao@udc.es / jose.garcia.rodriguez@udc.es

ABSTRACT.

The goal of this work is to develop deep learning numerical methods for solving option XVA pricing problems

given by non-linear PDE models. A novel strategy for the treatment of the boundary conditions is proposed,

which allows to get rid of the heuristic choice of the weights for the diﬀerent addends that appear in the loss

function related to the training process. It is based on deﬁning the losses associated to the boundaries by

means of the PDEs that arise from substituting the related conditions into the model equation itself. Further,

automatic diﬀerentiation is employed to obtain accurate approximation of the partial derivatives.

Keywords : deep learning, PDEs, PINNs, boundary conditions, nonlinear, automatic diﬀerentiation, option

pricing, XVA.

1 Introduction

Deep learning techniques are machine learning algorithms based on neural networks, also known as

artiﬁcial neural networks (ANNs), and representation learning, see [36] and the references therein.

From a mathematical point of view, ANNs can be interpreted as multiple chained compositions of

multivariate functions, and deep neural networks is the term used for ANNs with several interconnected

layers. Such networks are known for being universal approximators, property given by the Universal

Approximation Theorem, which essentially states that any continuous function in any dimension can

be represented to arbitrary accuracy by means of an ANN. For this reason, ANNs have a wide range

of application, and their use has become ubiquitous in many ﬁelds: computer vision, natural language

processing, autonomous vehicles, etc. Deep learning algorithms are usually classiﬁed according to the

amount and type of supervision they get during training and, among all the categories that can be

identiﬁed, we highlight the supervised and the unsupervised algorithms. They diﬀer in whether they

receive the desired solutions in the training set or not.

The aforementioned universal approximation property was exploited in the seminal papers [52],

[29] and [48] to introduce a technique to solve partial diﬀerential equations (PDEs) by means of

ANNs. In recent years there has been a growing interest in approximating the solution of PDEs

by means of deep neural networks. They promise to be an alternative to classical methods such as

Finite Diﬀerences (FD), Finite Volumes (FV) or Finite Elements (FE). For example, the FE technique

consists in projecting the solution in some functional space, the Galerkin spaces. Then, by passing to

the weak variational formulation and taking the discrete basis, we can ﬁnd a linear system of equations

whose unknowns are the approximated values of the solution as each point of the mesh. In a similar

∗The authors have no conﬂict of interest to disclose.

arXiv:2210.02175v1 [q-fin.CP] 5 Oct 2022

manner, the ANN can be trained to learn data from a physical law that is given by a PDE or a system

of PDEs. The idea is quite similar to the classical Galerkin methods, but instead of representing the

solution as a projection in some ﬂavour of Galerkin space, the solution is written in terms of ANNs

as the composition of nonlinear functions depending on some network weights. As a result, instead

of a high dimensional linear system, a high dimensional nonlinear optimization problem is obtained

for the ANN weights. This problem must be solved using nonlinear optimization algorithms such as

stochastic gradient descent-based methods, e.g., [45], and/or quasi-Newton methods, e.g., L-BFGS,

[7]. More recently, with the advances in automatic diﬀerentiation algorithms (AD) and hardware

(GPUs), this kind of techniques have gained more momentum in the literature and, currently, the

most promising approach is known as physics-informed neural networks (PINNs), see [53] [59], [51],

[54], [23].

In the last few years, PINNs have shown a remarkable performance. However, there is still some

room for improvements within the methodology. One of the disadvantages of PINNs is the lack of

theoretical results for the control of the approximation error. Obtaining error estimates or results for

the order of approximation in PINNs is a non-trivial task, much more challenging than in classical

methods. Even so, the authors in [25], [4], [54], [28], [26], [24] and [27] (among others) have derived

estimates and bounds for the so-called generalization error considering particular models. Another

drawback is the diﬃculty when imposing the boundary conditions (a fact discussed further later in

this section). Nevertheless the use of ANNs has several advantages for solving PDEs: they can be used

for nonlinear PDEs without any extra eﬀort; they can be extended to (moderate) high dimensions;

and they yield accurate approximations of the partial derivatives of the solution thanks to the AD

modules provided by modern deep learning frameworks.

PINNs is not the only approach relying on ANNs to address PDE-based problems. They can be

used as a complement for classical numerical methods, for example training the neural network to

obtain smoothness indicators, or WENO reconstructions in order for them to be used inside a classical

FV method, see [46], [47]. Also ANNs are being used to solve PDE models by means of their backward

stochastic diﬀerential equation (BSDE) representation as long as the Feynmann-K`ac theorem can be

applied, which is the usual situation in computational ﬁnance, for example. In [37], the authors present

the so called DeepBSDE numerical methods and their application to the solution of the nonlinear

Black-Scholes equation, the Hamilton-Jacobi-Bellman equation, and the Allen-Cahn equation in very

high (hundreds of) dimensions. The connection of such method with the recursive multilevel Picard

approximations allows the authors to prove that DeepBSDEs are capable of overcoming the so called

“curse of dimensionality” for a certain kind of PDEs, see [68], [42].

The main goal of the present work is to develop robust and stable deep learning numerical methods

for solving nonlinear parabolic PDE models by means of PINNs. The motivation arises from the

diﬃculty of ﬁnding and numerically imposing the boundary conditions, which are always delicate

and critical tasks both in the classical FD/FV/FE setting and thus also in the ANN setting. The

common approach consists in assigning weights to the diﬀerent terms involved in the loss function,

where the selection of this weights must be done heuristically. We introduce a new idea that consists

in introducing the loss terms due to the boundary conditions by means of evaluating the PDE operator

restricted to the boundaries. In this way the value of such addends is of the same magnitude of the

interior losses. Although this is non feasible in the classical PDE solving algorithms, it is very intuitive

within the PINNs framework since, by means of AD, we can evaluate this operator in the boundary

even in the case it contains normal derivatives to such boundary. Thus, this novel treatment of the

boundary conditions in PINNs is the main contribution of this work, allowing to get rid of the heuristic

choice of the weights for the contributions of the boundary addends to the loss function that come from

the boundary conditions. Further, AD can be naturally exploited to obtain accurate approximations of

the partial derivatives of the solution with respect to the input parameters (quantities of much interest

in several ﬁelds).

Although the proposed methodology could be presented for a wide range of applications, here we

will focus on the solution of PDE models for challenging problems appearing the the computational

ﬁnance ﬁeld. In particular, we consider the derivative valuation problem in the presence of counterparty

credit risk (CCR), which includes in its formulation the so-called x-value adjustments (XVA). This

term refers to the diﬀerent valuation adjustment that arise in the models when the CCR is considered,

i.e., when the possibility of default of the parties involved in the transaction is taking into account.

These adjustments can come from diﬀerent sources within a derivative portfolio: credit (CVA), debit

(DVA); funding costs (FVA); collateral requirements (ColVA); and capital requirements (KVA), among

others. After the 2007-2009 ﬁnancial crisis, CCR management became of key importance in the

ﬁnancial industry. Several models were developed in order to enrich the classical pricing models by the

introduction of risk terms. In this sense, the value adjustments are terms to be added to, or subtracted

from, an idealised reference portfolio value, computed in the absence of frictions, in order to obtain

the ﬁnal value of the transaction.

The ﬁrst works in this topic appeared before the above-mentioned crisis, focusing on analyzing

the CVA concept. Some seminal works from this period are [30], [11] and [18]. After the crisis, the

XVA adjustments gained huge attention. The models in which the possibility of default of the parties

involved in a transaction were revised by the introduction of the DVA factor, [13], [9]. Additionally,

the increasingly important role of collateral agreements demands for a portfolio-wide view of valuation

by introducing the ColVA factor. In a Black-Scholes economy, [57] gives valuation formulas both in

the collateralized and uncollateralized case. In addition, generalizations to the case of a multi-currency

economy can be found in [58], [31], [32], and [35]. Another important aspect for the industry, apart

from default risk, is represented by funding costs. Currently, the trading activity is dependent on

diﬀerent sources of liquidity such as the interest rate multi-curve, [22], and the old assumption of a

unique risk-free interest rate is no longer realistic. In [55], the FVA is included into a risk-neutral

pricing framework for CCR considering realistic settings. Such work is extended in [12], where the

eﬀect of Central Clearing Counterparties (CCPs) on funding costs is studied. In this regard, there are

many more contributions in obtaining a single risk management framework which includes funding and

default risk. In [8] is developed a uniﬁed valuation theory that incorporates credit risk, collateralization

and funding costs by means of the so-called discounting approach. The authors in [15], [14] generalize

the classical Black-Scholes replication approach to include some of the aforementioned eﬀects. A more

general BSDE approach is provided by [20], [21], [5], and [6]. In addition, the equivalence between the

discounting and BSDE-based approaches is demonstrated in [8].

Of course, the world of quantitative ﬁnance in general, and CCR management in particular, has

not been exempt from the advances in deep learning and, nowadays, ANNs are employed for a wide

variety of tasks in the industry. Unsupervised ANNs, in both ﬂavours, PINNs and DeepBSDEs, have

been recently used for solving challenging ﬁnancial problems. For example, in [62] the authors apply

PINNs for solving the linear one and two dimensional Black-Scholes equation, and [67] introduces

the solution of high dimensional Black-Scholes problems using BSDEs. In [34] the authors present a

novel computational framework for portfolio-wide risk management problems with a potentially large

number of risk factors that makes traditional numerical techniques ineﬀective. They use a coupled

system of BSDEs for XVA which is addressed by a recursive application of an ANN-based BSDE

solver. Other relevant works that make use of ANNs for computational ﬁnance problems, although

not formulated as PDEs, include [40], [41], or [50], for example.

The outline of this paper is as follows. In Section 2 we start by revisiting the PINNs framework for

solving PDEs. Section 3 introduces the new methodology for the treatment of the boundary conditions

in the PINNs setting. In Section 4, the XVA PDE models that we solve in this paper and the adaptation

to our PINNs extension are described; more precisely, XVA problems in one and two dimensions, under

on Black-Scholes and Heston models. Finally, in Section 5, the numerical experiments that assess the

accuracy of the approximation for option prices and their partial derivatives (the so-called Greeks) are

presented.

2 PINNs

In this section we introduce the so-called PINNs methodology for solving PDEs. The illustration is

carried out according to the kind of PDEs that arise in the selected ﬁnancial problems, i.e., semilinear

parabolic PDEs with source terms. Thus, let Ω ⊂Rd, d ∈N,be a bounded, closed and connected

domain and T > 0. Consider the following boundary value problem. Given a function f∈ C(R) and

setting ˆ

d=d+ 1, ﬁnd u: (t, x)∈[0, T ]×Ω⊂Rˆ

d−→ Rsuch that











∂u

∂t (t, x) + L[u](t, x)−f(u(t, x)) = 0,∀(t, x)∈(0, T )×

◦

Ω,

B[u](t, x)−g(t, x) = 0,∀(t, x)∈(0, T )×∂Ω,

u(0, x)−u0(x)=0,∀x∈Ω,

(1)

where L[·] is a strongly elliptic diﬀerential operator of second order in the space variables x, and B[·]

is a boundary operator deﬁned, for example, by a Dirichlet and/or Neumann boundary conditions.

The goal is to approximate this unknown function uby means of a feed-forward neural network,

uθ(t, x) := u(t, x;θ),where θ∈RPare the network parameters.

2.1 Feed-forward neural networks

A feed-forward network is a map that transforms an input y∈Rˆ

dinto an output z∈Rmby means of

the composition of a variable number, L, of vector-valued functions called layers. These consist of units

(neurons), which are the composition of aﬃne-linear maps with scalar non-linear activation functions,

[36]. Thus, assuming a L-layer network with ˆ

dlneurons per layer, it admits the representation

h(y;θ) := hL(·, θL)◦hL−1(·, θL−1)◦ ··· ◦ h1(·, θ1)(y),(2)

where, for any 1 ≤l≤L,

hl(zl;θl) = σl(Wlzl+bl), Wl∈Rˆ

dl+1×ˆ

dl, zl∈Rˆ

dl, bl∈Rˆ

dl+1 ,(3)

with z1=y, ˆ

d1=ˆ

dand ˆ

dL=m.

Usually (and this is taken as a guideline in this paper) the activation functions are assumed to be

the same in all layers except in the last one, where we consider the identity map, σL(·) = Id(·). In

addition, taking into account the nature of the problem, it is required that the neural network fulﬁlls

the diﬀerentiability conditions imposed by (1), requiring suﬃciently smooth activation functions such

as the sigmoid or the hyperbolic tangent, [63].

Lastly, it should be noted that a network as the one described above has ˆ

d+m+PL−1

l=2 ˆ

dlneurons,

with parameters θl={Wl, bl}per layer, yielding a total of

L−1

l=1

(ˆ

dl+ 1) ˆ

dl+1 (4)

parameters, which determine the network’s capacity.

2.2 Loss function and training algorithm

In order to obtain an approximation of the function uby means of a neural network, uθ, we need to

ﬁnd the network’s parameters, θ∈RP, that yields the best approximation of (1). This leads to a

global optimization problem that can be written in terms of the minimization of a loss function, that

measures how good the approximation is. The most popular choice for PINNs’ methods is to reduce

the problem (1) to an unconstrained optimization problem, [29], leading to the family of loss functions

involving the L2error minimization of the interior, initial and boundary residuals. Thus, the loss

function, J(θ), is deﬁned as

J(θ) := λIRI

θ2

L2((0,T )×Ω) +λBRB

θ2

L2((0,T )×∂Ω) +λORO

θ2

L2(Ω) ,

or, equivalently,

J(θ) = λIZT

0ZΩRI

θ(t, x)2dxdt+λBZT

0Z∂ΩRB

θ(t, x)2dσxdt+λOZΩRO

θ(x)2dx, (5)

where

θ(t, x) := ∂uθ

∂t (t, x) + L[uθ](t, x)−f(uθ(t, x)),(t, x)∈(0, T )×

◦

Ω,(6)

θ(t, x) := B[uθ](t, x)−g(t, x),(t, x)∈(0, T )×∂Ω,(7)

θ(x) := uθ(0, x)−u0(x), x ∈Ω,(8)

account for the residuals of the equation, the boundary condition and the initial condition, respectively.

The λj∈R+, j ∈ {I,B,O},are preset hyperparameters (or updateables during optimization) that

allow to impose a weight to each addend of the loss, as can be seen in, e.g., [66], [44]. Note that, for

the computation of the residuals (6), (7), it is necessary to obtain the derivatives of the neural network

with respect to the input space and time variables, well deﬁned under the premise of using suﬃciently

smooth activation functions. Numerically, they are calculated with the help of AD modules, such

those included in Tensorﬂow, [1], and Pytorch, [56]. Finally, the strategy followed in PINNs consists

of minimizing the loss function (5), i.e, ﬁnding θ∗∈Θ such that

θ∗= arg min

θ∈ΘJ(θ),(9)

where Θ ⊂RPis the set of admissible parameters.

Except for the simple cases, the integrals appearing in (5) must be computed numerically by means

of quadrature rules, [54]. For this reason, we need to select a set of training points, P=PI∪PB∪PO,

where

PI={(tI

i, xI

i)}NI

i=1,(tI

i, xI

i)∈(0, T )×

◦

Ω∀i∈ {1,2,··· , NI},

PB={(tB

i, xB

i)}NB

i=1,(tB

i, xB

i)∈(0, T )×∂Ω∀i∈ {1,2,··· , NB},

PO={(0, xO

i)}NO

i=1, xO

i∈Ω∀i∈ {1,2,··· , NO},

acting as nodes in the quadrature formulas.

Clearly, the choice of the quadrature technique has a direct inﬂuence on how these points are

selected, and may correspond to, for example, a suitable mesh for a trapezoidal quadrature rule,

SOBOL low-discrepancy sequences, a latin hypercube sampling, etc. Moreover, such choice is highly

inﬂuenced by the problem’s time-space dimension, being necessary to use random sampling in high-

dimensional domains.

In general terms, we can deﬁne the quadrature rule to calculate the integral of a function φ:A⊂

Rˆ

d−→ R, as

ΦM:=

i=1

wiφ(yi) (10)

with {wi}M

i=1 ⊂R+the weights and {yi}M

i=1 ⊂Athe nodes of the quadrature rule. This allows us to

rewrite the loss function (5) taking into account the chosen discretization and quadrature as follows,

J(θ) = λI

i=1

i|RI

θ(tI

i, xI

i)|2+λB

i=1

i|RB

θ(tB

i, xB

i)|2+λO

i=1

i|RO

θ(xO

i)|2.(11)

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

Boundary-safePINNsextension:Applicationtonon-linearparabolicPDEsincounterpartycreditrisk*JoelP.Villarino1;2andAlvaroLeitao1;2andJoseA.Garca-Rodrguez1;2October6,20221M2NICAresearchgroup,UniversityofCoru~na,Spain2CITICresearchcenter,SpainE-mails:joel.perez.villarino@udc.es/alvaro.leitao@udc.es/j...

展开>> 收起<<

Boundary-safe PINNs extension Application to non-linear parabolic PDEs in counterparty credit risk Joel P. Villarino12and Alvaro Leitao12and Jos e A. Garc a-Rodr guez12.pdf

共34页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Boundary-safe PINNs extension Application to non-linear parabolic PDEs in counterparty credit risk Joel P. Villarino12and Alvaro Leitao12and Jos e A. Garc a-Rodr guez12

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: