Boundary-safe PINNs extension Application to non-linear parabolic PDEs in counterparty credit risk Joel P. Villarino12and Alvaro Leitao12and Jos e A. Garc a-Rodr guez12

2025-04-30 0 0 1.81MB 34 页 10玖币
侵权投诉
Boundary-safe PINNs extension: Application to non-linear
parabolic PDEs in counterparty credit risk
Joel P. Villarino1,2and ´
Alvaro Leitao1,2and Jos´e A. Garc´ıa-Rodr´ıguez1,2
October 6, 2022
1 M2NICA research group, University of Coru˜na, Spain
2 CITIC research center, Spain
E-mails: joel.perez.villarino@udc.es / alvaro.leitao@udc.es / jose.garcia.rodriguez@udc.es
ABSTRACT.
The goal of this work is to develop deep learning numerical methods for solving option XVA pricing problems
given by non-linear PDE models. A novel strategy for the treatment of the boundary conditions is proposed,
which allows to get rid of the heuristic choice of the weights for the different addends that appear in the loss
function related to the training process. It is based on defining the losses associated to the boundaries by
means of the PDEs that arise from substituting the related conditions into the model equation itself. Further,
automatic differentiation is employed to obtain accurate approximation of the partial derivatives.
Keywords : deep learning, PDEs, PINNs, boundary conditions, nonlinear, automatic differentiation, option
pricing, XVA.
1 Introduction
Deep learning techniques are machine learning algorithms based on neural networks, also known as
artificial neural networks (ANNs), and representation learning, see [36] and the references therein.
From a mathematical point of view, ANNs can be interpreted as multiple chained compositions of
multivariate functions, and deep neural networks is the term used for ANNs with several interconnected
layers. Such networks are known for being universal approximators, property given by the Universal
Approximation Theorem, which essentially states that any continuous function in any dimension can
be represented to arbitrary accuracy by means of an ANN. For this reason, ANNs have a wide range
of application, and their use has become ubiquitous in many fields: computer vision, natural language
processing, autonomous vehicles, etc. Deep learning algorithms are usually classified according to the
amount and type of supervision they get during training and, among all the categories that can be
identified, we highlight the supervised and the unsupervised algorithms. They differ in whether they
receive the desired solutions in the training set or not.
The aforementioned universal approximation property was exploited in the seminal papers [52],
[29] and [48] to introduce a technique to solve partial differential equations (PDEs) by means of
ANNs. In recent years there has been a growing interest in approximating the solution of PDEs
by means of deep neural networks. They promise to be an alternative to classical methods such as
Finite Differences (FD), Finite Volumes (FV) or Finite Elements (FE). For example, the FE technique
consists in projecting the solution in some functional space, the Galerkin spaces. Then, by passing to
the weak variational formulation and taking the discrete basis, we can find a linear system of equations
whose unknowns are the approximated values of the solution as each point of the mesh. In a similar
The authors have no conflict of interest to disclose.
1
arXiv:2210.02175v1 [q-fin.CP] 5 Oct 2022
manner, the ANN can be trained to learn data from a physical law that is given by a PDE or a system
of PDEs. The idea is quite similar to the classical Galerkin methods, but instead of representing the
solution as a projection in some flavour of Galerkin space, the solution is written in terms of ANNs
as the composition of nonlinear functions depending on some network weights. As a result, instead
of a high dimensional linear system, a high dimensional nonlinear optimization problem is obtained
for the ANN weights. This problem must be solved using nonlinear optimization algorithms such as
stochastic gradient descent-based methods, e.g., [45], and/or quasi-Newton methods, e.g., L-BFGS,
[7]. More recently, with the advances in automatic differentiation algorithms (AD) and hardware
(GPUs), this kind of techniques have gained more momentum in the literature and, currently, the
most promising approach is known as physics-informed neural networks (PINNs), see [53] [59], [51],
[54], [23].
In the last few years, PINNs have shown a remarkable performance. However, there is still some
room for improvements within the methodology. One of the disadvantages of PINNs is the lack of
theoretical results for the control of the approximation error. Obtaining error estimates or results for
the order of approximation in PINNs is a non-trivial task, much more challenging than in classical
methods. Even so, the authors in [25], [4], [54], [28], [26], [24] and [27] (among others) have derived
estimates and bounds for the so-called generalization error considering particular models. Another
drawback is the difficulty when imposing the boundary conditions (a fact discussed further later in
this section). Nevertheless the use of ANNs has several advantages for solving PDEs: they can be used
for nonlinear PDEs without any extra effort; they can be extended to (moderate) high dimensions;
and they yield accurate approximations of the partial derivatives of the solution thanks to the AD
modules provided by modern deep learning frameworks.
PINNs is not the only approach relying on ANNs to address PDE-based problems. They can be
used as a complement for classical numerical methods, for example training the neural network to
obtain smoothness indicators, or WENO reconstructions in order for them to be used inside a classical
FV method, see [46], [47]. Also ANNs are being used to solve PDE models by means of their backward
stochastic differential equation (BSDE) representation as long as the Feynmann-K`ac theorem can be
applied, which is the usual situation in computational finance, for example. In [37], the authors present
the so called DeepBSDE numerical methods and their application to the solution of the nonlinear
Black-Scholes equation, the Hamilton-Jacobi-Bellman equation, and the Allen-Cahn equation in very
high (hundreds of) dimensions. The connection of such method with the recursive multilevel Picard
approximations allows the authors to prove that DeepBSDEs are capable of overcoming the so called
“curse of dimensionality” for a certain kind of PDEs, see [68], [42].
The main goal of the present work is to develop robust and stable deep learning numerical methods
for solving nonlinear parabolic PDE models by means of PINNs. The motivation arises from the
difficulty of finding and numerically imposing the boundary conditions, which are always delicate
and critical tasks both in the classical FD/FV/FE setting and thus also in the ANN setting. The
common approach consists in assigning weights to the different terms involved in the loss function,
where the selection of this weights must be done heuristically. We introduce a new idea that consists
in introducing the loss terms due to the boundary conditions by means of evaluating the PDE operator
restricted to the boundaries. In this way the value of such addends is of the same magnitude of the
interior losses. Although this is non feasible in the classical PDE solving algorithms, it is very intuitive
within the PINNs framework since, by means of AD, we can evaluate this operator in the boundary
even in the case it contains normal derivatives to such boundary. Thus, this novel treatment of the
boundary conditions in PINNs is the main contribution of this work, allowing to get rid of the heuristic
choice of the weights for the contributions of the boundary addends to the loss function that come from
the boundary conditions. Further, AD can be naturally exploited to obtain accurate approximations of
the partial derivatives of the solution with respect to the input parameters (quantities of much interest
in several fields).
Although the proposed methodology could be presented for a wide range of applications, here we
will focus on the solution of PDE models for challenging problems appearing the the computational
finance field. In particular, we consider the derivative valuation problem in the presence of counterparty
2
credit risk (CCR), which includes in its formulation the so-called x-value adjustments (XVA). This
term refers to the different valuation adjustment that arise in the models when the CCR is considered,
i.e., when the possibility of default of the parties involved in the transaction is taking into account.
These adjustments can come from different sources within a derivative portfolio: credit (CVA), debit
(DVA); funding costs (FVA); collateral requirements (ColVA); and capital requirements (KVA), among
others. After the 2007-2009 financial crisis, CCR management became of key importance in the
financial industry. Several models were developed in order to enrich the classical pricing models by the
introduction of risk terms. In this sense, the value adjustments are terms to be added to, or subtracted
from, an idealised reference portfolio value, computed in the absence of frictions, in order to obtain
the final value of the transaction.
The first works in this topic appeared before the above-mentioned crisis, focusing on analyzing
the CVA concept. Some seminal works from this period are [30], [11] and [18]. After the crisis, the
XVA adjustments gained huge attention. The models in which the possibility of default of the parties
involved in a transaction were revised by the introduction of the DVA factor, [13], [9]. Additionally,
the increasingly important role of collateral agreements demands for a portfolio-wide view of valuation
by introducing the ColVA factor. In a Black-Scholes economy, [57] gives valuation formulas both in
the collateralized and uncollateralized case. In addition, generalizations to the case of a multi-currency
economy can be found in [58], [31], [32], and [35]. Another important aspect for the industry, apart
from default risk, is represented by funding costs. Currently, the trading activity is dependent on
different sources of liquidity such as the interest rate multi-curve, [22], and the old assumption of a
unique risk-free interest rate is no longer realistic. In [55], the FVA is included into a risk-neutral
pricing framework for CCR considering realistic settings. Such work is extended in [12], where the
effect of Central Clearing Counterparties (CCPs) on funding costs is studied. In this regard, there are
many more contributions in obtaining a single risk management framework which includes funding and
default risk. In [8] is developed a unified valuation theory that incorporates credit risk, collateralization
and funding costs by means of the so-called discounting approach. The authors in [15], [14] generalize
the classical Black-Scholes replication approach to include some of the aforementioned effects. A more
general BSDE approach is provided by [20], [21], [5], and [6]. In addition, the equivalence between the
discounting and BSDE-based approaches is demonstrated in [8].
Of course, the world of quantitative finance in general, and CCR management in particular, has
not been exempt from the advances in deep learning and, nowadays, ANNs are employed for a wide
variety of tasks in the industry. Unsupervised ANNs, in both flavours, PINNs and DeepBSDEs, have
been recently used for solving challenging financial problems. For example, in [62] the authors apply
PINNs for solving the linear one and two dimensional Black-Scholes equation, and [67] introduces
the solution of high dimensional Black-Scholes problems using BSDEs. In [34] the authors present a
novel computational framework for portfolio-wide risk management problems with a potentially large
number of risk factors that makes traditional numerical techniques ineffective. They use a coupled
system of BSDEs for XVA which is addressed by a recursive application of an ANN-based BSDE
solver. Other relevant works that make use of ANNs for computational finance problems, although
not formulated as PDEs, include [40], [41], or [50], for example.
The outline of this paper is as follows. In Section 2 we start by revisiting the PINNs framework for
solving PDEs. Section 3 introduces the new methodology for the treatment of the boundary conditions
in the PINNs setting. In Section 4, the XVA PDE models that we solve in this paper and the adaptation
to our PINNs extension are described; more precisely, XVA problems in one and two dimensions, under
on Black-Scholes and Heston models. Finally, in Section 5, the numerical experiments that assess the
accuracy of the approximation for option prices and their partial derivatives (the so-called Greeks) are
presented.
3
2 PINNs
In this section we introduce the so-called PINNs methodology for solving PDEs. The illustration is
carried out according to the kind of PDEs that arise in the selected financial problems, i.e., semilinear
parabolic PDEs with source terms. Thus, let Ω Rd, d N,be a bounded, closed and connected
domain and T > 0. Consider the following boundary value problem. Given a function f∈ C(R) and
setting ˆ
d=d+ 1, find u: (t, x)[0, T ]×Rˆ
dRsuch that
u
t (t, x) + L[u](t, x)f(u(t, x)) = 0,(t, x)(0, T )×
,
B[u](t, x)g(t, x) = 0,(t, x)(0, T )×,
u(0, x)u0(x)=0,x,
(1)
where L[·] is a strongly elliptic differential operator of second order in the space variables x, and B[·]
is a boundary operator defined, for example, by a Dirichlet and/or Neumann boundary conditions.
The goal is to approximate this unknown function uby means of a feed-forward neural network,
uθ(t, x) := u(t, x;θ),where θRPare the network parameters.
2.1 Feed-forward neural networks
A feed-forward network is a map that transforms an input yRˆ
dinto an output zRmby means of
the composition of a variable number, L, of vector-valued functions called layers. These consist of units
(neurons), which are the composition of affine-linear maps with scalar non-linear activation functions,
[36]. Thus, assuming a L-layer network with ˆ
dlneurons per layer, it admits the representation
h(y;θ) := hL(·, θL)hL1(·, θL1)◦ ··· ◦ h1(·, θ1)(y),(2)
where, for any 1 lL,
hl(zl;θl) = σl(Wlzl+bl), WlRˆ
dl+1׈
dl, zlRˆ
dl, blRˆ
dl+1 ,(3)
with z1=y, ˆ
d1=ˆ
dand ˆ
dL=m.
Usually (and this is taken as a guideline in this paper) the activation functions are assumed to be
the same in all layers except in the last one, where we consider the identity map, σL(·) = Id(·). In
addition, taking into account the nature of the problem, it is required that the neural network fulfills
the differentiability conditions imposed by (1), requiring sufficiently smooth activation functions such
as the sigmoid or the hyperbolic tangent, [63].
Lastly, it should be noted that a network as the one described above has ˆ
d+m+PL1
l=2 ˆ
dlneurons,
with parameters θl={Wl, bl}per layer, yielding a total of
P=
L1
X
l=1
(ˆ
dl+ 1) ˆ
dl+1 (4)
parameters, which determine the network’s capacity.
2.2 Loss function and training algorithm
In order to obtain an approximation of the function uby means of a neural network, uθ, we need to
find the network’s parameters, θRP, that yields the best approximation of (1). This leads to a
global optimization problem that can be written in terms of the minimization of a loss function, that
measures how good the approximation is. The most popular choice for PINNs’ methods is to reduce
the problem (1) to an unconstrained optimization problem, [29], leading to the family of loss functions
4
involving the L2error minimization of the interior, initial and boundary residuals. Thus, the loss
function, J(θ), is defined as
J(θ) := λIRI
θ2
L2((0,T )×Ω) +λBRB
θ2
L2((0,T )×Ω) +λORO
θ2
L2(Ω) ,
or, equivalently,
J(θ) = λIZT
0ZRI
θ(t, x)2dxdt+λBZT
0ZRB
θ(t, x)2dσxdt+λOZRO
θ(x)2dx, (5)
where
RI
θ(t, x) := uθ
t (t, x) + L[uθ](t, x)f(uθ(t, x)),(t, x)(0, T )×
,(6)
RB
θ(t, x) := B[uθ](t, x)g(t, x),(t, x)(0, T )×,(7)
RO
θ(x) := uθ(0, x)u0(x), x ,(8)
account for the residuals of the equation, the boundary condition and the initial condition, respectively.
The λjR+, j ∈ {I,B,O},are preset hyperparameters (or updateables during optimization) that
allow to impose a weight to each addend of the loss, as can be seen in, e.g., [66], [44]. Note that, for
the computation of the residuals (6), (7), it is necessary to obtain the derivatives of the neural network
with respect to the input space and time variables, well defined under the premise of using sufficiently
smooth activation functions. Numerically, they are calculated with the help of AD modules, such
those included in Tensorflow, [1], and Pytorch, [56]. Finally, the strategy followed in PINNs consists
of minimizing the loss function (5), i.e, finding θΘ such that
θ= arg min
θΘJ(θ),(9)
where Θ RPis the set of admissible parameters.
Except for the simple cases, the integrals appearing in (5) must be computed numerically by means
of quadrature rules, [54]. For this reason, we need to select a set of training points, P=PIPBPO,
where
PI={(tI
i, xI
i)}NI
i=1,(tI
i, xI
i)(0, T )×
i∈ {1,2,··· , NI},
PB={(tB
i, xB
i)}NB
i=1,(tB
i, xB
i)(0, T )×i∈ {1,2,··· , NB},
PO={(0, xO
i)}NO
i=1, xO
ii∈ {1,2,··· , NO},
acting as nodes in the quadrature formulas.
Clearly, the choice of the quadrature technique has a direct influence on how these points are
selected, and may correspond to, for example, a suitable mesh for a trapezoidal quadrature rule,
SOBOL low-discrepancy sequences, a latin hypercube sampling, etc. Moreover, such choice is highly
influenced by the problem’s time-space dimension, being necessary to use random sampling in high-
dimensional domains.
In general terms, we can define the quadrature rule to calculate the integral of a function φ:A
Rˆ
dR, as
ΦM:=
M
X
i=1
wiφ(yi) (10)
with {wi}M
i=1 R+the weights and {yi}M
i=1 Athe nodes of the quadrature rule. This allows us to
rewrite the loss function (5) taking into account the chosen discretization and quadrature as follows,
ˆ
J(θ) = λI
NI
X
i=1
wI
i|RI
θ(tI
i, xI
i)|2+λB
NB
X
i=1
wB
i|RB
θ(tB
i, xB
i)|2+λO
NO
X
i=1
wO
i|RO
θ(xO
i)|2.(11)
5
摘要:

Boundary-safePINNsextension:Applicationtonon-linearparabolicPDEsincounterpartycreditrisk*JoelP.Villarino1;2andAlvaroLeitao1;2andJoseA.Garca-Rodrguez1;2October6,20221M2NICAresearchgroup,UniversityofCoru~na,Spain2CITICresearchcenter,SpainE-mails:joel.perez.villarino@udc.es/alvaro.leitao@udc.es/j...

展开>> 收起<<
Boundary-safe PINNs extension Application to non-linear parabolic PDEs in counterparty credit risk Joel P. Villarino12and Alvaro Leitao12and Jos e A. Garc a-Rodr guez12.pdf

共34页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:34 页 大小:1.81MB 格式:PDF 时间:2025-04-30

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 34
客服
关注