Neural Augmented Kalman Filtering with Bollinger Bands for Pairs Trading Amit Milstein Haoran Deng Guy Revach Hai Morgenstern and Nir Shlezinger

2025-05-02 0 0 1.27MB 13 页 10玖币

侵权投诉

Neural Augmented Kalman Filtering with Bollinger

Bands for Pairs Trading

Amit Milstein, Haoran Deng, Guy Revach, Hai Morgenstern, and Nir Shlezinger

Abstract—Pairs trading is a family of trading techniques that

determine their policies based on monitoring the relationships

between pairs of assets. A common pairs trading approach relies

on describing the pair-wise relationship as a linear Space State

(SS) model with Gaussian noise. This representation facilitates

extracting ﬁnancial indicators with low complexity and latency

using a Kalman Filter (KF), that are then processed using

classic policies such as Bollinger Bands (BB). However, such

SS models are inherently approximated and mismatched, often

degrading the revenue. In this work, we propose KalmenNet-

aided Bollinger bands Pairs Trading (KBPT), a deep learning

aided policy that augments the operation of KF-aided BB trading.

KBPT is designed by formulating an extended SS model for pairs

trading that approximates their relationship as holding partial

co-integration. This SS model is utilized by a trading policy that

augments KF-BB trading with a dedicated neural network based

on the KalmanNet architecture. The resulting KBPT is trained in

a two-stage manner which ﬁrst tunes the tracking algorithm in an

unsupervised manner independently of the trading task, followed

by its adaptation to track the ﬁnancial indicators to maximize

revenue while approximating BB with a differentiable mapping.

KBPT thus leverages data to overcome the approximated nature

of the SS model, converting the KF-BB policy into a trainable

model. We empirically demonstrate that our proposed KBPT

systematically yields improved revenue compared with model-

based and data-driven benchmarks over various different assets.

I. INTRODUCTION

Quantitative methods constitute the fundamental mathe-

matical framework for analysis and prediction in ﬁnancial

markets [2], [3]. A common type of quantitative methods

is algorithmic trading [4], which deals with decision-making

carried out by an agent (i.e., a trader) for the purpose of

maximizing a cumulative reward, most commonly achieving a

high Proﬁt and Loss (PNL) balance in the market. Quantitative

trading schemes are typically comprised of two main stages:

the agent ﬁrst tracks a stochastic process that describes the

prices of the assets of interest in order to extract useful trading

indicators. Then, these ﬁnancial indicators are used as a basis

for decision making by setting a trading policy [5]–[7].

Quantitative trading requires a decision making mechanism

given application time constraints, i.e., a trading policy that

outputs a position based on the trading indicators. Such

Parts of this work were accepted for presentation in the 2023 IEEE Inter-

national Conference on Acoustics Speech, and Signal Processing (ICASSP)

as the paper [1]. A. Milstein and N. Shlezinger are with the School of ECE,

Ben-Gurion University of the Negev, Israel (e-mail: amitmils@post.bgu.ac.il;

nirshl@bgu.ac.il). H. Deng and G. Revach are with the Institute for Signal

and Information Processing, D-ITET, ETH Z¨

urich, Switzerland (e-mail:

haodeng@student.ethz.ch; grevach@ethz.ch). H. Morgenstern is unafﬁliated

(e-mail: hai.morgenstern@gmail.com).

policies are typically based on indicators obtained as statistical

predictions of an asset price [8]. A popular classical policy is

the Bollinger Bands (BB) [9], which is based on the intuition

that if the price is much less than its mean, it will rise back

to normal level and thus one should long this asset. Due to

the fact that this method is not linear, it hedges the risk by

constraining the investment.

Classical trading schemes such as BB work well for single

stationary (and speciﬁcally, mean-reverting) processes [10]. It

is therefore sought-after to look for stationary assets, though

some schemes only look for the weaker condition of mean

reverting, e.g., using the Ornstein–Uhlenbeck formula [11].

Accordingly, algorithmic tracking of ﬁnancial processes is

typically based on imposing a model on their temporal

evolution [12]. A common approach imposes simple linear

stochastic stationary model [13], often based on autoregressive

and moving average models [14]. While assets are rarely

stationary in real markets, their differences and spread (i.e.,

linear combination) are in some cases faithfully captured as

being stationary, and thus such techniques are commonly

adopted in pairs trading [4], [15]. The spread evolution and

its relationship with the assets pair is often described using

a Space State (SS) model [16]–[18], enabling tracking with

a Kalman Filter (KF) [14, Ch. 10]. A core challenge with

combining ﬁnancial policies with algorithmic tracking based

on such statistical models it that they typically require strong

assumptions and prior ﬁnancial knowledge. For instance, to

utilize the KF for spread tracking, one has to faithfully

capture the pairs trading as a linear Gaussian SS model. Such

models often fail to capture complicated patterns of real world

ﬁnancial assets, which in turn leads to poor trading policies.

To overcome the drawbacks of classic model-based meth-

ods, recent years have witnessed a growing interest in the use

of model-agnostic deep learning. Deep learning systems are

used to capture the time evolution of ﬁnancial assets [19],

extract features for trading [20], and determine trading poli-

cies [21], see survey in [22]. Common deep learning architec-

tures for ﬁnancial modelling and prediction include recurrent

neural networks (RNNs) [23], auto-encoders [24], anomaly

detection [25] and attention models [26], [27]. Reinforcement

Learning (RL) is considered for training deep trading policies

[21], [28]–[32] to maximize the reward in an end-to-end

fashion. In order to generate various inputs, it was proposed

to use deep learning based natural language processing to

analyze social media and news for trading [20], [23]. Despite

their growing popularity, deep learning based quantitative

methods are subject to several drawbacks. They are based

on highly parameterized black boxes, giving rise to latency

arXiv:2210.15448v2 [q-fin.TR] 1 Sep 2023

considerations. Moreover, deep learning based policies lack

the interpretability and reliability of model-based methods, and

do not incorporate established models which is core in pairs

trading. In addition, these methods tend to have a long training

time and require large volumes of data for training, which can

constitute a limiting factor in high-frequency trading. This

motivates designing trading techniques that simultaneously

beneﬁt from the approximated modelling adopted by classical

trading schemes alongside the abstractness and capabilities of

data-driven deep learning methods.

In this work, we propose KalmenNet-aided Bollinger bands

Pairs Trading (KBPT), a pairs trading algorithm that combines

SS model-based trading policies with deep learning tools,

based on model-based deep learning methodology [33]–[35].

KBPT is derived by proposing a novel SS model repre-

sentation for pairs trading obtained from assuming partial

co-integration [17] combined with an autoregressive prior

imposed on the spread. As opposed to previous SS model-

based trading policies that utilize, e.g., KF with BB for setting

the position, thus implicitly assuming that the SS model is

Gaussian and accurate, we design our policy to particularly

cope with the approximated nature of the SS model and its

expected non-Gaussianity. This is achieved by having KBPT

preserve the ﬂow of KF-BB trading, retaining its structured

modeling and interpretability, while augmenting the KF with

a trainable RNN following the recently proposed Kalman-

Net [36]. The resulting neural augmentation, in which the

speciﬁc computation of the KF that depends on the underlying

stochasticity is learned, leverages data to track the spread in

partially known and non-Gaussian SS models.

We propose a dedicated training scheme for KBPT that

learns the pairs trading policy from sequences of past assets

pairs. The learning method is based on a two-stage procedure,

where we ﬁrst train KalmanNet separately from the trading

task as a form of pretraining. There, we overcome the fact

that there is no ground-truth spread value by leveraging the

interpretable architecture of KalmanNet, and particularly its

internal prediction of the next observation which follows from

the KF ﬂow, for unsupervised learning [37]. Then, we train

the overall trading policy, combining the neural augmented

KalmanNet with a customized BB mapping that is differ-

entiable, such that the tracking algorithm learns to produce

features that are most useful in the sense of maximizing

the PNL rather than accurately tracking the prices. By that,

we gain the ability to cope with modeling mismatch, as

the resulting architecture converts the model-based trading

algorithm into a trainable discriminative model [38] that is

trained end-to-end to maximize the PNL as a cumulative

reward.

Our empirical study compares KBPT with both model-based

trading and with deep RL-based policies for various assets

pairs. There, we demonstrate the individual gains of each of

the ingredients of KBPT, including the usefulness of the ex-

tended SS model underlying KBPT, as well as the superiority

of the proposed hybrid algorithm in systematically achieving

higher PNL compared with all considered benchmarks. Our

work extends upon its preliminary ﬁndings reported in [1]

in the proposal of the new partially co-integrated SS model,

the incorporation of a dedicated accumulated reward loss and

the two-stage training methods, as well as in the extensive

discussion, derivation, and experimental evaluations.

The rest of this paper is organized as follows: Section II

covers preliminaries in model-based trading and formulates the

problem; Section III describes the different SS models in pairs

trading and presents our proposed model; Section IV details

our proposed hybrid KBPT policy along with its learning

procedure; Section Vpresents the empirical study of KBPT,

contrasting it with both model-based and data-driven policies;

while Section VI provides concluding remarks.

Throughout this paper we use boldface lower-case letters for

vectors; e.g., x, and boldface uppercase letters for matrices,

e.g., for X. We denote the step function as U(·), with U(t) =

1for t > 0and U(t)=0for t≤0, while E{·} is the notation

for stochastic expectation. We use the term stationary process

to refer to a stochastic process that is stationary in the wide

sense. For consistency, the prices of all assets is given in USD.

II. PRELIMINARIES AND PROBLEM FORMULATION

In this section we formulate the considered model for pairs

trading. To that aim, we ﬁrst review necessary preliminaries

in quantitative trading in Subsection II-A, and recall the BB

policy in Subsection II-B. These preliminaries are then used

to formulate the problem in Subsection II-C.

A. Trading Formulation

Trading strategies refer to the determining of investment

policies based on the monitoring of ﬁnancial assets. Accord-

ingly, trading strategies can be generally divided into two

stages: (i)tracking of the assets into ﬁnancial indicators; and

(ii)the trading policy that is based on these indicators [7].

1) Tracking: A crucial part of any trading scheme is

constantly evaluating and analyzing the ﬁnancial markets, indi-

vidual securities, or sectors. Information such as price move-

ments, volatility, liquidity, volume, momentum, and market

breadth is valuable for making informed decisions in the trad-

ing market. Using this ﬁnancial data, one can derive ﬁnancial

indicators which enable the trader to get insight on potential

entry and exit points, assess risks, and ultimately optimize

the investment strategy. Quantitative ﬁnancial indicators can

include technical indicators (e.g., moving averages, relative

strength index) [39], fundamental indicators (e.g., earnings per

share, price-to-earnings ratio), or macroeconomic indicators

(e.g., GDP growth rate, inﬂation rate) [40].

To formulate this mathematically, we use dtto denote the

ﬁnancial information (e.g., assets price) at time t > 0. A

ﬁnancial tracker, denoted φ, is a mapping of all the ﬁnancial

data accumulated until time tinto ﬁnancial indicators zt, i.e.,

φ:{dτ}τ≤t7→ zt.(1)

The ﬁnancial indicator should provide sufﬁcient information

for the policy to dictate the current decision, as detailed next.

2) Policy: The policy component of a trading scheme,

denoted by π, refers to the rules, guidelines, and principles

that govern the decision-making process and the execution of

trades. The policy component in general may encompass both

quantitative and qualitative aspects: Quantitative aspects can

involve speciﬁc parameters, thresholds, or algorithms based on

ﬁnancial indicators or other mathematical models. Qualitative

aspects consider factors such as market conditions, investor

sentiment, news events, or expert judgment.

The policy is the last step of the trading scheme and it

outputs the recommended actions for the trader to take in

order to optimize proﬁts. We refer to the return of each trade

transaction the reward. In quantitative trading, the action at

time t, denoted pt, is determined using a trading policy π

based on the current indicator ztas well as past actions and

indicators, namely,

π:{zτ}τ≤t,{pτ}τ <t 7→ pt.(2)

We henceforth focus on settings where

A1 The information dtrepresents the price of an asset.

A2 The actions correspond to long/short decisions on dt, i.e.,

holding positive or negative quantities, respectively.

The action space in A2 indicates that ptencapsulates open and

close decisions. We formulate this by writing pt= [opt,cpt],

where opt∈ {−1,0,1}is the open position policy that signals

if to short, hold or long the asset, respectively; and cpt∈

{0,1}is the close position policy, which gets the value 1

when an existing open position (e.g., from time t−1) needs

to be closed. Otherwise, if a position needs to remain open or

there is no open position, it gets the value of 0. The order in

which positions are taken involves ﬁrst checking if the closing

criteria is met, and then checking whether to open one. We

say that ptis an active position if opt=±1.

3) Reward: Under A1-A2, one can mathematically formu-

late the reward accumulated for an active position. To that

aim, let to

ibe the time the ith active position is taken and tc

the time it is closed. Accordingly, the reward obtained for the

ith activity of of policy πwith ﬁnancial tracker φ, denoted by

rφ,π

i, is computed based the difference in the asset price over

the activity period and whether it was long or short via

rφ,π

i= opto

i·(dtc

i−dto

i).(3)

The reward in (3) can be positive or negative, i.e., proﬁt or

loss, respectively.

B. Bollinger Bands Trading Policy

A popular trading policy is based on BB, which is a simple

and fundamental technique employed in a variety of trading

schemes [9]. BB consists of 3 bands plotted around the asset’s

price – upper, middle, and lower – as illustrated in Fig. 1.

The middle band is a simple Moving Average (MA), whose

window size varies per application (in Fig. 1we used a

window of 20 samples). The top and bottom bands are plotted

around the middle band where the distance can be based on

the Standard Deviation (STD) of the MA. These are typically

set at ±1 STD around the MA, though the setting may

vary depending on the application. Alternatively, one may use

conﬁdence intervals for forming such bands.

Using these bands, one can build a trading strategy. A

natural approach to do so is applicable when dτis a stationary

Fig. 1. Asset price with Bollinger Bands illustration

price series. In this case, one can construct a ﬁnancial tracker

using the empirical z-score, i.e.,

zt=φ({dτ}τ≤t) = dt−µt

σt

,(4)

where µtand σtare the empirical ﬁrst and second order

moments of dt, respectively, estimated from {dτ}τ≤t.

The BB policy is obtained by examining in which band zt

lies. In particular, if an open position is not currently being

held, a short position is taken if the asset is being overbought,

i.e. zt>1, and a long position if its being oversold , i.e.

zt<−1. To formulate this mathematically, we say that an

open position is held at time tif the last open position time

denoted

τop,t ≜max

τ <t:opτ=±1τ, (5)

is not smaller than the last close position time

τcp,t ≜max

τ≤t:cpτ=1 τ. (6)

The open position policy is thus determined as

opt= (U(−1−zt)− U (zt−1)) · U (τcp,t −τop,t).(7)

The reward in (3) is formulated for each active position,

and not for each time instance. In some settings, e.g., when

designing trading strategies using RL [31], [41], [42], one

is often interested in obtaining instantaneous rewards. This

achieved by closing a position after a single time step (though

it can then re-opened and treated as a new active position,

yielding an addition transaction cost, i.e., friction [18], which

we omit for simplicity). Such an operation results in

cpt=U|opt−1|.(8)

Alternatively, one can determine the close position based on

the indicator, allowing a cumulative reward where a position

can be held over multiple time steps. In this case, the closing

of a position is a function of the indicator zt. For instance, one

can decide to close a currently open position if zthas crossed

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

NeuralAugmentedKalmanFilteringwithBollingerBandsforPairsTradingAmitMilstein,HaoranDeng,GuyRevach,HaiMorgenstern,andNirShlezingerAbstract—Pairstradingisafamilyoftradingtechniquesthatdeterminetheirpoliciesbasedonmonitoringtherelationshipsbetweenpairsofassets.Acommonpairstradingapproachreliesondescribi...

展开>> 收起<<

Neural Augmented Kalman Filtering with Bollinger Bands for Pairs Trading Amit Milstein Haoran Deng Guy Revach Hai Morgenstern and Nir Shlezinger.pdf

共13页,预览3页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Neural Augmented Kalman Filtering with Bollinger Bands for Pairs Trading Amit Milstein Haoran Deng Guy Revach Hai Morgenstern and Nir Shlezinger

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: