Algorithmic Trading Using Continuous Action Space Deep Reinforcement Learning Naseh Majidianaseh.majidisharif.edu Mahdi Shamsia

2025-04-27 0 0 746.17KB 26 页 10玖币
侵权投诉
Algorithmic Trading Using Continuous Action Space
Deep Reinforcement Learning
Naseh Majidia(naseh.majidi@sharif.edu), Mahdi Shamsia
(shamsi.mahdi@ee.sharif.edu), Farokh Marvastia(fmarvasti@gmail.com)
aFaculty of Electrical Engineering, Sharif University of Technology, Azadi Ave,
1458889694 Tehran, Iran.
Corresponding Author:
Farokh Marvasti
Faculty of Electrical Engineering, Sharif University of Technology, Azadi Ave,
1458889694 Tehran, Iran.
Tel: (98) 9123729799
Email: fmarvasti@gmail.com
arXiv:2210.03469v1 [cs.LG] 7 Oct 2022
Algorithmic Trading Using Continuous Action Space
Deep Reinforcement Learning
Naseh Majidia, Mahdi Shamsia, Farokh Marvastia,
aFaculty of Electrical Engineering, Sharif University of Technology, Azadi Ave, 1458889694
Tehran, Iran.
Abstract
Price movement prediction has always been one of the traders concerns in fi-
nancial market trading. In order to increase their profit, they can analyze the
historical data and predict the price movement. The large size of the data and
complex relations between them lead us to use algorithmic trading and arti-
ficial intelligence. This paper aims to offer an approach using Twin-Delayed
DDPG (TD3) and the daily close price in order to achieve a trading strategy in
the stock and cryptocurrency markets. Unlike previous studies using a discrete
action space reinforcement learning algorithm, the TD3 is continuous, offering
both position and the number of trading shares. Both the stock (Amazon)
and cryptocurrency (Bitcoin) markets are addressed in this research to evaluate
the performance of the proposed algorithm. The achieved strategy using the
TD3 is compared with some algorithms using technical analysis, reinforcement
learning, stochastic, and deterministic strategies through two standard metrics,
Return and Sharpe ratio. The results indicate that employing both position and
the number of trading shares can improve the performance of a trading system
based on the mentioned metrics.
Keywords: Bitcoin, Algorithmic Trading, Stock Market Prediction, Deep
Reinforcement Learning, Artificial Intelligence, Financial AI.
Corresponding author.
Email addresses: naseh.majidi@sharif.edu (Naseh Majidi),
shamsi.mahdi@ee.sharif.edu (Mahdi Shamsi), fmarvasti@gmail.com (Farokh Marvasti )
Preprint submitted to Expert Systems with Applications October 10, 2022
1. Introduction
Forecasting price movements in the financial market is a difficult task. Ac-
cording to the Efficient-Market hypothesis (Kirkpatrick & Dahlquist, 2008),
stock market prices follow a random walk process with unpredictable future
fluctuations. When it comes to Bitcoin, its price fluctuates highly, which makes
its forecasting challenging (Phaladisailoed & Numnonda, 2018). Technical and
fundamental analysis are two typical tools used by traders to build their trading
strategies in the financial markets. According to price movement and trading
volume, technical analysis provides trading signals (Murphy, 1999). Fundamen-
tal analysis, unlike the former, examines related economic and financial factors
to determine a security’s underlying worth (Drakopoulou, 2016).
Humans and computers both perform data analysis. Although humans are
able to keep an eye on financial charts (such as prices) and make decisions based
on their past experiences, managing a vast volume of data is complicated due to
various factors influencing the price movement. As a result, algorithmic trading
has emerged to tackle this issue. Algorithmic trading is a type of trading where
a computer that has been pre-programmed with a specific set of mathematical
rules is employed (Th´eate & Ernst, 2021). There are two sorts of approaches
in financial markets: price prediction and algorithmic trading. Price prediction
aims to build a model that can precisely predict future prices, whereas algorith-
mic trading is not limited to the price prediction and attempts to participate
in the financial market (e.g. choosing a position and the number of trading
shares) to maximize profit (Hirchoua et al., 2021). It is claimed that a more
precise prediction does not necessarily result in a higher profit. In other words,
a trader’s overall loss due to incorrect actions may be greater than the gain due
to correct ones (Li et al., 2019). Therefore, algorithmic trading has been the
focus of this study.
Classical Machine Learning (ML) and Deep Learning (DL), which are power-
ful tools for recognizing patterns, have been employed in various research fields.
In recent years, using the ML as an intelligent agent has risen in popularity over
2
the alternative of the traditional approaches in which a human being makes a
decision. For two reasons, the ML and DL have enhanced theperformance in
algorithmic trading. Firstly, they can extract complex patterns from data that
are difficult for humans to accomplish. Secondly, emotion does not affect their
performance, which is a disadvantage for humans (Chakole et al., 2021). How-
ever, there are two compelling reasons why the ML and DL in a supervised
learning approach are unsuitable for algorithmic trading. Firstly, supervised
learning is improper for learning problems with long-term and delayed rewards
(Dang, 2019), such as trading in financial markets, which is why Reinforcement
Learning (RL), a subfield of ML, is required to solve a decision-making prob-
lem (trading) in an uncertain environment (financial market) using the Markov
Decision Process (MDP). Secondly, in supervised learning, labeling is a critical
issue affecting the performance of the final model. To illustrate, classification
and regression approaches with defined labels may not be appropriate, leading
to the selection of RL, which does not require labels and instead uses a goal
(reward function) to determine its policy.
Recent studies have usually employed discrete action space RL to address
algorithmic trading problems (Chakole et al., 2021; Jeong & Kim, 2019; Shi
et al., 2021; Th´eate & Ernst, 2021), which compels traders to buy/sell a spe-
cific number of shares, which is not a useful approach in financial markets. On
the contrary, the continuous action space RL is used in this study to let the
trader buy/sell a dynamic number of shares. Furthermore, the results are com-
pared to the TDQN algorithm (Th´eate & Ernst, 2021), two technical strategies,
Buy/Sell and Hold algorithms, and some random and deterministic strategies
in the presence of transaction costs.
The main contributions of this work can be summarized as follows:
We have developed a novel continuous action space DRL algorithm (TD3)
in algorithmic trading: this helps traders with managing their money while
opening a position.
This research aims both cryptocurrency and stock markets.
3
The remainder of this paper is organized as follows: Section 2 provides a
glossary of terms that readers will need to comprehend the rest of the paper.
Section 3 discusses financial market research that has been conducted using
statistical methods, classical machine learning, deep learning, and reinforcement
learning. The model elements are defined in Section 4. Section 5 offers some
baseline models and standard metrics, as well as the evaluation of the models.
The final section discusses the findings and some recommendations for further
study.
2. Background Materials
In this section, some terms are introduced in order to assist readers in under-
standing the rest of the article. A general definition of reinforcement learning
and its elements are presented in the first part. The second one introduces
model-free RL algorithms (Q-leaning and Deep Q-Network), while the last one
provides a statistical test (T-test), which is required to compare results from
two samples.
2.1. Reinforcement Learning
Reinforcement learning is one of the widely used machine learning approaches,
which is composed of an agent, environment, reward, and policy. The RL agent
interacts with its environment to learn a policy that maximizes the received re-
ward. This procedure is quite similar to a situation in which someone is learning
to trade in financial markets. The RL tries to solve a problem through an MDP
that has four components:
State Space: S,
Action Space: A,
Transition Probability between the states: P,
Immediate Reward: R(s, a).
4
摘要:

AlgorithmicTradingUsingContinuousActionSpaceDeepReinforcementLearningNasehMajidia(naseh.majidi@sharif.edu),MahdiShamsia(shamsi.mahdi@ee.sharif.edu),FarokhMarvastia(fmarvasti@gmail.com)aFacultyofElectricalEngineering,SharifUniversityofTechnology,AzadiAve,1458889694Tehran,Iran.CorrespondingAuthor:Faro...

展开>> 收起<<
Algorithmic Trading Using Continuous Action Space Deep Reinforcement Learning Naseh Majidianaseh.majidisharif.edu Mahdi Shamsia.pdf

共26页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:26 页 大小:746.17KB 格式:PDF 时间:2025-04-27

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 26
客服
关注