Algorithmic Trading Using Continuous Action Space Deep Reinforcement Learning Naseh Majidianaseh.majidisharif.edu Mahdi Shamsia

2025-04-27 1 0 746.17KB 26 页 10玖币

侵权投诉

Algorithmic Trading Using Continuous Action Space

Deep Reinforcement Learning

Naseh Majidia(naseh.majidi@sharif.edu), Mahdi Shamsia

(shamsi.mahdi@ee.sharif.edu), Farokh Marvastia(fmarvasti@gmail.com)

aFaculty of Electrical Engineering, Sharif University of Technology, Azadi Ave,

1458889694 Tehran, Iran.

Corresponding Author:

Farokh Marvasti

Faculty of Electrical Engineering, Sharif University of Technology, Azadi Ave,

1458889694 Tehran, Iran.

Tel: (98) 9123729799

Email: fmarvasti@gmail.com

arXiv:2210.03469v1 [cs.LG] 7 Oct 2022

Algorithmic Trading Using Continuous Action Space

Deep Reinforcement Learning

Naseh Majidia, Mahdi Shamsia, Farokh Marvastia,∗

aFaculty of Electrical Engineering, Sharif University of Technology, Azadi Ave, 1458889694

Tehran, Iran.

Abstract

Price movement prediction has always been one of the traders concerns in ﬁ-

nancial market trading. In order to increase their proﬁt, they can analyze the

historical data and predict the price movement. The large size of the data and

complex relations between them lead us to use algorithmic trading and arti-

ﬁcial intelligence. This paper aims to oﬀer an approach using Twin-Delayed

DDPG (TD3) and the daily close price in order to achieve a trading strategy in

the stock and cryptocurrency markets. Unlike previous studies using a discrete

action space reinforcement learning algorithm, the TD3 is continuous, oﬀering

both position and the number of trading shares. Both the stock (Amazon)

and cryptocurrency (Bitcoin) markets are addressed in this research to evaluate

the performance of the proposed algorithm. The achieved strategy using the

TD3 is compared with some algorithms using technical analysis, reinforcement

learning, stochastic, and deterministic strategies through two standard metrics,

Return and Sharpe ratio. The results indicate that employing both position and

the number of trading shares can improve the performance of a trading system

based on the mentioned metrics.

Keywords: Bitcoin, Algorithmic Trading, Stock Market Prediction, Deep

Reinforcement Learning, Artiﬁcial Intelligence, Financial AI.

∗Corresponding author.

Email addresses: naseh.majidi@sharif.edu (Naseh Majidi),

shamsi.mahdi@ee.sharif.edu (Mahdi Shamsi), fmarvasti@gmail.com (Farokh Marvasti )

Preprint submitted to Expert Systems with Applications October 10, 2022

1. Introduction

Forecasting price movements in the ﬁnancial market is a diﬃcult task. Ac-

cording to the Eﬃcient-Market hypothesis (Kirkpatrick & Dahlquist, 2008),

stock market prices follow a random walk process with unpredictable future

ﬂuctuations. When it comes to Bitcoin, its price ﬂuctuates highly, which makes

its forecasting challenging (Phaladisailoed & Numnonda, 2018). Technical and

fundamental analysis are two typical tools used by traders to build their trading

strategies in the ﬁnancial markets. According to price movement and trading

volume, technical analysis provides trading signals (Murphy, 1999). Fundamen-

tal analysis, unlike the former, examines related economic and ﬁnancial factors

to determine a security’s underlying worth (Drakopoulou, 2016).

Humans and computers both perform data analysis. Although humans are

able to keep an eye on ﬁnancial charts (such as prices) and make decisions based

on their past experiences, managing a vast volume of data is complicated due to

various factors inﬂuencing the price movement. As a result, algorithmic trading

has emerged to tackle this issue. Algorithmic trading is a type of trading where

a computer that has been pre-programmed with a speciﬁc set of mathematical

rules is employed (Th´eate & Ernst, 2021). There are two sorts of approaches

in ﬁnancial markets: price prediction and algorithmic trading. Price prediction

aims to build a model that can precisely predict future prices, whereas algorith-

mic trading is not limited to the price prediction and attempts to participate

in the ﬁnancial market (e.g. choosing a position and the number of trading

shares) to maximize proﬁt (Hirchoua et al., 2021). It is claimed that a more

precise prediction does not necessarily result in a higher proﬁt. In other words,

a trader’s overall loss due to incorrect actions may be greater than the gain due

to correct ones (Li et al., 2019). Therefore, algorithmic trading has been the

focus of this study.

Classical Machine Learning (ML) and Deep Learning (DL), which are power-

ful tools for recognizing patterns, have been employed in various research ﬁelds.

In recent years, using the ML as an intelligent agent has risen in popularity over

the alternative of the traditional approaches in which a human being makes a

decision. For two reasons, the ML and DL have enhanced theperformance in

algorithmic trading. Firstly, they can extract complex patterns from data that

are diﬃcult for humans to accomplish. Secondly, emotion does not aﬀect their

performance, which is a disadvantage for humans (Chakole et al., 2021). How-

ever, there are two compelling reasons why the ML and DL in a supervised

learning approach are unsuitable for algorithmic trading. Firstly, supervised

learning is improper for learning problems with long-term and delayed rewards

(Dang, 2019), such as trading in ﬁnancial markets, which is why Reinforcement

Learning (RL), a subﬁeld of ML, is required to solve a decision-making prob-

lem (trading) in an uncertain environment (ﬁnancial market) using the Markov

Decision Process (MDP). Secondly, in supervised learning, labeling is a critical

issue aﬀecting the performance of the ﬁnal model. To illustrate, classiﬁcation

and regression approaches with deﬁned labels may not be appropriate, leading

to the selection of RL, which does not require labels and instead uses a goal

(reward function) to determine its policy.

Recent studies have usually employed discrete action space RL to address

algorithmic trading problems (Chakole et al., 2021; Jeong & Kim, 2019; Shi

et al., 2021; Th´eate & Ernst, 2021), which compels traders to buy/sell a spe-

ciﬁc number of shares, which is not a useful approach in ﬁnancial markets. On

the contrary, the continuous action space RL is used in this study to let the

trader buy/sell a dynamic number of shares. Furthermore, the results are com-

pared to the TDQN algorithm (Th´eate & Ernst, 2021), two technical strategies,

Buy/Sell and Hold algorithms, and some random and deterministic strategies

in the presence of transaction costs.

The main contributions of this work can be summarized as follows:

•We have developed a novel continuous action space DRL algorithm (TD3)

in algorithmic trading: this helps traders with managing their money while

opening a position.

•This research aims both cryptocurrency and stock markets.

The remainder of this paper is organized as follows: Section 2 provides a

glossary of terms that readers will need to comprehend the rest of the paper.

Section 3 discusses ﬁnancial market research that has been conducted using

statistical methods, classical machine learning, deep learning, and reinforcement

learning. The model elements are deﬁned in Section 4. Section 5 oﬀers some

baseline models and standard metrics, as well as the evaluation of the models.

The ﬁnal section discusses the ﬁndings and some recommendations for further

study.

2. Background Materials

In this section, some terms are introduced in order to assist readers in under-

standing the rest of the article. A general deﬁnition of reinforcement learning

and its elements are presented in the ﬁrst part. The second one introduces

model-free RL algorithms (Q-leaning and Deep Q-Network), while the last one

provides a statistical test (T-test), which is required to compare results from

two samples.

2.1. Reinforcement Learning

Reinforcement learning is one of the widely used machine learning approaches,

which is composed of an agent, environment, reward, and policy. The RL agent

interacts with its environment to learn a policy that maximizes the received re-

ward. This procedure is quite similar to a situation in which someone is learning

to trade in ﬁnancial markets. The RL tries to solve a problem through an MDP

that has four components:

•State Space: S,

•Action Space: A,

•Transition Probability between the states: P,

•Immediate Reward: R(s, a).

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

AlgorithmicTradingUsingContinuousActionSpaceDeepReinforcementLearningNasehMajidia(naseh.majidi@sharif.edu),MahdiShamsia(shamsi.mahdi@ee.sharif.edu),FarokhMarvastia(fmarvasti@gmail.com)aFacultyofElectricalEngineering,SharifUniversityofTechnology,AzadiAve,1458889694Tehran,Iran.CorrespondingAuthor:Faro...

展开>> 收起<<

Algorithmic Trading Using Continuous Action Space Deep Reinforcement Learning Naseh Majidianaseh.majidisharif.edu Mahdi Shamsia.pdf

共26页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Algorithmic Trading Using Continuous Action Space Deep Reinforcement Learning Naseh Majidianaseh.majidisharif.edu Mahdi Shamsia

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: