All Projects → miroblog → Deep_rl_trader

miroblog / Deep_rl_trader

Trading Environment(OpenAI Gym) + DDQN (Keras-RL)

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Deep rl trader

Deep Trading Agent
Deep Reinforcement Learning based Trading Agent for Bitcoin
Stars: ✭ 573 (+158.11%)
Mutual labels:  trading, deep-reinforcement-learning
Crypto Rl
Deep Reinforcement Learning toolkit: record and replay cryptocurrency limit order book data & train a DDQN agent
Stars: ✭ 328 (+47.75%)
Mutual labels:  trading, deep-reinforcement-learning
Deep Rl Trading
playing idealized trading games with deep reinforcement learning
Stars: ✭ 228 (+2.7%)
Mutual labels:  trading, deep-reinforcement-learning
Chanlun
文件 笔和线段的一种划分.py,只需要把k线high,low数据输入,就能自动实现笔,线段,中枢,买卖点,走势类型的划分了。可以把sh.csv 作为输入文件。个人简历见.pdf。时间的力量。有人说择时很困难,有人说选股很容易,有人说统计套利需要的IT配套设施很重要。还有人说系统有不可测原理。众说纷纭。分布式的系统,当你的影响可以被忽略,你才能实现,Jiang主席所谓之,闷声发大财。
Stars: ✭ 206 (-7.21%)
Mutual labels:  trading, deep-reinforcement-learning
Pytorch A2c Ppo Acktr Gail
PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).
Stars: ✭ 2,632 (+1085.59%)
Mutual labels:  deep-reinforcement-learning
Deep Reinforcement Learning Gym
Deep reinforcement learning model implementation in Tensorflow + OpenAI gym
Stars: ✭ 200 (-9.91%)
Mutual labels:  deep-reinforcement-learning
Bbgo
The modern cryptocurrency trading bot written in Go.
Stars: ✭ 192 (-13.51%)
Mutual labels:  trading
Drl4recsys
Courses on Deep Reinforcement Learning (DRL) and DRL papers for recommender systems
Stars: ✭ 196 (-11.71%)
Mutual labels:  deep-reinforcement-learning
Philadelphia
Low-latency Financial Information Exchange (FIX) engine for the JVM
Stars: ✭ 219 (-1.35%)
Mutual labels:  trading
Acer
Actor-critic with experience replay
Stars: ✭ 215 (-3.15%)
Mutual labels:  deep-reinforcement-learning
Rl trading
An environment to high-frequency trading agents under reinforcement learning
Stars: ✭ 205 (-7.66%)
Mutual labels:  trading
Awesome Quant
中国的Quant相关资源索引
Stars: ✭ 2,529 (+1039.19%)
Mutual labels:  trading
Trixi
Manage your machine learning experiments with trixi - modular, reproducible, high fashion. An experiment infrastructure optimized for PyTorch, but flexible enough to work for your framework and your tastes.
Stars: ✭ 211 (-4.95%)
Mutual labels:  deep-reinforcement-learning
Mlfinlab
MlFinLab helps portfolio managers and traders who want to leverage the power of machine learning by providing reproducible, interpretable, and easy to use tools.
Stars: ✭ 2,676 (+1105.41%)
Mutual labels:  trading
Devalpha Node
A stream-based approach to algorithmic trading and backtesting in Node.js
Stars: ✭ 217 (-2.25%)
Mutual labels:  trading
Atari Model Zoo
A binary release of trained deep reinforcement learning models trained in the Atari machine learning benchmark, and a software release that enables easy visualization and analysis of models, and comparison across training algorithms.
Stars: ✭ 198 (-10.81%)
Mutual labels:  deep-reinforcement-learning
Tensorflow2 Deep Reinforcement Learning
Code accompanying the blog post "Deep Reinforcement Learning with TensorFlow 2.1"
Stars: ✭ 204 (-8.11%)
Mutual labels:  deep-reinforcement-learning
Jiji2
Forex algorithmic trading framework using OANDA REST API.
Stars: ✭ 211 (-4.95%)
Mutual labels:  trading
Papers
Summaries of machine learning papers
Stars: ✭ 2,362 (+963.96%)
Mutual labels:  deep-reinforcement-learning
Gekko Backtesttool
Batch backtest, import and strategy params optimalization for Gekko Trading Bot. With one command you will run any number of backtests.
Stars: ✭ 203 (-8.56%)
Mutual labels:  trading

Deep RL Trader (Duel DQN) Implemented using Keras-RL

This repo contains

  1. Trading environment(OpenAI Gym) for trading crypto currency
  2. Duel Deep Q Network
    Agent is implemented using keras-rl(https://github.com/keras-rl/keras-rl)

Agent is expected to learn useful action sequences to maximize profit in a given environment.
Environment limits agent to either buy, sell, hold stock(coin) at each step.
If an agent decides to take a

  • LONG position it will initiate sequence of action such as buy- hold- hold- sell
  • for a SHORT position vice versa (e.g.) sell - hold -hold -buy.

Only a single position can be opened per trade.

  • Thus invalid action sequence like buy - buy will be considered buy- hold.
  • Default transaction fee is : 0.0005

Reward is given

  • when the position is closed or
  • an episode is finished.

This type of sparse reward granting scheme takes longer to train but is most successful at learning long term dependencies.

Agent decides optimal action by observing its environment.

  • Trading environment will emit features derived from ohlcv-candles(the window size can be configured).
  • Thus, input given to the agent is of the shape (window_size, n_features).

With some modification it can easily be applied to stocks, futures or foregin exchange as well.

Visualization / Main / Environment

Sample data provided is 5min ohlcv candle fetched from bitmex.

  • train : './data/train/ 70000
  • test : './data/train/ 16000

Prerequisites

keras-rl, numpy, tensorflow ... etc

pip install -r requirements.txt

# change "keras-rl/core.py" to "./modified/core.py"

Getting Started

Create Environment & Agent

# create environment
# OPTIONS
ENV_NAME = 'OHLCV-v0'
TIME_STEP = 30
PATH_TRAIN = "./data/train/"
PATH_TEST = "./data/test/"
env = OhlcvEnv(TIME_STEP, path=PATH_TRAIN)
env_test = OhlcvEnv(TIME_STEP, path=PATH_TEST)

# random seed
np.random.seed(123)
env.seed(123)

# create_model
nb_actions = env.action_space.n
model = create_model(shape=env.shape, nb_actions=nb_actions)
print(model.summary())


# create memory
memory = SequentialMemory(limit=50000, window_length=TIME_STEP)

# create policy
policy = EpsGreedyQPolicy()# policy = BoltzmannQPolicy()

# create agent
# you can specify the dueling_type to one of {'avg','max','naive'}
dqn = DQNAgent(model=model, nb_actions=nb_actions, memory=memory, nb_steps_warmup=200,
               enable_dueling_network=True, dueling_type='avg', target_model_update=1e-2, policy=policy,
               processor=NormalizerProcessor())
dqn.compile(Adam(lr=1e-3), metrics=['mae'])

Train and Validate

# now train and test agent
while True:
    # train
    dqn.fit(env, nb_steps=5500, nb_max_episode_steps=10000, visualize=False, verbose=2)
    try:
        # validate
        info = dqn.test(env_test, nb_episodes=1, visualize=False)
        n_long, n_short, total_reward, portfolio = info['n_trades']['long'], info['n_trades']['short'], info[
            'total_reward'], int(info['portfolio'])
        np.array([info]).dump(
            './info/duel_dqn_{0}_weights_{1}LS_{2}_{3}_{4}.info'.format(ENV_NAME, portfolio, n_long, n_short,
                                                                        total_reward))
        dqn.save_weights(
            './model/duel_dqn_{0}_weights_{1}LS_{2}_{3}_{4}.h5f'.format(ENV_NAME, portfolio, n_long, n_short,
                                                                        total_reward),
            overwrite=True)
    except KeyboardInterrupt:
        continue

Configuring Agent

## simply plug in any keras model :)
def create_model(shape, nb_actions):
    model = Sequential()
    model.add(CuDNNLSTM(64, input_shape=shape, return_sequences=True))
    model.add(CuDNNLSTM(64))
    model.add(Dense(32))
    model.add(Activation('relu'))
    model.add(Dense(nb_actions, activation='linear'))

Running

[Verbose] While training or testing,

  • environment will print out (current_tick , # Long, # Short, Portfolio)

[Portfolio]

  • initial portfolio starts with 100*10000(krw-won)
  • reflects change in portfolio value if the agent had invested 100% of its balance every time it opened a position.

[Reward]

  • simply pct earning per trade.

Inital Result

Trade History : Buy (green) Sell (red)

trade
partial_trade

Cumulative Return, Max Drawdown Period (red)

cum_return

  • total cumulative return :[0] -> [3.670099054203348]
  • portfolio value [1000000] -> [29415305.46593453]

Wow ! 29 fold return, 3.67 reward !
! Disclaimer : if may have overfitted :(

Authors

License

This project is licensed under the MIT License - see the LICENSE.md file for details

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].