All Projects → AI4Finance-Foundation → ElegantRL

AI4Finance-Foundation / ElegantRL

Licence: other
Scalable and Elastic Deep Reinforcement Learning Using PyTorch. Please star. 🔥

Programming Languages

python
139335 projects - #7 most used programming language
Jupyter Notebook
11667 projects

Projects that are alternatives of or similar to ElegantRL

Elegantrl
Lightweight, efficient and stable implementations of deep reinforcement learning algorithms using PyTorch.
Stars: ✭ 575 (-72.28%)
Mutual labels:  lightweight, efficient, stable, dqn, ddpg, ppo
Tianshou
An elegant PyTorch deep reinforcement learning library.
Stars: ✭ 4,109 (+98.12%)
Mutual labels:  dqn, ddpg, sac, ppo, a2c, td3
Deep-Reinforcement-Learning-With-Python
Master classic RL, deep RL, distributional RL, inverse RL, and more using OpenAI Gym and TensorFlow with extensive Math
Stars: ✭ 222 (-89.3%)
Mutual labels:  dqn, ddpg, sac, ppo, a2c, td3
ReinforcementLearningZoo.jl
juliareinforcementlearning.org/
Stars: ✭ 46 (-97.78%)
Mutual labels:  dqn, ddpg, sac, ppo, a2c, td3
Rainy
☔ Deep RL agents with PyTorch☔
Stars: ✭ 39 (-98.12%)
Mutual labels:  dqn, ddpg, sac, ppo, a2c, td3
model-free-algorithms
TD3, SAC, IQN, Rainbow, PPO, Ape-X and etc. in TF1.x
Stars: ✭ 56 (-97.3%)
Mutual labels:  ddpg, sac, ppo, td3, model-free-rl
Minimalrl
Implementations of basic RL algorithms with minimal lines of codes! (pytorch based)
Stars: ✭ 2,051 (-1.11%)
Mutual labels:  dqn, ddpg, sac, ppo, a2c
Deeprl
Modularized Implementation of Deep RL Algorithms in PyTorch
Stars: ✭ 2,640 (+27.29%)
Mutual labels:  dqn, ddpg, ppo, a2c, td3
Paddle-RLBooks
Paddle-RLBooks is a reinforcement learning code study guide based on pure PaddlePaddle.
Stars: ✭ 113 (-94.55%)
Mutual labels:  dqn, ddpg, sac, td3
LWDRLC
Lightweight deep RL Libraray for continuous control.
Stars: ✭ 14 (-99.32%)
Mutual labels:  ddpg, sac, ppo, td3
Deep-rl-mxnet
Mxnet implementation of Deep Reinforcement Learning papers, such as DQN, PG, DDPG, PPO
Stars: ✭ 26 (-98.75%)
Mutual labels:  dqn, ddpg, a2c, td3
TF2-RL
Reinforcement learning algorithms implemented for Tensorflow 2.0+ [DQN, DDPG, AE-DDPG, SAC, PPO, Primal-Dual DDPG]
Stars: ✭ 160 (-92.29%)
Mutual labels:  dqn, ddpg, sac, ppo
Reinforcement Learning With Tensorflow
Simple Reinforcement learning tutorials, 莫烦Python 中文AI教学
Stars: ✭ 6,948 (+235%)
Mutual labels:  dqn, ddpg, ppo
Autonomous Learning Library
A PyTorch library for building deep reinforcement learning agents.
Stars: ✭ 425 (-79.51%)
Mutual labels:  dqn, ddpg, ppo
mujoco-benchmark
Provide full reinforcement learning benchmark on mujoco environments, including ddpg, sac, td3, pg, a2c, ppo, library
Stars: ✭ 101 (-95.13%)
Mutual labels:  ddpg, sac, ppo
Reinforcement Learning Algorithms
This repository contains most of pytorch implementation based classic deep reinforcement learning algorithms, including - DQN, DDQN, Dueling Network, DDPG, SAC, A2C, PPO, TRPO. (More algorithms are still in progress)
Stars: ✭ 426 (-79.46%)
Mutual labels:  dqn, ddpg, ppo
Torchrl
Pytorch Implementation of Reinforcement Learning Algorithms ( Soft Actor Critic(SAC)/ DDPG / TD3 /DQN / A2C/ PPO / TRPO)
Stars: ✭ 90 (-95.66%)
Mutual labels:  dqn, ddpg, ppo
Deep Reinforcement Learning
Repo for the Deep Reinforcement Learning Nanodegree program
Stars: ✭ 4,012 (+93.44%)
Mutual labels:  dqn, ddpg, ppo
Easy Rl
强化学习中文教程,在线阅读地址:https://datawhalechina.github.io/easy-rl/
Stars: ✭ 3,004 (+44.84%)
Mutual labels:  dqn, ddpg, ppo
Machin
Reinforcement learning library(framework) designed for PyTorch, implements DQN, DDPG, A2C, PPO, SAC, MADDPG, A3C, APEX, IMPALA ...
Stars: ✭ 145 (-93.01%)
Mutual labels:  dqn, ddpg, ppo

ElegantRL “小雅”: Massively Parallel Library for Cloud-native Deep Reinforcement Learning

Downloads Downloads Python 3.6 PyPI



ElegantRL (website) is developed for practitioners with the following advantages:

  • Cloud-native: follows a cloud-native paradigm through microservice architecture and containerization, supporting ElegantRL-Podracer and FinRL-Podracer.

  • Scalable: fully exploits the parallelism of DRL algorithms at multiple levels, making it easily scale out to hundreds or thousands of computing nodes on a cloud platform, say, a DGX SuperPOD platform with thousands of GPUs.

  • Elastic: allows to elastically and automatically allocate computing resources on the cloud.

  • Lightweight: the core codes <1,000 lines (check Elegantrl_Helloworld).

  • Efficient: in many testing cases (single GPU/multi-GPU/GPU cloud), we find it more efficient than Ray RLlib.

  • Stable: much much much more stable than Stable Baselines 3 by utilizing various ensemble methods.

ElegantRL implements the following model-free deep reinforcement learning (DRL) algorithms:

  • DDPG, TD3, SAC, PPO, REDQ for continuous actions in single-agent environment,
  • DQN, Double DQN, D3QN, SAC for discrete actions in single-agent environment,
  • QMIX, VDN, MADDPG, MAPPO, MATD3 in multi-agent environment.

For the details of DRL algorithms, please check out the educational webpage OpenAI Spinning Up.

ElegantRL supports the following simulators:

  • Isaac Gym for massively parallel simulation,
  • OpenAI Gym, MuJoCo, PyBullet, FinRL for benchmarking.

“小雅”源于《诗经·小雅·鹤鸣》,旨在「他山之石,可以攻玉」。

Contents

Star History

Star History Chart

News

ElegantRL-Helloworld

For beginners, we maintain ElegantRL-Helloworld as a tutorial. Its goal is to get hands-on experience with ELegantRL.

One sentence summary: an agent (agent.py) with Actor-Critic networks (net.py) is trained (run.py) by interacting with an environment (env.py).

File Structure

  • elegantrl # main folder

    • agents # a collection of DRL algorithms
      • AgentXXX.py # a collection of one kind of DRL algorithms
      • net.py # a collection of network architectures
    • envs # a collection of environments
      • XxxEnv.py # a training environment for RL
    • train # a collection of training programs - demo.py # a collection of demos
      • config.py # configurations (hyper-parameter)
      • run.py # training loop
      • worker.py # the worker class (explores the env, saving the data to replay buffer)
      • learner.py # the learner class (update the networks, using the data in replay buffer)
      • evaluator.py # the evaluator class (evaluate the cumulative returns of policy network)
      • replay_buffer.py # the buffer class (save sequences of transitions for training)
  • elegantrl_helloworld # tutorial version

    • config.py # configurations (hyper-parameter)
    • agent.py # DRL algorithms
    • net.py # network architectures
    • run.py # training loop
    • env.py # environments for RL training
  • examples # a collection of example codes

  • ready-to-run Google-Colab notebooks

    • quickstart_Pendulum_v1.ipynb
    • tutorial_BipedalWalker_v3.ipynb
    • tutorial_Creating_ChasingVecEnv.ipynb
    • tutorial_LunarLanderContinuous_v2.ipynb
  • unit_tests # a collection of tests

Experimental Demos

More efficient than Ray RLlib

Experiments on Ant (MuJoCo), Humainoid (MuJoCo), Ant (Isaac Gym), Humanoid (Isaac Gym) # from left to right

ElegantRL fully supports Isaac Gym that runs massively parallel simulation (e.g., 4096 sub-envs) on one GPU.

More stable than Stable-baseline 3

Experiment on Hopper-v2 # ElegantRL achieves much smaller variance (average over 8 runs).

Also, PPO+H in ElegantRL completed the training process of 5M samples about 6x faster than Stable-Baseline3.

Testing and Contributing

Our tests are written with the built-in unittest Python module for easy access. In order to run a specific test file (for example, test_training_agents.py), use the following command from the root directory:

python -m unittest unit_tests/test_training_agents.py

In order to run all the tests sequentially, you can use the following command:

python -m unittest discover

Please note that some of the tests require Isaac Gym to be installed on your system. If it is not, any tests related to Isaac Gym will fail.

We welcome any contributions to the codebase, but we ask that you please do not submit/push code that breaks the tests. Also, please shy away from modifying the tests just to get your proposed changes to pass them. As it stands, the tests on their own are quite minimal (instantiating environments, training agents for one step, etc.), so if they're breaking, it's almost certainly a problem with your code and not with the tests.

We're actively working on refactoring and trying to make the codebase cleaner and more performant as a whole. If you'd like to help us clean up some code, we'd strongly encourage you to also watch Uncle Bob's clean coding lessons if you haven't already.

Requirements

Necessary:
| Python 3.6+     |
| PyTorch 1.6+    |

Not necessary:
| Numpy 1.18+     | For ReplayBuffer. Numpy will be installed along with PyTorch.
| gym 0.17.0      | For env. Gym provides tutorial env for DRL training. (env.render() bug in gym==0.18 pyglet==1.6. Change to gym==0.17.0, pyglet==1.5)
| pybullet 2.7+   | For env. We use PyBullet (free) as an alternative of MuJoCo (not free).
| box2d-py 2.3.8  | For gym. Use pip install Box2D (instead of box2d-py)
| matplotlib 3.2  | For plots.

pip3 install gym==0.17.0 pybullet Box2D matplotlib # or pip install -r requirements.txt

To install StarCraftII env,
bash ./elegantrl/envs/installsc2.sh
pip install -r sc2_requirements.txt

Citation:

To cite this repository:

@misc{erl,
  author = {Liu, Xiao-Yang and Li, Zechu, Zhaoran Wang, and Zheng, Jiahao},
  title = {{ElegantRL}: Massively Parallel Framework for Cloud-native Deep Reinforcement Learning},
  year = {2021},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/AI4Finance-Foundation/ElegantRL}},
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].