qlan3 / Explorer

Licence: MIT license

Explorer is a PyTorch reinforcement learning framework for exploring new ideas.

Programming Languages

python

139335 projects - #7 most used programming language

shell

77523 projects

Projects that are alternatives of or similar to Explorer

Deep-Reinforcement-Learning-With-Python

Master classic RL, deep RL, distributional RL, inverse RL, and more using OpenAI Gym and TensorFlow with extensive Math

Stars: ✭ 222 (+311.11%)

Mutual labels: deep-reinforcement-learning, q-learning, dqn, policy-gradient, actor-critic, ppo

Easy Rl

强化学习中文教程，在线阅读地址：https://datawhalechina.github.io/easy-rl/

Stars: ✭ 3,004 (+5462.96%)

Mutual labels: deep-reinforcement-learning, q-learning, dqn, policy-gradient, ppo

Reinforcement Learning With Tensorflow

Simple Reinforcement learning tutorials, 莫烦Python 中文AI教学

Stars: ✭ 6,948 (+12766.67%)

Mutual labels: q-learning, dqn, policy-gradient, actor-critic, ppo

Deep Reinforcement Learning With Pytorch

PyTorch implementation of DQN, AC, ACER, A2C, A3C, PG, DDPG, TRPO, PPO, SAC, TD3 and ....

Stars: ✭ 1,345 (+2390.74%)

Mutual labels: deep-reinforcement-learning, dqn, policy-gradient, actor-critic, ppo

rl implementations

No description or website provided.

Stars: ✭ 40 (-25.93%)

Mutual labels: deep-reinforcement-learning, dqn, policy-gradient, actor-critic

Machine Learning Is All You Need

🔥🌟《Machine Learning 格物志》: ML + DL + RL basic codes and notes by sklearn, PyTorch, TensorFlow, Keras & the most important, from scratch!💪 This repository is ALL You Need!

Stars: ✭ 173 (+220.37%)

Mutual labels: deep-reinforcement-learning, dqn, actor-critic, ppo

Paddle-RLBooks

Paddle-RLBooks is a reinforcement learning code study guide based on pure PaddlePaddle.

Stars: ✭ 113 (+109.26%)

Mutual labels: q-learning, dqn, policy-gradient, actor-critic

Pytorch Rl

This repository contains model-free deep reinforcement learning algorithms implemented in Pytorch

Stars: ✭ 394 (+629.63%)

Mutual labels: deep-reinforcement-learning, dqn, gym, policy-gradient

Reinforcement Learning Algorithms

This repository contains most of pytorch implementation based classic deep reinforcement learning algorithms, including - DQN, DDQN, Dueling Network, DDPG, SAC, A2C, PPO, TRPO. (More algorithms are still in progress)

Stars: ✭ 426 (+688.89%)

Mutual labels: deep-reinforcement-learning, dqn, actor-critic, ppo

Pytorch Drl

PyTorch implementations of various Deep Reinforcement Learning (DRL) algorithms for both single agent and multi-agent.

Stars: ✭ 233 (+331.48%)

Mutual labels: deep-reinforcement-learning, dqn, actor-critic, ppo

Hands On Reinforcement Learning With Python

Master Reinforcement and Deep Reinforcement Learning using OpenAI Gym and TensorFlow

Stars: ✭ 640 (+1085.19%)

Mutual labels: deep-reinforcement-learning, q-learning, policy-gradient, ppo

Reinforcement Learning

Minimal and Clean Reinforcement Learning Examples

Stars: ✭ 2,863 (+5201.85%)

Mutual labels: deep-reinforcement-learning, dqn, policy-gradient, actor-critic

Reinforcement learning tutorial with demo

Reinforcement Learning Tutorial with Demo: DP (Policy and Value Iteration), Monte Carlo, TD Learning (SARSA, QLearning), Function Approximation, Policy Gradient, DQN, Imitation, Meta Learning, Papers, Courses, etc..

Stars: ✭ 442 (+718.52%)

Mutual labels: deep-reinforcement-learning, q-learning, policy-gradient, actor-critic

Slm Lab

Modular Deep Reinforcement Learning framework in PyTorch. Companion library of the book "Foundations of Deep Reinforcement Learning".

Stars: ✭ 904 (+1574.07%)

Mutual labels: deep-reinforcement-learning, dqn, policy-gradient, ppo

Deeprl algorithms

DeepRL algorithms implementation easy for understanding and reading with Pytorch and Tensorflow 2(DQN, REINFORCE, VPG, A2C, TRPO, PPO, DDPG, TD3, SAC)

Stars: ✭ 97 (+79.63%)

Mutual labels: deep-reinforcement-learning, dqn, policy-gradient, ppo

Deep Reinforcement Learning Algorithms

31 projects in the framework of Deep Reinforcement Learning algorithms: Q-learning, DQN, PPO, DDPG, TD3, SAC, A2C and others. Each project is provided with a detailed training log.

Stars: ✭ 167 (+209.26%)

Mutual labels: deep-reinforcement-learning, dqn, ppo

Hands On Intelligent Agents With Openai Gym

Code for Hands On Intelligent Agents with OpenAI Gym book to get started and learn to build deep reinforcement learning agents using PyTorch

Stars: ✭ 189 (+250%)

Mutual labels: deep-reinforcement-learning, dqn, actor-critic

Minimalrl

Implementations of basic RL algorithms with minimal lines of codes! (pytorch based)

Stars: ✭ 2,051 (+3698.15%)

Mutual labels: deep-reinforcement-learning, dqn, ppo

Pytorch sac

PyTorch implementation of Soft Actor-Critic (SAC)

Stars: ✭ 174 (+222.22%)

Mutual labels: deep-reinforcement-learning, gym, actor-critic

Rainy

☔ Deep RL agents with PyTorch☔

Stars: ✭ 39 (-27.78%)

Mutual labels: deep-reinforcement-learning, dqn, ppo

View All Similar Projects ➔

Explorer

Explorer is a PyTorch reinforcement learning framework for exploring new ideas.

Implemented algorithms

Vanilla Deep Q-learning (VanillaDQN): No target network.
Deep Q-Learning (DQN)
Double Deep Q-learning (DDQN)
Maxmin Deep Q-learning (MaxminDQN)
Averaged Deep Q-learning (AveragedDQN)
Ensemble Deep Q-learning (EnsembleDQN)
Bootstrapped Deep Q-learning (BootstrappedDQN)
NoisyNet Deep Q-learning (NoisyNetDQN)
REINFORCE
Actor-Critic
Proximal Policy Optimisation (PPO)
Soft Actor-Critic (SAC)
Deep Deterministic Policy Gradients (DDPG)
Twin Delayed Deep Deterministic Policy Gradients (TD3)
Reward Policy Gradient (RPG)
Memory-efficient Deep Q-learning (MeDQN)

To do list

SAC with automatically adjusted temperature
SAC with discrete action spaces

The dependency tree of agent classes

Base Agent
  ├── Vanilla DQN
  |     ├── DQN
  |     |    ├── DDQN
  |     |    ├── NoisyNetDQN
  |     |    ├── BootstrappedDQN
  |     |    └── MeDQN_Uniform, MeDQN_Real
  |     ├── Maxmin DQN ── Ensemble DQN
  |     └── Averaged DQN
  └── REINFORCE 
        ├── Actor-Critic
        |     └── PPO ── RPG
        └── SAC ── DDPG ── TD3

Requirements

Python (>=3.6)
PyTorch
Gym && Gym Games: You may only install part of Gym (classic_control, box2d) by command pip install 'gym[classic_control, box2d]'.
Optional:
- Gym Atari: pip install gym[atari,accept-rom-license]
- Gym Mujoco:
  - Download MuJoCo version 1.50 from MuJoCo website.
  - Unzip the downloaded mjpro150 directory into ~/.mujoco/mjpro150, and place the activation key (the mjkey.txt file downloaded from here) at ~/.mujoco/mjkey.txt.
  - Install mujoco-py: pip install 'mujoco-py<1.50.2,>=1.50.1'
  - Install gym[mujoco]: pip install gym[mujoco]
- PyBullet: pip install pybullet
- DeepMind Control Suite: pip install git+git://github.com/denisyarats/dmc2gym.git
Others: Please check requirements.txt.

Experiments

Train && Test

All hyperparameters including parameters for grid search are stored in a configuration file in directory configs. To run an experiment, a configuration index is first used to generate a configuration dict corresponding to this specific configuration index. Then we run an experiment defined by this configuration dict. All results including log files are saved in directory logs. Please refer to the code for details.

For example, run the experiment with configuration file RPG.json and configuration index 1:

python main.py --config_file ./configs/RPG.json --config_idx 1

The models are tested for one episode after every test_per_episodes training episodes which can be set in the configuration file.

Grid Search (Optional)

First, we calculate the number of total combinations in a configuration file (e.g. RPG.json):

python utils/sweeper.py

The output will be:

Number of total combinations in RPG.json: 12

Then we run through all configuration indexes from 1 to 12. The simplest way is using a bash script:

for index in {1..12}
do
  python main.py --config_file ./configs/RPG.json --config_idx $index
done

Parallel is usually a better choice to schedule a large number of jobs:

parallel --eta --ungroup python main.py --config_file ./configs/RPG.json --config_idx {1} ::: $(seq 1 12)

Any configuration index that has the same remainder (divided by the number of total combinations) should have the same configuration dict. So for multiple runs, we just need to add the number of total combinations to the configuration index. For example, 5 runs for configuration index 1:

for index in 1 13 25 37 49
do
  python main.py --config_file ./configs/RPG.json --config_idx $index
done

Or a simpler way:

parallel --eta --ungroup python main.py --config_file ./configs/RPG.json --config_idx {1} ::: $(seq 1 12 60)

Analysis (Optional)

To analyze the experimental results, just run:

python analysis.py

Inside analysis.py, unfinished_index will print out the configuration indexes of unfinished jobs based on the existence of the result file. memory_info will print out the memory usage information and generate a histogram to show the distribution of memory usages in directory logs/RPG/0. Similarly, time_info will print out the time information and generate a histogram to show the distribution of time in directory logs/RPG/0. Finally, analyze will generate csv files that store training and test results. Please check analysis.py for more details. More functions are available in utils/plotter.py.

Enjoy!

Code of My Papers

Qingfeng Lan, Yangchen Pan, Alona Fyshe, Martha White. Maxmin Q-learning: Controlling the Estimation Bias of Q-learning. ICLR, 2020. (Poster) [paper] [code]
Qingfeng Lan, Samuele Tosatto, Homayoon Farrahi, A. Rupam Mahmood. Model-free Policy Learning with Reward Gradients. AISTATS, 2022. (Poster) [paper] [code]
Qingfeng Lan, Yangchen Pan, Jun Luo, A. Rupam Mahmood. Memory-efficient Reinforcement Learning with Knowledge Consolidation. Arxiv [paper] [code]

Cite

If you find this repo useful to your research, please cite my paper if related. Otherwise, please cite this repo:

@misc{Explorer,
  author = {Lan, Qingfeng},
  title = {A PyTorch Reinforcement Learning Framework for Exploring New Ideas},
  year = {2019},
  publisher = {GitHub},
  journal = {GitHub Repository},
  howpublished = {\url{https://github.com/qlan3/Explorer}}
}

Acknowledgements

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

qlan3 / Explorer

Programming Languages

Labels

Projects that are alternatives of or similar to Explorer

Explorer

Implemented algorithms

To do list

The dependency tree of agent classes

Requirements

Experiments

Train && Test

Grid Search (Optional)

Analysis (Optional)

Code of My Papers

Cite

Acknowledgements