All Projects → Rahul-Choudhary-3614 → Deep-Reinforcement-Learning-Notebooks

Rahul-Choudhary-3614 / Deep-Reinforcement-Learning-Notebooks

Licence: MIT License
This Repository contains a series of google colab notebooks which I created to help people dive into deep reinforcement learning.This notebooks contain both theory and implementation of different algorithms.

Programming Languages

Jupyter Notebook
11667 projects
python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Deep-Reinforcement-Learning-Notebooks

Deeprl
Modularized Implementation of Deep RL Algorithms in PyTorch
Stars: ✭ 2,640 (+17500%)
Mutual labels:  deep-reinforcement-learning, rainbow, dqn, ppo, a2c, prioritized-experience-replay
Minimalrl
Implementations of basic RL algorithms with minimal lines of codes! (pytorch based)
Stars: ✭ 2,051 (+13573.33%)
Mutual labels:  deep-reinforcement-learning, dqn, a3c, ppo, a2c
Easy Rl
强化学习中文教程,在线阅读地址:https://datawhalechina.github.io/easy-rl/
Stars: ✭ 3,004 (+19926.67%)
Mutual labels:  deep-reinforcement-learning, dqn, sarsa, a3c, ppo
Deep-Reinforcement-Learning-With-Python
Master classic RL, deep RL, distributional RL, inverse RL, and more using OpenAI Gym and TensorFlow with extensive Math
Stars: ✭ 222 (+1380%)
Mutual labels:  deep-reinforcement-learning, dqn, a3c, ppo, a2c
Deep RL with pytorch
A pytorch tutorial for DRL(Deep Reinforcement Learning)
Stars: ✭ 160 (+966.67%)
Mutual labels:  deep-reinforcement-learning, dqn, ppo, a2c, soft-actor-critic
Reinforcement Learning
Learn Deep Reinforcement Learning in 60 days! Lectures & Code in Python. Reinforcement Learning + Deep Learning
Stars: ✭ 3,329 (+22093.33%)
Mutual labels:  deep-reinforcement-learning, dqn, ppo, a2c
Reinforcement Learning With Tensorflow
Simple Reinforcement learning tutorials, 莫烦Python 中文AI教学
Stars: ✭ 6,948 (+46220%)
Mutual labels:  dqn, sarsa, a3c, ppo
Deep Reinforcement Learning With Pytorch
PyTorch implementation of DQN, AC, ACER, A2C, A3C, PG, DDPG, TRPO, PPO, SAC, TD3 and ....
Stars: ✭ 1,345 (+8866.67%)
Mutual labels:  deep-reinforcement-learning, dqn, a3c, ppo
Deeprl Tensorflow2
🐋 Simple implementations of various popular Deep Reinforcement Learning algorithms using TensorFlow2
Stars: ✭ 319 (+2026.67%)
Mutual labels:  deep-reinforcement-learning, dqn, a3c, ppo
Slm Lab
Modular Deep Reinforcement Learning framework in PyTorch. Companion library of the book "Foundations of Deep Reinforcement Learning".
Stars: ✭ 904 (+5926.67%)
Mutual labels:  deep-reinforcement-learning, dqn, a3c, ppo
ReinforcementLearningZoo.jl
juliareinforcementlearning.org/
Stars: ✭ 46 (+206.67%)
Mutual labels:  rainbow, dqn, ppo, a2c
yarll
Combining deep learning and reinforcement learning.
Stars: ✭ 84 (+460%)
Mutual labels:  deep-reinforcement-learning, sarsa, a3c, soft-actor-critic
Rainy
☔ Deep RL agents with PyTorch☔
Stars: ✭ 39 (+160%)
Mutual labels:  deep-reinforcement-learning, dqn, ppo, a2c
imitation learning
PyTorch implementation of some reinforcement learning algorithms: A2C, PPO, Behavioral Cloning from Observation (BCO), GAIL.
Stars: ✭ 93 (+520%)
Mutual labels:  deep-reinforcement-learning, ppo, a2c
Machine Learning Is All You Need
🔥🌟《Machine Learning 格物志》: ML + DL + RL basic codes and notes by sklearn, PyTorch, TensorFlow, Keras & the most important, from scratch!💪 This repository is ALL You Need!
Stars: ✭ 173 (+1053.33%)
Mutual labels:  deep-reinforcement-learning, dqn, ppo
Explorer
Explorer is a PyTorch reinforcement learning framework for exploring new ideas.
Stars: ✭ 54 (+260%)
Mutual labels:  deep-reinforcement-learning, dqn, ppo
Deep Reinforcement Learning Algorithms
31 projects in the framework of Deep Reinforcement Learning algorithms: Q-learning, DQN, PPO, DDPG, TD3, SAC, A2C and others. Each project is provided with a detailed training log.
Stars: ✭ 167 (+1013.33%)
Mutual labels:  deep-reinforcement-learning, dqn, ppo
Pytorch A2c Ppo Acktr Gail
PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).
Stars: ✭ 2,632 (+17446.67%)
Mutual labels:  deep-reinforcement-learning, ppo, a2c
ElegantRL
Scalable and Elastic Deep Reinforcement Learning Using PyTorch. Please star. 🔥
Stars: ✭ 2,074 (+13726.67%)
Mutual labels:  dqn, ppo, a2c
Reinforcement Learning
Minimal and Clean Reinforcement Learning Examples
Stars: ✭ 2,863 (+18986.67%)
Mutual labels:  deep-reinforcement-learning, dqn, a3c

Alt Text Alt Text

Alt Text

Alt Text

Alt Text

Alt Text

Alt Text

Alt Text

Alt Text

Alt Text

Deep-Reinforcement-Learning-Notebooks

This Repository contains a series of google colab notebooks which I created to help people dive into deep reinforcement learning.This notebooks contains both theory and implementation of different algorithms.

Following is the content of the Notebooks of till now:

  1. The first Notebook provides with the introduction of Deep Reinforcement Learning.

  2. The second Notebook contains the most basic policy based algorithm called REINFORCE.The Notebook is about the theory behind the REINFORCE algorithm, disadvantages of using Reinforce Algorithm and its few remedies. Environment used here is CartPole-v1

  3. The third Notebook contains the most basic value based algorithm called SARSA. The Notebook starts with providing the definitions about value functions, temporal difference learning. Then we talk about Epsilon- greedy policy. Using all these we talk SARSA and its implementation.Environment used here is CliffWalking

  4. The fourth Notebook contains another value based algorithm called Deep Q-Networks algorithm(DQN). The Notebook starts with providing how DQN is better than SARSA. Then we talk about a Boltzmann exploration which is bit better than epsilon- greedy policy. At last we talk how DQN improves on the sample efficiency of SARSA with the help of experience replay memory. Combining all these we implement DQN.Environment used here is Assault-v0

  5. The fifth Notebook contains a bit advance value based algorithm called Double DQN with Prioritized Experience Replay(PER) and target networks.This Notebook talks about some problems we have while using the Deep Q-Networks algorithm (DQN) and how we can improve DQN. This Notebook introduces concept of target network, double dqn and Prioritized Experience Replay and how we can combine them to make DQN more stable and efficient.Environment used here is KungFuMaster-v0

  6. The sixth Notebook contains an algorithm called Advantage Actor-Critic (A2C) which elegantly combine the policy gradient and a learned value function. This Notebook starts with providing the concept of advantage function and various methods of estimating advantage. Then we discuss the implementation of A2C. In the code I have also written about how add regularisation in Actor model with the help of entropy of policy

  7. The seventh Notebook contains an algorithm called Deep Deterministic Policy Gradients(DDPG).Deep Deterministic Policy Gradient (DDPG) is a model-free off-policy algorithm for learning continous actions. It combines ideas from DPG (Deterministic Policy Gradient) and DQN (Deep Q-Network). It uses Experience Replay and slow-learning target networks from DQN, and it is based on DPG, which can operate over continuous action spaces.

  8. The Eighth Notebook contains an algorithm called Proximal Policy Optimization (PPO) extending Actor- Critic Algorithm.It is a another family of policy gradient methods for reinforcement learning, which alternate between sampling data through interaction with the environment, and optimizing a "surrogate" objective function using stochastic gradient ascent.

  9. The Ninth Notebook contains an algorithm called Soft Actor-Critic (SAC): Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor. It is policy maximum entropy actor-critic algorithm which provides for both sample- efficient learning and stability. This algorithm extends readily to very complex, high-dimensional tasks.

  10. The Tenth Notebook contains an algorithm called Asynchronous Advantage Actor Critic (A3C): A3C is an algorithm similar to A2C but differs in a way that it is Asynchronous i.e. multiple independent agents(networks) with their own weights interact with a different copy of the environment in parallel and thus explore a bigger part of the state-action space in much less time.

  11. The Eleventh Notebook contains an algorithm called Noisy Nets: Noisy Nets are way to improve exploration in the environment by adding noise to network parameters

  12. The Twelfth Notebook contains an algorithm called Rainbow: Rainbow takes the standard DQN algorithm and adds the 6 variants of DQN to it.

Credits: A lot of theory part for initial few notebooks is taken from Book:Foundations of Deep Reinforcement Learning By Graesser Laura and Keng Wah Loon.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].