All Projects → andri27-ts → Reinforcement Learning

andri27-ts / Reinforcement Learning

Licence: mit
Learn Deep Reinforcement Learning in 60 days! Lectures & Code in Python. Reinforcement Learning + Deep Learning

Programming Languages

Jupyter Notebook
11667 projects
python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Reinforcement Learning

Minimalrl
Implementations of basic RL algorithms with minimal lines of codes! (pytorch based)
Stars: ✭ 2,051 (-38.39%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning, dqn, ppo, a2c, policy-gradients
Deep Reinforcement Learning
Repo for the Deep Reinforcement Learning Nanodegree program
Stars: ✭ 4,012 (+20.52%)
Mutual labels:  jupyter-notebook, reinforcement-learning, deep-reinforcement-learning, dqn, ppo
Lagom
lagom: A PyTorch infrastructure for rapid prototyping of reinforcement learning algorithms.
Stars: ✭ 364 (-89.07%)
Mutual labels:  artificial-intelligence, jupyter-notebook, reinforcement-learning, deep-reinforcement-learning, ppo
Deep reinforcement learning course
Implementations from the free course Deep Reinforcement Learning with Tensorflow and PyTorch
Stars: ✭ 3,232 (-2.91%)
Mutual labels:  jupyter-notebook, deep-reinforcement-learning, ppo, qlearning, a2c
Pytorch Drl
PyTorch implementations of various Deep Reinforcement Learning (DRL) algorithms for both single agent and multi-agent.
Stars: ✭ 233 (-93%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning, dqn, ppo
Deep-Reinforcement-Learning-With-Python
Master classic RL, deep RL, distributional RL, inverse RL, and more using OpenAI Gym and TensorFlow with extensive Math
Stars: ✭ 222 (-93.33%)
Mutual labels:  deep-reinforcement-learning, dqn, ppo, a2c
Rainy
☔ Deep RL agents with PyTorch☔
Stars: ✭ 39 (-98.83%)
Mutual labels:  deep-reinforcement-learning, dqn, ppo, a2c
Hands On Reinforcement Learning With Python
Master Reinforcement and Deep Reinforcement Learning using OpenAI Gym and TensorFlow
Stars: ✭ 640 (-80.78%)
Mutual labels:  jupyter-notebook, reinforcement-learning, deep-reinforcement-learning, ppo
Deeprl Tutorials
Contains high quality implementations of Deep Reinforcement Learning algorithms written in PyTorch
Stars: ✭ 748 (-77.53%)
Mutual labels:  jupyter-notebook, reinforcement-learning, deep-reinforcement-learning, ppo
Advanced Deep Learning And Reinforcement Learning Deepmind
🎮 Advanced Deep Learning and Reinforcement Learning at UCL & DeepMind | YouTube videos 👉
Stars: ✭ 121 (-96.37%)
Mutual labels:  jupyter-notebook, reinforcement-learning, deep-reinforcement-learning, deepmind
Deep-Reinforcement-Learning-Notebooks
This Repository contains a series of google colab notebooks which I created to help people dive into deep reinforcement learning.This notebooks contain both theory and implementation of different algorithms.
Stars: ✭ 15 (-99.55%)
Mutual labels:  deep-reinforcement-learning, dqn, ppo, a2c
Gdrl
Grokking Deep Reinforcement Learning
Stars: ✭ 304 (-90.87%)
Mutual labels:  artificial-intelligence, jupyter-notebook, reinforcement-learning, deep-reinforcement-learning
Applied Reinforcement Learning
Reinforcement Learning and Decision Making tutorials explained at an intuitive level and with Jupyter Notebooks
Stars: ✭ 229 (-93.12%)
Mutual labels:  artificial-intelligence, jupyter-notebook, reinforcement-learning, deep-reinforcement-learning
Learning To Communicate Pytorch
Learning to Communicate with Deep Multi-Agent Reinforcement Learning in PyTorch
Stars: ✭ 236 (-92.91%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning, dqn, deepmind
Pytorch A2c Ppo Acktr Gail
PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).
Stars: ✭ 2,632 (-20.94%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning, ppo, a2c
Rad
RAD: Reinforcement Learning with Augmented Data
Stars: ✭ 268 (-91.95%)
Mutual labels:  jupyter-notebook, reinforcement-learning, deep-reinforcement-learning, ppo
Ml In Tf
Get started with Machine Learning in TensorFlow with a selection of good reads and implemented examples!
Stars: ✭ 45 (-98.65%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning, dqn, deepmind
Easy Rl
强化学习中文教程,在线阅读地址:https://datawhalechina.github.io/easy-rl/
Stars: ✭ 3,004 (-9.76%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning, dqn, ppo
Deep Reinforcement Learning Algorithms
31 projects in the framework of Deep Reinforcement Learning algorithms: Q-learning, DQN, PPO, DDPG, TD3, SAC, A2C and others. Each project is provided with a detailed training log.
Stars: ✭ 167 (-94.98%)
Mutual labels:  jupyter-notebook, deep-reinforcement-learning, dqn, ppo
Deep RL with pytorch
A pytorch tutorial for DRL(Deep Reinforcement Learning)
Stars: ✭ 160 (-95.19%)
Mutual labels:  deep-reinforcement-learning, dqn, ppo, a2c

Course in Deep Reinforcement Learning

Explore the combination of neural network and reinforcement learning. Algorithms and examples in Python & PyTorch

Have you heard about the amazing results achieved by Deepmind with AlphaGo Zero and by OpenAI in Dota 2? It's all about deep neural networks and reinforcement learning. Do you want to know more about it?
This is the right opportunity for you to finally learn Deep RL and use it on new and exciting projects and applications.

Here you'll find an in depth introduction to these algorithms. Among which you'll learn q learning, deep q learning, PPO, actor critic, and implement them using Python and PyTorch.

The ultimate aim is to use these general-purpose technologies and apply them to all sorts of important real world problems. Demis Hassabis

This repository contains:


drawing Lectures (& other content) primarily from DeepMind and Berkley Youtube's Channel.


drawing Algorithms (like DQN, A2C, and PPO) implemented in PyTorch and tested on OpenAI Gym: RoboSchool & Atari.



Stay tuned and follow me on Twitter Follow and GitHub followers #60DaysRLChallenge

Now we have also a Slack channel. To get an invitation, email me at [email protected]. Also, email me if you have any idea, suggestion or improvement.

To learn Deep Learning, Computer Vision or Natural Language Processing check my 1-Year-ML-Journey

Before starting.. Prerequisites



Quick Note: my NEW BOOK is out!

To learn Reinforcement Learning and Deep RL more in depth, check out my book Reinforcement Learning Algorithms with Python!!

drawing

Table of Contents

  1. The Landscape of Reinforcement Learning
  2. Implementing RL Cycle and OpenAI Gym
  3. Solving Problems with Dynamic Programming
  4. Q learning and SARSA Applications
  5. Deep Q-Network
  6. Learning Stochastic and DDPG optimization
  7. TRPO and PPO implementation
  8. DDPG and TD3 Applications
  9. Model-Based RL
  10. Imitation Learning with the DAgger Algorithm
  11. Understanding Black-Box Optimization Algorithms
  12. Developing the ESBAS Algorithm
  13. Practical Implementation for Resolving RL Challenges



Index - Reinforcement Learning


Week 1 - Introduction

Other Resources


Week 2 - RL Basics: MDP, Dynamic Programming and Model-Free Control

Those who cannot remember the past are condemned to repeat it - George Santayana

This week, we will learn about the basic blocks of reinforcement learning, starting from the definition of the problem all the way through the estimation and optimization of the functions that are used to express the quality of a policy or state.

Lectures - Theory drawing

  • Model-Free Prediction - David Silver (DeepMind)
    • Monte Carlo Learning
    • Temporal Difference Learning
    • TD(λ)
  • Model-Free Control - David Silver (DeepMind)
    • Ɛ-greedy policy iteration
    • GLIE Monte Carlo Search
    • SARSA
    • Importance Sampling

Project of the Week - Q-learning drawing

Q-learning applied to FrozenLake - For exercise, you can solve the game using SARSA or implement Q-learning by yourself. In the former case, only few changes are needed.

Other Resources


Week 3 - Value based algorithms - DQN

This week we'll learn more advanced concepts and apply deep neural network to Q-learning algorithms.

Lectures - Theory drawing

Project of the Week - DQN and variants drawing

drawing

DQN and some variants applied to Pong - This week the goal is to develop a DQN algorithm to play an Atari game. To make it more interesting I developed three extensions of DQN: Double Q-learning, Multi-step learning, Dueling networks and Noisy Nets. Play with them, and if you feel confident, you can implement Prioritized replay, Dueling networks or Distributional RL. To know more about these improvements read the papers!


Papers

Must Read
Extensions of DQN

Other Resources


Week 4 - Policy gradient algorithms - REINFORCE & A2C

Week 4 introduce Policy Gradient methods, a class of algorithms that optimize directly the policy. Also, you'll learn about Actor-Critic algorithms. These algorithms combine both policy gradient (the actor) and value function (the critic).

Lectures - Theory drawing

  • Policy gradient Methods - David Silver (DeepMind)
    • Finite Difference Policy Gradient
    • Monte-Carlo Policy Gradient
    • Actor-Critic Policy Gradient
  • Policy gradient intro - Sergey Levine (RECAP, optional)
    • Policy Gradient (REINFORCE and Vanilla PG)
    • Variance reduction
  • Actor-Critic - Sergey Levine (More in depth)
    • Actor-Critic
    • Discout factor
    • Actor-Critic algorithm design (batch mode or online)
    • state-dependent baseline

Project of the Week - Vanilla PG and A2C drawing

Vanilla PG and A2C applied to CartPole - The exercise of this week is to implement a policy gradient method or a more sophisticated actor-critic. In the repository you can find an implemented version of PG and A2C. Bug Alert! Pay attention that A2C give me strange result. If you find the implementation of PG and A2C easy, you can try with the asynchronous version of A2C (A3C).

Papers

Other Resources


Week 5 - Advanced Policy Gradients - PPO

This week is about advanced policy gradient methods that improve the stability and the convergence of the "Vanilla" policy gradient methods. You'll learn and implement PPO, a RL algorithm developed by OpenAI and adopted in OpenAI Five.

Lectures - Theory drawing

  • Advanced policy gradients - Sergey Levine (UC Berkley)
    • Problems with "Vanilla" Policy Gradient Methods
    • Policy Performance Bounds
    • Monotonic Improvement Theory
    • Algorithms: NPO, TRPO, PPO
  • Natural Policy Gradients, TRPO, PPO - John Schulman (Berkey DRL Bootcamp) - (RECAP, optional)
    • Limitations of "Vanilla" Policy Gradient Methods
    • Natural Policy Gradient
    • Trust Region Policy Optimization, TRPO
    • Proximal Policy Optimization, PPO

Project of the Week - PPO drawing

drawing

PPO applied to BipedalWalker - This week, you have to implement PPO or TRPO. I suggest PPO given its simplicity (compared to TRPO). In the project folder Week5 you find an implementation of PPO that learn to play BipedalWalker. Furthermore, in the folder you can find other resources that will help you in the development of the project. Have fun!


To learn more about PPO read the paper and take a look at the Arxiv Insights's video

Papers

Other Resources


Week 6 - Evolution Strategies and Genetic Algorithms - ES

In the last year, Evolution strategies (ES) and Genetic Algorithms (GA) has been shown to achieve comparable results to RL methods. They are derivate-free black-box algorithms that require more data than RL to learn but are able to scale up across thousands of CPUs. This week we'll look at this black-box algorithms.

Lectures & Articles - Theory drawing

Project of the Week - ES drawing

drawing

Evolution Strategies applied to LunarLander - This week the project is to implement a ES or GA. In the Week6 folder you can find a basic implementation of the paper Evolution Strategies as a Scalable Alternative to Reinforcement Learning to solve LunarLanderContinuous. You can modify it to play more difficult environments or add your ideas.


Papers

Other Resources


Week 7 - Model-Based reinforcement learning - MB-MF

The algorithms studied up to now are model-free, meaning that they only choose the better action given a state. These algorithms achieve very good performance but require a lot of training data. Instead, model-based algorithms, learn the environment and plan the next actions accordingly to the model learned. These methods are more sample efficient than model-free but overall achieve worst performance. In this week you'll learn the theory behind these methods and implement one of the last algorithms.

Lectures - Theory drawing

Project of the Week - MB-MF drawing

drawing

MB-MF applied to RoboschoolAnt - This week I chose to implement the model-based algorithm described in this paper. You can find my implementation here. NB: Instead of implementing it on Mujoco as in the paper, I used RoboSchool, an open-source simulator for robot, integrated with OpenAI Gym.


Papers

Other Resources


Week 8 - Advanced Concepts and Project Of Your Choice

This last week is about advanced RL concepts and a project of your choice.

Lectures - Theory drawing

The final project

Here you can find some project ideas.

Other Resources


Last 4 days - Review + Sharing

Congratulation for completing the 60 Days RL Challenge!! Let me know if you enjoyed it and share it!

See you!

Best resources

📚 Reinforcement Learning: An Introduction - by Sutton & Barto. The "Bible" of reinforcement learning. Here you can find the PDF draft of the second version.

📚 Deep Reinforcement Learning Hands-On - by Maxim Lapan

📚 Deep Learning - Ian Goodfellow

📺 Deep Reinforcement Learning - UC Berkeley class by Levine, check here their site.

📺 Reinforcement Learning course - by David Silver, DeepMind. Great introductory lectures by Silver, a lead researcher on AlphaGo. They follow the book Reinforcement Learning by Sutton & Barto.

Additional resources

📚 Awesome Reinforcement Learning. A curated list of resources dedicated to reinforcement learning

📚 GroundAI on RL. Papers on reinforcement learning

A cup of Coffe

Any contribution is higly appreciated! Cheers!

paypal

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].