All Projects → mimoralea → Gdrl

mimoralea / Gdrl

Licence: bsd-3-clause
Grokking Deep Reinforcement Learning

Projects that are alternatives of or similar to Gdrl

Mit Deep Learning
Tutorials, assignments, and competitions for MIT Deep Learning related courses.
Stars: ✭ 8,912 (+2831.58%)
Mutual labels:  artificial-intelligence, jupyter-notebook, neural-networks, deep-reinforcement-learning
Practical rl
A course in reinforcement learning in the wild
Stars: ✭ 4,741 (+1459.54%)
Mutual labels:  jupyter-notebook, pytorch-tutorials, reinforcement-learning, deep-reinforcement-learning
Drlkit
A High Level Python Deep Reinforcement Learning library. Great for beginners, prototyping and quickly comparing algorithms
Stars: ✭ 29 (-90.46%)
Mutual labels:  gpu, reinforcement-learning, deep-reinforcement-learning, numpy
Reinforcement Learning
Learn Deep Reinforcement Learning in 60 days! Lectures & Code in Python. Reinforcement Learning + Deep Learning
Stars: ✭ 3,329 (+995.07%)
Mutual labels:  artificial-intelligence, jupyter-notebook, reinforcement-learning, deep-reinforcement-learning
Basic reinforcement learning
An introductory series to Reinforcement Learning (RL) with comprehensive step-by-step tutorials.
Stars: ✭ 826 (+171.71%)
Mutual labels:  artificial-intelligence, jupyter-notebook, reinforcement-learning, neural-networks
Awesome Ai Books
Some awesome AI related books and pdfs for learning and downloading, also apply some playground models for learning
Stars: ✭ 855 (+181.25%)
Mutual labels:  artificial-intelligence, algorithms, jupyter-notebook, reinforcement-learning
Deep Reinforcement Learning
Repo for the Deep Reinforcement Learning Nanodegree program
Stars: ✭ 4,012 (+1219.74%)
Mutual labels:  jupyter-notebook, reinforcement-learning, neural-networks, deep-reinforcement-learning
Lagom
lagom: A PyTorch infrastructure for rapid prototyping of reinforcement learning algorithms.
Stars: ✭ 364 (+19.74%)
Mutual labels:  artificial-intelligence, jupyter-notebook, reinforcement-learning, deep-reinforcement-learning
Applied Reinforcement Learning
Reinforcement Learning and Decision Making tutorials explained at an intuitive level and with Jupyter Notebooks
Stars: ✭ 229 (-24.67%)
Mutual labels:  artificial-intelligence, jupyter-notebook, reinforcement-learning, deep-reinforcement-learning
Awesome Ai Ml Dl
Awesome Artificial Intelligence, Machine Learning and Deep Learning as we learn it. Study notes and a curated list of awesome resources of such topics.
Stars: ✭ 831 (+173.36%)
Mutual labels:  artificial-intelligence, algorithms, jupyter-notebook, neural-networks
Machine Learning From Scratch
Succinct Machine Learning algorithm implementations from scratch in Python, solving real-world problems (Notebooks and Book). Examples of Logistic Regression, Linear Regression, Decision Trees, K-means clustering, Sentiment Analysis, Recommender Systems, Neural Networks and Reinforcement Learning.
Stars: ✭ 42 (-86.18%)
Mutual labels:  artificial-intelligence, jupyter-notebook, reinforcement-learning, neural-networks
Snake
Artificial intelligence for the Snake game.
Stars: ✭ 1,241 (+308.22%)
Mutual labels:  artificial-intelligence, reinforcement-learning, deep-reinforcement-learning
60 days rl challenge
60_Days_RL_Challenge中文版
Stars: ✭ 92 (-69.74%)
Mutual labels:  artificial-intelligence, jupyter-notebook, reinforcement-learning
Awesome Decision Making Reinforcement Learning
A selection of state-of-the-art research materials on decision making and motion planning.
Stars: ✭ 68 (-77.63%)
Mutual labels:  artificial-intelligence, algorithms, reinforcement-learning
100daysofmlcode
My journey to learn and grow in the domain of Machine Learning and Artificial Intelligence by performing the #100DaysofMLCode Challenge.
Stars: ✭ 146 (-51.97%)
Mutual labels:  artificial-intelligence, jupyter-notebook, neural-networks
Rlai Exercises
Exercise Solutions for Reinforcement Learning: An Introduction [2nd Edition]
Stars: ✭ 97 (-68.09%)
Mutual labels:  artificial-intelligence, jupyter-notebook, reinforcement-learning
Hands On Machine Learning With Scikit Learn Keras And Tensorflow
Notes & exercise solutions of Part I from the book: "Hands-On ML with Scikit-Learn, Keras & TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems" by Aurelien Geron
Stars: ✭ 151 (-50.33%)
Mutual labels:  artificial-intelligence, jupyter-notebook, neural-networks
Fixy
Amacımız Türkçe NLP literatüründeki birçok farklı sorunu bir arada çözebilen, eşsiz yaklaşımlar öne süren ve literatürdeki çalışmaların eksiklerini gideren open source bir yazım destekleyicisi/denetleyicisi oluşturmak. Kullanıcıların yazdıkları metinlerdeki yazım yanlışlarını derin öğrenme yaklaşımıyla çözüp aynı zamanda metinlerde anlamsal analizi de gerçekleştirerek bu bağlamda ortaya çıkan yanlışları da fark edip düzeltebilmek.
Stars: ✭ 165 (-45.72%)
Mutual labels:  artificial-intelligence, jupyter-notebook, neural-networks
Deep Learning Notes
My personal notes, presentations, and notebooks on everything Deep Learning.
Stars: ✭ 191 (-37.17%)
Mutual labels:  artificial-intelligence, jupyter-notebook, neural-networks
Notebooks
Some notebooks
Stars: ✭ 53 (-82.57%)
Mutual labels:  artificial-intelligence, jupyter-notebook, reinforcement-learning

Grokking Deep Reinforcement Learning

Note: At the moment, only running the code from the docker container (below) is supported. Docker allows for creating a single environment that is more likely to work on all systems. Basically, I install and configure all packages for you, except docker itself, and you just run the code on a tested environment.

To install docker, I recommend a web search for "installing docker on <your os here>". For running the code on a GPU, you have to additionally install nvidia-docker. NVIDIA Docker allows for using a host's GPUs inside docker containers. After you have docker (and nvidia-docker if using a GPU) installed, follow the three steps below.

Running the code

  1. Clone this repo:
    git clone --depth 1 https://github.com/mimoralea/gdrl.git && cd gdrl
  2. Pull the gdrl image with:
    docker pull mimoralea/gdrl:v0.14
  3. Spin up a container:
    • On Mac or Linux:
      docker run -it --rm -p 8888:8888 -v "$PWD"/notebooks/:/mnt/notebooks/ mimoralea/gdrl:v0.14
    • On Windows:
      docker run -it --rm -p 8888:8888 -v %CD%/notebooks/:/mnt/notebooks/ mimoralea/gdrl:v0.14
    • NOTE: Use nvidia-docker if you are using a GPU.
  4. Open a browser and go to the URL shown in the terminal (likely to be: http://localhost:8888). The password is: gdrl

About the book

Book's website

https://www.manning.com/books/grokking-deep-reinforcement-learning

Table of content

  1. Introduction to deep reinforcement learning
  2. Mathematical foundations of reinforcement learning
  3. Balancing immediate and long-term goals
  4. Balancing the gathering and utilization of information
  5. Evaluating agents' behaviors
  6. Improving agents' behaviors
  7. Achieving goals more effectively and efficiently
  8. Introduction to value-based deep reinforcement learning
  9. More stable value-based methods
  10. Sample-efficient value-based methods
  11. Policy-gradient and actor-critic methods
  12. Advanced actor-critic methods
  13. Towards artificial general intelligence

Detailed table of content

1. Introduction to deep reinforcement learning

2. Mathematical foundations of reinforcement learning

  • (Livebook)
  • (Notebook)
    • Implementations of several MDPs:
      • Bandit Walk
      • Bandit Slippery Walk
      • Slippery Walk Three
      • Random Walk
      • Russell and Norvig's Gridworld from AIMA
      • FrozenLake
      • FrozenLake8x8

3. Balancing immediate and long-term goals

  • (Livebook)
  • (Notebook)
    • Implementations of methods for finding optimal policies:
      • Policy Evaluation
      • Policy Improvement
      • Policy Iteration
      • Value Iteration

4. Balancing the gathering and utilization of information

  • (Livebook)
  • (Notebook)
    • Implementations of exploration strategies for bandit problems:
      • Random
      • Greedy
      • E-greedy
      • E-greedy with linearly decaying epsilon
      • E-greedy with exponentially decaying epsilon
      • Optimistic initialization
      • SoftMax
      • Upper Confidence Bound
      • Bayesian

5. Evaluating agents' behaviors

  • (Livebook)
  • (Notebook)
    • Implementation of algorithms that solve the prediction problem (policy estimation):
      • On-policy first-visit Monte-Carlo prediction
      • On-policy every-visit Monte-Carlo prediction
      • Temporal-Difference prediction (TD)
      • n-step Temporal-Difference prediction (n-step TD)
      • TD(λ)

6. Improving agents' behaviors

  • (Livebook)
  • (Notebook)
    • Implementation of algorithms that solve the control problem (policy improvement):
      • On-policy first-visit Monte-Carlo control
      • On-policy every-visit Monte-Carlo control
      • On-policy TD control: SARSA
      • Off-policy TD control: Q-Learning
      • Double Q-Learning

7. Achieving goals more effectively and efficiently

  • (Livebook)
  • (Notebook)
    • Implementation of more effective and efficient reinforcement learning algorithms:
      • SARSA(λ) with replacing traces
      • SARSA(λ) with accumulating traces
      • Q(λ) with replacing traces
      • Q(λ) with accumulating traces
      • Dyna-Q
      • Trajectory Sampling

8. Introduction to value-based deep reinforcement learning

  • (Livebook)
  • (Notebook)
    • Implementation of a value-based deep reinforcement learning baseline:
      • Neural Fitted Q-iteration (NFQ)

9. More stable value-based methods

  • (Livebook)
  • (Notebook)
    • Implementation of "classic" value-based deep reinforcement learning methods:
      • Deep Q-Networks (DQN)
      • Double Deep Q-Networks (DDQN)

10. Sample-efficient value-based methods

  • (Livebook)
  • (Notebook)
    • Implementation of main improvements for value-based deep reinforcement learning methods:
      • Dueling Deep Q-Networks (Dueling DQN)
      • Prioritized Experience Replay (PER)

11. Policy-gradient and actor-critic methods

  • (Livebook)
  • (Notebook)
    • Implementation of classic policy-based and actor-critic deep reinforcement learning methods:
      • Policy Gradients without value function and Monte-Carlo returns (REINFORCE)
      • Policy Gradients with value function baseline trained with Monte-Carlo returns (VPG)
      • Asynchronous Advantage Actor-Critic (A3C)
      • Generalized Advantage Estimation (GAE)
      • [Synchronous] Advantage Actor-Critic (A2C)

12. Advanced actor-critic methods

  • (Livebook)
  • (Notebook)
    • Implementation of advanced actor-critic methods:
      • Deep Deterministic Policy Gradient (DDPG)
      • Twin Delayed Deep Deterministic Policy Gradient (TD3)
      • Soft Actor-Critic (SAC)
      • Proximal Policy Optimization (PPO)

13. Towards artificial general intelligence

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].