All Projects → Datatouille → Rl Workshop

Datatouille / Rl Workshop

Licence: mit
Reinforcement Learning Workshop for Data Science BKK

Projects that are alternatives of or similar to Rl Workshop

Hands On Meta Learning With Python
Learning to Learn using One-Shot Learning, MAML, Reptile, Meta-SGD and more with Tensorflow
Stars: ✭ 768 (+952.05%)
Mutual labels:  jupyter-notebook, reinforcement-learning
Rainbow Is All You Need
Rainbow is all you need! A step-by-step tutorial from DQN to Rainbow
Stars: ✭ 938 (+1184.93%)
Mutual labels:  jupyter-notebook, reinforcement-learning
Coursera
Quiz & Assignment of Coursera
Stars: ✭ 774 (+960.27%)
Mutual labels:  jupyter-notebook, reinforcement-learning
Hands On Reinforcement Learning With Python
Master Reinforcement and Deep Reinforcement Learning using OpenAI Gym and TensorFlow
Stars: ✭ 640 (+776.71%)
Mutual labels:  jupyter-notebook, reinforcement-learning
Machine Learning From Scratch
Succinct Machine Learning algorithm implementations from scratch in Python, solving real-world problems (Notebooks and Book). Examples of Logistic Regression, Linear Regression, Decision Trees, K-means clustering, Sentiment Analysis, Recommender Systems, Neural Networks and Reinforcement Learning.
Stars: ✭ 42 (-42.47%)
Mutual labels:  jupyter-notebook, reinforcement-learning
Reinforcement Learning 2nd Edition By Sutton Exercise Solutions
Solutions of Reinforcement Learning, An Introduction
Stars: ✭ 713 (+876.71%)
Mutual labels:  jupyter-notebook, reinforcement-learning
Deeplearning Trader
backtrader with DRL ( Deep Reinforcement Learning)
Stars: ✭ 24 (-67.12%)
Mutual labels:  jupyter-notebook, reinforcement-learning
Tensorflow Book
Accompanying source code for Machine Learning with TensorFlow. Refer to the book for step-by-step explanations.
Stars: ✭ 4,448 (+5993.15%)
Mutual labels:  jupyter-notebook, reinforcement-learning
World Models Sonic Pytorch
Attempt at reinforcement learning with curiosity for Sonic the Hedgehog games. Number 149 on OpenAI retro contest leaderboard, but more work needed
Stars: ✭ 27 (-63.01%)
Mutual labels:  jupyter-notebook, reinforcement-learning
Udacity Deep Learning Nanodegree
This is just a collection of projects that made during my DEEPLEARNING NANODEGREE by UDACITY
Stars: ✭ 15 (-79.45%)
Mutual labels:  jupyter-notebook, reinforcement-learning
David Silver Reinforcement Learning
Notes for the Reinforcement Learning course by David Silver along with implementation of various algorithms.
Stars: ✭ 623 (+753.42%)
Mutual labels:  jupyter-notebook, reinforcement-learning
Notebooks
Some notebooks
Stars: ✭ 53 (-27.4%)
Mutual labels:  jupyter-notebook, reinforcement-learning
Amazon Sagemaker Examples
Example 📓 Jupyter notebooks that demonstrate how to build, train, and deploy machine learning models using 🧠 Amazon SageMaker.
Stars: ✭ 6,346 (+8593.15%)
Mutual labels:  jupyter-notebook, reinforcement-learning
Deeprl Tutorials
Contains high quality implementations of Deep Reinforcement Learning algorithms written in PyTorch
Stars: ✭ 748 (+924.66%)
Mutual labels:  jupyter-notebook, reinforcement-learning
Ml Mipt
Open Machine Learning course at MIPT
Stars: ✭ 480 (+557.53%)
Mutual labels:  jupyter-notebook, reinforcement-learning
Basic reinforcement learning
An introductory series to Reinforcement Learning (RL) with comprehensive step-by-step tutorials.
Stars: ✭ 826 (+1031.51%)
Mutual labels:  jupyter-notebook, reinforcement-learning
Rl Book
Source codes for the book "Reinforcement Learning: Theory and Python Implementation"
Stars: ✭ 464 (+535.62%)
Mutual labels:  jupyter-notebook, reinforcement-learning
Practical rl
A course in reinforcement learning in the wild
Stars: ✭ 4,741 (+6394.52%)
Mutual labels:  jupyter-notebook, reinforcement-learning
Awesome Ai Books
Some awesome AI related books and pdfs for learning and downloading, also apply some playground models for learning
Stars: ✭ 855 (+1071.23%)
Mutual labels:  jupyter-notebook, reinforcement-learning
Policy Gradient Methods
Implementation of Algorithms from the Policy Gradient Family. Currently includes: A2C, A3C, DDPG, TD3, SAC
Stars: ✭ 54 (-26.03%)
Mutual labels:  jupyter-notebook, reinforcement-learning

Reinforcement Learning Workshop

How to Use Notebooks

Each notebook contains the content and code-along of each session. We recommend that you run the notebooks from Google Colaboratory for minimal setup requirements. Edit the Fill in The Code section for coding assigments and check with our way of solving them in solutions.

Session 1 Escaping GridWorld with Simple RL Agents

Markov Decision Processes / Discrete States and Actions

  • What is Reinforcement Learning: Pavlov's kitties
  • How Useful is Reinforcement Learning: games, robotics, ads biddings, stock trading, etc.
  • Why is Reinforcement Learning Different: level of workflow automation in classes of machine learning algorithm
    • Use cases for reinforcement learning
  • Reinforcement Learning Framework and Markov Decision Processes
  • GridWorld example to explain:
    • Problems: Markov decision processes, states, actions, and rewards
    • Solutions: policies, state values, (state-)action values, discount factor, optimality equations
  • Words of Caution: a few reasons Deep Reinforcement Learning Doesn't Work Yet
  • Challenges:
    • Read up on Bellman's equations and find out where they hid in our workshop today.
    • What are you ideas about how we can find the policy policy?
    • Play around with Gridworld. Tweak these variables and see what happens to state and action values:
      • Expand the grid and/or add some more traps
      • Wind probability
      • Move rewards
      • Discount factor
      • Epsilon and how to decay it (or not)

Session 2 Win Big at Monte Carlo - Sponsored by Humanize, the company that helps your business grow with AI

Discrete States and Actions

  • Blackjack-v0 environment, human play and computer play
  • Optimal Strategy for Blackjack
  • What is Monte Carlo Method
  • Monte Carlo Prediction
  • Monte Carlo Control: All-visit, First-visit, and GLIE
  • Challanges:
    • What are some other ways of solving reinforcement learning problems? How are they better or worse than Monte Carlo methods e.g. performance, data requirements, etc.?
    • Solve at least one of the following OpenAI gym environments with discrete states and actions:
      • FrozenLake-v0
      • Taxi-v2
      • Blackjack-v0
      • Any other environments with discrete states and actions at OpenAI Gym
    • Check session2b.ipynb if you are interested in using Monte Carlo method to solve Grid World. This will give you more insight into difference between all-visit and first-visit Monte Carlo.

Session 3 GET a Taxi with Temporal Difference Learning - Sponsored by GET, the new ride-hailing service in Thailand

Discrete States and Actions

  • Taxi-v2 environment
  • Comparison between Monte Carlo and TD
  • SARSA
  • Q-learning
  • Expected SARSA
  • Handling Continuous States
  • Challenges: Solve an environment with continuous states using discretization
    • Acrobat-v1
    • MountainCar-v0
    • CartPole-v0
    • LunarLander-v2
  • Points to consider:
    • What are other ways of handling continuous states? (See tile coding)
    • What are the state space, action space, and rewards of the environment?
    • What algorithms did you use to solve the environment and why?
    • How many episodes did you solve it in? Can you improve the performance? (Tweaking discount factor, learning rate, Monte Carlo vs TD)

Session 3b Neural Networks in Pytorch - Sponsored by GET, the new ride-hailing service in Thailand

Optional

  1. Building Blocks

Familiarize ourselves with basic building blocks of a neural network in PyTorch such as tensors and layers

  1. Your First Neural Network

Build your first neural network with the main components of architecture, loss and optimizer

  1. Spiral Example

Use your first neural network in a task challenging for linear models to understand why we even need deep learning

Session 4 Policy-gradient to The Moon - Sponsored by GET, the new ride-hailing service in Thailand

Continuous States and Discrete Actions

  • Replacing Q dictionaries with neural networks
  • LunarLander-v2 environment
  • Vanilla Policy Gradient aka Monte Carlo Policy Gradient aka REINFORCE aka Stochastic Policy Gradient
  • Train Your Own Vanilla Policy Gradient Agent:
    • Hyperparameter tuning
    • Reward engineering
  • Inside Policy Gradient Agent:
    • Policy network
    • Returns function
    • Trajectories
    • Gradient ascent
  • Bonus: How to Derive Gradients of Policy Network
  • Challenges:
    • Finetune the model and try to beat OpenAI Leaderboard at 658 episodes. Pay attention on how you can improve on vanilla policy gradients such as reward shaping.
    • See if you can solve LunarLanderContinuous-v2 with continuous actions using more sophisticated policy gradient methods such as TRPO and PPO.

Session 4.5 Moon Redux with Proximal Policy Optimization (PPO)

Continuous States and Continuous Actions

  • PPO
    • Parallel environments
    • Normalized rewards and actions
    • Future rewards
    • GAE rewards
    • Clipped surrogate function
  • Challenges: Implement PPO to solve LunarLanderContinuous-v2 and compare it to your last project

Session 5 Deep Deep Q-learning to Drive MountainCar - Sponsored by GET, the new ride-hailing service in Thailand

Continuous States and Discrete Actions

  • MountainCar-v0 environment
  • Deep Q-learning (DQN)
  • Train Your Own DQN Agent:
    • Hyperparameter tuning
    • Reward engineering
  • Inside DQN Agent:
    • Replay Memory
    • Q Networks
    • Agent action selection
    • Agent update: DQN and DDQN
  • Challenges:
    • Finetune the model and try to beat OpenAI Leaderboard at 341 episodes. Use what you learn from this session such as creative reward engineering and other hyperparameter tunings.
    • Try to figure out how you can solve MountainCarContinuous-v0. It is almost exactly the same as MountainCar-v0 but with continuous action space of size 1. See NAF Q-learning and DDPG papers for some hints.
    • Read up on Rainbow and how to push DQN to its limits.

Session 5.5 Rainbow

Continuous States and Discrete Actions

  • Rainbow
    • Vanilla DQN (experience replay + target network)
    • Double DQN
    • Prioritized experience replay
    • Dueling networks
    • Multi-step learning
    • Distributional RL
    • Noisy networks
  • Challenges: Implement Rainbow to solve MountainCarContinuous-v0 and compare it to your last project

Session 6 Continuous Control with Deep Deterministic Policy Gradient

Continuous States and Continuous Actions

  • Policy-based vs Value-based Methods

  • Pendulum-v0 environment

  • Deep Deterministic Policy Gradient (DDPG)

  • Train Your Own DDPG Agent:

    • Hyperparameter tuning
    • Reward engineering
  • How DDPG Agent Learns

    • Critic update
    • Actor update
  • Challenges:

  • Try to beat OpenAI leaderboard 100-episode average of -123.11 ± 6.86 for Pendulum-v0

  • Implement DDPG to solve MountainCarContinuous-v0

  • What are other methods that can handle continuous action space except for DDPG? Look up Proximal Policy Optimization (PPO) and Soft Actor-Critic (SAC).

Session 6.5 Advanced Actor-Critic Methods

Continuous States and Continuous Actions

  • A2C / A3C
  • PPO
  • SAC

Other Topics

  • Explore vs exploit: epsilon greedy, ucb, thompson sampling
  • Reward function setting
  • Monte Carlo Tree Search
  • Hackathon nights to play Blackjack, Poker, Pommerman, boardgames and self-driving cars

Readings

Environments

  • Spinning Up - an educational resource produced by OpenAI that makes it easier to learn about deep reinforcement learning (deep RL)
  • OpenAI Gym - a toolkit for developing and comparing reinforcement learning algorithms
  • Unity ML-Agent Toolkit - an open-source Unity plugin that enables games and simulations to serve as environments for training intelligent agents
  • Holodeck - a high-fidelity simulator for reinforcement learning built on top of Unreal Engine 4
  • AirSim - a simulator for drones, cars and more, built on Unreal Engine
  • Carla - an open-source simulator for autonomous driving research
  • Pommerman - a clone of Bomberman built for AI research
  • MetaCar - a reinforcement learning environment for self-driving cars in the browser
  • Boardgame.io - a boardgame environment

Agents

  • Unity ML-Agent Toolkit - an open-source Unity plugin that enables games and simulations to serve as environments for training intelligent agents
  • SLM Labs - a modular deep reinforcement learning framework in PyTorch
  • Dopamine - a research framework for fast prototyping of reinforcement learning algorithms
  • TRF - a library built on top of TensorFlow that exposes several useful building blocks for implementing Reinforcement Learning agent
  • Horizon - an open source end-to-end platform for applied reinforcement learning (RL) developed and used at Facebook.
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].