All Projects → wumo → Reinforcement-Learning-An-Introduction

wumo / Reinforcement-Learning-An-Introduction

Licence: MIT License
Kotlin implementation of algorithms, examples, and exercises from the Sutton and Barto: Reinforcement Learning (2nd Edition)

Programming Languages

kotlin
9241 projects

Projects that are alternatives of or similar to Reinforcement-Learning-An-Introduction

ReinforcementLearning Sutton-Barto Solutions
Solutions and figures for problems from Reinforcement Learning: An Introduction Sutton&Barto
Stars: ✭ 20 (-28.57%)
Mutual labels:  qlearning, sarsa
Deep-QLearning-Demo-csharp
This demo is a C# port of ConvNetJS RLDemo (https://cs.stanford.edu/people/karpathy/convnetjs/demo/rldemo.html) by Andrej Karpathy
Stars: ✭ 34 (+21.43%)
Mutual labels:  qlearning
Paddle-RLBooks
Paddle-RLBooks is a reinforcement learning code study guide based on pure PaddlePaddle.
Stars: ✭ 113 (+303.57%)
Mutual labels:  sarsa
reinforcement-learning-flappybird
In-browser reinforcement learning for flappy bird 🐦
Stars: ✭ 41 (+46.43%)
Mutual labels:  qlearning
Reinforcement-Learning-In-Motion
Code repository for my course on the fundamentals of reinforcement learning
Stars: ✭ 78 (+178.57%)
Mutual labels:  sarsa
Q-learning-conv-net
Q learning AI bot perceiving environment with CNN
Stars: ✭ 14 (-50%)
Mutual labels:  qlearning
RL
Reinforcement Learning Demos
Stars: ✭ 66 (+135.71%)
Mutual labels:  sarsa
java-reinforcement-learning
Package provides java implementation of reinforcement learning algorithms such Q-Learn, R-Learn, SARSA, Actor-Critic
Stars: ✭ 90 (+221.43%)
Mutual labels:  sarsa
cartpole-rl-remote
CartPole game by Reinforcement Learning, a journey from training to inference
Stars: ✭ 24 (-14.29%)
Mutual labels:  qlearning
GAN-Q-Learning
Unofficial Implementation of GAN Q Learning https://arxiv.org/abs/1805.04874
Stars: ✭ 42 (+50%)
Mutual labels:  qlearning
yarll
Combining deep learning and reinforcement learning.
Stars: ✭ 84 (+200%)
Mutual labels:  sarsa
Reinforcement Learning
Learn Deep Reinforcement Learning in 60 days! Lectures & Code in Python. Reinforcement Learning + Deep Learning
Stars: ✭ 3,329 (+11789.29%)
Mutual labels:  qlearning
Deep reinforcement learning course
Implementations from the free course Deep Reinforcement Learning with Tensorflow and PyTorch
Stars: ✭ 3,232 (+11442.86%)
Mutual labels:  qlearning
DOM-Q-NET
Graph-based Deep Q Network for Web Navigation
Stars: ✭ 30 (+7.14%)
Mutual labels:  qlearning
reinforced-race
A model car learns driving along a track using reinforcement learning
Stars: ✭ 37 (+32.14%)
Mutual labels:  qlearning
Easy Rl
强化学习中文教程,在线阅读地址:https://datawhalechina.github.io/easy-rl/
Stars: ✭ 3,004 (+10628.57%)
Mutual labels:  sarsa
Reinforcement Learning With Tensorflow
Simple Reinforcement learning tutorials, 莫烦Python 中文AI教学
Stars: ✭ 6,948 (+24714.29%)
Mutual labels:  sarsa
Deep-Reinforcement-Learning-Notebooks
This Repository contains a series of google colab notebooks which I created to help people dive into deep reinforcement learning.This notebooks contain both theory and implementation of different algorithms.
Stars: ✭ 15 (-46.43%)
Mutual labels:  sarsa

Reinforcement Learning: An Introduction

Kotlin implementation of algorithms, examples, and exercises from the Sutton and Barto: Reinforcement Learning (2nd Edition). The purpose of this project is to help understanding RL algorithms and experimenting easily.

Inspired by ShangtongZhang/reinforcement-learning-an-introduction (Python) and idsc-frazzoli/subare (Java 8)

Features:

  • Algorithms and problems are separated. So you can experiment with various combination of <algorithm, problem> or <algorithm,function approximator, problem>
  • Implementation is very close to the pseudo code in the book. So reading source code will help you understand the original algorithm.

Implemented algorithms:

Model-based (Dynamic Programming):

Monte Carlo (episode backup):

Temporal Difference (one-step backup):

n-step Temporal Difference (unify MC and TD):

Dyna (Integrate Planning, Acting, and Learning):

On-policy Prediction with Function Approximation

On-policy Control with Function Approximation

Off-policy Methods with Approximation

Eligibility Traces

Policy Gradient Methods

Implemented problems:

Build

Built with Maven

Test cases

Try Testcases

Figure 7.2

Figure 7.2: Performance of n-step TD methods as acc function of α, for various values of n, on acc 19-state random walk task


Figure 10.1

Figure 10.1: The Mountain Car task and the cost-to-go function learned during one run


Figure 10.4

Figure 10.4: Effect of the α and n on early performance of n-step semi-gradient Sarsa and tile-coding function approximation on the Mountain Car task


Figure 12.3

Figure 12.3: 19-state Random walk results: Performance of the offline λ-return algorithm .


Figure 12.6

Figure 12.6: 19-state Random walk results: Performance of TD(λ) .


Figure 12.8

Figure 12.8: 19-state Random walk results: Performance of online λ-return algorithms


Figure 12.10

Figure 12.10: Early performance on the Mountain Car task of Sarsa(λ) with replacing traces


Figure 12.11

Figure 12.11: Summary comparison of Sarsa(λ) algorithms on the Mountain Car task.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].