All Projects → qqiang00 → Reinforce

qqiang00 / Reinforce

Reinforcement Learning Algorithm Package & PuckWorld, GridWorld Gym environments

Projects that are alternatives of or similar to Reinforce

Hate Speech And Offensive Language
Repository for the paper "Automated Hate Speech Detection and the Problem of Offensive Language", ICWSM 2017
Stars: ✭ 543 (-1.63%)
Mutual labels:  jupyter-notebook
Pdpbox
python partial dependence plot toolbox
Stars: ✭ 544 (-1.45%)
Mutual labels:  jupyter-notebook
Data Structures Using Python
This is my repository for Data Structures using Python
Stars: ✭ 546 (-1.09%)
Mutual labels:  jupyter-notebook
Sentiment analysis fine grain
Multi-label Classification with BERT; Fine Grained Sentiment Analysis from AI challenger
Stars: ✭ 546 (-1.09%)
Mutual labels:  jupyter-notebook
Handson Ml
A series of Jupyter notebooks that walk you through the fundamentals of Machine Learning and Deep Learning in python using Scikit-Learn and TensorFlow.
Stars: ✭ 23,798 (+4211.23%)
Mutual labels:  jupyter-notebook
Bandits
Python library for Multi-Armed Bandits
Stars: ✭ 547 (-0.91%)
Mutual labels:  jupyter-notebook
Video Classification
Tutorial for video classification/ action recognition using 3D CNN/ CNN+RNN on UCF101
Stars: ✭ 543 (-1.63%)
Mutual labels:  jupyter-notebook
Competitive Data Science
Materials for "How to Win a Data Science Competition: Learn from Top Kagglers" course
Stars: ✭ 551 (-0.18%)
Mutual labels:  jupyter-notebook
Probabilistic Programming And Bayesian Methods For Hackers
aka "Bayesian Methods for Hackers": An introduction to Bayesian methods + probabilistic programming with a computation/understanding-first, mathematics-second point of view. All in pure Python ;)
Stars: ✭ 23,912 (+4231.88%)
Mutual labels:  jupyter-notebook
Neural Collage
Collaging on Internal Representations: An Intuitive Approach for Semantic Transfiguration
Stars: ✭ 549 (-0.54%)
Mutual labels:  jupyter-notebook
Ember
Stars: ✭ 545 (-1.27%)
Mutual labels:  jupyter-notebook
Machine Learning Notes
My continuously updated Machine Learning, Probabilistic Models and Deep Learning notes and demos (2000+ slides) 我不间断更新的机器学习,概率模型和深度学习的讲义(2000+页)和视频链接
Stars: ✭ 5,390 (+876.45%)
Mutual labels:  jupyter-notebook
Pythoncode Tutorials
The Python Code Tutorials
Stars: ✭ 544 (-1.45%)
Mutual labels:  jupyter-notebook
Sqlworkshops
SQL Server Workshops
Stars: ✭ 544 (-1.45%)
Mutual labels:  jupyter-notebook
Go Profiler Notes
felixge's notes on the various go profiling methods that are available.
Stars: ✭ 525 (-4.89%)
Mutual labels:  jupyter-notebook
Pierian Data Complete Python 3 Bootcamp
Stars: ✭ 544 (-1.45%)
Mutual labels:  jupyter-notebook
Fuzzingbook
Project page for "The Fuzzing Book"
Stars: ✭ 549 (-0.54%)
Mutual labels:  jupyter-notebook
Deepnlp Course
Deep NLP Course
Stars: ✭ 551 (-0.18%)
Mutual labels:  jupyter-notebook
Curve Text Detector
This repository provides train&test code, dataset, det.&rec. annotation, evaluation script, annotation tool, and ranking.
Stars: ✭ 551 (-0.18%)
Mutual labels:  jupyter-notebook
Gan
Tooling for GANs in TensorFlow
Stars: ✭ 547 (-0.91%)
Mutual labels:  jupyter-notebook

Learn reinforcement learning with classic GridWorld and PuckWorld Environments compatible with Gym library.

I wrote several basic classes describing the events occured during an agent's interaction with an environment. Besides, for RL beginners to better understand how the classic RL algorithms work in discrete observation spaces, I wrote two classic environments:GridWorld and PuckWorld.

You can copy these two environments into your gym library and by just making a few modification, these two environments can be used the same as the embeded environments in Gym.

Please go to the sub-folder "reinforce" to see the organization of the whole package:

core.py

You will find some core classes modeling the object needed in reinforcement learning in this file. These are:

Transition

stores the information describing an agent's state transition. Transition is the basic unit of an Episode.

Episode

stores a list of transition that an agent experience till to one of its end states.

Experience

stores a list of episode. Experience has a capacity limit; it also has a sample method to randomly select a certain number of transitions from its memory.

Agent

this is the base class for all agents implemented for a certain reinforcement learning algorithm. in Agent class, an "act" function wraps the step() function of an environment which interacts with the agent. you can implement your own agent class by deriving this class.

agents.py

In this file, you will find some agents class which are already implemented for a certain reinforcement learning algorithms. more agents classes will be added into this file as I practice. Now, you can find agent with sarsa, Q, sarsa(\lambda) algorithms.

approximator.py

You can find some classes which performs like a neural network. that's right. Deep neural network is used as an function approximator in RL algorithms, this is so called Deep reinforcement Learning. You will find different types of Agents using different type of function approximators.

gridworld.py

A base GridWorld classe is implemented for generating more specific GridWorld environments used in David Silver's RL course, such as:

  • Simple 10×7 Grid world
  • Windy Grid world
  • Random Walk
  • Cliff Walk
  • Skull and Treasure Environment used for explain an agent can benefit from random policy, while a determistic policy may lead to an endless loop.

You can build your own grid world object just by giving different parameters to its init function. Visit here for more details about how to generate a specific grid world environment object.

puckworld.py

This is another classic environment called "PuckWorld", the idea of which comes from ReinforceJS. Thanks to Karpathy. Different from gridworld environment which has a one-dimensional discrete observation and action space, puck world has a continuous observation state space with six dimensions and a discrete action space which can also easily be converted to continuous one.

PuckWord is considered as one of the classic environments for training an agent with Deep Q-Learning Network.

examples

several seperate .pys are provided for understanding a RL algorithm without the classes mentioned above.

you can also find a implementation of Policy Iteration and Value Iteration by using dynamic programming in this folder.

Hope you enjoy these classes and expect you to make contribution for this package.

Author: Qiang Ye.

Date: August 16, 2017

License: MIT

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].