Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → qqiang00 → Reinforce

qqiang00 / Reinforce

Reinforcement Learning Algorithm Package & PuckWorld, GridWorld Gym environments

Labels

jupyter-notebook

Projects that are alternatives of or similar to Reinforce

Hate Speech And Offensive Language

Repository for the paper "Automated Hate Speech Detection and the Problem of Offensive Language", ICWSM 2017

Stars: ✭ 543 (-1.63%)

Mutual labels: jupyter-notebook

python partial dependence plot toolbox

Stars: ✭ 544 (-1.45%)

Mutual labels: jupyter-notebook

Data Structures Using Python

This is my repository for Data Structures using Python

Stars: ✭ 546 (-1.09%)

Mutual labels: jupyter-notebook

Sentiment analysis fine grain

Multi-label Classification with BERT; Fine Grained Sentiment Analysis from AI challenger

Stars: ✭ 546 (-1.09%)

Mutual labels: jupyter-notebook

A series of Jupyter notebooks that walk you through the fundamentals of Machine Learning and Deep Learning in python using Scikit-Learn and TensorFlow.

Stars: ✭ 23,798 (+4211.23%)

Mutual labels: jupyter-notebook

Python library for Multi-Armed Bandits

Stars: ✭ 547 (-0.91%)

Mutual labels: jupyter-notebook

Video Classification

Tutorial for video classification/ action recognition using 3D CNN/ CNN+RNN on UCF101

Stars: ✭ 543 (-1.63%)

Mutual labels: jupyter-notebook

Competitive Data Science

Materials for "How to Win a Data Science Competition: Learn from Top Kagglers" course

Stars: ✭ 551 (-0.18%)

Mutual labels: jupyter-notebook

Probabilistic Programming And Bayesian Methods For Hackers

aka "Bayesian Methods for Hackers": An introduction to Bayesian methods + probabilistic programming with a computation/understanding-first, mathematics-second point of view. All in pure Python ;)

Stars: ✭ 23,912 (+4231.88%)

Mutual labels: jupyter-notebook

Collaging on Internal Representations: An Intuitive Approach for Semantic Transfiguration

Stars: ✭ 549 (-0.54%)

Mutual labels: jupyter-notebook

Stars: ✭ 545 (-1.27%)

Mutual labels: jupyter-notebook

Machine Learning Notes

My continuously updated Machine Learning, Probabilistic Models and Deep Learning notes and demos (2000+ slides) 我不间断更新的机器学习，概率模型和深度学习的讲义(2000+页)和视频链接

Stars: ✭ 5,390 (+876.45%)

Mutual labels: jupyter-notebook

Pythoncode Tutorials

The Python Code Tutorials

Stars: ✭ 544 (-1.45%)

Mutual labels: jupyter-notebook

SQL Server Workshops

Stars: ✭ 544 (-1.45%)

Mutual labels: jupyter-notebook

Go Profiler Notes

felixge's notes on the various go profiling methods that are available.

Stars: ✭ 525 (-4.89%)

Mutual labels: jupyter-notebook

Pierian Data Complete Python 3 Bootcamp

Stars: ✭ 544 (-1.45%)

Mutual labels: jupyter-notebook

Project page for "The Fuzzing Book"

Stars: ✭ 549 (-0.54%)

Mutual labels: jupyter-notebook

Deep NLP Course

Stars: ✭ 551 (-0.18%)

Mutual labels: jupyter-notebook

Curve Text Detector

This repository provides train＆test code, dataset, det.&rec. annotation, evaluation script, annotation tool, and ranking.

Stars: ✭ 551 (-0.18%)

Mutual labels: jupyter-notebook

Tooling for GANs in TensorFlow

Stars: ✭ 547 (-0.91%)

Mutual labels: jupyter-notebook

View All Similar Projects ➔

Learn reinforcement learning with classic GridWorld and PuckWorld Environments compatible with Gym library.

I wrote several basic classes describing the events occured during an agent's interaction with an environment. Besides, for RL beginners to better understand how the classic RL algorithms work in discrete observation spaces, I wrote two classic environments:GridWorld and PuckWorld.

You can copy these two environments into your gym library and by just making a few modification, these two environments can be used the same as the embeded environments in Gym.

Please go to the sub-folder "reinforce" to see the organization of the whole package:

core.py

You will find some core classes modeling the object needed in reinforcement learning in this file. These are:

Transition

stores the information describing an agent's state transition. Transition is the basic unit of an Episode.

Episode

stores a list of transition that an agent experience till to one of its end states.

Experience

stores a list of episode. Experience has a capacity limit; it also has a sample method to randomly select a certain number of transitions from its memory.

Agent

this is the base class for all agents implemented for a certain reinforcement learning algorithm. in Agent class, an "act" function wraps the step() function of an environment which interacts with the agent. you can implement your own agent class by deriving this class.

agents.py

In this file, you will find some agents class which are already implemented for a certain reinforcement learning algorithms. more agents classes will be added into this file as I practice. Now, you can find agent with sarsa, Q, sarsa(\lambda) algorithms.

approximator.py

You can find some classes which performs like a neural network. that's right. Deep neural network is used as an function approximator in RL algorithms, this is so called Deep reinforcement Learning. You will find different types of Agents using different type of function approximators.

gridworld.py

A base GridWorld classe is implemented for generating more specific GridWorld environments used in David Silver's RL course, such as:

Simple 10×7 Grid world
Windy Grid world
Random Walk
Cliff Walk
Skull and Treasure Environment used for explain an agent can benefit from random policy, while a determistic policy may lead to an endless loop.

You can build your own grid world object just by giving different parameters to its init function. Visit here for more details about how to generate a specific grid world environment object.

puckworld.py

This is another classic environment called "PuckWorld", the idea of which comes from ReinforceJS. Thanks to Karpathy. Different from gridworld environment which has a one-dimensional discrete observation and action space, puck world has a continuous observation state space with six dimensions and a discrete action space which can also easily be converted to continuous one.

PuckWord is considered as one of the classic environments for training an agent with Deep Q-Learning Network.

examples

several seperate .pys are provided for understanding a RL algorithm without the classes mentioned above.

you can also find a implementation of Policy Iteration and Value Iteration by using dynamic programming in this folder.

Hope you enjoy these classes and expect you to make contribution for this package.

Author: Qiang Ye.

Date: August 16, 2017

License: MIT

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 552

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (5) 🔗