All Projects → gsurma → Cartpole

gsurma / Cartpole

Licence: mit
OpenAI's cartpole env solver.

Programming Languages

python
139335 projects - #7 most used programming language
python27
39 projects

Projects that are alternatives of or similar to Cartpole

Snake Ai Reinforcement
AI for Snake game trained from pixels using Deep Reinforcement Learning (DQN).
Stars: ✭ 123 (+14.95%)
Mutual labels:  ai, reinforcement-learning, dqn
Pytorch Rl
This repository contains model-free deep reinforcement learning algorithms implemented in Pytorch
Stars: ✭ 394 (+268.22%)
Mutual labels:  reinforcement-learning, dqn, openai-gym
Doudizhu
AI斗地主
Stars: ✭ 149 (+39.25%)
Mutual labels:  ai, reinforcement-learning, dqn
Atari
AI research environment for the Atari 2600 games 🤖.
Stars: ✭ 174 (+62.62%)
Mutual labels:  ai, reinforcement-learning, dqn
Gym Anytrading
The most simple, flexible, and comprehensive OpenAI Gym trading environment (Approved by OpenAI Gym)
Stars: ✭ 627 (+485.98%)
Mutual labels:  reinforcement-learning, dqn, openai-gym
Tensorflow Rl
Implementations of deep RL papers and random experimentation
Stars: ✭ 176 (+64.49%)
Mutual labels:  reinforcement-learning, dqn, openai-gym
Deep Reinforcement Learning
Repo for the Deep Reinforcement Learning Nanodegree program
Stars: ✭ 4,012 (+3649.53%)
Mutual labels:  reinforcement-learning, dqn, openai-gym
Ctc Executioner
Master Thesis: Limit order placement with Reinforcement Learning
Stars: ✭ 112 (+4.67%)
Mutual labels:  reinforcement-learning, dqn, openai-gym
Openaigym
Solving OpenAI Gym problems.
Stars: ✭ 98 (-8.41%)
Mutual labels:  reinforcement-learning, dqn, openai-gym
Mushroom Rl
Python library for Reinforcement Learning.
Stars: ✭ 442 (+313.08%)
Mutual labels:  reinforcement-learning, dqn, openai-gym
Aigames
use AI to play some games.
Stars: ✭ 422 (+294.39%)
Mutual labels:  ai, reinforcement-learning, dqn
Basic reinforcement learning
An introductory series to Reinforcement Learning (RL) with comprehensive step-by-step tutorials.
Stars: ✭ 826 (+671.96%)
Mutual labels:  ai, reinforcement-learning, openai-gym
Super Mario Bros Ppo Pytorch
Proximal Policy Optimization (PPO) algorithm for Super Mario Bros
Stars: ✭ 649 (+506.54%)
Mutual labels:  ai, reinforcement-learning, openai-gym
Rlcard
Reinforcement Learning / AI Bots in Card (Poker) Games - Blackjack, Leduc, Texas, DouDizhu, Mahjong, UNO.
Stars: ✭ 980 (+815.89%)
Mutual labels:  ai, reinforcement-learning, openai-gym
Reinforcepy
Collection of reinforcement learners implemented in python. Mainly including DQN and its variants
Stars: ✭ 54 (-49.53%)
Mutual labels:  reinforcement-learning, dqn
Gym Minigrid
Minimalistic gridworld package for OpenAI Gym
Stars: ✭ 1,047 (+878.5%)
Mutual labels:  reinforcement-learning, openai-gym
Reinforcement learning
강화학습에 대한 기본적인 알고리즘 구현
Stars: ✭ 100 (-6.54%)
Mutual labels:  reinforcement-learning, dqn
Treeqn
Stars: ✭ 77 (-28.04%)
Mutual labels:  reinforcement-learning, openai-gym
Holodeck Engine
High Fidelity Simulator for Reinforcement Learning and Robotics Research.
Stars: ✭ 48 (-55.14%)
Mutual labels:  ai, reinforcement-learning
Dmc2gym
OpenAI Gym wrapper for the DeepMind Control Suite
Stars: ✭ 75 (-29.91%)
Mutual labels:  reinforcement-learning, openai-gym

Cartpole

Reinforcement Learning solution of the OpenAI's Cartpole.

Check out corresponding Medium article: Cartpole - Introduction to Reinforcement Learning (DQN - Deep Q-Learning)

About

A pole is attached by an un-actuated joint to a cart, which moves along a frictionless track. The system is controlled by applying a force of +1 or -1 to the cart. The pendulum starts upright, and the goal is to prevent it from falling over. A reward of +1 is provided for every timestep that the pole remains upright. The episode ends when the pole is more than 15 degrees from vertical, or the cart moves more than 2.4 units from the center. source

DQN

Standard DQN with Experience Replay.

Hyperparameters:

  • GAMMA = 0.95
  • LEARNING_RATE = 0.001
  • MEMORY_SIZE = 1000000
  • BATCH_SIZE = 20
  • EXPLORATION_MAX = 1.0
  • EXPLORATION_MIN = 0.01
  • EXPLORATION_DECAY = 0.995

Model structure:

  1. Dense layer - input: 4, output: 24, activation: relu
  2. Dense layer - input 24, output: 24, activation: relu
  3. Dense layer - input 24, output: 2, activation: linear
  • MSE loss function
  • Adam optimizer

Performance

CartPole-v0 defines "solving" as getting average reward of 195.0 over 100 consecutive trials. source

Example trial gif
Example trial chart
Solved trials chart

Author

Greg (Grzegorz) Surma

PORTFOLIO

GITHUB

BLOG

Support via PayPal
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].