All Projects → shivaverma → Openaigym

shivaverma / Openaigym

Solving OpenAI Gym problems.

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Openaigym

Pytorch Rl
This repository contains model-free deep reinforcement learning algorithms implemented in Pytorch
Stars: ✭ 394 (+302.04%)
Mutual labels:  reinforcement-learning, dqn, openai-gym, ddpg
Mushroom Rl
Python library for Reinforcement Learning.
Stars: ✭ 442 (+351.02%)
Mutual labels:  reinforcement-learning, dqn, openai-gym, ddpg
Deep Reinforcement Learning
Repo for the Deep Reinforcement Learning Nanodegree program
Stars: ✭ 4,012 (+3993.88%)
Mutual labels:  reinforcement-learning, dqn, openai-gym, ddpg
pytorch-rl
Pytorch Implementation of RL algorithms
Stars: ✭ 15 (-84.69%)
Mutual labels:  openai-gym, dqn, ddpg
Rlcycle
A library for ready-made reinforcement learning agents and reusable components for neat prototyping
Stars: ✭ 184 (+87.76%)
Mutual labels:  reinforcement-learning, dqn, ddpg
Pytorch Drl
PyTorch implementations of various Deep Reinforcement Learning (DRL) algorithms for both single agent and multi-agent.
Stars: ✭ 233 (+137.76%)
Mutual labels:  reinforcement-learning, dqn, ddpg
Ctc Executioner
Master Thesis: Limit order placement with Reinforcement Learning
Stars: ✭ 112 (+14.29%)
Mutual labels:  reinforcement-learning, dqn, openai-gym
TF2-RL
Reinforcement learning algorithms implemented for Tensorflow 2.0+ [DQN, DDPG, AE-DDPG, SAC, PPO, Primal-Dual DDPG]
Stars: ✭ 160 (+63.27%)
Mutual labels:  openai-gym, dqn, ddpg
Deeprl Tensorflow2
🐋 Simple implementations of various popular Deep Reinforcement Learning algorithms using TensorFlow2
Stars: ✭ 319 (+225.51%)
Mutual labels:  reinforcement-learning, dqn, ddpg
Autonomous Learning Library
A PyTorch library for building deep reinforcement learning agents.
Stars: ✭ 425 (+333.67%)
Mutual labels:  reinforcement-learning, dqn, ddpg
Elegantrl
Lightweight, efficient and stable implementations of deep reinforcement learning algorithms using PyTorch.
Stars: ✭ 575 (+486.73%)
Mutual labels:  reinforcement-learning, dqn, ddpg
Tensorflow Rl
Implementations of deep RL papers and random experimentation
Stars: ✭ 176 (+79.59%)
Mutual labels:  reinforcement-learning, dqn, openai-gym
Minimalrl
Implementations of basic RL algorithms with minimal lines of codes! (pytorch based)
Stars: ✭ 2,051 (+1992.86%)
Mutual labels:  reinforcement-learning, dqn, ddpg
Deep-Reinforcement-Learning-With-Python
Master classic RL, deep RL, distributional RL, inverse RL, and more using OpenAI Gym and TensorFlow with extensive Math
Stars: ✭ 222 (+126.53%)
Mutual labels:  openai-gym, dqn, ddpg
Machin
Reinforcement learning library(framework) designed for PyTorch, implements DQN, DDPG, A2C, PPO, SAC, MADDPG, A3C, APEX, IMPALA ...
Stars: ✭ 145 (+47.96%)
Mutual labels:  reinforcement-learning, dqn, ddpg
Torchrl
Pytorch Implementation of Reinforcement Learning Algorithms ( Soft Actor Critic(SAC)/ DDPG / TD3 /DQN / A2C/ PPO / TRPO)
Stars: ✭ 90 (-8.16%)
Mutual labels:  reinforcement-learning, dqn, ddpg
Easy Rl
强化学习中文教程,在线阅读地址:https://datawhalechina.github.io/easy-rl/
Stars: ✭ 3,004 (+2965.31%)
Mutual labels:  reinforcement-learning, dqn, ddpg
Cartpole
OpenAI's cartpole env solver.
Stars: ✭ 107 (+9.18%)
Mutual labels:  reinforcement-learning, dqn, openai-gym
Deep Rl Keras
Keras Implementation of popular Deep RL Algorithms (A3C, DDQN, DDPG, Dueling DDQN)
Stars: ✭ 395 (+303.06%)
Mutual labels:  reinforcement-learning, dqn, ddpg
Gym Anytrading
The most simple, flexible, and comprehensive OpenAI Gym trading environment (Approved by OpenAI Gym)
Stars: ✭ 627 (+539.8%)
Mutual labels:  reinforcement-learning, dqn, openai-gym

Requirements

  • python - 3.7
  • keras - 2.4.3
  • tensorflow - 2.2.0

Project 1: Cart-Pole

Introduction

  • In this task we have to balance a rod on top of a cart. Number of action spaces is 2. Action space is discrete here.

  • 0 - move cart to the left

  • 1 - move cart to the right

  • I solved this problem using DQN in around 60 episodes. Following is a graph of score vs episodes.

Project 2: Mountain-Car

Introduction

  • In this task we have to teach the car to reach at the goal position which is at the top of mountain. Number of action spaces is 3. Action space is descrete in this environment.

  • 0 - move car to left

  • 1 - do nothing

  • 2 - move car to right

  • I solved this problem using DQN in around 15 episodes. Following is a graph of score vs episodes.

Project 3: Pendulam

Introduction

  • In this task we have to balance the pendulam upside down. Number of action spaces is 1 which is torque applied on the joint. Action space is continuous here.

  • 0 - torque [-2, 2]

  • I solved this problem using DDPG in around 100 episodes. Following is a graph of score vs episodes.

Project 4: Lunar-Lander

  • The task is to land the space-ship between the flags smoothly. The ship has 3 throttles in it. One throttle points downward and other 2 points in the left and right direction. With the help of these, you have to control the Ship. There are 2 version for this task. One is discrete version which has discrete action space and other is continuous which has continuous action space.

  • In order to solve the episode you have to get a reward of +200 for 100 consecutive episodes. I solved both the version under 400 episodes.

Discrete Version Plot

Continuous Version Plot

Project 5: Bipedal-Walker

  • BipedalWalker has 2 legs. Each leg has 2 joints. You have to teach the Bipedal-walker to walk by applying the torque on these joints. You can apply the torque in the range of (-1, 1). Positive reward is given for moving forward and small negative reward is given on applying torque on the motors.

Smooth Terrain

  • In the beginning, AI is behaving very randomly. It does not know how to control and balance the legs.
  • After 300 episodes, it learns to crawl on one knee and one leg. This AI is playing safe now because if it tumbles then it gets -100 reward.
  • After 500 episodes it started to balance on both of the legs. But It still needs to learn how to walk properly.
  • After 600 episodes, it learns to maximize the rewards. It is walking in some different style. After all, it’s an AI not a Human. This is just one of the way to walk in order to get maximum reward. If I train it again, it might learn some other optimal way to walk.

Hardcore Terrain

  • I saved my weight from the previous training on simple terrain and resumed my training on the hardcore terrain. I did it because the agent already knew how to walk on simple terrain and now it needs to learn how to cross obstacles while walking.
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].