Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → stevenpjg → Ddpg Aigym

stevenpjg / Ddpg Aigym

Licence: mit

Continuous control with deep reinforcement learning - Deep Deterministic Policy Gradient (DDPG) algorithm implemented in OpenAI Gym environments

Programming Languages

139335 projects - #7 most used programming language

Labels

deep-learning tensorflow reinforcement-learning

Projects that are alternatives of or similar to Ddpg Aigym

Code for the paper "Evolved Policy Gradients"

Stars: ✭ 204 (-9.33%)

Mutual labels: reinforcement-learning

Reinforcement Learning An Introduction Chinese

《Reinforcement Learning: An Introduction》（第二版）中文翻译

Stars: ✭ 210 (-6.67%)

Mutual labels: reinforcement-learning

Input Convex Neural Networks

Stars: ✭ 214 (-4.89%)

Mutual labels: reinforcement-learning

An environment to high-frequency trading agents under reinforcement learning

Stars: ✭ 205 (-8.89%)

Mutual labels: reinforcement-learning

Alphazero gomoku

An implementation of the AlphaZero algorithm for Gomoku (also called Gobang or Five in a Row)

Stars: ✭ 2,570 (+1042.22%)

Mutual labels: reinforcement-learning

Awesome Deeplearning Resources

Deep Learning and deep reinforcement learning research papers and some codes

Stars: ✭ 2,483 (+1003.56%)

Mutual labels: reinforcement-learning

Unreal environments for reinforcement learning

Stars: ✭ 202 (-10.22%)

Mutual labels: reinforcement-learning

Bayesian Reinforcement Learning in Tensorflow

Stars: ✭ 222 (-1.33%)

Mutual labels: reinforcement-learning

A universal flight control tuning framework

Stars: ✭ 210 (-6.67%)

Mutual labels: reinforcement-learning

Framework and OpenAI Gym Environment for Autonomous Vehicle Development

Stars: ✭ 214 (-4.89%)

Mutual labels: reinforcement-learning

Rl Tutorial Jnrr19

Stable-Baselines tutorial for Journées Nationales de la Recherche en Robotique 2019

Stars: ✭ 204 (-9.33%)

Mutual labels: reinforcement-learning

Pytorch A2c Ppo Acktr Gail

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).

Stars: ✭ 2,632 (+1069.78%)

Mutual labels: reinforcement-learning

Classic papers and resources on recommendation

Stars: ✭ 2,804 (+1146.22%)

Mutual labels: reinforcement-learning

Meandering In Networks of Entities to Reach Verisimilar Answers

Stars: ✭ 205 (-8.89%)

Mutual labels: reinforcement-learning

Reinforcement Learning in Go

Stars: ✭ 215 (-4.44%)

Mutual labels: reinforcement-learning

Multi-hop knowledge graph reasoning learned via policy gradient with reward shaping and action dropout

Stars: ✭ 202 (-10.22%)

Mutual labels: reinforcement-learning

Pytorch Reinforce

PyTorch Implementation of REINFORCE for both discrete & continuous control

Stars: ✭ 212 (-5.78%)

Mutual labels: reinforcement-learning

Machine Learning Notebooks

Machine Learning notebooks for refreshing concepts.

Stars: ✭ 222 (-1.33%)

Mutual labels: reinforcement-learning

ns3-gym - The Playground for Reinforcement Learning in Networking Research

Stars: ✭ 221 (-1.78%)

Mutual labels: reinforcement-learning

Framework for Multi-Agent Deep Reinforcement Learning in Poker

Stars: ✭ 214 (-4.89%)

Mutual labels: reinforcement-learning

View All Similar Projects ➔

ddpg-aigym

Deep Deterministic Policy Gradient

Implementation of Deep Deterministic Policy Gradiet Algorithm (Lillicrap et al.arXiv:1509.02971.) in Tensorflow

How to use

git clone https://github.com/stevenpjg/ddpg-aigym.git
cd ddpg-aigym
python main.py

During training

Once trained

Learning Curve

The learning curve for InvertedPendulum-v1 environment.

Dependencies

Tensorflow (Developed in tensorflow version 0.11.0rc0 [CPU version] [GPU version])
OpenAi gym
Mujoco

Features

Batch Normalization (improvement in learning speed)
Grad-inverter (given in arXiv: arXiv:1511.04143)

Note

To use different environment

experiment= 'InvertedPendulum-v1' #specify environments here

To use batch normalization

is_batch_norm = True #batch normalization switch

Let me know if there are any issues and clarifications regarding hyperparameter tuning.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 225

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (0) 🔗