Alternatives and detailed information of GAN-Q-Learning

louaaron / GAN-Q-Learning

Licence: other

Unofficial Implementation of GAN Q Learning https://arxiv.org/abs/1805.04874

Programming Languages

python

139335 projects - #7 most used programming language

Projects that are alternatives of or similar to GAN-Q-Learning

Reinforcement Learning

Learn Deep Reinforcement Learning in 60 days! Lectures & Code in Python. Reinforcement Learning + Deep Learning

Stars: ✭ 3,329 (+7826.19%)

Mutual labels: qlearning

Deep reinforcement learning course

Implementations from the free course Deep Reinforcement Learning with Tensorflow and PyTorch

Stars: ✭ 3,232 (+7595.24%)

Mutual labels: qlearning

DOM-Q-NET

Graph-based Deep Q Network for Web Navigation

Stars: ✭ 30 (-28.57%)

Mutual labels: qlearning

reinforced-race

A model car learns driving along a track using reinforcement learning

Stars: ✭ 37 (-11.9%)

Mutual labels: qlearning

Reinforcement-Learning-An-Introduction

Kotlin implementation of algorithms, examples, and exercises from the Sutton and Barto: Reinforcement Learning (2nd Edition)

Stars: ✭ 28 (-33.33%)

Mutual labels: qlearning

Deep-QLearning-Demo-csharp

This demo is a C# port of ConvNetJS RLDemo (https://cs.stanford.edu/people/karpathy/convnetjs/demo/rldemo.html) by Andrej Karpathy

Stars: ✭ 34 (-19.05%)

Mutual labels: qlearning

reinforcement-learning-flappybird

In-browser reinforcement learning for flappy bird 🐦

Stars: ✭ 41 (-2.38%)

Mutual labels: qlearning

Q-learning-conv-net

Q learning AI bot perceiving environment with CNN

Stars: ✭ 14 (-66.67%)

Mutual labels: qlearning

cartpole-rl-remote

CartPole game by Reinforcement Learning, a journey from training to inference

Stars: ✭ 24 (-42.86%)

Mutual labels: qlearning

ReinforcementLearning Sutton-Barto Solutions

Solutions and figures for problems from Reinforcement Learning: An Introduction Sutton&Barto

Stars: ✭ 20 (-52.38%)

Mutual labels: qlearning

This code implements the "GAN Q-Learning" algorithm found in https://arxiv.org/abs/1805.04874.

Modifications From Paper

The published algorithm has a typo in it (in the form of the discriminator loss)
Currently, there seems to be a situation which causes the discriminator to (eventually) perfectly discriminate against the generator (even before learning the actual distribution) on the cartpole environment. I've experimented with different hyperparamters, but this is definitely there. For example, even when I update the generater 10 times per discriminator update, the training graph is still as follows

Final Results

In the end, I was unable to reproduce the results given in the paper since my computer couldn't sweep enough hyperparameters. After verifying that the algorithm is correct, I found that the classic problems of training GANs arose. In particular, the discriminator easily overfit the reward distribution, meaning that the generator got stuck and the reward function couldn't learn. Even with significant artchitecture modifications, these problems persisted.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

louaaron / GAN-Q-Learning

Programming Languages

Labels

Projects that are alternatives of or similar to GAN-Q-Learning

Modifications From Paper

Final Results