All Projects → paintception → Deep-Quality-Value-Family

paintception / Deep-Quality-Value-Family

Licence: other
Official implementation of the paper "Approximating two value functions instead of one: towards characterizing a new family of Deep Reinforcement Learning Algorithms": https://arxiv.org/abs/1909.01779 To appear at the next NeurIPS2019 DRL-Workshop

Programming Languages

python
139335 projects - #7 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to Deep-Quality-Value-Family

deep-blueberry
If you've always wanted to learn about deep-learning but don't know where to start, then you might have stumbled upon the right place!
Stars: ✭ 17 (+41.67%)
Mutual labels:  deep-reinforcement-learning, keras-tensorflow
dqn-lambda
NeurIPS 2019: DQN(λ) = Deep Q-Network + λ-returns.
Stars: ✭ 20 (+66.67%)
Mutual labels:  deep-reinforcement-learning, atari-2600
semantic-guidance
Code for our CVPR-2021 paper on Combining Semantic Guidance and Deep Reinforcement Learning For Generating Human Level Paintings.
Stars: ✭ 19 (+58.33%)
Mutual labels:  deep-reinforcement-learning
digit-recognizer-live
Recognize Digits using Deep Neural Networks in Google Chrome live!
Stars: ✭ 29 (+141.67%)
Mutual labels:  keras-tensorflow
Reinforcement Learning Course
Curso de Aprendizaje por Refuerzo, de 0 a 100 con notebooks y slides muy sencillas para entenderlo todo perfectamente.
Stars: ✭ 18 (+50%)
Mutual labels:  deep-reinforcement-learning
Underflow
With underflow, create trafic light clusters that interact together to regulate circulation
Stars: ✭ 12 (+0%)
Mutual labels:  deep-reinforcement-learning
Real-Time-Violence-Detection-in-Video-
No description or website provided.
Stars: ✭ 54 (+350%)
Mutual labels:  keras-tensorflow
6502-npp-syntax
Notepad++ Syntax Highlighting for 6502 Assembly (and NESASM)
Stars: ✭ 21 (+75%)
Mutual labels:  atari-2600
gcnn keras
Graph convolution with tf.keras
Stars: ✭ 47 (+291.67%)
Mutual labels:  keras-tensorflow
keras tfrecord
Extending Keras to support tfrecord dataset
Stars: ✭ 61 (+408.33%)
Mutual labels:  keras-tensorflow
Deep-Reinforcement-Learning-CS285-Pytorch
Solutions of assignments of Deep Reinforcement Learning course presented by the University of California, Berkeley (CS285) in Pytorch framework
Stars: ✭ 104 (+766.67%)
Mutual labels:  deep-reinforcement-learning
GradCAM and GuidedGradCAM tf2
Implementation of GradCAM & Guided GradCAM with Tensorflow 2.x
Stars: ✭ 16 (+33.33%)
Mutual labels:  keras-tensorflow
Explorer
Explorer is a PyTorch reinforcement learning framework for exploring new ideas.
Stars: ✭ 54 (+350%)
Mutual labels:  deep-reinforcement-learning
tf-faster-rcnn
Tensorflow 2 Faster-RCNN implementation from scratch supporting to the batch processing with MobileNetV2 and VGG16 backbones
Stars: ✭ 88 (+633.33%)
Mutual labels:  keras-tensorflow
Keras ile Derin Ogrenmeye Giris
BTK Akademi -1 Milyon İstihdam Projesi için Merve Ayyüce Kızrak tarafından Hazırlanmıştır.
Stars: ✭ 109 (+808.33%)
Mutual labels:  keras-tensorflow
G-SimCLR
This is the code base for paper "G-SimCLR : Self-Supervised Contrastive Learning with Guided Projection via Pseudo Labelling" by Souradip Chakraborty, Aritra Roy Gosthipaty and Sayak Paul.
Stars: ✭ 69 (+475%)
Mutual labels:  keras-tensorflow
racing dreamer
Latent Imagination Facilitates Zero-Shot Transfer in Autonomous Racing
Stars: ✭ 31 (+158.33%)
Mutual labels:  deep-reinforcement-learning
Smart-Traffic-Signals-in-India-using-Deep-Reinforcement-Learning-and-Advanced-Computer-Vision
We have used Deep Reinforcement Learning and Advanced Computer Vision techniques to for the creation of Smart Traffic Signals for Indian Roads. We have created the scripts for using SUMO as our environment for deploying all our RL models.
Stars: ✭ 131 (+991.67%)
Mutual labels:  deep-reinforcement-learning
pytorch-hdqn
Hierarchical-DQN in pytorch (not actively maintained)
Stars: ✭ 36 (+200%)
Mutual labels:  deep-reinforcement-learning
dl-relu
Deep Learning using Rectified Linear Units (ReLU)
Stars: ✭ 20 (+66.67%)
Mutual labels:  keras-tensorflow

A new family of Deep Reinforcement Learning algorithms: DQV, Dueling-DQV and DQV-Max Learning

This repo contains the code that releases a new family of Deep Reinforcement Learning (DRL) algorithms. The aim of these algorithms is to learn an approximation of the state-value V(s) function alongside an approximation of the state-action value Q(s,a) function. Both approximations learn from each-others estimates, therefore yielding faster and more robust training. This work is an in-depth extension of our original DQV-Learning paper and will be presented in December at the coming NeurIPS Deep Reinforcement Learning (DRLW) Workshop in Vancouver (Canada).

An in depth presentation of the several benefits that these algorithms provide are discussed in our new paper: 'Approximating two value functions instead of one: towards characterizing a new family of Deep Reinforcement Learning Algorithms'.

Be sure to check out Arxiv for a pre-print of our work!

The main algorithms presented in this repo are:

  • Dueling Deep Quality-Value (Dueling-DQV) Learning: This Repo
  • Deep Quality-Value-Max (DQV-Max) Learning: This Repo
  • Deep Quality-Value (DQV) Learning: originally presented in 'DQV-Learning'', is now properly refactored.

while we also release implementations of:

  • Deep Q-Learning: DQN
  • Double Deep Q-Learning: DDQN

which have been used for some of the comparisons presented in our work.

alt textalt text

If you aim to train an agent from scratch on a game of the Atari Arcade Learning benchmark (ALE) run the training_job.sh script: it allows you to choose which type of agent to train according to the type of policy learning it uses (online for DQV and Dueling-DQV, while offline for all other algorithms). Note that based on which game you choose, some modifications to the code might be required.

In ./models we release a trained model obtained on Pong both for DQV and for DQV-Max.

You can use these models to explore the behavior of the learned value functions with the ./src/test_value_functions.py script. The script will compute the averaged expected return of all visited states and show that the algorithms of the DQV-family suffer less from the overestimation bias of the Q function. The script will also show that our algorithms do not overestimate the V function instead of the Q function.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].