Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → mmcenta → Left Shift

mmcenta / Left Shift

Licence: mit

Using deep reinforcement learning to tackle the game 2048.

Programming Languages

139335 projects - #7 most used programming language

Labels

reinforcement-learning deep-reinforcement-learning

Projects that are alternatives of or similar to Left Shift

Reinforcement Learning Algorithms

Stars: ✭ 14 (-60%)

Mutual labels: reinforcement-learning, deep-reinforcement-learning

Tensorflow Tutorial

TensorFlow and Deep Learning Tutorials

Stars: ✭ 748 (+2037.14%)

Mutual labels: reinforcement-learning, deep-reinforcement-learning

Trax — Deep Learning with Clear Code and Speed

Stars: ✭ 6,666 (+18945.71%)

Mutual labels: reinforcement-learning, deep-reinforcement-learning

Gibson Environments: Real-World Perception for Embodied Agents

Stars: ✭ 666 (+1802.86%)

Mutual labels: reinforcement-learning, deep-reinforcement-learning

A High Level Python Deep Reinforcement Learning library. Great for beginners, prototyping and quickly comparing algorithms

Stars: ✭ 29 (-17.14%)

Mutual labels: reinforcement-learning, deep-reinforcement-learning

Softlearning is a reinforcement learning framework for training maximum entropy policies in continuous domains. Includes the official implementation of the Soft Actor-Critic algorithm.

Stars: ✭ 713 (+1937.14%)

Mutual labels: reinforcement-learning, deep-reinforcement-learning

Deep Reinforcement Learning with pytorch & visdom

Stars: ✭ 745 (+2028.57%)

Mutual labels: reinforcement-learning, deep-reinforcement-learning

A modular high-level library to train embodied AI agents across a variety of tasks, environments, and simulators.

Stars: ✭ 587 (+1577.14%)

Mutual labels: reinforcement-learning, deep-reinforcement-learning

Scalable, event-driven, deep-learning-friendly backtesting library

Stars: ✭ 765 (+2085.71%)

Mutual labels: reinforcement-learning, deep-reinforcement-learning

Reinforcement learning environments with musculoskeletal models

Stars: ✭ 763 (+2080%)

Mutual labels: reinforcement-learning, deep-reinforcement-learning

PyTorch implementation of Deep Reinforcement Learning: Policy Gradient methods (TRPO, PPO, A2C) and Generative Adversarial Imitation Learning (GAIL). Fast Fisher vector product TRPO.

Stars: ✭ 658 (+1780%)

Mutual labels: reinforcement-learning, deep-reinforcement-learning

Modular Deep Reinforcement Learning framework in PyTorch. Companion library of the book "Foundations of Deep Reinforcement Learning".

Stars: ✭ 904 (+2482.86%)

Mutual labels: reinforcement-learning, deep-reinforcement-learning

Hands On Reinforcement Learning With Python

Master Reinforcement and Deep Reinforcement Learning using OpenAI Gym and TensorFlow

Stars: ✭ 640 (+1728.57%)

Mutual labels: reinforcement-learning, deep-reinforcement-learning

PyTorch implementation of Asynchronous Advantage Actor Critic (A3C) from "Asynchronous Methods for Deep Reinforcement Learning".

Stars: ✭ 879 (+2411.43%)

Mutual labels: reinforcement-learning, deep-reinforcement-learning

Deepdrive is a simulator that allows anyone with a PC to push the state-of-the-art in self-driving

Stars: ✭ 628 (+1694.29%)

Mutual labels: reinforcement-learning, deep-reinforcement-learning

StarCraft II - pysc2 Deep Reinforcement Learning Examples

Stars: ✭ 722 (+1962.86%)

Mutual labels: reinforcement-learning, deep-reinforcement-learning

Animalai Olympics

Code repository for the Animal AI Olympics competition

Stars: ✭ 544 (+1454.29%)

Mutual labels: reinforcement-learning, deep-reinforcement-learning

Lightweight, efficient and stable implementations of deep reinforcement learning algorithms using PyTorch.

Stars: ✭ 575 (+1542.86%)

Mutual labels: reinforcement-learning, deep-reinforcement-learning

Deeprl Tutorials

Contains high quality implementations of Deep Reinforcement Learning algorithms written in PyTorch

Stars: ✭ 748 (+2037.14%)

Mutual labels: reinforcement-learning, deep-reinforcement-learning

Pygame Learning Environment

PyGame Learning Environment (PLE) -- Reinforcement Learning Environment in Python.

Stars: ✭ 828 (+2265.71%)

Mutual labels: reinforcement-learning, deep-reinforcement-learning

View All Similar Projects ➔

left-shift

This repository contains the code used in our project for the INF581: Advanced Topics in A.I. at École Polytechnique.

In this project, we aim at training a game-playing agent for the 2048 game. We implement an OpenAI Gym environment to model the game and use the Deep Q-Learning (DQN) algorithm from the Stable Baseline library to train multiple agents varying the states encoding, reward function, network type and structure. Results show that encoding states using one-hot encoding is crucial for better performance. We have also concluded that Convolutional Neural Networks (CNN) are more efficient than Multilayer Perceptrons (MLP) for the purpose of this game.

For a more in-depth discussion, give our report a read.

Project Structure

Below we detail the function of each directory:

agents: contains scripts to train and evaluate agents (more details on the Running subsection), as well as the necessary code implementing custom callbacks and policies;
docs: contains the GIF you saw above and the final report of the project;
hyperparams: contains YAML files detailing hyperparameters for agents (more details on the Hyperparameters subsection);
models: contains pretrained agents;
utils: contains auxiliary scripts for plotting results.

Instalation

We recommend using a separate Python 3.7 environment for this project (there is an incompatibility issue when trying to load models created using Python 3.7 on other versions). Our dependencies are:

A quick way to install them is to run the following command: pip install -r [requirements.txt|requirements-gpu.txt], choosing the appropriate file depending on whether you wish to run the models on a CPU or a GPU.

To install the environment, execute the following commands:

git clone https://github.com/mmcenta/gym-text2048
pip install -e gym-text2048

Running

Interactive player for humans:

python agents/play.py

Random agent:

python agents/random_agent.py

DQN:

To show an agent reaching 2048:

python agents/dqn.py --r2048

To demo pretrained models:

python agents/dqn.py -mn MODEL_NAME --demo

Example:

python agents/dqn.py -mn cnn_5l_4_v2 --demo

The models are in the folder models/. If the model has _nohot, you have to launch it with the --no-one-hot flag.

Example:

python agents/dqn.py -mn cnn_5l_4_v2_nohot --no-one-hot --demo

To train:

python agents/dqn.py -mn MODEL_NAME --train

To evaluate:

python agents/dqn.py -mn MODEL_NAME --eval

Complete usage:

python agents/dqn.py -h

usage: python agents/dqn.py [-h] [--env ENV] [--tensorboard-log TENSORBOARD_LOG]
              [--hyperparams-file HYPERPARAMS_FILE] [--model-name MODEL_NAME]
              [--n-timesteps N_TIMESTEPS] [--log-interval LOG_INTERVAL]
              [--hist-freq HIST_FREQ] [--eval-episodes EVAL_EPISODES]
              [--save-freq SAVE_FREQ] [--save-directory SAVE_DIRECTORY]
              [--log-directory LOG_DIRECTORY] [--seed SEED]
              [--verbose VERBOSE] [--no-one-hot] [--train] [--eval]
              [--extractor EXTRACTOR] [--demo] [--r2048]

optional arguments:
  -h, --help            show this help message and exit
  --env ENV             Environment id.
  --tensorboard-log TENSORBOARD_LOG, -tb TENSORBOARD_LOG
                        Tensorboard log directory.
  --hyperparams-file HYPERPARAMS_FILE, -hf HYPERPARAMS_FILE
                        Hyperparameter YAML file location.
  --model-name MODEL_NAME, -mn MODEL_NAME
                        Model name (if it already exists, training will be
                        resumed).
  --n-timesteps N_TIMESTEPS, -n N_TIMESTEPS
                        Number of timesteps.
  --log-interval LOG_INTERVAL
                        Log interval.
  --hist-freq HIST_FREQ
                        Dumps histogram each n steps.
  --eval-episodes EVAL_EPISODES
                        Number of episodes to use for evaluation.
  --save-freq SAVE_FREQ
                        Save the model every n steps (if negative, no
                        checkpoint).
  --save-directory SAVE_DIRECTORY, -sd SAVE_DIRECTORY
                        Save directory.
  --log-directory LOG_DIRECTORY, -ld LOG_DIRECTORY
                        Log directory.
  --seed SEED           Random generator seed.
  --verbose VERBOSE     Verbose mode (0: no output, 1: INFO).
  --no-one-hot          Disable one-hot encoding
  --train               Enable training
  --eval                Enable evaluation
  --extractor EXTRACTOR
                        Change extractor
  --demo                Enable rendering and runs for 1 episode
  --r2048               Show an agent reaching 2048

Plotting training logs:

To plot all the training logs:

python utils/plot_log_multi.py

To plot a specific training log:

python utils/plot_log.py PATH_TO_LOG

Example:

python utils/plot_log.py logs/cnn_5l4_fc.npz

Authors

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 35

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (2) 🔗