All Projects → FelipeMarcelino → 2048-Gym

FelipeMarcelino / 2048-Gym

Licence: other
This projects aims to use reinforcement learning algorithms to play the game 2048.

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to 2048-Gym

gym-mtsim
A general-purpose, flexible, and easy-to-use simulator alongside an OpenAI Gym trading environment for MetaTrader 5 trading platform (Approved by OpenAI Gym)
Stars: ✭ 196 (+188.24%)
Mutual labels:  openai-gym, gym-environment
rlflow
A TensorFlow-based framework for learning about and experimenting with reinforcement learning algorithms
Stars: ✭ 20 (-70.59%)
Mutual labels:  openai-gym
2048-rs
A rust implementation of the famous 2048 game
Stars: ✭ 48 (-29.41%)
Mutual labels:  2048
mgym
A collection of multi-agent reinforcement learning OpenAI gym environments
Stars: ✭ 41 (-39.71%)
Mutual labels:  gym-environment
2048
🎮 2048 game developed using javascript
Stars: ✭ 53 (-22.06%)
Mutual labels:  2048
2048-go
This is 2048 working on CLI
Stars: ✭ 21 (-69.12%)
Mutual labels:  2048
rl pytorch
Deep Reinforcement Learning Algorithms Implementation in PyTorch
Stars: ✭ 23 (-66.18%)
Mutual labels:  openai-gym
nelua-game2048
Clone of the 2048 game in Nelua using Raylib
Stars: ✭ 16 (-76.47%)
Mutual labels:  2048
cxk-2048-react
🐓 A noisy 2048 game. 一个魔性又吵闹的2048小游戏。
Stars: ✭ 13 (-80.88%)
Mutual labels:  2048
2048.c
CLI version of 2048, written in C.
Stars: ✭ 14 (-79.41%)
Mutual labels:  2048
dqn zoo
The implement of all kinds of dqn reinforcement learning with Pytorch
Stars: ✭ 42 (-38.24%)
Mutual labels:  ddqn
gym-line-follower
Line follower robot simulator environment for Open AI Gym.
Stars: ✭ 46 (-32.35%)
Mutual labels:  openai-gym
Autonomous-Drifting
Autonomous Drifting using Reinforcement Learning
Stars: ✭ 83 (+22.06%)
Mutual labels:  openai-gym
prl
Open-source library for a reinforcement learning research.
Stars: ✭ 53 (-22.06%)
Mutual labels:  openai-gym
stadium
A graphical interface for reinforcement learning and gym-based environments. Integrates tensorboard and various configuration utilities for ease of usage.
Stars: ✭ 26 (-61.76%)
Mutual labels:  gym-environment
2048
🎮 2048 clone (React/TypeScript/Redux). No canvas.
Stars: ✭ 55 (-19.12%)
Mutual labels:  2048
jiminy
Jiminy: a fast and portable Python/C++ simulator of poly-articulated systems with OpenAI Gym interface for reinforcement learning
Stars: ✭ 90 (+32.35%)
Mutual labels:  openai-gym
TF2-RL
Reinforcement learning algorithms implemented for Tensorflow 2.0+ [DQN, DDPG, AE-DDPG, SAC, PPO, Primal-Dual DDPG]
Stars: ✭ 160 (+135.29%)
Mutual labels:  openai-gym
atc-reinforcement-learning
Reinforcement learning for an air traffic control task. OpenAI gym based simulation.
Stars: ✭ 37 (-45.59%)
Mutual labels:  openai-gym
FinRL
FinRL: The first open-source project for financial reinforcement learning. Please star. 🔥
Stars: ✭ 3,497 (+5042.65%)
Mutual labels:  openai-gym

2048-Gym

Agent playing

This repository is a project about using DQN(Q-Learning) to play the Game 2048 and accelarate and accelerate the environment using Numba). The algorithm used is from Stable Baselines, and the environment is a custom Open AI env. The environment contains two types of representation for the board: binary and no binary. The first one uses a power two matrix to represent each tile of the board. On the contrary, no binary uses a raw matrix board.

The model uses two different types of neural networks: CNN(Convolutional Neural Network), MLP(Multi-Layer Perceptron). The agent performed better using CNN as an extractor for features than MLP. Probably it is because CNN can extract spatial features. As a result, the agent achieve a 2048 tile in 10% of the 1000 played games.

Optuna

Optuna is an automatic hyperparameter optimization software framework, particularly designed for machine learning. It features an imperative, define-by-run style user API. Thanks to our define-by-run API, the code written with Optuna enjoys high modularity, and the user of Optuna can dynamically construct the search spaces for the hyperparameters.

There is a guide of how to use this library here.

Numba

Numba is an open source JIT compiler that translates a subset of Python and NumPy code into fast machine code.

There is a guide of how to use this library here.

Instalation

Installing dependecies pip install -r [requirements_cpu.txt|requirements-gpu.txt], choosing the appropriate file depending on whether you wish to run the models on a CPU or a GPU.

OR

Using conda environment

conda env create -f [conda_env_gpu.yml|conda_env_cpu.yml]

To install the environment, execute the following commands:

git clone https://github.com/FelipeMarcelino/2048-Gym/
cd 2048-gym/gym-game2048/
pip install -e .

Running

usage: model_optimize.py [-h] --agent AGENT
                         [--tensorboard-log TENSORBOARD_LOG]
                         [--study-name STUDY_NAME] [--trials TRIALS]
                         [--n-timesteps N_STEPS] [--save-freq SAVE_FREQ]
                         [--save-dir SAVE_DIR] [--log-interval LOG_INTERVAL]
                         [--no-binary] [--seed SEED]
                         [--eval-episodes EVAL_EPISODES]
                         [--extractor EXTRACTOR] [--layer-normalization]
                         [--num-cpus NUM_CPUS] [--layers LAYERS [LAYERS ...]]
                         [--penalty PENALTY] [--load_path LOAD_PATH]
                         [--num_timesteps_log NUM_TIMESTEPS_LOG]

optional arguments:
  -h, --help            show this help message and exit
  --agent AGENT, -ag AGENT
                        Algorithm to use to train the model - DQN, ACER, PPO2
  --tensorboard-log TENSORBOARD_LOG, -tl TENSORBOARD_LOG
                        Tensorboard log directory
  --study-name STUDY_NAME, -sn STUDY_NAME
                        The name of study used for optuna to create the
                        database.
  --trials TRIALS, -tr TRIALS
                        The number of trials tested for optuna optimize. - 0
                        is the default setting and try until the script is
                        finish
  --n-timesteps N_STEPS, -nt N_STEPS
                        Number of timestems the model going to run.
  --save-freq SAVE_FREQ, -sf SAVE_FREQ
                        The interval between model saves.
  --save-dir SAVE_DIR, -sd SAVE_DIR
                        Save dictory models
  --log-interval LOG_INTERVAL, -li LOG_INTERVAL
                        Log interval
  --no-binary, -bi      Do not use binary observation space
  --seed SEED           Seed
  --eval-episodes EVAL_EPISODES, -ee EVAL_EPISODES
                        The number of episodes to test after training the
                        model
  --extractor EXTRACTOR, -ex EXTRACTOR
                        The extractor used to create the features from
                        observation space - (mlp or cnn)
  --layer-normalization, -ln
                        Use layer normalization - Only for DQN
  --num-cpus NUM_CPUS, -nc NUM_CPUS
                        Number of cpus to use. DQN only accept 1
  --layers LAYERS [LAYERS ...], -l LAYERS [LAYERS ...]
                        List of neurons to use in DQN algorithm. The number of
                        elements inside list going to be the number of layers.
  --penalty PENALTY, -pe PENALTY
                        How much penalize the model when choose a invalid
                        action
  --load_path LOAD_PATH, -lp LOAD_PATH
                        Load model from
  --num_timesteps_log NUM_TIMESTEPS_LOG, -ntl NUM_TIMESTEPS_LOG
                        Continuing timesteps for tensorboard_log

Playing

Play the game using trained agent.

python play_game.py 

OBS: It is necessary to change the model path and agent inside play_game.py

Visualization

See best model actions using Tkinter.

python show_played_game.py

OBS: It is necessary to change the pickle game data inside show_played_game.py

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].