Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → wenkesj → Holdem

wenkesj / Holdem

🃏 OpenAI Gym No Limit Texas Hold 'em Environment for Reinforcement Learning

Programming Languages

139335 projects - #7 most used programming language

Labels

reinforcement-learning openai-gym

Projects that are alternatives of or similar to Holdem

Reinforcement learning

Implementation of selected reinforcement learning algorithms in Tensorflow. A3C, DDPG, REINFORCE, DQN, etc.

Stars: ✭ 132 (-2.22%)

Mutual labels: reinforcement-learning, openai-gym

Train robotic agents to learn pick and place with deep learning for vision-based manipulation in PyBullet. Transporter Nets, CoRL 2020.

Stars: ✭ 133 (-1.48%)

Mutual labels: reinforcement-learning, openai-gym

Stars: ✭ 77 (-42.96%)

Mutual labels: reinforcement-learning, openai-gym

Playing Atari games with TensorFlow implementation of Asynchronous Deep Q-Learning

Stars: ✭ 44 (-67.41%)

Mutual labels: reinforcement-learning, openai-gym

Ctc Executioner

Master Thesis: Limit order placement with Reinforcement Learning

Stars: ✭ 112 (-17.04%)

Mutual labels: reinforcement-learning, openai-gym

Minimalistic gridworld package for OpenAI Gym

Stars: ✭ 1,047 (+675.56%)

Mutual labels: reinforcement-learning, openai-gym

Gym Electric Motor

Gym Electric Motor (GEM): An OpenAI Gym Environment for Electric Motors

Stars: ✭ 95 (-29.63%)

Mutual labels: reinforcement-learning, openai-gym

An OpenAI Gym Env for Panda

Stars: ✭ 29 (-78.52%)

Mutual labels: reinforcement-learning, openai-gym

OpenAI's cartpole env solver.

Stars: ✭ 107 (-20.74%)

Mutual labels: reinforcement-learning, openai-gym

Framework for developing OpenAI Gym robotics environments simulated with Ignition Gazebo

Stars: ✭ 97 (-28.15%)

Mutual labels: reinforcement-learning, openai-gym

Deterministic Gail Pytorch

PyTorch implementation of Deterministic Generative Adversarial Imitation Learning (GAIL) for Off Policy learning

Stars: ✭ 44 (-67.41%)

Mutual labels: reinforcement-learning, openai-gym

Hierarchical Actor Critic Hac Pytorch

PyTorch implementation of Hierarchical Actor Critic (HAC) for OpenAI gym environments

Stars: ✭ 116 (-14.07%)

Mutual labels: reinforcement-learning, openai-gym

Gridworld environments for OpenAI gym.

Stars: ✭ 43 (-68.15%)

Mutual labels: reinforcement-learning, openai-gym

OpenAI Gym wrapper for the DeepMind Control Suite

Stars: ✭ 75 (-44.44%)

Mutual labels: reinforcement-learning, openai-gym

Reinforcement Learning / AI Bots in Card (Poker) Games - Blackjack, Leduc, Texas, DouDizhu, Mahjong, UNO.

Stars: ✭ 980 (+625.93%)

Mutual labels: reinforcement-learning, openai-gym

Cs234 Reinforcement Learning Winter 2019

My Solutions of Assignments of CS234: Reinforcement Learning Winter 2019

Stars: ✭ 93 (-31.11%)

Mutual labels: reinforcement-learning, openai-gym

Rl Baselines Zoo

A collection of 100+ pre-trained RL agents using Stable Baselines, training and hyperparameter optimization included.

Stars: ✭ 839 (+521.48%)

Mutual labels: reinforcement-learning, openai-gym

OpenAI Gym environments using DART

Stars: ✭ 20 (-85.19%)

Mutual labels: reinforcement-learning, openai-gym

Solving OpenAI Gym problems.

Stars: ✭ 98 (-27.41%)

Mutual labels: reinforcement-learning, openai-gym

Stable Baselines

Mirror of Stable-Baselines: a fork of OpenAI Baselines, implementations of reinforcement learning algorithms

Stars: ✭ 115 (-14.81%)

Mutual labels: reinforcement-learning, openai-gym

View All Similar Projects ➔

holdem

⚠️ This is an experimental API, it will most definitely contain bugs, but that's why you are here!

pip install holdem

Afaik, this is the first OpenAI Gym No-Limit Texas Hold'em* (NLTH) environment written in Python. It's an experiment to build a Gym environment that is synchronous and can support any number of players but also appeal to the general public that wants to learn how to "solve" NLTH.

*Python 3 supports arbitrary length integers 💸

Right now, this is a work in progress, but I believe the API is mature enough for some preliminary experiments. Join me in making some interesting progress on multi-agent Gym environments.

Usage

There is limited documentation at the moment. I'll try to make this less painful to understand.

`env = holdem.TexasHoldemEnv(n_seats, max_limit=1e9, debug=False)`

Creates a gym environment representation a NLTH Table from the parameters:

n_seats - number of available players for the current table. No players are initially allocated to the table. You must call env.add_player(seat_id, ...) to populate the table.
max_limit - max_limit is used to define the gym.spaces API for the class. It does not actually determine any NLTH limits; in support of gym.spaces.Discrete.
debug - add debug statements to play, will probably be removed in the future.

`env.add_player(seat_id, stack=2000)`

Adds a player to the table according to the specified seat (seat_id) and the initial amount of chips allocated to the player's stack. If the table does not have enough seats according to the n_seats used by the constructor, a gym.error.Error will be raised.

`(player_states, community_states) = env.reset()`

Calling env.reset resets the NLTH table to a new hand state. It does not reset any of the players stacks, or, reset any of the blinds. New behavior is reserved for a special, future portion of the API that is yet another feature that is not standard in Gym environments and is a work in progress.

The observation returned is a tuple of the following by index:

player_states - a tuple where each entry is tuple(player_info, player_hand), this feature can be used to gather all states and hands by (player_infos, player_hands) = zip(*player_states).
- player_infos - is a list of int features describing the individual player. It contains the following by index: 0. [0, 1] - 0 - seat is empty, 1 - seat is not empty.
  1. [0, n_seats - 1] - player's id, where they are sitting.
  2. [0, inf] - player's current stack.
  3. [0, 1] - player is playing the current hand.
  4. [0, inf] the player's current handrank according to treys.Evaluator.evaluate(hand, community).
  5. [0, 1] - 0 - player has not played this round, 1 - player has played this round.
  6. [0, 1] - 0 - player is currently not betting, 1 - player is betting.
  7. [0, 1] - 0 - player is currently not all-in, 1 - player is all-in.
  8. [0, inf] - player's last sidepot.
- player_hands - is a list of int features describing the cards in the player's pocket. The values are encoded based on the treys.Card integer representation.
community_states - a tuple(community_infos, community_cards) where:
- community_infos - a list by index: 0. [0, n_seats - 1] - location of the dealer button, where big blind is posted.
  1. [0, inf] - the current small blind amount.
  2. [0, inf] - the current big blind amount.
  3. [0, inf] - the current total amount in the community pot.
  4. [0, inf] - the last posted raise amount.
  5. [0, inf] - minimum required raise amount, if above 0.
  6. [0, inf] - the amount required to call.
  7. [0, n_seats - 1] - the current player required to take an action.
- community_cards - is a list of int features describing the cards in the community. The values are encoded based on the treys.Card integer representation. There are 5 int in the list, where -1 represents that there is no card present.

Example

import gym
import holdem

def play_out_hand(env, n_seats):
  # reset environment, gather relevant observations
  (player_states, (community_infos, community_cards)) = env.reset()
  (player_infos, player_hands) = zip(*player_states)

  # display the table, cards and all
  env.render(mode='human')

  terminal = False
  while not terminal:
    # play safe actions, check when noone else has raised, call when raised.
    actions = holdem.safe_actions(community_infos, n_seats=n_seats)
    (player_states, (community_infos, community_cards)), rews, terminal, info = env.step(actions)
    env.render(mode='human')

env = gym.make('TexasHoldem-v1') # holdem.TexasHoldemEnv(2)

# start with 2 players
env.add_player(0, stack=2000) # add a player to seat 0 with 2000 "chips"
env.add_player(1, stack=2000) # add another player to seat 1 with 2000 "chips"
# play out a hand
play_out_hand(env, env.n_seats)

# add one more player
env.add_player(2, stack=2000) # add another player to seat 1 with 2000 "chips"
# play out another hand
play_out_hand(env, env.n_seats)

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 135

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (11) 🔗