All Projects → namoshizun → PyPOMDP

namoshizun / PyPOMDP

Licence: other
Python implementation of POMDP framework and PBVI & POMCP algorithms.

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to PyPOMDP

Recurrent-Deep-Q-Learning
Solving POMDP using Recurrent networks
Stars: ✭ 52 (-13.33%)
Mutual labels:  reinforcement-learning-algorithms, pomdp
POMDP
Implementing a RL algorithm based upon a partially observable Markov decision process.
Stars: ✭ 31 (-48.33%)
Mutual labels:  reinforcement-learning-algorithms, pomdp
agentmodels.org
Modeling agents with probabilistic programs
Stars: ✭ 66 (+10%)
Mutual labels:  reinforcement-learning-algorithms, pomdp
interaction-design-in-a-nutshell
A dense, clearly defined, and small guide to interaction design
Stars: ✭ 13 (-78.33%)
Mutual labels:  educational
magic-web
Discover all the amazing things your browser can do
Stars: ✭ 39 (-35%)
Mutual labels:  educational
ml-ai
ML-AI Community | Open Source | Built in Bharat for the World | Data science problem statements and solutions
Stars: ✭ 32 (-46.67%)
Mutual labels:  reinforcement-learning-algorithms
connect4
Solving board games like Connect4 using Deep Reinforcement Learning
Stars: ✭ 33 (-45%)
Mutual labels:  reinforcement-learning-algorithms
unpackai
The Unpack.AI library
Stars: ✭ 20 (-66.67%)
Mutual labels:  educational
FortranTip
Short instructional Fortran codes associated with Twitter @FortranTip
Stars: ✭ 39 (-35%)
Mutual labels:  educational
ethmerge.com-content
Markdown formatted content for the ethmerge.com website.
Stars: ✭ 29 (-51.67%)
Mutual labels:  educational
kotoba
A Discord bot for helping with learning Japanese.
Stars: ✭ 118 (+96.67%)
Mutual labels:  educational
bandits
Comparison of bandit algorithms from the Reinforcement Learning bible.
Stars: ✭ 16 (-73.33%)
Mutual labels:  reinforcement-learning-algorithms
pytorch-rl
Pytorch Implementation of RL algorithms
Stars: ✭ 15 (-75%)
Mutual labels:  reinforcement-learning-algorithms
l2rpn-baselines
L2RPN Baselines a repository to host baselines for l2rpn competitions.
Stars: ✭ 57 (-5%)
Mutual labels:  reinforcement-learning-algorithms
Mastering-Algorithms-with-C
This repository contains example files organized by chapters in Mastering Algorithms with C, by Kyle Loudon
Stars: ✭ 48 (-20%)
Mutual labels:  educational
SilentCryptoMiner
A Silent (Hidden) Free Crypto Miner Builder - Supports ETH, ETC, XMR and many more.
Stars: ✭ 547 (+811.67%)
Mutual labels:  educational
MakeYourOwnNN-Persian
Persian translation of Make Your Own Neural Networks book by Tariq Rashid
Stars: ✭ 18 (-70%)
Mutual labels:  educational
PyGameofLife
Conway's Game of Life using python's matplotlib and numpy
Stars: ✭ 40 (-33.33%)
Mutual labels:  educational
SilentETHMiner
A Silent (Hidden) Ethereum (ETH & ETC) Miner Builder
Stars: ✭ 219 (+265%)
Mutual labels:  educational
gds course
Geographic Data Science, the course
Stars: ✭ 60 (+0%)
Mutual labels:  educational

POMDP Solvers

An educational project with modules for creating a POMDP (Partially Observable Markov Decision Process) model, implementing and running POMDP solver algorithms. This package was developed during my bachelor thesis to help study POMDP and its solvers.

Installation

Python version >=3.5.2. To install dependencies, simply do following:

pip install -r requirements.txt

How it works

POMDP Environment

For easier construction of a POMDP environment, POMDP File Grammar is used to encode environment dynamics. Examples of environments can be found in the 'environments' folder. You could also create a new one as long as it complies with the POMDP file conversions.

  • RockSample-7x8.POMDP: Semantic explanation can be found here
  • Tiger-2D.POMDP: Standard Tiger Problem.
  • Tiger-3D.POMDP: 3-Door version of the Tiger Problem. The main difference is that the agent now has to choose one of the doors to listen if it doesn't want to open a door. Then the observation is given depending on how far away is the tiger when the agent puts its ear against the door.
  • GridWorld.POMDP:
    • A simple 2D grid environment where the agent can only move left, right or halt at the current position. The rewarding states are at the two end states and the attempt to moving out of the grid edge causes a penalty,
    • A much more general 2D grid world environment can be generated using /environments/grid_world_maker.py. Check out /environments/grid_world_example.py for how to use it.
POMDP Solvers

This package has implemented PBVI (Point-Based Value Iteration) and POMCP (Partially Observable Monte Carlo Planning). Variable names follows the notations used in the original paper so a read-through of papers would be encouraged.

Solver algorithms extend the blueprint class 'POMDP' and are managed by the PomdpRunner. The runner class reads algorithm configurations in the 'configs' folder, creates the environment model, and use those elements to create an actual POMDP solver.

How to run it

usage: main.py [-h] [--env ENV] [--budget BUDGET] [--snapshot SNAPSHOT]
               [--logfile LOGFILE] [--random_prior RANDOM_PRIOR]
               [--max_play MAX_PLAY]
               config

Solve pomdp

positional arguments:
  config                The file name of algorithm configuration (without JSON
                        extension)

optional arguments:
  -h, --help            show this help message and exit
  --env ENV             The name of environment's config file
  --budget BUDGET       The total action budget (defeault to inf)
  --snapshot SNAPSHOT   Whether to snapshot the belief tree after each episode
  --logfile LOGFILE     Logfile path
  --random_prior RANDOM_PRIOR
                        Whether or not to use a randomly generated
                        distribution as prior belief, default to False
  --max_play MAX_PLAY   Maximum number of play steps (episodes)

* Example usage:
> python main.py pomcp --env Tiger-3D.POMDP --budget 10

Improvements

  • Use POMDPX instead of POMDP file grammar. POMDPX is a much more concise grammar for defining a POMDP environment.
  • PomdpParser is carrying too much responsibility — needs to be refactored.
  • Configuration implementation still looks a bit messy.
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].