Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → Svalorzen → Ai Toolbox

Svalorzen / Ai Toolbox

Licence: gpl-3.0

A C++ framework for MDPs and POMDPs with Python bindings

Programming Languages

139335 projects - #7 most used programming language

Labels

artificial-intelligence reinforcement-learning

Projects that are alternatives of or similar to Ai Toolbox

Mastering Atari with Discrete World Models

Stars: ✭ 287 (-42.6%)

Mutual labels: artificial-intelligence, reinforcement-learning

MDPs and POMDPs in Julia - An interface for defining, solving, and simulating fully and partially observable Markov decision processes on discrete and continuous spaces.

Stars: ✭ 338 (-32.4%)

Mutual labels: artificial-intelligence, reinforcement-learning

Grokking Deep Reinforcement Learning

Stars: ✭ 304 (-39.2%)

Mutual labels: artificial-intelligence, reinforcement-learning

Machine Learning Platform for Kubernetes (MLOps tools for experimentation and automation)

Stars: ✭ 2,966 (+493.2%)

Mutual labels: artificial-intelligence, reinforcement-learning

Deep Reinforcement Learning Framework + Algorithms

Stars: ✭ 414 (-17.2%)

Mutual labels: artificial-intelligence, reinforcement-learning

Dream to Control: Learning Behaviors by Latent Imagination

Stars: ✭ 269 (-46.2%)

Mutual labels: artificial-intelligence, reinforcement-learning

Deep Reinforcement Learning for the JVM (Deep-Q, A3C)

Stars: ✭ 330 (-34%)

Mutual labels: artificial-intelligence, reinforcement-learning

DeepMind's software stack for physics-based simulation and Reinforcement Learning environments, using MuJoCo.

Stars: ✭ 2,592 (+418.4%)

Mutual labels: artificial-intelligence, reinforcement-learning

HoME: a Household Multimodal Environment is a platform for artificial agents to learn from vision, audio, semantics, physics, and interaction with objects and other agents, all within a realistic context.

Stars: ✭ 370 (-26%)

Mutual labels: artificial-intelligence, reinforcement-learning

lagom: A PyTorch infrastructure for rapid prototyping of reinforcement learning algorithms.

Stars: ✭ 364 (-27.2%)

Mutual labels: artificial-intelligence, reinforcement-learning

Applied Reinforcement Learning

Reinforcement Learning and Decision Making tutorials explained at an intuitive level and with Jupyter Notebooks

Stars: ✭ 229 (-54.2%)

Mutual labels: artificial-intelligence, reinforcement-learning

A Configurable Recommender Systems Simulation Platform

Stars: ✭ 461 (-7.8%)

Mutual labels: artificial-intelligence, reinforcement-learning

A fast Evolution Strategy implementation in Python

Stars: ✭ 227 (-54.6%)

Mutual labels: artificial-intelligence, reinforcement-learning

和（he for objective-c） —— “信息熵减机系统”

Stars: ✭ 284 (-43.2%)

Mutual labels: artificial-intelligence, reinforcement-learning

Amazing Machine Learning Opensource 2019

Amazing Machine Learning Open Source Tools and Projects for the Past Year (v.2019)

Stars: ✭ 198 (-60.4%)

Mutual labels: artificial-intelligence, reinforcement-learning

Reinforcement Learning

Learn Deep Reinforcement Learning in 60 days! Lectures & Code in Python. Reinforcement Learning + Deep Learning

Stars: ✭ 3,329 (+565.8%)

Mutual labels: artificial-intelligence, reinforcement-learning

Reinforcement learning framework to accelerate research

Stars: ✭ 173 (-65.4%)

Mutual labels: artificial-intelligence, reinforcement-learning

Free Ai Resources

🚀 FREE AI Resources - 🎓 Courses, 👷 Jobs, 📝 Blogs, 🔬 AI Research, and many more - for everyone!

Stars: ✭ 192 (-61.6%)

Mutual labels: artificial-intelligence, reinforcement-learning

Text summurization abstractive methods

Multiple implementations for abstractive text summurization , using google colab

Stars: ✭ 359 (-28.2%)

Mutual labels: artificial-intelligence, reinforcement-learning

Arnold - DOOM Agent

Stars: ✭ 457 (-8.6%)

Mutual labels: artificial-intelligence, reinforcement-learning

View All Similar Projects ➔

AI-Toolbox

This C++ toolbox is aimed at representing and solving common AI problems, implementing an easy-to-use interface which should be hopefully extensible to many problems, while keeping code readable.

Current development includes MDPs, POMDPs and related algorithms. This toolbox was originally developed taking inspiration from the Matlab MDPToolbox, which you can find here, and from the pomdp-solve software written by A. R. Cassandra, which you can find here.

An excellent introduction to the basics of reinforcement learning can be found freely online in this book.

If you use this toolbox for research, please consider citing our JMLR article:

@article{JMLR:v21:18-402,
  author  = {Eugenio Bargiacchi and Diederik M. Roijers and Ann Now\'{e}},
  title   = {AI-Toolbox: A C++ library for Reinforcement Learning and Planning (with Python Bindings)},
  journal = {Journal of Machine Learning Research},
  year    = {2020},
  volume  = {21},
  number  = {102},
  pages   = {1-12},
  url     = {http://jmlr.org/papers/v21/18-402.html}
}

Example

// The model can be any custom class that respects a 10-method interface.
auto model = makeTigerProblem();
unsigned horizon = 10; // The horizon of the solution.

// The 0.0 is the convergence parameter. It gives a way to stop the
// computation if the policy has converged before the horizon.
AIToolbox::POMDP::IncrementalPruning solver(horizon, 0.0);

// Solve the model and obtain the optimal value function.
auto [bound, valueFunction] = solver(model);

// We create a policy from the solution to compute the agent's actions.
// The parameters are the size of the model (SxAxO), and the value function.
AIToolbox::POMDP::Policy policy(2, 3, 2, valueFunction);

// We begin a simulation with a uniform belief. We sample from the belief
// in order to get a "real" state for the world, since this code has to
// both emulate the environment and control the agent.
AIToolbox::POMDP::Belief b(2); b << 0.5, 0.5;
auto s = AIToolbox::sampleProbability(b.size(), b, rand);

// We sample the first action. The id is to follow the policy tree later.
auto [a, id] = policy.sampleAction(b, horizon);

double totalReward = 0.0;// As an example, we store the overall reward.
for (int t = horizon - 1; t >= 0; --t) {
    // We advance the world one step.
    auto [s1, o, r] = model.sampleSOR(s, a);
    totalReward += r;

    // We select our next action from the observation we got.
    std::tie(a, id) = policy.sampleAction(id, o, t);

    s = s1; // Finally we update the world for the next timestep.
}

Documentation

The latest documentation is available here. Keep in mind that it may not always be 100% up to date with the latest commits, while the one you compile yourself will of course be.

For Python docs you can find them by typing help(AIToolbox) from the interpreter. It should show the exported API for each class, along with any differences in input/output.

Features

Cassandra POMDP Format Parsing

Cassandra's POMDP format is a type of text file that contains a definition of an MDP or POMDP model. You can find some examples here. While it is absolutely not necessary to use this format, and you can define models via code, we do parse a reasonable subset of Cassandra's POMDP format, which allows to reuse already defined problems with this library. Here's the docs on that.

Python 2 and 3 Bindings!

The user interface of the library is pretty much the same with Python than what you would get by using simply C++. See the examples folder to see just how much Python and C++ code resemble each other. Since Python does not allow templates, the classes are binded with as many instantiations as possible.

Additionally, the library allows the usage of native Python generative models (where you don't need to specify the transition and reward functions, you only sample next state and reward). This allows for example to directly use OpenAI gym environments with minimal code writing.

That said, if you need to customize a specific implementation to make it perform better on your specific use-cases, or if you want to try something completely new, you will have to use C++.

Utilities

The library has an extensive set of utilities which would be too long to enumerate here. In particular, we have utilities for combinatorics, polytopes, linear programming, sampling and distributions, automated statistics, belief updating, many data structures, logging, seeding and much more.

Bandit/Normal Games:

	Policies
Exploring Selfish Reinforcement Learning (ESRL)	Q-Greedy Policy	Softmax Policy
Linear Reward Penalty	Thompson Sampling (Student-t distribution)	Random Policy

Single Agent MDP/Stochastic Games:

	Models
Basic Model	Sparse Model	Maximum Likelihood Model
Sparse Maximum Likelihood Model	Thompson Model (Dirichlet + Student-t distributions)
	Algorithms
Dyna-Q	Dyna2	Expected SARSA
Hysteretic Q-Learning	Importance Sampling	Linear Programming
Monte Carlo Tree Search (MCTS)	Policy Evaluation	Policy Iteration
Prioritized Sweeping	Q-Learning	Double Q-Learning
Q(λ)	R-Learning	SARSA(λ)
SARSA	Retrace(λ)	Tree Backup(λ)
Value Iteration
	Policies
Basic Policy	Epsilon-Greedy Policy	Softmax Policy
Q-Greedy Policy	PGA-APP	Win or Learn Fast Policy Iteration (WoLF)

Single Agent POMDP:

	Models
Basic Model	Sparse Model
	Algorithms
Augmented MDP (AMDP)	Blind Strategies	Fast Informed Bound
GapMin	Incremental Pruning	Linear Support
PERSEUS	POMCP with UCB1	Point Based Value Iteration (PBVI)
QMDP	Real-Time Belief State Search (RTBSS)	SARSOP
Witness	rPOMCP
	Policies
Basic Policy

Factored/Joint Multi-Agent:

Bandits:

Not in Python yet.

	Algorithms
Max-Plus	Multi-Objective Variable Elimination (MOVE)	Upper Confidence Variable Elimination (UCVE)
Variable Elimination
	Policies
Q-Greedy Policy	Random Policy	Learning with Linear Rewards (LLR)
Multi-Agent Upper Confidence Exploration (MAUCE)	Multi-Agent Thompson-Sampling (Student-t distribution)	Single-Action Policy

MDP:

Not in Python yet.

	Models
Cooperative Basic Model	Cooperative Maximum Likelihood Model	Cooperative Thompson Model (Dirichlet + Student-t distributions)
	Algorithms
FactoredLP	Multi Agent Linear Programming	Joint Action Learners
Sparse Cooperative Q-Learning	Cooperative Prioritized Sweeping
	Policies
All Bandit Policies	Epsilon-Greedy Policy	Q-Greedy Policy

Build Instructions

Dependencies

To build the library you need:

cmake >= 3.9
the boost library >= 1.67
the Eigen 3.3 library.
the lp_solve library (a shared library must be available to compile the Python wrapper).

In addition, full C++17 support is now required (this means at least g++-7)

Building

Once you have all required dependencies, you can simply execute the following commands from the project's main folder:

mkdir build
cd build/
cmake ..
make

cmake can be called with a series of flags in order to customize the output, if building everything is not desirable. The following flags are available:

CMAKE_BUILD_TYPE # Defines the build type
MAKE_ALL         # Builds all there is to build in the project, but Python.
MAKE_LIB         # Builds the whole core C++ libraries (MDP, POMDP, etc..)
MAKE_MDP         # Builds only the core C++ MDP library
MAKE_FMDP        # Builds only the core C++ Factored/Multi-Agent and MDP libraries
MAKE_POMDP       # Builds only the core C++ POMDP and MDP libraries
MAKE_TESTS       # Builds the library's tests for the compiled core libraries
MAKE_EXAMPLES    # Builds the library's examples using the compiled core libraries
MAKE_PYTHON      # Builds Python bindings for the compiled core libraries
PYTHON_VERSION   # Selects the Python version you want (2 or 3). If not
                 # specified, we try to guess based on your default interpreter.

These flags can be combined as needed. For example:

# Will build MDP and MDP Python 3 bindings
cmake -DCMAKE_BUILD_TYPE=Debug -DMAKE_MDP=1 -DMAKE_PYTHON=1 -DPYTHON_VERSION=3 ..

The default flags when nothing is specified are MAKE_ALL and CMAKE_BUILD_TYPE=Release.

Note that by default MAKE_ALL does not build the Python bindings, as they have a minor performance hit on the C++ static libraries. You can easily enable them by using the flag MAKE_PYTHON.

The static library files will be available directly in the build directory. Three separate libraries are built: AIToolboxMDP, AIToolboxPOMDP and AIToolboxFMDP. In case you want to link against either the POMDP library or the Factored MDP library, you will also need to link against the MDP one, since both of them use MDP functionality.

A number of small tests are included which you can find in the test/ folder. You can execute them after building the project using the following command directly from the build directory, just after you finish make:

ctest

The tests also offer a brief introduction for the framework, waiting for a more complete descriptive write-up. Only the tests for the parts of the library that you compiled are going to be built.

To compile the library's documentation you need Doxygen. To use it it is sufficient to execute the following command from the project's root folder:

doxygen

After that the documentation will be generated into an html folder in the main directory.

Compiling a Program

To compile a program that uses this library, simply link it against the compiled libraries you need, and possibly to the lp_solve libraries (if using POMDP or FMDP).

Please note that since both POMDP and FMDP libraries rely on the MDP code, you MUST specify those libraries before the MDP library when linking, otherwise it may result in undefined reference errors. The POMDP and Factored MDP libraries are not currently dependent on each other so their order does not matter.

For Python, you just need to import the AIToolbox.so module, and you'll be able to use the classes as exported to Python. All classes are documented, and you can run in the Python CLI

help(AIToolbox.MDP)
help(AIToolbox.POMDP)

to see the documentation for each specific class.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 500

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (8) 🔗