All Projects → tensorforce → Tensorforce

tensorforce / Tensorforce

Licence: apache-2.0
Tensorforce: a TensorFlow library for applied reinforcement learning

Programming Languages

python
139335 projects - #7 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to Tensorforce

Drq
DrQ: Data regularized Q
Stars: ✭ 268 (-91.25%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning, control
Deepdrive
Deepdrive is a simulator that allows anyone with a PC to push the state-of-the-art in self-driving
Stars: ✭ 628 (-79.49%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning, control
Gym Gazebo2
gym-gazebo2 is a toolkit for developing and comparing reinforcement learning algorithms using ROS 2 and Gazebo
Stars: ✭ 257 (-91.61%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning
Gam
A PyTorch implementation of "Graph Classification Using Structural Attention" (KDD 2018).
Stars: ✭ 227 (-92.59%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning
Machine Learning Uiuc
🖥️ CS446: Machine Learning in Spring 2018, University of Illinois at Urbana-Champaign
Stars: ✭ 233 (-92.39%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning
Rad
RAD: Reinforcement Learning with Augmented Data
Stars: ✭ 268 (-91.25%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning
Drl4recsys
Courses on Deep Reinforcement Learning (DRL) and DRL papers for recommender systems
Stars: ✭ 196 (-93.6%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning
Deep Rl Trading
playing idealized trading games with deep reinforcement learning
Stars: ✭ 228 (-92.55%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning
Gym Pybullet Drones
PyBullet Gym environments for single and multi-agent reinforcement learning of quadcopter control
Stars: ✭ 168 (-94.51%)
Mutual labels:  reinforcement-learning, control
Learning To Communicate Pytorch
Learning to Communicate with Deep Multi-Agent Reinforcement Learning in PyTorch
Stars: ✭ 236 (-92.29%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning
Roboleague
A car soccer environment inspired by Rocket League for deep reinforcement learning experiments in an adversarial self-play setting.
Stars: ✭ 236 (-92.29%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning
Reinforcementlearning.jl
A reinforcement learning package for Julia
Stars: ✭ 192 (-93.73%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning
Naf Tensorflow
"Continuous Deep Q-Learning with Model-based Acceleration" in TensorFlow
Stars: ✭ 192 (-93.73%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning
Pytorch A2c Ppo Acktr Gail
PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).
Stars: ✭ 2,632 (-14.04%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning
Pytorch sac
PyTorch implementation of Soft Actor-Critic (SAC)
Stars: ✭ 174 (-94.32%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning
Applied Reinforcement Learning
Reinforcement Learning and Decision Making tutorials explained at an intuitive level and with Jupyter Notebooks
Stars: ✭ 229 (-92.52%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning
Reinforcement Learning
Minimal and Clean Reinforcement Learning Examples
Stars: ✭ 2,863 (-6.5%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning
Accel Brain Code
The purpose of this repository is to make prototypes as case study in the context of proof of concept(PoC) and research and development(R&D) that I have written in my website. The main research topics are Auto-Encoders in relation to the representation learning, the statistical machine learning for energy-based models, adversarial generation networks(GANs), Deep Reinforcement Learning such as Deep Q-Networks, semi-supervised learning, and neural network language model for natural language processing.
Stars: ✭ 166 (-94.58%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning
2048 Deep Reinforcement Learning
Trained A Convolutional Neural Network To Play 2048 using Deep-Reinforcement Learning
Stars: ✭ 169 (-94.48%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning
Pytorch Drl
PyTorch implementations of various Deep Reinforcement Learning (DRL) algorithms for both single agent and multi-agent.
Stars: ✭ 233 (-92.39%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning

Tensorforce: a TensorFlow library for applied reinforcement learning

Docs Gitter Build Status pypi version python version License Donate Donate

Introduction

Tensorforce is an open-source deep reinforcement learning framework, with an emphasis on modularized flexible library design and straightforward usability for applications in research and practice. Tensorforce is built on top of Google's TensorFlow framework and requires Python 3.

Tensorforce follows a set of high-level design choices which differentiate it from other similar libraries:

  • Modular component-based design: Feature implementations, above all, strive to be as generally applicable and configurable as possible, potentially at some cost of faithfully resembling details of the introducing paper.
  • Separation of RL algorithm and application: Algorithms are agnostic to the type and structure of inputs (states/observations) and outputs (actions/decisions), as well as the interaction with the application environment.
  • Full-on TensorFlow models: The entire reinforcement learning logic, including control flow, is implemented in TensorFlow, to enable portable computation graphs independent of application programming language, and to facilitate the deployment of models.

Quicklinks

Table of content

Installation

A stable version of Tensorforce is periodically updated on PyPI and installed as follows:

pip3 install tensorforce

To always use the latest version of Tensorforce, install the GitHub version instead:

git clone https://github.com/tensorforce/tensorforce.git
pip3 install -e tensorforce

Note on installation on M1 Macs: At the moment Tensorflow, which is a core dependency of Tensorforce, cannot be installed on M1 Macs directly. Follow the "M1 Macs" section in the documentation for a workaround.

Environments require additional packages for which there are setup options available (ale, gym, retro, vizdoom, carla; or envs for all environments), however, some require additional tools to be installed separately (see environments documentation). Other setup options include tfa for TensorFlow Addons and tune for HpBandSter required for the tune.py script.

Note on GPU usage: Different from (un)supervised deep learning, RL does not always benefit from running on a GPU, depending on environment and agent configuration. In particular for environments with low-dimensional state spaces (i.e., no images), it is hence worth trying to run on CPU only.

Quickstart example code

from tensorforce import Agent, Environment

# Pre-defined or custom environment
environment = Environment.create(
    environment='gym', level='CartPole', max_episode_timesteps=500
)

# Instantiate a Tensorforce agent
agent = Agent.create(
    agent='tensorforce',
    environment=environment,  # alternatively: states, actions, (max_episode_timesteps)
    memory=10000,
    update=dict(unit='timesteps', batch_size=64),
    optimizer=dict(type='adam', learning_rate=3e-4),
    policy=dict(network='auto'),
    objective='policy_gradient',
    reward_estimation=dict(horizon=20)
)

# Train for 300 episodes
for _ in range(300):

    # Initialize episode
    states = environment.reset()
    terminal = False

    while not terminal:
        # Episode timestep
        actions = agent.act(states=states)
        states, terminal, reward = environment.execute(actions=actions)
        agent.observe(terminal=terminal, reward=reward)

agent.close()
environment.close()

Command line usage

Tensorforce comes with a range of example configurations for different popular reinforcement learning environments. For instance, to run Tensorforce's implementation of the popular Proximal Policy Optimization (PPO) algorithm on the OpenAI Gym CartPole environment, execute the following line:

python3 run.py --agent benchmarks/configs/ppo.json --environment gym \
    --level CartPole-v1 --episodes 100

For more information check out the documentation.

Features

  • Network layers: Fully-connected, 1- and 2-dimensional convolutions, embeddings, pooling, RNNs, dropout, normalization, and more; plus support of Keras layers.
  • Network architecture: Support for multi-state inputs and layer (block) reuse, simple definition of directed acyclic graph structures via register/retrieve layer, plus support for arbitrary architectures.
  • Memory types: Simple batch buffer memory, random replay memory.
  • Policy distributions: Bernoulli distribution for boolean actions, categorical distribution for (finite) integer actions, Gaussian distribution for continuous actions, Beta distribution for range-constrained continuous actions, multi-action support.
  • Reward estimation: Configuration options for estimation horizon, future reward discount, state/state-action/advantage estimation, and for whether to consider terminal and horizon states.
  • Training objectives: (Deterministic) policy gradient, state-(action-)value approximation.
  • Optimization algorithms: Various gradient-based optimizers provided by TensorFlow like Adam/AdaDelta/RMSProp/etc, evolutionary optimizer, natural-gradient-based optimizer, plus a range of meta-optimizers.
  • Exploration: Randomized actions, sampling temperature, variable noise.
  • Preprocessing: Clipping, deltafier, sequence, image processing.
  • Regularization: L2 and entropy regularization.
  • Execution modes: Parallelized execution of multiple environments based on Python's multiprocessing and socket.
  • Optimized act-only SavedModel extraction.
  • TensorBoard support.

By combining these modular components in different ways, a variety of popular deep reinforcement learning models/features can be replicated:

Note that in general the replication is not 100% faithful, since the models as described in the corresponding paper often involve additional minor tweaks and modifications which are hard to support with a modular design (and, arguably, also questionable whether it is important/desirable to support them). On the upside, these models are just a few examples from the multitude of module combinations supported by Tensorforce.

Environment adapters

  • Arcade Learning Environment, a simple object-oriented framework that allows researchers and hobbyists to develop AI agents for Atari 2600 games.
  • CARLA, is an open-source simulator for autonomous driving research.
  • OpenAI Gym, a toolkit for developing and comparing reinforcement learning algorithms which supports teaching agents everything from walking to playing games like Pong or Pinball.
  • OpenAI Retro, lets you turn classic video games into Gym environments for reinforcement learning and comes with integrations for ~1000 games.
  • OpenSim, reinforcement learning with musculoskeletal models.
  • PyGame Learning Environment, learning environment which allows a quick start to Reinforcement Learning in Python.
  • ViZDoom, allows developing AI bots that play Doom using only the visual information.

Support, feedback and donating

Please get in touch via mail or on Gitter if you have questions, feedback, ideas for features/collaboration, or if you seek support for applying Tensorforce to your problem.

If you want to support the Tensorforce core team (see below), please also consider donating: GitHub Sponsors or Liberapay.

Core team and contributors

Tensorforce is currently developed and maintained by Alexander Kuhnle.

Earlier versions of Tensorforce (<= 0.4.2) were developed by Michael Schaarschmidt, Alexander Kuhnle and Kai Fricke.

The advanced parallel execution functionality was originally contributed by Jean Rabault (@jerabaul29) and Vincent Belus (@vbelus). Moreover, the pretraining feature was largely developed in collaboration with Hongwei Tang (@thw1021) and Jean Rabault (@jerabaul29).

The CARLA environment wrapper is currently developed by Luca Anzalone (@luca96).

We are very grateful for our open-source contributors (listed according to Github, updated periodically):

Islandman93, sven1977, Mazecreator, wassname, lefnire, daggertye, trickmeyer, mkempers, mryellow, ImpulseAdventure, janislavjankov, andrewekhalel, HassamSheikh, skervim, beflix, coord-e, benelot, tms1337, vwxyzjn, erniejunior, Deathn0t, petrbel, nrhodes, batu, yellowbee686, tgianko, AdamStelmaszczyk, BorisSchaeling, christianhidber, Davidnet, ekerazha, gitter-badger, kborozdin, Kismuz, mannsi, milesmcc, nagachika, neitzal, ngoodger, perara, sohakes, tomhennigan.

Cite Tensorforce

Please cite the framework as follows:

@misc{tensorforce,
  author       = {Kuhnle, Alexander and Schaarschmidt, Michael and Fricke, Kai},
  title        = {Tensorforce: a TensorFlow library for applied reinforcement learning},
  howpublished = {Web page},
  url          = {https://github.com/tensorforce/tensorforce},
  year         = {2017}
}

If you use the parallel execution functionality, please additionally cite it as follows:

@article{rabault2019accelerating,
  title        = {Accelerating deep reinforcement learning strategies of flow control through a multi-environment approach},
  author       = {Rabault, Jean and Kuhnle, Alexander},
  journal      = {Physics of Fluids},
  volume       = {31},
  number       = {9},
  pages        = {094105},
  year         = {2019},
  publisher    = {AIP Publishing}
}

If you use Tensorforce in your research, you may additionally consider citing the following paper:

@article{lift-tensorforce,
  author       = {Schaarschmidt, Michael and Kuhnle, Alexander and Ellis, Ben and Fricke, Kai and Gessert, Felix and Yoneki, Eiko},
  title        = {{LIFT}: Reinforcement Learning in Computer Systems by Learning From Demonstrations},
  journal      = {CoRR},
  volume       = {abs/1808.07903},
  year         = {2018},
  url          = {http://arxiv.org/abs/1808.07903},
  archivePrefix = {arXiv},
  eprint       = {1808.07903}
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].