Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → JuliaML → Reinforce.jl

JuliaML / Reinforce.jl

Licence: other

Abstractions, algorithms, and utilities for reinforcement learning in Julia

Programming Languages

2034 projects

Labels

reinforcement-learning

Projects that are alternatives of or similar to Reinforce.jl

Testbed for deep reinforcement learning

Stars: ✭ 163 (-8.43%)

Mutual labels: reinforcement-learning

A Clearer and Simpler Synchronous Advantage Actor Critic (A2C) Implementation in TensorFlow

Stars: ✭ 169 (-5.06%)

Mutual labels: reinforcement-learning

AI research environment for the Atari 2600 games 🤖.

Stars: ✭ 174 (-2.25%)

Mutual labels: reinforcement-learning

Reinforcement Learning Coach by Intel AI Lab enables easy experimentation with state of the art Reinforcement Learning algorithms

Stars: ✭ 2,085 (+1071.35%)

Mutual labels: reinforcement-learning

A library of reinforcement learning components and agents

Stars: ✭ 2,441 (+1271.35%)

Mutual labels: reinforcement-learning

Gym Pybullet Drones

PyBullet Gym environments for single and multi-agent reinforcement learning of quadcopter control

Stars: ✭ 168 (-5.62%)

Mutual labels: reinforcement-learning

A curated list of artificial intelligence resources (Courses, Tools, App, Open Source Project)

Stars: ✭ 161 (-9.55%)

Mutual labels: reinforcement-learning

Deep Algotrading

A resource for learning about deep learning techniques from regression to LSTM and Reinforcement Learning using financial data and the fitness functions of algorithmic trading

Stars: ✭ 173 (-2.81%)

Mutual labels: reinforcement-learning

2048 Deep Reinforcement Learning

Trained A Convolutional Neural Network To Play 2048 using Deep-Reinforcement Learning

Stars: ✭ 169 (-5.06%)

Mutual labels: reinforcement-learning

PyTorch implementation of Soft Actor-Critic (SAC)

Stars: ✭ 174 (-2.25%)

Mutual labels: reinforcement-learning

This project was moved to: https://github.com/coax-dev/coax

Stars: ✭ 166 (-6.74%)

Mutual labels: reinforcement-learning

Awesome Ml Courses

Awesome free machine learning and AI courses with video lectures.

Stars: ✭ 2,145 (+1105.06%)

Mutual labels: reinforcement-learning

An End-To-End, Lightweight and Flexible Platform for Game Research

Stars: ✭ 2,057 (+1055.62%)

Mutual labels: reinforcement-learning

Rl Baselines3 Zoo

A collection of pre-trained RL agents using Stable Baselines3, training and hyperparameter optimization included.

Stars: ✭ 161 (-9.55%)

Mutual labels: reinforcement-learning

Reinforcement learning framework to accelerate research

Stars: ✭ 173 (-2.81%)

Mutual labels: reinforcement-learning

Reinforcement learning algorithms for MuJoCo tasks

Stars: ✭ 162 (-8.99%)

Mutual labels: reinforcement-learning

Data Science Toolkit

Collection of stats, modeling, and data science tools in Python and R.

Stars: ✭ 169 (-5.06%)

Mutual labels: reinforcement-learning

Implementations of deep RL papers and random experimentation

Stars: ✭ 176 (-1.12%)

Mutual labels: reinforcement-learning

Machine Learning And Reinforcement Learning In Finance

Machine Learning and Reinforcement Learning in Finance New York University Tandon School of Engineering

Stars: ✭ 173 (-2.81%)

Mutual labels: reinforcement-learning

A learning environment for man-made Interactive Fiction games.

Stars: ✭ 173 (-2.81%)

Mutual labels: reinforcement-learning

View All Similar Projects ➔

Reinforce

Reinforce.jl is an interface for Reinforcement Learning. It is intended to connect modular environments, policies, and solvers with a simple interface.

Packages which build on Reinforce:

AtariAlgos: Environment which wraps Atari games using ArcadeLearningEnvironment
OpenAIGym: Wrapper for OpenAI's python package: gym

Environment Interface

New environments are created by subtyping AbstractEnvironment and implementing a few methods:

reset!(env) -> env
actions(env, s) -> A
step!(env, s, a) -> (r, s′)
finished(env, s′) -> Bool

and optional overrides:

state(env) -> s
reward(env) -> r

which map to env.state and env.reward respectively when unset.

ismdp(env) -> Bool

An environment may be fully observable (MDP) or partially observable (POMDP). In the case of a partially observable environment, the state s is really an observation o. To maintain consistency, we call everything a state, and assume that an environment is free to maintain additional (unobserved) internal state. The ismdp query returns true when the environment is MDP, and false otherwise.

maxsteps(env) -> Int

The terminating condition of an episode is control by maxsteps() || finished(). It's default value is 0, indicates unlimited.

An minimal example for testing purpose is test/foo.jl.

TODO: more details and examples

Policy Interface

Agents/policies are created by subtyping AbstractPolicy and implementing action. The built-in random policy is a short example:

struct RandomPolicy <: AbstractPolicy end
action(π::RandomPolicy, r, s, A) = rand(A)

Where A is the action space. The action method maps the last reward and current state to the next chosen action: (r, s) -> a.

reset!(π::AbstractPolicy) -> π

Episode Iterator

Iterate through episodes using the Episode iterator. A 4-tuple (s,a,r,s′) is returned from each step of the episode:

ep = Episode(env, π)
for (s, a, r, s′) in ep
    # do some custom processing of the sars-tuple
end
R = ep.total_reward
T = ep.niter

There is also a convenience method run_episode. The following is an equivalent method to the last example:

R = run_episode(env, π) do
    # anything you want... this section is called after each step
end

Author: Tom Breloff

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 178

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (9) 🔗