All Projects → robotsorcerer → gps

robotsorcerer / gps

Licence: other
Guided policy search in Python and ROS Indigo.

Programming Languages

python
139335 projects - #7 most used programming language
C++
36643 projects - #6 most used programming language
Jupyter Notebook
11667 projects

Projects that are alternatives of or similar to gps

buzzmobile
An autonomous parade float/vehicle
Stars: ✭ 18 (-21.74%)
Mutual labels:  ros-indigo
rgbd person tracking
R-GBD Person Tracking is a ROS framework for detecting and tracking people from a mobile robot.
Stars: ✭ 46 (+100%)
Mutual labels:  ros-indigo
IterativeLQR.jl
A Julia package for constrained iterative LQR (iLQR)
Stars: ✭ 15 (-34.78%)
Mutual labels:  differential-dynamic-programming
ddp-gym
Differential Dynamic Programming controller operating in OpenAI Gym environment.
Stars: ✭ 70 (+204.35%)
Mutual labels:  differential-dynamic-programming

GPS

This code is a reimplementation of the guided policy search algorithm and iterative LQG-based trajectory optimization and supervised policy learning method, meant to help others understand, reuse, and build upon existing work. For full documentation, see rll.berkeley.edu/gps.

The code base is a work in progress. See the FAQ for information on planned future additions to the code.

Mujoco dependency

Create a mujoco directory in your home folder and place the downloaded mjpro131 folder there. This is important as the activation function and openscenegraph bindings will look for mujoco in this path.

iDG

This fork of the code implements the iterative Dynamic Game that was proposed in the paper:

  • iDG: A Robust Zero-Sum, Two-Player Reinforcement Learning

For details of the algorithm, please see the paper on arxiv under the name: Olalekan Ogunmolu.

Running iDG

  • First train a protagonist agent by following the instructions on the rll.berkeley.edu/gps page.

  • Go to the experiments directory and run the copy_gps executable. This will copy the learned policy for the original system into a new folder.

  • We will then make a few modifications in the hyperparams directory of the new folder as follows:

For box2d experiments, we will import the MDGPS class like so at the top of the hyperparams file:

from gps.algorithm.algorithm_mdgps import AlgorithmMDGPS # for new experiments
EXP_DIR: change this to point to the new experiment directory

common:
	|
	|--'experiment_name': 'name_of_new_experiment'
	|--'costs_filename': EXP_DIR + 'costs.csv',
  |--'mode': 'antagonist',  # whether we are running in block-alternating ascent mode
  |--'gamma': 1e8,   # the magnitude of the additive disturbance

where a full common dict will for example look like so:

common = {
    'experiment_name': 'box2d_badmm_example' + '_' + \
            datetime.strftime(datetime.now(), '%m-%d-%y_%H-%M'),
    'experiment_dir': EXP_DIR,
    'data_files_dir': EXP_DIR + 'data_files/',
    'log_filename': EXP_DIR + 'log.txt',
    'costs_filename': EXP_DIR + 'costs.csv',
    'dists_filename': EXP_DIR + 'dist.txt',
    'conditions': 4,
    'mode': 'antagonist',
    'gamma': 1e8,
    'target_end_effector': np.array([0.0, 0.3, -0.5, 0.0, 0.3, -0.2]),
}
  • In the action_cost dict, we would want to add the gamma and mode terms as well e.g.
action_cost = {
    'type': CostAction,
    'wu': np.array([1, 1]),
    'gamma': 1e8,
    'mode': 'antagonist',
}
  • So also for algorithm['cost'] e.g.,
algorithm['cost'] = {
    'type': CostSum,
    'costs': [action_cost, state_cost],
    'weights': [1e-5, 1.0],
    'gamma': 1e8,
    'mode': 'antagonist',
}
  • Similarly, in the algorithm['init_traj_distr'] field, we would want to modify the type of the lqr implementation to
algorithm['init_traj_distr'] = {
    'type': init_lqr_robust,
		}

to account for the new robust lqr algorithm in lin_gauss_init

  • In agent, we want to define the mode as
agent = {
	... : ...
	'mode': 'robust'
	}

Also,

algorithm['traj_opt'] = {
    'type': TrajOptLQRPython,
    'mode': 'robust'
}
  • Add the following to algorithm['policy_opt'] to account for the robust policy
algorithm['policy_opt'] = {
    'robust_weights_file_prefix': EXP_DIR + 'robust_policy',
}

Docker Image

The docker image for the base gps codes is located at lakehanne/gps/

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].