BerkeleyAutomation / DART

Licence: other
No description or website provided.

Programming Languages

python
139335 projects - #7 most used programming language
shell
77523 projects

Labels

Projects that are alternatives of or similar to DART

Dagger2Example
[停止维护]Dagger2简单使用,推荐使用Jetpack Hilt。
Stars: ✭ 39 (-2.5%)
Mutual labels:  dagger
AndroidDaggerSample
Android-dagger and Architecture component ViewModel sample
Stars: ✭ 30 (-25%)
Mutual labels:  dagger
DaggerAutoInject
Inject automatically your Activities & Fragments, just with a simple annotation
Stars: ✭ 49 (+22.5%)
Mutual labels:  dagger
Starwars-clean
Simple project with clean architecture
Stars: ✭ 34 (-15%)
Mutual labels:  dagger
kotlin-mvp-starter
MVP Starter with RxJava, Dagger 2 in Kotlin
Stars: ✭ 56 (+40%)
Mutual labels:  dagger
DaggerGpuMiner
Standalone GPU/CPU miner for Dagger coin
Stars: ✭ 21 (-47.5%)
Mutual labels:  dagger
Newandroidarchitecture Component Github
Sample project based on the new Android Component Architecture
Stars: ✭ 229 (+472.5%)
Mutual labels:  dagger
blog-resources
✍🏻 Resources and samples for my blog
Stars: ✭ 21 (-47.5%)
Mutual labels:  dagger
Li-MVPArms
这个项目会持续更新
Stars: ✭ 17 (-57.5%)
Mutual labels:  dagger
MVPFramework
基本框架已搭建出,后续可根据需求增加
Stars: ✭ 29 (-27.5%)
Mutual labels:  dagger
MVVMQuick
🚀使用MVVMQuick快速构建您的MVVM结构项目!(Quickly start projects with MVVMQuick!)
Stars: ✭ 23 (-42.5%)
Mutual labels:  dagger
AndroidVIP
Android project to experiment the VIPER approach using mosby, RxJava and dagger2
Stars: ✭ 21 (-47.5%)
Mutual labels:  dagger
avaje-inject
Dependency injection via APT (source code generation) ala "Server side Dagger DI"
Stars: ✭ 114 (+185%)
Mutual labels:  dagger
RestaurantsExplorer
Android application build with MVVM Pattern, using Zomato API to enable search cities arround the world and display the city restaurants on a map.
Stars: ✭ 32 (-20%)
Mutual labels:  dagger
gilfoyle
A CLI to interactively remove useless apps from your Android device.
Stars: ✭ 23 (-42.5%)
Mutual labels:  dagger
android-mvp-kotlin
使用kotlin实现Android MVP模式,使用了Dagger2、Retrofit、RxJava等
Stars: ✭ 14 (-65%)
Mutual labels:  dagger
DaggerGPU.jl
GPU integrations for Dagger.jl
Stars: ✭ 33 (-17.5%)
Mutual labels:  dagger
sailer
Sailer is an Android Sample That shows the use of Coordinator pattern for navigation through Multi Module, Dagger, Navigation Component and much more.
Stars: ✭ 35 (-12.5%)
Mutual labels:  dagger
android-base-project
Android LateralView Base Project
Stars: ✭ 25 (-37.5%)
Mutual labels:  dagger
Kriptofolio
Free open source minimalistic cryptocurrencies portfolio app for Android.
Stars: ✭ 79 (+97.5%)
Mutual labels:  dagger

DART: Noise Injection for Imitation Learning

The code is based on work by Michael Laskey, Jonathan Lee, Roy Fox, Anca Dragan, and Ken Goldberg.

The purpose of this repository is to make available the simulation experiments used in the paper DART: Noise Injection for Robust Imitation Learning and to provide examples of how noise injection may be used to improve off-policy imitation learning by mitigating covariate shift.

Requirements

Clone this repo:

git clone https://github.com/BerkeleyAutomation/DART.git 
cd DART

Create a virtual environment (optional), but useful for exact reproduction of experiments:

virtualenv env
source env/bin/activate

While in the VE, install the required packages:

pip install -e .

Download mjpro131. Follow the instructions from mujoco-py for where to unzip and where to place the license key. Again, other versions of Mujoco may work, but they have not been tested on this project.

Reproducing the Experiments

The results from the paper with the exact same parameters can be reproduced by running the following shell scripts

sh test.sh
sh plot.sh

The test.sh script will run all four domains (Hopper, Walker, HalfCheeth, and Humanoid) for several trials using each algorithm used in the paper and in the supplementary material. Note that this may take hours to complete. Once finished, the data collected can be found in the results/ directory. Subdirectories will be named after the parameters used.

By running plot.sh, reward and loss plots for each environment will be generated as in Fig. 2 and Fig. 4. Loss plots for the random covariance matrices with hand-chosen traces will also be generated as in Fig. 5 of the paper. For the loss plots, loss on the robot's distribution is shown with solid lines and error bars. Loss on the supervisor's distribution is shown with dashed lines.

The simulated error, i.e., the error simulated by a noisy supervisor, may also be plotted in a similar fashion. Although this data is collected from each test, the curves are left out by default so as not to clutter the plots.

For plot_reward.py, an optional --normalize flag may be added to normalize the reward between 0 and 1 as in the paper.

Explanations of Experiments and Parameters

The general methods used for initializing the tasks, collecting the data and evaluating the learners can be found in framework.py. The number of trials to run each experiment may be specified in framework.py.

Each test file (test_bc.py, test_dart.py, etc.) runs a different learning algorithm which takes a series of arguments. test_bc.py runs behavior cloning without any noise as a baseline. test_dagger runs the DAgger algorithm (Ross et al.). test_dagger_b.py runs DAgger-B, which is variant of DAgger where the policy is only updated on select iterations to reduce the computational burden. test_iso.py runs behavior cloning with a noisy supervisor with a isotropic covariance matrix. test_rand.py runs behavior cloning with a Gaussian-noisy supervisor where the covariance matrix is sampled from an inverse Wishart distribution and scaled to a predetermined trace. test_dart.py runs the DART iterative noise optimization algorithm.

Each experiment requires a series of arguments. Arguments common to all tests are given below:

  • --envname [string] Name for the OpenAI gym environment e.g. Hopper-v1
  • --t [integer] Number of times steps per trajectory
  • --iters [space-separated integers] Iterations to evaluate the learned policy

The following are arguments specific to each algorithm:

DART Arguments

  • --update [space-separated integers] Iterations to update the noise parameter.
  • --partition [integer] Number of examples to use for noise optimizations

Random Noise Arguments

  • --prior [float] Error to simulate, i.e., trace of covariance matrix of Gaussian-noisy supervisor.

DAgger Arguments

  • --beta [float] Decaying probability of taking the supervisor's action during training (see Ross et al.).

DAgger-B Arguments

  • --update [space-separated integers] Iterations to update the policy
  • --beta [float] Decaying probability of taking the supervisor's action during training.

Isotropic Noise Arguments

  • --scale [float] Amount to scale identity matrix

As mentioned before, the data collected from running any of these experiments will be stored in the results/ directory under subdirectories named after the provided arguments. The data are stored as CSV files which may be extracted and plotted using pandas and matplotlib or inspected directly using any spreadsheet application. See examples of plotting code in experiments/plot_reward.py

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].