Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → google-research → dads

google-research / dads

Licence: Apache-2.0 License

Code for 'Dynamics-Aware Unsupervised Discovery of Skills' (DADS). Enables skill discovery without supervision, which can be combined with model-based control.

Programming Languages

139335 projects - #7 most used programming language

Labels

reinforcement-learning deep-learning unsupervised-learning model-based-rl skill-discovery

Projects that are alternatives of or similar to dads

Deep-Reinforcement-Learning-CS285-Pytorch

Solutions of assignments of Deep Reinforcement Learning course presented by the University of California, Berkeley (CS285) in Pytorch framework

Stars: ✭ 104 (-24.64%)

Mutual labels: model-based-rl

SimCLR-in-TensorFlow-2

(Minimally) implements SimCLR (https://arxiv.org/abs/2002.05709) in TensorFlow 2.

Stars: ✭ 75 (-45.65%)

Mutual labels: unsupervised-learning

awesome-contrastive-self-supervised-learning

A comprehensive list of awesome contrastive self-supervised learning papers.

Stars: ✭ 748 (+442.03%)

Mutual labels: unsupervised-learning

ADENINE: A Data ExploratioN PipelINE

Stars: ✭ 15 (-89.13%)

Mutual labels: unsupervised-learning

Improved-Wasserstein-GAN-application-on-MRI-images

Improved Wasserstein GAN (WGAN-GP) application on medical (MRI) images

Stars: ✭ 23 (-83.33%)

Mutual labels: unsupervised-learning

Fast, linear version of CorEx for covariance estimation, dimensionality reduction, and subspace clustering with very under-sampled, high-dimensional data

Stars: ✭ 39 (-71.74%)

Mutual labels: unsupervised-learning

[ICML 2021] Break-It-Fix-It: Unsupervised Learning for Program Repair

Stars: ✭ 74 (-46.38%)

Mutual labels: unsupervised-learning

A sparsity aware implementation of "Alternating Direction Method of Multipliers for Non-Negative Matrix Factorization with the Beta-Divergence" (ICASSP 2014).

Stars: ✭ 39 (-71.74%)

Mutual labels: unsupervised-learning

music-recommendation-system

A simple Music Recommendation System

Stars: ✭ 38 (-72.46%)

Mutual labels: unsupervised-learning

Unsupervised and Semi-supervised Learning with Categorical Generative Adversarial Networks

Stars: ✭ 50 (-63.77%)

Mutual labels: unsupervised-learning

Implementation of linear CorEx and temporal CorEx.

Stars: ✭ 31 (-77.54%)

Mutual labels: unsupervised-learning

SPEAR: Programmatically label and build training data quickly.

Stars: ✭ 81 (-41.3%)

Mutual labels: unsupervised-learning

machine-learning-course

Machine Learning Course @ Santa Clara University

Stars: ✭ 17 (-87.68%)

Mutual labels: unsupervised-learning

machine-learning

Programming Assignments and Lectures for Andrew Ng's "Machine Learning" Coursera course

Stars: ✭ 83 (-39.86%)

Mutual labels: unsupervised-learning

Mining Discourse Markers for Unsupervised Sentence Representation Learning

Stars: ✭ 48 (-65.22%)

Mutual labels: unsupervised-learning

[AAAI2021] Unsupervised Opinion Summarization with Content Planning

Stars: ✭ 25 (-81.88%)

Mutual labels: unsupervised-learning

Deep-Association-Learning

Tensorflow Implementation on Paper [BMVC2018]Deep Association Learning for Unsupervised Video Person Re-identification

Stars: ✭ 68 (-50.72%)

Mutual labels: unsupervised-learning

Here is the official implementation of the model KD3A in paper "KD3A: Unsupervised Multi-Source Decentralized Domain Adaptation via Knowledge Distillation".

Stars: ✭ 63 (-54.35%)

Mutual labels: unsupervised-learning

Indoor-SfMLearner

[ECCV'20] Patch-match and Plane-regularization for Unsupervised Indoor Depth Estimation

Stars: ✭ 115 (-16.67%)

Mutual labels: unsupervised-learning

PyTorch implementation of the NIPS 2017 paper - Unsupervised Learning of Disentangled Representations from Video

Stars: ✭ 45 (-67.39%)

Mutual labels: unsupervised-learning

View All Similar Projects ➔

Dynamics-Aware Discovery of Skills (DADS)

This repository is the open-source implementation of Dynamics-Aware Unsupervised Discovery of Skills (project page, arXiv). We propose an skill-discovery method which can learn skills for different agents without any rewards, while simultaneously learning dynamics model for the skills which can be leveraged for model-based control on the downstream task. This work was published in International Conference of Learning Representations (ICLR), 2020.

We have also included an improved off-policy version of DADS, coined off-DADS. The details have been released in Emergent Real-World Robotic Skills via Unsupervised Off-Policy Reinforcement Learning.

In case of problems, contact Archit Sharma.

Table of Contents

Setup
Usage
Citation
Disclaimer

Setup

(1) Setup MuJoCo

Download and setup mujoco in ~/.mujoco. Set the LD_LIBRARY_PATH in your ~/.bashrc:

LD_LIBRARY_PATH='~/.mujoco/mjpro150/bin':$LD_LIBRARY_PATH

(2) Setup environment

Clone the repository and setup up the conda environment to run DADS code:

cd <path_to_dads>
conda env create -f env.yml
conda activate dads-env

Usage

We give a high-level explanation of how to use the code. More details pertaining to hyperparameters can be found in the the configs/template_config.txt, dads_off.py and the Appendix A of paper.

Every training run will require an experimental logging directory and a configuration file, which can be created started from the configs/template_config.txt. There are two phases: (a) Training where the new skills are learnt along with their skill-dynamics models and (b) evaluation where the learnt skills are evaluated on the task associated with the environment.

For training, ensure --run_train=1 is set in the configuration file. For on-policy optimization, set --clear_buffer_every_iter=1 and ensure the replay buffer size is bigger than the number of steps collected in every iteration. For off-policy optimization (details yet to be released), set --clear_buffer_every_iter=0. Set the environment name (ensure the environment is listed in get_environment() in dads_off.py). To change the observation for skill-dynamics (for example to learn in x-y space), set --reduced_observation and correspondingly configure process_observation() in dads_off.py. The skill space can be configured to be discrete or continuous. The optimization parameters can be tweaked, and some basic values have been set in (more details in the paper).

For evaluation, ensure --run_eval=1 and the experimental directory points to the same directory in which the training happened. Set --num_evals if you want to record videos of randomly sampled skills from the prior distribution. After that, the script will use the learned models to execute MPC on the latent space to optimize for the task-reward. By default, the code will call get_environment() to load FLAGS.environment + '_goal', and will go through the list of goal-coordinates specified in the eval section of the script.

We have provided the configuration files in configs/ to reproduce results from the experiments in the paper. Goal evaluation is currently only setup for MuJoCo Ant environement. The goal distribution can be changed in dads_off.py in evaluation part of the script.

cd <path_to_dads>
python unsupervised_skill_learning/dads_off.py --logdir=<path_for_experiment_logs> --flagfile=configs/<config_name>.txt

The specified experimental log directory will contain the tensorboard files, the saved checkpoints and the skill-evaluation videos.

Citation

To cite Dynamics-Aware Unsupervised Discovery of Skills:

@article{sharma2019dynamics,
  title={Dynamics-aware unsupervised discovery of skills},
  author={Sharma, Archit and Gu, Shixiang and Levine, Sergey and Kumar, Vikash and Hausman, Karol},
  journal={arXiv preprint arXiv:1907.01657},
  year={2019}
}

To cite off-DADS and Emergent Real-World Robotic Skills via Unsupervised Off-Policy Reinforcement Learning:

@article{sharma2020emergent,
    title={Emergent Real-World Robotic Skills via Unsupervised Off-Policy Reinforcement Learning},
    author={Sharma, Archit and Ahn, Michael and Levine, Sergey and Kumar, Vikash and Hausman, Karol and Gu, Shixiang},
    journal={arXiv preprint arXiv:2004.12974},
    year={2020}
}

Disclaimer

This is not an officially supported Google product.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 138

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (6) 🔗