All Projects → VinAIResearch → PC3-pytorch

VinAIResearch / PC3-pytorch

Licence: MIT license
Predictive Coding for Locally-Linear Control (ICML-2020)

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to PC3-pytorch

Paddlehelix
Bio-Computing Platform featuring Large-Scale Representation Learning and Multi-Task Deep Learning “螺旋桨”生物计算工具集
Stars: ✭ 213 (+1231.25%)
Mutual labels:  representation-learning
pgdl
Winning Solution of the NeurIPS 2020 Competition on Predicting Generalization in Deep Learning
Stars: ✭ 36 (+125%)
Mutual labels:  representation-learning
object-aware-contrastive
Object-aware Contrastive Learning for Debiased Scene Representation (NeurIPS 2021)
Stars: ✭ 44 (+175%)
Mutual labels:  representation-learning
Contrastive Predictive Coding Pytorch
Contrastive Predictive Coding for Automatic Speaker Verification
Stars: ✭ 223 (+1293.75%)
Mutual labels:  representation-learning
COCO-LM
[NeurIPS 2021] COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining
Stars: ✭ 109 (+581.25%)
Mutual labels:  representation-learning
opensim-moco
Solve optimal control problems for musculoskeletal models using OpenSim and direct collocation.
Stars: ✭ 45 (+181.25%)
Mutual labels:  optimal-control
Awesome Network Embedding
A curated list of network embedding techniques.
Stars: ✭ 2,379 (+14768.75%)
Mutual labels:  representation-learning
VQ-APC
Vector Quantized Autoregressive Predictive Coding (VQ-APC)
Stars: ✭ 34 (+112.5%)
Mutual labels:  representation-learning
DESOM
🌐 Deep Embedded Self-Organizing Map: Joint Representation Learning and Self-Organization
Stars: ✭ 76 (+375%)
Mutual labels:  representation-learning
Pontryagin-Differentiable-Programming
A unified end-to-end learning and control framework that is able to learn a (neural) control objective function, dynamics equation, control policy, or/and optimal trajectory in a control system.
Stars: ✭ 111 (+593.75%)
Mutual labels:  optimal-control
Link Prediction
Representation learning for link prediction within social networks
Stars: ✭ 245 (+1431.25%)
Mutual labels:  representation-learning
PLBART
Official code of our work, Unified Pre-training for Program Understanding and Generation [NAACL 2021].
Stars: ✭ 151 (+843.75%)
Mutual labels:  representation-learning
RLGC
An open-source platform for applying Reinforcement Learning for Grid Control (RLGC)
Stars: ✭ 85 (+431.25%)
Mutual labels:  optimal-control
Poincare Embedding
Poincaré Embedding (unofficial)
Stars: ✭ 218 (+1262.5%)
Mutual labels:  representation-learning
Representation-Learning-for-Information-Extraction
Pytorch implementation of Paper by Google Research - Representation Learning for Information Extraction from Form-like Documents.
Stars: ✭ 82 (+412.5%)
Mutual labels:  representation-learning
Pytorch Byol
PyTorch implementation of Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning
Stars: ✭ 213 (+1231.25%)
Mutual labels:  representation-learning
Supervised-Contrastive-Learning-in-TensorFlow-2
Implements the ideas presented in https://arxiv.org/pdf/2004.11362v1.pdf by Khosla et al.
Stars: ✭ 117 (+631.25%)
Mutual labels:  representation-learning
dymos
Open Source Optimization of Dynamic Multidisciplinary Systems
Stars: ✭ 128 (+700%)
Mutual labels:  optimal-control
Revisiting-Contrastive-SSL
Revisiting Contrastive Methods for Unsupervised Learning of Visual Representations. [NeurIPS 2021]
Stars: ✭ 81 (+406.25%)
Mutual labels:  representation-learning
opty
A library for using direct collocation in the optimization of dynamic systems.
Stars: ✭ 71 (+343.75%)
Mutual labels:  optimal-control

Predictive Coding for Locally-Linear Control

This is a pytorch implementation of the paper "Predictive Coding for Locally-Linear Control". We propose PC3 - an information-theoretic representation learning framework for optimal control from high-dimensional observations. Experiments show that our proposed method outperforms the existing reconstruction-based approaches significantly.

pc3 model

Details of the model architecture and experimental results can be found in our following paper:

@InProceedings{pmlr-v119-shu20a,
  title = 	 {Predictive Coding for Locally-Linear Control},
  author =       {Shu, Rui and Nguyen, Tung and Chow, Yinlam and Pham, Tuan and Than, Khoat and Ghavamzadeh, Mohammad and Ermon, Stefano and Bui, Hung},
  booktitle = 	 {Proceedings of the 37th International Conference on Machine Learning},
  year = 	 {2020},
  volume = 	 {119},
  series = 	 {Proceedings of Machine Learning Research},
  publisher =    {PMLR},
}

Please CITE our paper whenever this repository is used to help produce published results or incorporated into other software.

Installing

First, clone the repository:

https://github.com/VinAIResearch/PC3-pytorch.git

Then install the dependencies as listed in pc3.yml and activate the environment:

conda env create -f pc3.yml
conda activate pc3

Training

The code currently supports training for planar, pendulum, cartpole and 3-link environment. Run train_pc3.py with your own settings. For example:

python train_pc3.py \
    --env=planar \
    --armotized=False \
    --log_dir=planar_1 \
    --seed=1 \
    --data_size=5000 \
    --noise=0 \
    --batch_size=256 \
    --latent_noise=0.1 \
    --lam_nce=1.0 \
    --lam_c=1.0 \
    --lam_cur=7.0 \
    --norm_coeff=0.1 \
    --lr=0.0005 \
    --decay=0.001 \
    --num_iter=2000 \
    --iter_save=1000 \
    --save_map=False

First, data is sampled according to the given data size and noise level, then the PC3 model will be trained using the specified settings.

If the argument save_map is set to True, the latent map will be drawn every 10 epochs (for planar only), then the gif file will be saved at the same directory as the trained model.

You can also visualize the training process by running tensorboard --logdir={path_to_log_dir}, where path_to_log_dir has the form logs/{env}/{log_dir}. The trained model will be saved at result/{env}/{log_dir}.

Latent maps visualization

You can visualize the latent map for planar and pendulum, to do that simply run:

python latent_map_planar.py --log_path={log_to_trained_model} --epoch={epoch}
or 
python latent_map_pendulum.py --log_path={log_to_trained_model} --epoch={epoch}

Data visualization

You can generate training images for visualization purpose by simply running:

cd data
python sample_{env_name}_data.py --sample_size={sample_size} --noise={noise}

Currently the code supports simulating 4 environments: planar, pendulum, cartpole and 3-link.

The raw data (images) is saved in data/{env_name}/raw_{noise}_noise

Running iLQR on latent space

The configuration file for running iLQR for each task is in ilqr_config folder, you can modify with your own settings. Run:

python ilqr.py --task={task} --setting_path={setting_path} --noise={noise} --epoch={epoch}

where task is in {plane, swing, balance, cartpole, 3-link}, setting_path is the path to the model of your 10 trained models (e.g., result/pendulum/).

The code will run iLQR for all trained models for that specific task and compute some statistics. The result is saved in iLQR_result.

Result

Quantitative result

We compare PC3 with two state-of-the-art LCE baselines: PCC (Levine et al., 2020) and SOLAR (Zhang et al., 2019). Specifically, we report the percentage of time spent in the goal region in the underlying system.

result table

Below are videos showing learned policy in 5 tasks.

planar trajectory

swing trajectory

balance trajectory

cartpole trajectory

3-link

Qualitative result

We also compare the quality of learned latent maps between PCC and PC3 in planar and pendulum.

maps table

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].