All Projects → xwhan → Walk_the_blocks

xwhan / Walk_the_blocks

Licence: gpl-3.0
Implementation of Scheduled Policy Optimization for task-oriented language grouding

Projects that are alternatives of or similar to Walk the blocks

Hands On Meta Learning With Python
Learning to Learn using One-Shot Learning, MAML, Reptile, Meta-SGD and more with Tensorflow
Stars: ✭ 768 (+3390.91%)
Mutual labels:  reinforcement-learning
Webshell
This is a webshell open source project
Stars: ✭ 7,545 (+34195.45%)
Mutual labels:  asp
Textworld
​TextWorld is a sandbox learning environment for the training and evaluation of reinforcement learning (RL) agents on text-based games.
Stars: ✭ 895 (+3968.18%)
Mutual labels:  reinforcement-learning
Super Mario Bros A3c Pytorch
Asynchronous Advantage Actor-Critic (A3C) algorithm for Super Mario Bros
Stars: ✭ 775 (+3422.73%)
Mutual labels:  reinforcement-learning
Deeprec
推荐、广告工业界经典以及最前沿的论文、资料集合/ Must-read Papers on Recommendation System and CTR Prediction
Stars: ✭ 822 (+3636.36%)
Mutual labels:  reinforcement-learning
Pygame Learning Environment
PyGame Learning Environment (PLE) -- Reinforcement Learning Environment in Python.
Stars: ✭ 828 (+3663.64%)
Mutual labels:  reinforcement-learning
Btgym
Scalable, event-driven, deep-learning-friendly backtesting library
Stars: ✭ 765 (+3377.27%)
Mutual labels:  reinforcement-learning
Sc2atari
Convert sc2 environment to gym-atari and play some mini-games
Stars: ✭ 19 (-13.64%)
Mutual labels:  reinforcement-learning
Tensorlayer
Deep Learning and Reinforcement Learning Library for Scientists and Engineers 🔥
Stars: ✭ 6,796 (+30790.91%)
Mutual labels:  reinforcement-learning
Aspexec
asp命令执行webshell
Stars: ✭ 16 (-27.27%)
Mutual labels:  asp
Chatbot cn
基于金融-司法领域(兼有闲聊性质)的聊天机器人,其中的主要模块有信息抽取、NLU、NLG、知识图谱等,并且利用Django整合了前端展示,目前已经封装了nlp和kg的restful接口
Stars: ✭ 791 (+3495.45%)
Mutual labels:  reinforcement-learning
Tradinggym
Trading and Backtesting environment for training reinforcement learning agent or simple rule base algo.
Stars: ✭ 813 (+3595.45%)
Mutual labels:  reinforcement-learning
Aspchat
Classic ASP chat application
Stars: ✭ 6 (-72.73%)
Mutual labels:  asp
Coursera
Quiz & Assignment of Coursera
Stars: ✭ 774 (+3418.18%)
Mutual labels:  reinforcement-learning
Aim
Aim — a super-easy way to record, search and compare 1000s of ML training runs
Stars: ✭ 894 (+3963.64%)
Mutual labels:  reinforcement-learning
Notes
Resources to learn more about Machine Learning and Artificial Intelligence
Stars: ✭ 766 (+3381.82%)
Mutual labels:  reinforcement-learning
Basic reinforcement learning
An introductory series to Reinforcement Learning (RL) with comprehensive step-by-step tutorials.
Stars: ✭ 826 (+3654.55%)
Mutual labels:  reinforcement-learning
Slm Lab
Modular Deep Reinforcement Learning framework in PyTorch. Companion library of the book "Foundations of Deep Reinforcement Learning".
Stars: ✭ 904 (+4009.09%)
Mutual labels:  reinforcement-learning
Bombora
My experimentations with Reinforcement Learning in Pytorch
Stars: ✭ 18 (-18.18%)
Mutual labels:  reinforcement-learning
Kuainiao for xiaobao
迅雷快鸟For Merlin小宝改版固件
Stars: ✭ 7 (-68.18%)
Mutual labels:  asp

Scheduled Policy Optimization for Natural Language Communication with Intelligent Agents

Models and Algorithms

See files under walk_the_blocks/BlockWorldRoboticAgent/srcs/

  • learn_by_ppo.py run this file for training, you can change the schedule mechanism in the function ppo_update(), these are the options:

    • do imitation every 50
    • do imitation based on rules
    • imitation 1 epoch and then RL 1 epoch

    example: python learn_by_ppo.py -lr 0.0001 -max_epochs 2 -entropy_coef 0.05

  • policy_model.py the network achitecture and loss functions:

    • PPO Loss
    • Supervised Loss
    • Advantage Actor-Critic Loss

Instructions

For the usage of the Block-world environment, please refer to https://github.com/clic-lab/blocks

Train the RL agents

  • S-REIN *

If you use our code in your own research, please cite the following paper

@article{xiong2018scheduled,
  title={Scheduled Policy Optimization for Natural Language Communication with Intelligent Agents},
  author={Xiong, Wenhan and Guo, Xiaoxiao and Yu, Mo and Chang, Shiyu and Zhou, Bowen and Wang, William Yang},
  journal={arXiv preprint arXiv:1806.06187},
  year={2018}
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].