Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → xwhan → Walk_the_blocks

xwhan / Walk_the_blocks

Licence: gpl-3.0

Implementation of Scheduled Policy Optimization for task-oriented language grouding

Labels

reinforcement-learning asp

Projects that are alternatives of or similar to Walk the blocks

Hands On Meta Learning With Python

Learning to Learn using One-Shot Learning, MAML, Reptile, Meta-SGD and more with Tensorflow

Stars: ✭ 768 (+3390.91%)

Mutual labels: reinforcement-learning

Webshell

This is a webshell open source project

Stars: ✭ 7,545 (+34195.45%)

Mutual labels: asp

Textworld

TextWorld is a sandbox learning environment for the training and evaluation of reinforcement learning (RL) agents on text-based games.

Stars: ✭ 895 (+3968.18%)

Mutual labels: reinforcement-learning

Super Mario Bros A3c Pytorch

Asynchronous Advantage Actor-Critic (A3C) algorithm for Super Mario Bros

Stars: ✭ 775 (+3422.73%)

Mutual labels: reinforcement-learning

Deeprec

推荐、广告工业界经典以及最前沿的论文、资料集合/ Must-read Papers on Recommendation System and CTR Prediction

Stars: ✭ 822 (+3636.36%)

Mutual labels: reinforcement-learning

Pygame Learning Environment

PyGame Learning Environment (PLE) -- Reinforcement Learning Environment in Python.

Stars: ✭ 828 (+3663.64%)

Mutual labels: reinforcement-learning

Btgym

Scalable, event-driven, deep-learning-friendly backtesting library

Stars: ✭ 765 (+3377.27%)

Mutual labels: reinforcement-learning

Sc2atari

Convert sc2 environment to gym-atari and play some mini-games

Stars: ✭ 19 (-13.64%)

Mutual labels: reinforcement-learning

Tensorlayer

Deep Learning and Reinforcement Learning Library for Scientists and Engineers 🔥

Stars: ✭ 6,796 (+30790.91%)

Mutual labels: reinforcement-learning

Aspexec

asp命令执行webshell

Stars: ✭ 16 (-27.27%)

Mutual labels: asp

Chatbot cn

基于金融-司法领域(兼有闲聊性质)的聊天机器人，其中的主要模块有信息抽取、NLU、NLG、知识图谱等，并且利用Django整合了前端展示,目前已经封装了nlp和kg的restful接口

Stars: ✭ 791 (+3495.45%)

Mutual labels: reinforcement-learning

Tradinggym

Trading and Backtesting environment for training reinforcement learning agent or simple rule base algo.

Stars: ✭ 813 (+3595.45%)

Mutual labels: reinforcement-learning

Aspchat

Classic ASP chat application

Stars: ✭ 6 (-72.73%)

Mutual labels: asp

Coursera

Quiz & Assignment of Coursera

Stars: ✭ 774 (+3418.18%)

Mutual labels: reinforcement-learning

Aim

Aim — a super-easy way to record, search and compare 1000s of ML training runs

Stars: ✭ 894 (+3963.64%)

Mutual labels: reinforcement-learning

Notes

Resources to learn more about Machine Learning and Artificial Intelligence

Stars: ✭ 766 (+3381.82%)

Mutual labels: reinforcement-learning

Basic reinforcement learning

An introductory series to Reinforcement Learning (RL) with comprehensive step-by-step tutorials.

Stars: ✭ 826 (+3654.55%)

Mutual labels: reinforcement-learning

Slm Lab

Modular Deep Reinforcement Learning framework in PyTorch. Companion library of the book "Foundations of Deep Reinforcement Learning".

Stars: ✭ 904 (+4009.09%)

Mutual labels: reinforcement-learning

Bombora

My experimentations with Reinforcement Learning in Pytorch

Stars: ✭ 18 (-18.18%)

Mutual labels: reinforcement-learning

Kuainiao for xiaobao

迅雷快鸟For Merlin小宝改版固件

Stars: ✭ 7 (-68.18%)

Mutual labels: asp

View All Similar Projects ➔

Scheduled Policy Optimization for Natural Language Communication with Intelligent Agents

Models and Algorithms

See files under walk_the_blocks/BlockWorldRoboticAgent/srcs/

learn_by_ppo.py run this file for training, you can change the schedule mechanism in the function ppo_update(), these are the options:
- do imitation every 50
- do imitation based on rules
- imitation 1 epoch and then RL 1 epoch
example: python learn_by_ppo.py -lr 0.0001 -max_epochs 2 -entropy_coef 0.05
policy_model.py the network achitecture and loss functions:
- PPO Loss
- Supervised Loss
- Advantage Actor-Critic Loss

Instructions

For the usage of the Block-world environment, please refer to https://github.com/clic-lab/blocks

Train the RL agents

S-REIN *

If you use our code in your own research, please cite the following paper

@article{xiong2018scheduled,
  title={Scheduled Policy Optimization for Natural Language Communication with Intelligent Agents},
  author={Xiong, Wenhan and Guo, Xiaoxiao and Yu, Mo and Chang, Shiyu and Zhou, Bowen and Wang, William Yang},
  journal={arXiv preprint arXiv:1806.06187},
  year={2018}
}

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 22

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (1) 🔗