Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → AaronJi → RL

AaronJi / RL

Licence: other

A set of RL experiments. Currently including: (1) the MDP rank experiment, based on policy gradient algorithm

Programming Languages

139335 projects - #7 most used programming language

Labels

reinforcement-learning policy-gradient ranking-algorithm

Projects that are alternatives of or similar to RL

强化学习中文教程，在线阅读地址：https://datawhalechina.github.io/easy-rl/

Stars: ✭ 3,004 (+13554.55%)

Mutual labels: policy-gradient

An elegant PyTorch deep reinforcement learning library.

Stars: ✭ 4,109 (+18577.27%)

Mutual labels: policy-gradient

Ranking Policy Gradient

Stars: ✭ 22 (+0%)

Mutual labels: policy-gradient

Machine Learning and having it Deep and Structured (MLDS) in 2018 spring

Stars: ✭ 124 (+463.64%)

Mutual labels: policy-gradient

Deep Algotrading

A resource for learning about deep learning techniques from regression to LSTM and Reinforcement Learning using financial data and the fitness functions of algorithmic trading

Stars: ✭ 173 (+686.36%)

Mutual labels: policy-gradient

Automate swing trading using deep reinforcement learning. The deep deterministic policy gradient-based neural network model trains to choose an action to sell, buy, or hold the stocks to maximize the gain in asset value. The paper also acknowledges the need for a system that predicts the trend in stock value to work along with the reinforcement …

Stars: ✭ 63 (+186.36%)

Mutual labels: policy-gradient

Reinforcement learning

강화학습에 대한 기본적인 알고리즘 구현

Stars: ✭ 100 (+354.55%)

Mutual labels: policy-gradient

Usage of policy gradient reinforcement learning to solve portfolio optimization problems (Tactical Asset Allocation).

Stars: ✭ 26 (+18.18%)

Mutual labels: policy-gradient

Multi-hop knowledge graph reasoning learned via policy gradient with reward shaping and action dropout

Stars: ✭ 202 (+818.18%)

Mutual labels: policy-gradient

Deep-Reinforcement-Learning-With-Python

Master classic RL, deep RL, distributional RL, inverse RL, and more using OpenAI Gym and TensorFlow with extensive Math

Stars: ✭ 222 (+909.09%)

Mutual labels: policy-gradient

Policy Gradient

Minimal Monte Carlo Policy Gradient (REINFORCE) Algorithm Implementation in Keras

Stars: ✭ 135 (+513.64%)

Mutual labels: policy-gradient

A Clearer and Simpler Synchronous Advantage Actor Critic (A2C) Implementation in TensorFlow

Stars: ✭ 169 (+668.18%)

Mutual labels: policy-gradient

Combining deep learning and reinforcement learning.

Stars: ✭ 84 (+281.82%)

Mutual labels: policy-gradient

Tutorials for reinforcement learning in PyTorch and Gym by implementing a few of the popular algorithms. [IN PROGRESS]

Stars: ✭ 121 (+450%)

Mutual labels: policy-gradient

deep rl acrobot

TensorFlow A2C to solve Acrobot, with synchronized parallel environments

Stars: ✭ 32 (+45.45%)

Mutual labels: policy-gradient

Highly Modular and Scalable Reinforcement Learning

Stars: ✭ 102 (+363.64%)

Mutual labels: policy-gradient

Reinforcement Learning

Minimal and Clean Reinforcement Learning Examples

Stars: ✭ 2,863 (+12913.64%)

Mutual labels: policy-gradient

Lightweight deep RL Libraray for continuous control.

Stars: ✭ 14 (-36.36%)

Mutual labels: policy-gradient

A course on Deep Reinforcement Learning in Computer Vision. Visit Website:

Stars: ✭ 59 (+168.18%)

Mutual labels: policy-gradient

siamese dssm sentence_similarity sentece_similarity_rank tensorflow

Stars: ✭ 59 (+168.18%)

Mutual labels: ranking-algorithm

View All Similar Projects ➔

RL

LIRD

Replicate the MDP rank algorithm in https://github.com/egipcy/LIRD Some logic is extended including:

user features is added

Related paper: [Deep Reinforcement Learning for List-wise Recommendations]

RUN MovieLens example

python python/LIRD/LIRD_main.py movielens_lird_example

MDP rank:

Replicate the MDP rank algorithm in [Reinforcement Learning to Rank with Markov Decision Process. Wei, Xu, Lan, Guo, Cheng, SIGIR’17, 2017]

Related paper: [Adapting Markov Decision Process for Search Result Diversification. Xia, Xu, Lan, Guo, Zeng, Cheng, SIGIR’17, 2017]

Run OHSUMED example

python python python/MDPrank/MDPrank_main.py letor_ohsumed_example

Run TREC example

python python/MDPrank/MDPrank_main.py letor_trec_example 
--training_set Letor/TREC/TD2003/Data/Fold1/trainingset.txt 
--valid_set Letor/TREC/TD2003/Data/Fold1/validationset.txt 
--test_set Letor/TREC/TD2003/Data/Fold1/testset.txt

ADP (adaptive dynamic programming):

Related paper:

[An Adaptive Dynamic Programming Algorithm for Dynamic Fleet Fleet Management, I: Single Period Travel Times. Godfrey, Powell, Transportation Science, 2002]
[An Adaptive Dynamic Programming Algorithm for Dynamic Fleet Fleet Management, II: Multiperiod Travel Times. Godfrey, Powell, Transportation Science, 2002]

Run time & space scheduling example:

python python/ADPscheduling/ADP_scheduling_main.py time_space_scheduling_example

Example shows 5 resources to be scheduling within a 4x4 rectangular system and 24 time intervals. Only relocations with transfer period (tau) less than 2 time intervals are considered; 30 iterations are executed.

Figure 0: Shape of the converged value function and corresponding margin values / derivatives of value function / shadow variables of optimzation problem at number of resource = 0, tau = 0, t = 8, 12, 16, 24
Figure 1: The scheduling actions (the arrows, color indicates number of relocated resources) and value of converged value function at the real number of resource, tau = 0, t = 8, 12, 16, 24
Figure 2: Detailed result of the 13th location. (1) Result of the CAVE update at t = 20 and iter = 29; (2) Initial values of value functions at iter = 30 and t = 0, 6, 12, 18; (3) Initial values of value functions at t = 20 and iter = 0, 9, 19, 29

For different experiments, the data path in the arguments need to be changed accordingly.

Temporary issue:

Currently the code may not run direclty in Windows, due to some path issues.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 22

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (0) 🔗