All Projects → salesforce → Multihopkg

salesforce / Multihopkg

Licence: bsd-3-clause
Multi-hop knowledge graph reasoning learned via policy gradient with reward shaping and action dropout

Projects that are alternatives of or similar to Multihopkg

Text summurization abstractive methods
Multiple implementations for abstractive text summurization , using google colab
Stars: ✭ 359 (+77.72%)
Mutual labels:  jupyter-notebook, reinforcement-learning, policy-gradient
Pytorch Rl
Tutorials for reinforcement learning in PyTorch and Gym by implementing a few of the popular algorithms. [IN PROGRESS]
Stars: ✭ 121 (-40.1%)
Mutual labels:  jupyter-notebook, reinforcement-learning, policy-gradient
Trpo
Trust Region Policy Optimization with TensorFlow and OpenAI Gym
Stars: ✭ 343 (+69.8%)
Mutual labels:  jupyter-notebook, reinforcement-learning, policy-gradient
Lagom
lagom: A PyTorch infrastructure for rapid prototyping of reinforcement learning algorithms.
Stars: ✭ 364 (+80.2%)
Mutual labels:  jupyter-notebook, reinforcement-learning, policy-gradient
Deep Algotrading
A resource for learning about deep learning techniques from regression to LSTM and Reinforcement Learning using financial data and the fitness functions of algorithmic trading
Stars: ✭ 173 (-14.36%)
Mutual labels:  jupyter-notebook, reinforcement-learning, policy-gradient
Hands On Reinforcement Learning With Python
Master Reinforcement and Deep Reinforcement Learning using OpenAI Gym and TensorFlow
Stars: ✭ 640 (+216.83%)
Mutual labels:  jupyter-notebook, reinforcement-learning, policy-gradient
Reinforcement learning tutorial with demo
Reinforcement Learning Tutorial with Demo: DP (Policy and Value Iteration), Monte Carlo, TD Learning (SARSA, QLearning), Function Approximation, Policy Gradient, DQN, Imitation, Meta Learning, Papers, Courses, etc..
Stars: ✭ 442 (+118.81%)
Mutual labels:  jupyter-notebook, reinforcement-learning, policy-gradient
Rl Course Experiments
Stars: ✭ 73 (-63.86%)
Mutual labels:  jupyter-notebook, reinforcement-learning, policy-gradient
Modular Rl
[ICML 2020] PyTorch Code for "One Policy to Control Them All: Shared Modular Policies for Agent-Agnostic Control"
Stars: ✭ 126 (-37.62%)
Mutual labels:  jupyter-notebook, reinforcement-learning
Policy Gradient
Minimal Monte Carlo Policy Gradient (REINFORCE) Algorithm Implementation in Keras
Stars: ✭ 135 (-33.17%)
Mutual labels:  reinforcement-learning, policy-gradient
Show Adapt And Tell
Code for "Show, Adapt and Tell: Adversarial Training of Cross-domain Image Captioner" in ICCV 2017
Stars: ✭ 146 (-27.72%)
Mutual labels:  reinforcement-learning, policy-gradient
Rl Quadcopter
Teach a Quadcopter How to Fly!
Stars: ✭ 124 (-38.61%)
Mutual labels:  jupyter-notebook, reinforcement-learning
Mlds2018spring
Machine Learning and having it Deep and Structured (MLDS) in 2018 spring
Stars: ✭ 124 (-38.61%)
Mutual labels:  reinforcement-learning, policy-gradient
Data Science Question Answer
A repo for data science related questions and answers
Stars: ✭ 2,000 (+890.1%)
Mutual labels:  jupyter-notebook, reinforcement-learning
Advanced Deep Learning And Reinforcement Learning Deepmind
🎮 Advanced Deep Learning and Reinforcement Learning at UCL & DeepMind | YouTube videos 👉
Stars: ✭ 121 (-40.1%)
Mutual labels:  jupyter-notebook, reinforcement-learning
2048 Deep Reinforcement Learning
Trained A Convolutional Neural Network To Play 2048 using Deep-Reinforcement Learning
Stars: ✭ 169 (-16.34%)
Mutual labels:  jupyter-notebook, reinforcement-learning
A2c
A Clearer and Simpler Synchronous Advantage Actor Critic (A2C) Implementation in TensorFlow
Stars: ✭ 169 (-16.34%)
Mutual labels:  reinforcement-learning, policy-gradient
Chess Alpha Zero
Chess reinforcement learning by AlphaGo Zero methods.
Stars: ✭ 1,868 (+824.75%)
Mutual labels:  jupyter-notebook, reinforcement-learning
Pytorch sac
PyTorch implementation of Soft Actor-Critic (SAC)
Stars: ✭ 174 (-13.86%)
Mutual labels:  jupyter-notebook, reinforcement-learning
Andrew Ng Notes
This is Andrew NG Coursera Handwritten Notes.
Stars: ✭ 180 (-10.89%)
Mutual labels:  jupyter-notebook, reinforcement-learning

Multi-Hop Knowledge Graph Reasoning with Reward Shaping

This is the official code release of the following paper:

Xi Victoria Lin, Richard Socher and Caiming Xiong. Multi-Hop Knowledge Graph Reasoning with Reward Shaping. EMNLP 2018.

multihopkg_architecture

Quick Start

Environment variables & dependencies

Use Docker

Build the docker image

docker build -< Dockerfile -t multi_hop_kg:v1.0

Spin up a docker container and run experiments inside it.

nvidia-docker run -v `pwd`:/workspace/MultiHopKG -it multi_hop_kg:v1.0

The rest of the readme assumes that one works interactively inside a container. If you prefer to run experiments outside a container, please change the commands accordingly.

Mannually set up

Alternatively, you can install Pytorch (>=0.4.1) manually and use the Makefile to set up the rest of the dependencies.

make setup

Process data

First, unpack the data files

tar xvzf data-release.tgz

and run the following command to preprocess the datasets.

./experiment.sh configs/<dataset>.sh --process_data <gpu-ID>

<dataset> is the name of any dataset folder in the ./data directory. In our experiments, the five datasets used are: umls, kinship, fb15k-237, wn18rr and nell-995. <gpu-ID> is a non-negative integer number representing the GPU index.

Train models

Then the following commands can be used to train the proposed models and baselines in the paper. By default, dev set evaluation results will be printed when training terminates.

  1. Train embedding-based models
./experiment-emb.sh configs/<dataset>-<emb_model>.sh --train <gpu-ID>

The following embedding-based models are implemented: distmult, complex and conve.

  1. Train RL models (policy gradient)
./experiment.sh configs/<dataset>.sh --train <gpu-ID>
  1. Train RL models (policy gradient + reward shaping)
./experiment-rs.sh configs/<dataset>-rs.sh --train <gpu-ID>
  • Note: To train the RL models using reward shaping, make sure 1) you have pre-trained the embedding-based models and 2) set the file path pointers to the pre-trained embedding-based models correctly (example configuration file).

Evaluate pretrained models

To generate the evaluation results of a pre-trained model, simply change the --train flag in the commands above to --inference.

For example, the following command performs inference with the RL models (policy gradient + reward shaping) and prints the evaluation results (on both dev and test sets).

./experiment-rs.sh configs/<dataset>-rs.sh --inference <gpu-ID>

To print the inference paths generated by beam search during inference, use the --save_beam_search_paths flag:

./experiment-rs.sh configs/<dataset>-rs.sh --inference <gpu-ID> --save_beam_search_paths
  • Note for the NELL-995 dataset:

    On this dataset we split the original training data into train.triples and dev.triples, and the final model to test has to be trained with these two files combined.

    1. To obtain the correct test set results, you need to add the --test flag to all data pre-processing, training and inference commands.
    # You may need to adjust the number of training epochs based on the dev set development.
    
    ./experiment.sh configs/nell-995.sh --process_data <gpu-ID> --test
    ./experiment-emb.sh configs/nell-995-conve.sh --train <gpu-ID> --test
    ./experiment-rs.sh configs/NELL-995-rs.sh --train <gpu-ID> --test
    ./experiment-rs.sh configs/NELL-995-rs.sh --inference <gpu-ID> --test
    
    1. Leave out the --test flag during development.

Change the hyperparameters

To change the hyperparameters and other experiment set up, start from the configuration files.

More on implementation details

We use mini-batch training in our experiments. To save the amount of paddings (which can cause memory issues and slow down computation for knowledge graphs that contain nodes with large fan-outs), we group the action spaces of different nodes into buckets based on their sizes. Description of the bucket implementation can be found here and here.

Citation

If you find the resource in this repository helpful, please cite

@inproceedings{LinRX2018:MultiHopKG, 
  author = {Xi Victoria Lin and Richard Socher and Caiming Xiong}, 
  title = {Multi-Hop Knowledge Graph Reasoning with Reward Shaping}, 
  booktitle = {Proceedings of the 2018 Conference on Empirical Methods in Natural
               Language Processing, {EMNLP} 2018, Brussels, Belgium, October
               31-November 4, 2018},
  year = {2018} 
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].