All Projects → dasguptar → Treelstm.pytorch

dasguptar / Treelstm.pytorch

Licence: mit
Tree LSTM implementation in PyTorch

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Treelstm.pytorch

type4py
Type4Py: Deep Similarity Learning-Based Type Inference for Python
Stars: ✭ 41 (-91.39%)
Mutual labels:  machinelearning, deeplearning
Onepanel
The open and extensible integrated development environment (IDE) for computer vision with built-in modules for model building, automated labeling, data processing, model training, hyperparameter tuning and workflow orchestration.
Stars: ✭ 428 (-10.08%)
Mutual labels:  deeplearning, machinelearning
Groundbreaking-Papers
ML Research paper summaries, annotated papers and implementation walkthroughs
Stars: ✭ 90 (-81.09%)
Mutual labels:  machinelearning, deeplearning
Nearest-Celebrity-Face
Tensorflow Implementation of FaceNet: A Unified Embedding for Face Recognition and Clustering to find the celebrity whose face matches the closest to yours.
Stars: ✭ 30 (-93.7%)
Mutual labels:  machinelearning, deeplearning
Tensorwatch
Debugging, monitoring and visualization for Python Machine Learning and Data Science
Stars: ✭ 3,191 (+570.38%)
Mutual labels:  deeplearning, machinelearning
Kapsul-Aglari-ile-Isaret-Dili-Tanima
Recognition of Sign Language using Capsule Networks
Stars: ✭ 42 (-91.18%)
Mutual labels:  machinelearning, deeplearning
Forecasting-Solar-Energy
Forecasting Solar Power: Analysis of using a LSTM Neural Network
Stars: ✭ 23 (-95.17%)
Mutual labels:  machinelearning, deeplearning
Awesome Deep Learning And Machine Learning Questions
【不定期更新】收集整理的一些网站中(如知乎、Quora、Reddit、Stack Exchange等)与深度学习、机器学习、强化学习、数据科学相关的有价值的问题
Stars: ✭ 203 (-57.35%)
Mutual labels:  deeplearning, machinelearning
Solaris
CosmiQ Works Geospatial Machine Learning Analysis Toolkit
Stars: ✭ 290 (-39.08%)
Mutual labels:  deeplearning, machinelearning
barrage
Barrage is an opinionated supervised deep learning tool built on top of TensorFlow 2.x designed to standardize and orchestrate the training and scoring of complicated models.
Stars: ✭ 16 (-96.64%)
Mutual labels:  machinelearning, deeplearning
gan deeplearning4j
Automatic feature engineering using Generative Adversarial Networks using Deeplearning4j and Apache Spark.
Stars: ✭ 19 (-96.01%)
Mutual labels:  machinelearning, deeplearning
Text summurization abstractive methods
Multiple implementations for abstractive text summurization , using google colab
Stars: ✭ 359 (-24.58%)
Mutual labels:  deeplearning, machinelearning
awesome-conformal-prediction
A professionally curated list of awesome Conformal Prediction videos, tutorials, books, papers, PhD and MSc theses, articles and open-source libraries.
Stars: ✭ 998 (+109.66%)
Mutual labels:  machinelearning, deeplearning
datascience-mashup
In this repo I will try to gather all of the projects related to data science with clean datasets and high accuracy models to solve real world problems.
Stars: ✭ 36 (-92.44%)
Mutual labels:  machinelearning, deeplearning
Netron
Visualizer for neural network, deep learning, and machine learning models
Stars: ✭ 17,193 (+3511.97%)
Mutual labels:  deeplearning, machinelearning
dst
yet another custom data science template via cookiecutter
Stars: ✭ 59 (-87.61%)
Mutual labels:  machinelearning, deeplearning
Clearml Server
ClearML - Auto-Magical Suite of tools to streamline your ML workflow. Experiment Manager, ML-Ops and Data-Management
Stars: ✭ 186 (-60.92%)
Mutual labels:  deeplearning, machinelearning
Clearml
ClearML - Auto-Magical CI/CD to streamline your ML workflow. Experiment Manager, MLOps and Data-Management
Stars: ✭ 2,868 (+502.52%)
Mutual labels:  deeplearning, machinelearning
Data-Scientist-In-Python
This repository contains notes and projects of Data scientist track from dataquest course work.
Stars: ✭ 23 (-95.17%)
Mutual labels:  machinelearning, deeplearning
Awesome Segmentation Saliency Dataset
A collection of some datasets for segmentation / saliency detection. Welcome to PR...😄
Stars: ✭ 315 (-33.82%)
Mutual labels:  deeplearning, machinelearning

Tree-Structured Long Short-Term Memory Networks

This is a PyTorch implementation of Tree-LSTM as described in the paper Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks by Kai Sheng Tai, Richard Socher, and Christopher Manning. On the semantic similarity task using the SICK dataset, this implementation reaches:

  • Pearson's coefficient: 0.8492 and MSE: 0.2842 using hyperparameters --lr 0.010 --wd 0.0001 --optim adagrad --batchsize 25
  • Pearson's coefficient: 0.8674 and MSE: 0.2536 using hyperparameters --lr 0.025 --wd 0.0001 --optim adagrad --batchsize 25 --freeze_embed
  • Pearson's coefficient: 0.8676 and MSE: 0.2532 are the numbers reported in the original paper.
  • Known differences include the way the gradients are accumulated (normalized by batchsize or not).

Requirements

  • Python (tested on 3.6.5, should work on >=2.7)
  • Java >= 8 (for Stanford CoreNLP utilities)
  • Other dependencies are in requirements.txt Note: Currently works with PyTorch 0.4.0. Switch to the pytorch-v0.3.1 branch if you want to use PyTorch 0.3.1.

Usage

Before delving into how to run the code, here is a quick overview of the contents:

  • Use the script fetch_and_preprocess.sh to download the SICK dataset, Stanford Parser and Stanford POS Tagger, and Glove word vectors (Common Crawl 840) -- Warning: this is a 2GB download!), and additionally preprocees the data, i.e. generate dependency parses using Stanford Neural Network Dependency Parser.
  • main.pydoes the actual heavy lifting of training the model and testing it on the SICK dataset. For a list of all command-line arguments, have a look at config.py.
    • The first run caches GLOVE embeddings for words in the SICK vocabulary. In later runs, only the cache is read in during later runs.
    • Logs and model checkpoints are saved to the checkpoints/ directory with the name specified by the command line argument --expname.

Next, these are the different ways to run the code here to train a TreeLSTM model.

Local Python Environment

If you have a working Python3 environment, simply run the following sequence of steps:

- bash fetch_and_preprocess.sh
- pip install -r requirements.txt
- python main.py

Pure Docker Environment

If you want to use a Docker container, simply follow these steps:

- docker build -t treelstm .
- docker run -it treelstm bash
- bash fetch_and_preprocess.sh
- python main.py

Local Filesystem + Docker Environment

If you want to use a Docker container, but want to persist data and checkpoints in your local filesystem, simply follow these steps:

- bash fetch_and_preprocess.sh
- docker build -t treelstm .
- docker run -it --mount type=bind,source="$(pwd)",target="/root/treelstm.pytorch" treelstm bash
- python main.py

NOTE: Setting the environment variable OMP_NUM_THREADS=1 usually gives a speedup on the CPU. Use it like OMP_NUM_THREADS=1 python main.py. To run on a GPU, set the CUDA_VISIBLE_DEVICES instead. Usually, CUDA does not give much speedup here, since we are operating at a batchsize of 1.

Notes

  • (Apr 02, 2018) Added Dockerfile
  • (Apr 02, 2018) Now works on PyTorch 0.3.1 and Python 3.6, removed dependency on Python 2.7
  • (Nov 28, 2017) Added frozen embeddings, closed gap to paper.
  • (Nov 08, 2017) Refactored model to get 1.5x - 2x speedup.
  • (Oct 23, 2017) Now works with PyTorch 0.2.0.
  • (May 04, 2017) Added support for sparse tensors. Using the --sparse argument will enable sparse gradient updates for nn.Embedding, potentially reducing memory usage.
    • There are a couple of caveats, however, viz. weight decay will not work in conjunction with sparsity, and results from the original paper might not be reproduced using sparse embeddings.

Acknowledgements

Shout-out to Kai Sheng Tai for the original LuaTorch implementation, and to the Pytorch team for the fun library.

Contact

Riddhiman Dasgupta

This is my first PyTorch based implementation, and might contain bugs. Please let me know if you find any!

License

MIT

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].