All Projects → ketranm → Neuralhmm

ketranm / Neuralhmm

Licence: mit
code for unsupervised learning Neural Hidden Markov Models paper

Programming Languages

lua
6591 projects

Projects that are alternatives of or similar to Neuralhmm

Sine
A PyTorch Implementation of "SINE: Scalable Incomplete Network Embedding" (ICDM 2018).
Stars: ✭ 67 (+4.69%)
Mutual labels:  unsupervised-learning, torch
Hypergan
Composable GAN framework with api and user interface
Stars: ✭ 1,104 (+1625%)
Mutual labels:  unsupervised-learning
Uc Davis Cs Exams Analysis
📈 Regression and Classification with UC Davis student quiz data and exam data
Stars: ✭ 33 (-48.44%)
Mutual labels:  unsupervised-learning
Tadw
An implementation of "Network Representation Learning with Rich Text Information" (IJCAI '15).
Stars: ✭ 43 (-32.81%)
Mutual labels:  unsupervised-learning
Gdynet
Unsupervised learning of atomic scale dynamics from molecular dynamics.
Stars: ✭ 37 (-42.19%)
Mutual labels:  unsupervised-learning
Voxelmorph
Unsupervised Learning for Image Registration
Stars: ✭ 1,057 (+1551.56%)
Mutual labels:  unsupervised-learning
Dlt
Deep Learning Toolbox for Torch
Stars: ✭ 20 (-68.75%)
Mutual labels:  torch
Weakly Supervised 3d Object Detection
Weakly Supervised 3D Object Detection from Point Clouds (VS3D), ACM MM 2020
Stars: ✭ 61 (-4.69%)
Mutual labels:  unsupervised-learning
Hzproc
torch data augmentation toolbox (supports affine transform)
Stars: ✭ 56 (-12.5%)
Mutual labels:  torch
Student Teacher Anomaly Detection
Student–Teacher Anomaly Detection with Discriminative Latent Embeddings
Stars: ✭ 43 (-32.81%)
Mutual labels:  unsupervised-learning
Pytorch Cpp
C++ Implementation of PyTorch Tutorials for Everyone
Stars: ✭ 1,014 (+1484.38%)
Mutual labels:  torch
Unsuprevised seg via cnn
Stars: ✭ 38 (-40.62%)
Mutual labels:  unsupervised-learning
Lir For Unsupervised Ir
This is an implementation for the CVPR2020 paper "Learning Invariant Representation for Unsupervised Image Restoration"
Stars: ✭ 53 (-17.19%)
Mutual labels:  unsupervised-learning
Iva
IVA: Independent Vector Analysis implementation
Stars: ✭ 35 (-45.31%)
Mutual labels:  unsupervised-learning
Dgi
TensorFlow implementation of Deep Graph Infomax
Stars: ✭ 58 (-9.37%)
Mutual labels:  unsupervised-learning
Discogan Pytorch
PyTorch implementation of "Learning to Discover Cross-Domain Relations with Generative Adversarial Networks"
Stars: ✭ 961 (+1401.56%)
Mutual labels:  unsupervised-learning
Susi
SuSi: Python package for unsupervised, supervised and semi-supervised self-organizing maps (SOM)
Stars: ✭ 42 (-34.37%)
Mutual labels:  unsupervised-learning
Php Ml
PHP-ML - Machine Learning library for PHP
Stars: ✭ 7,900 (+12243.75%)
Mutual labels:  unsupervised-learning
Dmgi
Unsupervised Attributed Multiplex Network Embedding (AAAI 2020)
Stars: ✭ 62 (-3.12%)
Mutual labels:  unsupervised-learning
Neuralamr
Sequence-to-sequence models for AMR parsing and generation
Stars: ✭ 60 (-6.25%)
Mutual labels:  torch

neuralHMM

Code of end to end unsupervised neural HMM described in our paper: https://arxiv.org/abs/1609.09007

Unsupervised Neural Hidden Markov Models  
Ke Tran, Yonatan Bisk, Ashish Vaswani, Daniel Marcu, and Kevin Knight   
Workshop Structured Prediction for Natural Language Processing, EMNLP 2016

Requirement

To run the code, you need

Note

  • You can enjoy some speed up if you use cudnn.LSTM instead. However, we see better performance when using Justin's LSTM implementation. This is probably due to parameter initialization, which is sensitive in unsupervised learning.

Code Structure

This code is inspired by Edward. Each Model file (e.g EmiConv.lua, FFTran.lua) has API log_prob that returns the log probabilities. These log probabilities of prior, emission, and transition then will be passed to inference module (BaumWelch) that computes the posteriors.

Data preprocessing

  • tokenized, replace all digit by 0 (e.g tr '0-9' 0)
  • DO NOT lowercase
  • one sentence per line

Running the script

File main.lua is the main script to train and evaluate an unsupervised neural Hidden Markov Model.

To train the full version of NHMM + Convlutional Character Emission + LSTM Transition, run

$ th main.lua -datapath ../wsj -nstates 45 -niters 20 \
 -hidsize 512 -mnbz 256 -nloops 6 -cuda -maxlen 81 \
 -nlayers 3 -modelpath ../save -model lstmconv -conv -lstm

The option -lstm indicates using LSTM transition, and -conv indicates using Char-CNN emission.

This code is heavily optimized, you can train it using very long sentences. -maxlen option is used to specify maximum sentence length you want to use. The loader will automatically re-arrange data in to batches of length 20, 40, 60, and 80 (when -maxlen 81 is on).

For evaluation, assume that you have a trained model file ../save/nhmm.iter19.t7, you can perform tagging by

$ th main.lua -datapath ../wsj -nstates 45 -hidsize 512 \
-nlayers 3 -cuda -model ../save/nhmm.iter19.t7 -conv -lstm \
-input your_text_file.txt -output -tagged_file.txt

If -cuda is provided, the model will use GPU. For the Char-CNN (-conv option), you have to use GPU.

Bell and Whistle

Probing

Since unsupervised models are sensitive to initialization, ideally you would like to run a dozen (or even hundreds) instances of the NHMM and pick the best instance based on likelihood for testing. This is rather expensive. We find that the final likelihood (at the last iteration) highly correlates with the likelihood at the first iteration among different runs. Therefore, we introduce a trick called probing. Essentially, we will run the first iteration for nprobes times. Each time, all the parameters are reset and we use different random seeds. After nprobes runs, we pick the run that gives the highest log-likihood to continue. Setting nprobes to 5 usually produces good results.

NOTE: We didn't use this trick in our paper. If you do not have development data to evaluate your model, using probing is strongly advised.

Adding noise

Another trick to get good performance is adding gradient noise (https://arxiv.org/abs/1511.06807). We didn't use this trick in our paper, but empirically we find that adding gradient noise often gives stable and good performance.

Evaluation

The evaluation script was prepared by Yonatan Bisk. It's quite self contained. It expects data in the format of one sentence per line with tags whitespace separated.

$ python eval.py predicted_tagged.txt gold_tagged.txt

TODO:

We use cutorch 1.0 in our experiments. Need to make this code compatible for last cutorch.

  • Map corresponding cpu tensors to CUDA tensors

Acknowledgment

The code is utilized from

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].