All Projects → swabhs → Joint Lstm Parser

swabhs / Joint Lstm Parser

Transition-based joint syntactic dependency parser and semantic role labeler using a stack LSTM RNN architecture.

Projects that are alternatives of or similar to Joint Lstm Parser

Lingua Franca
Mycroft's multilingual text parsing and formatting library
Stars: ✭ 51 (-10.53%)
Mutual labels:  natural-language-processing
Market Reporter
Automatic Generation of Brief Summaries of Time-Series Data
Stars: ✭ 54 (-5.26%)
Mutual labels:  natural-language-processing
Demos
Some JavaScript works published as demos, mostly ML or DS
Stars: ✭ 55 (-3.51%)
Mutual labels:  natural-language-processing
Nlp Various Tutorials
자연어 처리와 관련한 여러 튜토리얼 저장소
Stars: ✭ 52 (-8.77%)
Mutual labels:  natural-language-processing
Thot
Thot toolkit for statistical machine translation
Stars: ✭ 53 (-7.02%)
Mutual labels:  natural-language-processing
Scdv
Text classification with Sparse Composite Document Vectors.
Stars: ✭ 54 (-5.26%)
Mutual labels:  natural-language-processing
Corenlp
Stanford CoreNLP: A Java suite of core NLP tools.
Stars: ✭ 8,248 (+14370.18%)
Mutual labels:  natural-language-processing
Quaterniontransformers
Repository for ACL 2019 paper
Stars: ✭ 56 (-1.75%)
Mutual labels:  natural-language-processing
Nltk Book Resource
Notes and solutions to complement the official NLTK book
Stars: ✭ 54 (-5.26%)
Mutual labels:  natural-language-processing
Coarij
Corpus of Annual Reports in Japan
Stars: ✭ 55 (-3.51%)
Mutual labels:  natural-language-processing
Python Tutorial Notebooks
Python tutorials as Jupyter Notebooks for NLP, ML, AI
Stars: ✭ 52 (-8.77%)
Mutual labels:  natural-language-processing
Notes
The notes for Math, Machine Learning, Deep Learning and Research papers.
Stars: ✭ 53 (-7.02%)
Mutual labels:  natural-language-processing
Emotion Detector
A python code to detect emotions from text
Stars: ✭ 54 (-5.26%)
Mutual labels:  natural-language-processing
Iob2corpus
Japanese IOB2 tagged corpus for Named Entity Recognition.
Stars: ✭ 51 (-10.53%)
Mutual labels:  natural-language-processing
Research papers
Record some papers I have read and paper notes I have taken, also including some awesome papers reading lists and academic blog posts.
Stars: ✭ 55 (-3.51%)
Mutual labels:  natural-language-processing
Spark Nkp
Natural Korean Processor for Apache Spark
Stars: ✭ 50 (-12.28%)
Mutual labels:  natural-language-processing
Jieba Php
"結巴"中文分詞:做最好的 PHP 中文分詞、中文斷詞組件。 / "Jieba" (Chinese for "to stutter") Chinese text segmentation: built to be the best PHP Chinese word segmentation module.
Stars: ✭ 1,073 (+1782.46%)
Mutual labels:  natural-language-processing
Li emnlp 2017
Deep Recurrent Generative Decoder for Abstractive Text Summarization in DyNet
Stars: ✭ 56 (-1.75%)
Mutual labels:  natural-language-processing
Hmtl
🌊HMTL: Hierarchical Multi-Task Learning - A State-of-the-Art neural network model for several NLP tasks based on PyTorch and AllenNLP
Stars: ✭ 1,084 (+1801.75%)
Mutual labels:  natural-language-processing
Vietnamese Electra
Electra pre-trained model using Vietnamese corpus
Stars: ✭ 55 (-3.51%)
Mutual labels:  natural-language-processing

lstm-parser

Transition-based joint syntactic dependency parser and semantic role labeler using stack LSTM RNN architecture

Required software

Checking out the project for the first time

The first time you clone the repository, you need to sync the cnn/ submodule.

git submodule init
git submodule update

Build instructions

mkdir build
cd build
cmake .. -DEIGEN3_INCLUDE_DIR=/path/to/eigen
make -j2

Train a parsing model

The data must be in the CoNLL 2009 format. For best performance, it is suggested to projectivize the treebank data. As a preprocessing step, first convert the treebank data (in CoNLL 2009 format) into transitions, *.transitions (this is a format usable by the joint parser), using the following commands.

java -jar ../jointOracle.jar -inp train.conll -lemmas true > train.transitions
java -jar ../jointOracle.jar -inp dev.conll > dev.transitions

Note that it is required to set the option ``lemmas'' to true for the training data, so that an auxilliary file, train.conll.pb.lemmas is generated, which saves the lemmas of all the predicate words. The joint parser can now run on the generated files.

parser/lstm-parse -T train.transitions -d dev.transitions -w sskip.100.vectors --propbank_lemmas train.conll.pb.lemmas -g dev.conll -e ../eval09.pl -s dev.predictions.conll --out_model joint.model -t

Link to the word vectors that we used in the ACL 2015 paper for English: sskip.100.vectors. The evaluation script, eval09.pl, is provided by CoNLL 2009. The model is written to the current directory.

Note-1: you can also run it without word embeddings by removing the -w option for both training and parsing.

Note-2: the training process should be stopped when the development Macro F1 does not substantially improve anymore.

Note-3: the default hyperparameters are the same as in the paper for the CoNLL 2009 English model. These can be changed by altering the command-line options provided in the lstm-parse.cc code.

Parse data with your parsing model

The test.conll file must also be in to the CoNLL 2009 data format .

java -jar ../jointOracle.jar -inp test.conll > test.transitions

parser/lstm-parse -T train.transitions -d test.transitions -w sskip.100.vectors --propbank_lemmas train.conll.pb.lemmas -m joint.model -s test.predictions.conll -g test.conll -e ../eval09.pl 

The parser will output a file test.predictions.conll with predicted syntax and SRL dependencies.

Citation

If you make use of this software, please cite the following:

@inproceedings{swayamdipta:2016conll,
author={Swabha Swayamdipta and Miguel Ballesteros and Chris Dyer and Noah A. Smith},
title={Greedy, Joint Syntactic-Semantic Parsing with Stack LSTMs},
booktitle={Proc. of CoNLL},
year={2016}
}

Contact

For questions and usage issues, please contact [email protected]

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].