Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → tatsuokun → context2vec

tatsuokun / context2vec

Licence: BSD-3-Clause license

PyTorch implementation of context2vec from Melamud et al., CoNLL 2016

Programming Languages

139335 projects - #7 most used programming language

Labels

natural-language-processing deep-learning word-embeddings pytorch

Projects that are alternatives of or similar to context2vec

word2vec-on-wikipedia

A pipeline for training word embeddings using word2vec on wikipedia corpus.

Stars: ✭ 68 (+277.78%)

Mutual labels: word-embeddings

Active-Explainable-Classification

A set of tools for leveraging pre-trained embeddings, active learning and model explainability for effecient document classification

Stars: ✭ 28 (+55.56%)

Mutual labels: word-embeddings

Google News and Leo Tolstoy: Visualizing Word2Vec Word Embeddings using t-SNE.

Stars: ✭ 59 (+227.78%)

Mutual labels: word-embeddings

Contextual LSTM for NLP tasks like word prediction and word embedding creation for Deep Learning

Stars: ✭ 28 (+55.56%)

Mutual labels: word-embeddings

MorphologicalPriorsForWordEmbeddings

Code for EMNLP 2016 paper: Morphological Priors for Probabilistic Word Embeddings

Stars: ✭ 53 (+194.44%)

Mutual labels: word-embeddings

robot-mind-meld

A little game powered by word vectors

Stars: ✭ 31 (+72.22%)

Mutual labels: word-embeddings

Named-Entity Recognition in Persian Language

Stars: ✭ 48 (+166.67%)

Mutual labels: word-embeddings

Mixing Dirichlet Topic Models and Word Embeddings to Make lda2vec from this paper https://arxiv.org/abs/1605.02019

Stars: ✭ 27 (+50%)

Mutual labels: word-embeddings

compress-fasttext

Tools for shrinking fastText models (in gensim format)

Stars: ✭ 124 (+588.89%)

Mutual labels: word-embeddings

Implementation of Siamese CBOW using keras whose backend is tensorflow.

Stars: ✭ 14 (-22.22%)

Mutual labels: word-embeddings

word-benchmarks

Benchmarks for intrinsic word embeddings evaluation.

Stars: ✭ 45 (+150%)

Mutual labels: word-embeddings

pair2vec: Compositional Word-Pair Embeddings for Cross-Sentence Inference

Stars: ✭ 62 (+244.44%)

Mutual labels: word-embeddings

datastories-semeval2017-task6

Deep-learning model presented in "DataStories at SemEval-2017 Task 6: Siamese LSTM with Attention for Humorous Text Comparison".

Stars: ✭ 20 (+11.11%)

Mutual labels: word-embeddings

Danish Semantic analysis

Stars: ✭ 17 (-5.56%)

Mutual labels: word-embeddings

QuestionClustering

Clasificador de preguntas escrito en python 3 que fue implementado en el siguiente vídeo: https://youtu.be/qnlW1m6lPoY

Stars: ✭ 15 (-16.67%)

Mutual labels: word-embeddings

Code for Supervised Word Mover's Distance (SWMD)

Stars: ✭ 90 (+400%)

Mutual labels: word-embeddings

[KDD 2020] Hierarchical Topic Mining via Joint Spherical Tree and Text Embedding

Stars: ✭ 55 (+205.56%)

Mutual labels: word-embeddings

wikidata-corpus

Train Wikidata with word2vec for word embedding tasks

Stars: ✭ 109 (+505.56%)

Mutual labels: word-embeddings

materials-synthesis-generative-models

Public release of data and code for materials synthesis generation

Stars: ✭ 47 (+161.11%)

Mutual labels: word-embeddings

The code of our paper "SIFRank: A New Baseline for Unsupervised Keyphrase Extraction Based on Pre-trained Language Model"

Stars: ✭ 96 (+433.33%)

Mutual labels: word-embeddings

View All Similar Projects ➔

context2vec: Learning Generic Context Embedding with Bidirectional LSTM, Melamud et al., CoNLL 2016

This is a PyTorch implementation of Context2Vec that learns context vectors utilizing bi-directional LSTM.

Requirements

Framework

python (<= 3.6)
pytorch (<= 0.4.1)

Packages

torchtext
nltk

Quick Run

Train

python -m src --train

which means running on cpu and learning context vectors from a small piece of penn tree bank (that is in the repository). The learned model and the embedding file are stored at models/model.param and models/embedding.vec respectively. (Note that you have to put flag --train if you want to train the model. Otherwise you might be on an inference mode.)

Inference

python -m src
>> I am a [] .

(Note that you might not get a good result if you use the model that learns from a part of penn tree bank (i.e. dataset/sample.txt) because it does not contain enough data for learning context vectors. The reason why I put this sample in the repository is that you can easily check whether this program could actually work.)

Running with GPU and other settings

Train

Running on GPU_ID 0 using INPUT_FILE and output embedding file on OUTPUT_EMBEDDING_FILE and model parameters on MODEL_FILE. (The other detailed settings are set by config.toml)

python -m src -g 0 -i INPUT_FILE -w OUTPUT_EMBEDDING_FILE -m MODEL_FILE --train

Inference

python -m src -w WORD_EMBEDDING_FILE -m MODEL_FILE

Performance

Training Speed

There is approximatitely 3x speed up compared to the original implementation.

MSR Sentence Completion

After setting your question/answer file in config.toml, run

python -m src --task mscc -w WORD_EMBEDDING_FILE -m MODEL_FILE

-	Reported score	This implementation
TEST	64.0	65.9
ALL	65.1	65.8

Reference

The original implementation (written in Chainer) by the author can be found here.

@InProceedings{K16-1006,
  author = 	"Melamud, Oren
		and Goldberger, Jacob
		and Dagan, Ido",
  title = 	"context2vec: Learning Generic Context Embedding with Bidirectional LSTM",
  booktitle = 	"Proceedings of The 20th SIGNLL Conference on Computational Natural Language Learning",
  year = 	"2016",
  publisher = 	"Association for Computational Linguistics",
  pages = 	"51--61",
  location = 	"Berlin, Germany",
  doi = 	"10.18653/v1/K16-1006",
  url = 	"http://www.aclweb.org/anthology/K16-1006"
}

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 18

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (0) 🔗