All Projects → harsh19 → SPINE

harsh19 / SPINE

Licence: other
Code for SPINE - Sparse Interpretable Neural Embeddings. Jhamtani H.*, Pruthi D.*, Subramanian A.*, Berg-Kirkpatrick T., Hovy E. AAAI 2018

Programming Languages

python
139335 projects - #7 most used programming language
Jupyter Notebook
11667 projects
shell
77523 projects

Projects that are alternatives of or similar to SPINE

yggdrasil-decision-forests
A collection of state-of-the-art algorithms for the training, serving and interpretation of Decision Forest models.
Stars: ✭ 156 (+254.55%)
Mutual labels:  interpretability
Naive-Resume-Matching
Text Similarity Applied to resume, to compare Resumes with Job Descriptions and create a score to rank them. Similar to an ATS.
Stars: ✭ 27 (-38.64%)
Mutual labels:  word-embeddings
SWDM
SIGIR 2017: Embedding-based query expansion for weighted sequential dependence retrieval model
Stars: ✭ 35 (-20.45%)
Mutual labels:  word-embeddings
sentiment-analysis-of-tweets-in-russian
Sentiment analysis of tweets in Russian using Convolutional Neural Networks (CNN) with Word2Vec embeddings.
Stars: ✭ 51 (+15.91%)
Mutual labels:  word-embeddings
summit
🏔️ Summit: Scaling Deep Learning Interpretability by Visualizing Activation and Attribution Summarizations
Stars: ✭ 95 (+115.91%)
Mutual labels:  interpretability
conec
Context Encoders (ConEc) as a simple but powerful extension of the word2vec model for learning word embeddings
Stars: ✭ 20 (-54.55%)
Mutual labels:  word-embeddings
lda2vec
Mixing Dirichlet Topic Models and Word Embeddings to Make lda2vec from this paper https://arxiv.org/abs/1605.02019
Stars: ✭ 27 (-38.64%)
Mutual labels:  word-embeddings
yelp comments classification nlp
Yelp round-10 review comments classification using deep learning (LSTM and CNN) and natural language processing.
Stars: ✭ 72 (+63.64%)
Mutual labels:  word-embeddings
SentimentAnalysis
Sentiment Analysis: Deep Bi-LSTM+attention model
Stars: ✭ 32 (-27.27%)
Mutual labels:  word-embeddings
textlytics
Text processing library for sentiment analysis and related tasks
Stars: ✭ 25 (-43.18%)
Mutual labels:  word-embeddings
sage
For calculating global feature importance using Shapley values.
Stars: ✭ 129 (+193.18%)
Mutual labels:  interpretability
Word-recognition-EmbedNet-CAB
Code implementation for our ICPR, 2020 paper titled "Improving Word Recognition using Multiple Hypotheses and Deep Embeddings"
Stars: ✭ 19 (-56.82%)
Mutual labels:  word-embeddings
neuron-importance-zsl
[ECCV 2018] code for Choose Your Neuron: Incorporating Domain Knowledge Through Neuron Importance
Stars: ✭ 56 (+27.27%)
Mutual labels:  interpretability
context2vec
PyTorch implementation of context2vec from Melamud et al., CoNLL 2016
Stars: ✭ 18 (-59.09%)
Mutual labels:  word-embeddings
knowledge-neurons
A library for finding knowledge neurons in pretrained transformer models.
Stars: ✭ 72 (+63.64%)
Mutual labels:  interpretability
wikidata-corpus
Train Wikidata with word2vec for word embedding tasks
Stars: ✭ 109 (+147.73%)
Mutual labels:  word-embeddings
codenames
Codenames AI using Word Vectors
Stars: ✭ 41 (-6.82%)
Mutual labels:  word-embeddings
sembei
🍘 単語分割を経由しない単語埋め込み 🍘
Stars: ✭ 14 (-68.18%)
Mutual labels:  word-embeddings
shapeshop
Towards Understanding Deep Learning Representations via Interactive Experimentation
Stars: ✭ 16 (-63.64%)
Mutual labels:  interpretability
word embedding
Sample code for training Word2Vec and FastText using wiki corpus and their pretrained word embedding..
Stars: ✭ 21 (-52.27%)
Mutual labels:  word-embeddings

SPINE: SParse Interpretable Neural Embeddings

SPINE is a tool to transform existing representations into more interpretable ones. It is a novel extension of the k-sparse autoencoder that is able to enforce stricter sparsity constraints. It is highly expressive and facilitates non-linear transformations in contrast to existing linear matrix factorization based approaches.

Link to our AAAI 2018 paper

A k-sparse autoencoder. For an input X, an autoencoder attempts to construct an output X' at its output layer that is close to X. In a k-sparse autoencoder, only a few hidden units are active for any given input (denoted by the colored units in the figure).

Requirements

python 3
numpy
pytorch 0.3 (along with torchvision)
tqdm

UPDATE [4th Feb 2020]: A pytorch 1.3 compatible version of this codebase has been released by Jacob Danovitch at following URL: https://github.com/jacobdanovitch/SPINE

Input format requirements

The input embeddings, that you wish to transform, should be in the following format. Each line contains the word and its continuous representation in a space separated format

word1 0.4 0.2 0.42 ...
word2 0.23 0.54 0.123 ...

For the help menu,

cd code/models/
python main.py -h

We recommend running the following setting:

python3 main.py --input input_file \
		 --num_epochs 4000 \
		 --denoising \
		 --noise 0.2 \
		 --sparsity 0.85 \
		 --output output_file \
		 --hdim 1000

A note of caution: Different input representations might require very different hyper-parameter setting. For instance, based on the range of the input representations, one should set how much noise to add for the denoising auto-encoder. For glove embeddings we found 0.4 noise level to be best, whereas for word2vec 0.2 noise level was more suitable. Hence, you might have to play with a few hyper-parameter settings to attain the best accuracy/interpretability trade-off. Further, we also suggest visualizating the obtained representations.

Word Embeddings

SPINE word vectors of original glove and word2vec vectors, along with word vectors from baseline Sparse Overcomplete Word Vectors (SPOWV) method, are available here.

Visualization

To qualitatively assess the resulting representations, follow this easy ipython tutorial.

Evaluation

cd code/evaluation/

./setup.sh # to download the required datasets and dumps relevant pickle files

./run.sh <embeddings_absolute_path> # to evaluate embeddings on various extrinsic and intrinsic tasks

Note

This is the official code for the following paper, if you use it please consider citing it.

@article{subramanian2018spine,
  title={SPINE: SParse Interpretable Neural Embeddings},
  author={Subramanian, Anant and Pruthi, Danish and Jhamtani, Harsh and Berg-Kirkpatrick, Taylor and Hovy, Eduard},
  journal={Proceedings of the Thirty Second AAAI Conference on Artificial Intelligence (AAAI)}
  year={2018}
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].