sismetanin / word2vec-tsne

Licence: other

Google News and Leo Tolstoy: Visualizing Word2Vec Word Embeddings using t-SNE.

Programming Languages

Jupyter Notebook

11667 projects

Projects that are alternatives of or similar to word2vec-tsne

sentiment-analysis-of-tweets-in-russian

Sentiment analysis of tweets in Russian using Convolutional Neural Networks (CNN) with Word2Vec embeddings.

Stars: ✭ 51 (-13.56%)

Mutual labels: word2vec, word-embeddings, embeddings, machinelearning, computational-linguistics, nlp-machine-learning

datastories-semeval2017-task6

Deep-learning model presented in "DataStories at SemEval-2017 Task 6: Siamese LSTM with Attention for Humorous Text Comparison".

Stars: ✭ 20 (-66.1%)

Mutual labels: word-embeddings, embeddings, computational-linguistics, nlp-machine-learning

SentimentAnalysis

Sentiment Analysis: Deep Bi-LSTM+attention model

Stars: ✭ 32 (-45.76%)

Mutual labels: word-embeddings, embeddings, computational-linguistics, nlp-machine-learning

NTUA-slp-nlp

💻Speech and Natural Language Processing (SLP & NLP) Lab Assignments for ECE NTUA

Stars: ✭ 19 (-67.8%)

Mutual labels: word2vec, word-embeddings, nlp-machine-learning

Datastories Semeval2017 Task4

Deep-learning model presented in "DataStories at SemEval-2017 Task 4: Deep LSTM with Attention for Message-level and Topic-based Sentiment Analysis".

Stars: ✭ 184 (+211.86%)

Mutual labels: word-embeddings, embeddings, nlp-machine-learning

lda2vec

Mixing Dirichlet Topic Models and Word Embeddings to Make lda2vec from this paper https://arxiv.org/abs/1605.02019

Stars: ✭ 27 (-54.24%)

Mutual labels: word2vec, word-embeddings, embeddings

Dict2vec

Dict2vec is a framework to learn word embeddings using lexical dictionaries.

Stars: ✭ 91 (+54.24%)

Mutual labels: word2vec, word-embeddings, embeddings

Magnitude

A fast, efficient universal vector embedding utility package.

Stars: ✭ 1,394 (+2262.71%)

Mutual labels: word2vec, word-embeddings, embeddings

Dna2vec

dna2vec: Consistent vector representations of variable-length k-mers

Stars: ✭ 117 (+98.31%)

Mutual labels: word2vec, word-embeddings, embeddings

Fasttext.js

FastText for Node.js

Stars: ✭ 127 (+115.25%)

Mutual labels: word2vec, word-embeddings, machinelearning

biovec

ProtVec can be used in protein interaction predictions, structure prediction, and protein data visualization.

Stars: ✭ 23 (-61.02%)

Mutual labels: word2vec, tsne

Word2VecAndTsne

Scripts demo-ing how to train a Word2Vec model and reduce its vector space

Stars: ✭ 45 (-23.73%)

Mutual labels: word2vec, tsne

word-embeddings-from-scratch

Creating word embeddings from scratch and visualize them on TensorBoard. Using trained embeddings in Keras.

Stars: ✭ 22 (-62.71%)

Mutual labels: word2vec, embeddings

Simple-Sentence-Similarity

Exploring the simple sentence similarity measurements using word embeddings

Stars: ✭ 99 (+67.8%)

Mutual labels: word2vec, word-embeddings

empythy

Automated NLP sentiment predictions- batteries included, or use your own data

Stars: ✭ 17 (-71.19%)

Mutual labels: machinelearning, nlp-machine-learning

two-stream-cnn

A two-stream convolutional neural network for learning abitrary similarity functions over two sets of training data

Stars: ✭ 24 (-59.32%)

Mutual labels: word2vec, word-embeddings

SentimentAnalysis

(BOW, TF-IDF, Word2Vec, BERT) Word Embeddings + (SVM, Naive Bayes, Decision Tree, Random Forest) Base Classifiers + Pre-trained BERT on Tensorflow Hub + 1-D CNN and Bi-Directional LSTM on IMDB Movie Reviews Dataset

Stars: ✭ 40 (-32.2%)

Mutual labels: word2vec, embeddings

DeepLearningReading

Deep Learning and Machine Learning mini-projects. Current Project: Deepmind Attentive Reader (rc-data)

Stars: ✭ 78 (+32.2%)

Mutual labels: embeddings, nlp-machine-learning

embedding evaluation

Evaluate your word embeddings

Stars: ✭ 32 (-45.76%)

Mutual labels: embeddings, computational-linguistics

Koan

A word2vec negative sampling implementation with correct CBOW update.

Stars: ✭ 232 (+293.22%)

Mutual labels: word2vec, word-embeddings

View All Similar Projects ➔

Google News and Leo Tolstoy: Visualizing Word2Vec Word Embeddings using t-SNE

This repository contains the source code for visualizing high-dimensional Word2Vec word embeddings using t-SNE. The visualization can be useful to understand how Word2Vec works and how to interpret relations between vectors captured from your texts before using them in neural networks or other machine learning algorithms. As a training data, we will use articles from Google News and classical literary works by Leo Tolstoy, the Russian writer who is regarded as one of the greatest authors of all time.

Data

The pre-trained model trained on part of Google News dataset (about 100 billion words) is available at https://code.google.com/archive/p/word2vec/ (and also described in [1]). The model contains 300-dimensional vectors for 3 million words and phrases.

Tolstoy's novels in Russian are available at https://www.litres.ru/lev-tolstoy.

References

L. Maate and G. Hinton, "Visualizing data using t-SNE", Journal of Machine Learning Research, vol. 9, pp. 2579-2605, 2008.
T. Mikolov, I. Sutskever, K. Chen, G. Corrado and J. Dean, "Distributed Representations of Words and Phrases and their Compositionality", Advances in Neural Information Processing Systems, pp. 3111-3119, 2013.
R. Rehurek and P. Sojka, "Software Framework for Topic Modelling with Large Corpora", Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks, 2010.

Documentation and How to report bugs

Gensim documentatiob: https://radimrehurek.com/gensim/.
Scikit-learn documentation: http://scikit-learn.org/stable/documentation.html.
If you find any issues, please open a bug here on GitHub.

License

See LICENSE.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

sismetanin / word2vec-tsne

Programming Languages

Labels

Projects that are alternatives of or similar to word2vec-tsne

Google News and Leo Tolstoy: Visualizing Word2Vec Word Embeddings using t-SNE

Data

References

Documentation and How to report bugs

License