All Projects β†’ zhangyafeikimi β†’ Word2vec Win32

zhangyafeikimi / Word2vec Win32

Licence: apache-2.0
A word2vec port for Windows.

Programming Languages

c
50402 projects - #5 most used programming language

Projects that are alternatives of or similar to Word2vec Win32

lda2vec
Mixing Dirichlet Topic Models and Word Embeddings to Make lda2vec from this paper https://arxiv.org/abs/1605.02019
Stars: ✭ 27 (-34.15%)
Mutual labels:  word2vec, word-embeddings
NTUA-slp-nlp
πŸ’»Speech and Natural Language Processing (SLP & NLP) Lab Assignments for ECE NTUA
Stars: ✭ 19 (-53.66%)
Mutual labels:  word2vec, word-embeddings
wikidata-corpus
Train Wikidata with word2vec for word embedding tasks
Stars: ✭ 109 (+165.85%)
Mutual labels:  word2vec, word-embeddings
Arabic-Word-Embeddings-Word2vec
Arabic Word Embeddings Word2vec
Stars: ✭ 26 (-36.59%)
Mutual labels:  word2vec, word-embeddings
Text-Analysis
Explaining textual analysis tools in Python. Including Preprocessing, Skip Gram (word2vec), and Topic Modelling.
Stars: ✭ 48 (+17.07%)
Mutual labels:  word2vec, word-embeddings
pair2vec
pair2vec: Compositional Word-Pair Embeddings for Cross-Sentence Inference
Stars: ✭ 62 (+51.22%)
Mutual labels:  word-embeddings, representation-learning
Bagofconcepts
Python implementation of bag-of-concepts
Stars: ✭ 18 (-56.1%)
Mutual labels:  word2vec, representation-learning
Simple-Sentence-Similarity
Exploring the simple sentence similarity measurements using word embeddings
Stars: ✭ 99 (+141.46%)
Mutual labels:  word2vec, word-embeddings
SWDM
SIGIR 2017: Embedding-based query expansion for weighted sequential dependence retrieval model
Stars: ✭ 35 (-14.63%)
Mutual labels:  word2vec, word-embeddings
word embedding
Sample code for training Word2Vec and FastText using wiki corpus and their pretrained word embedding..
Stars: ✭ 21 (-48.78%)
Mutual labels:  word2vec, word-embeddings
word-benchmarks
Benchmarks for intrinsic word embeddings evaluation.
Stars: ✭ 45 (+9.76%)
Mutual labels:  word2vec, word-embeddings
Deep learning nlp
Keras, PyTorch, and NumPy Implementations of Deep Learning Architectures for NLP
Stars: ✭ 407 (+892.68%)
Mutual labels:  word2vec, word-embeddings
word2vec-on-wikipedia
A pipeline for training word embeddings using word2vec on wikipedia corpus.
Stars: ✭ 68 (+65.85%)
Mutual labels:  word2vec, word-embeddings
word2vec-tsne
Google News and Leo Tolstoy: Visualizing Word2Vec Word Embeddings using t-SNE.
Stars: ✭ 59 (+43.9%)
Mutual labels:  word2vec, word-embeddings
two-stream-cnn
A two-stream convolutional neural network for learning abitrary similarity functions over two sets of training data
Stars: ✭ 24 (-41.46%)
Mutual labels:  word2vec, word-embeddings
sentiment-analysis-of-tweets-in-russian
Sentiment analysis of tweets in Russian using Convolutional Neural Networks (CNN) with Word2Vec embeddings.
Stars: ✭ 51 (+24.39%)
Mutual labels:  word2vec, word-embeddings
Chameleon recsys
Source code of CHAMELEON - A Deep Learning Meta-Architecture for News Recommender Systems
Stars: ✭ 202 (+392.68%)
Mutual labels:  word2vec, word-embeddings
Koan
A word2vec negative sampling implementation with correct CBOW update.
Stars: ✭ 232 (+465.85%)
Mutual labels:  word2vec, word-embeddings
codenames
Codenames AI using Word Vectors
Stars: ✭ 41 (+0%)
Mutual labels:  word2vec, word-embeddings
Wego
Word Embeddings (e.g. Word2Vec) in Go!
Stars: ✭ 336 (+719.51%)
Mutual labels:  word2vec, word-embeddings

Tools for computing distributed representtion of words

We provide an implementation of the Continuous Bag-of-Words (CBOW) and the Skip-gram model (SG), as well as several demo scripts.

Given a text corpus, the word2vec tool learns a vector for every word in the vocabulary using the Continuous Bag-of-Words or the Skip-Gram neural network architectures. The user should to specify the following:

  • desired vector dimensionality
  • the size of the context window for either the Skip-Gram or the Continuous Bag-of-Words model
  • training algorithm: hierarchical softmax and / or negative sampling
  • threshold for downsampling the frequent words
  • number of threads to use
  • the format of the output word vector file (text or binary)

Usually, the other hyper-parameters such as the learning rate do not need to be tuned for different training sets.

The script demo-word.sh downloads a small (100MB) text corpus from the web, and trains a small word vector model. After the training is finished, the user can interactively explore the similarity of the words.

More information about the scripts is provided at https://code.google.com/p/word2vec/


In order to get Python wrapper (https://github.com/danielfrg/word2vec) working added word2vec-doc2vec tool support (https://github.com/nliu86/word2vec-doc2vec)

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].