Formed trajectories of sets of points.Experimented on finding similarities between trajectories based on DTW (Dynamic Time Warping) and LCSS (Longest Common SubSequence) algorithms.Modeled trajectories as strings based on a Grid representation.Benchmarked KNN, Random Forest, Logistic Regression classification algorithms to classify efficiently t…

Stars: ✭ 41 (+28.13%)

Mutual labels: scikitlearn-machine-learning

christmAIs

Text to abstract art generation for the holidays!

Stars: ✭ 90 (+181.25%)

Mutual labels: fasttext-embeddings

name2gender

Extrapolate gender from first names using Naïve-Bayes and PyTorch Char-RNN

Stars: ✭ 24 (-25%)

Mutual labels: naive-bayes-classifier

Simple-Sentence-Similarity

Exploring the simple sentence similarity measurements using word embeddings

Stars: ✭ 99 (+209.38%)

Mutual labels: word2vec

lapis-bayes

Naive Bayes classifier for use in Lua

Stars: ✭ 26 (-18.75%)

Mutual labels: naive-bayes-classifier

hyperstar

Hyperstar: Negative Sampling Improves Hypernymy Extraction Based on Projection Learning.

Stars: ✭ 24 (-25%)

Mutual labels: word2vec

fake-fews

Candidate solution for Facebook's fake news problem using machine learning and crowd-sourcing. Takes form of a Chrome extension. Developed in under 24 hours at 2017 Crimson Code hackathon at Washington State University.

Stars: ✭ 13 (-59.37%)

Mutual labels: naive-bayes-classifier

Recommendation-based-on-sequence-

Recommendation based on sequence

Stars: ✭ 23 (-28.12%)

Mutual labels: word2vec

Word2VecAndTsne

Scripts demo-ing how to train a Word2Vec model and reduce its vector space

Stars: ✭ 45 (+40.63%)

Mutual labels: word2vec

skip-gram-Chinese

skip-gram for Chinese word2vec base on tensorflow

Stars: ✭ 20 (-37.5%)

Mutual labels: word2vec

Vaaku2Vec

Language Modeling and Text Classification in Malayalam Language using ULMFiT

Stars: ✭ 68 (+112.5%)

Mutual labels: word2vec

word2vec-movies

Bag of words meets bags of popcorn in Python 3 中文教程

Stars: ✭ 54 (+68.75%)

Mutual labels: word2vec

grad-cam-text

Implementation of Grad-CAM for text.

Stars: ✭ 37 (+15.63%)

Mutual labels: word2vec

Word2Vec-iOS

Word2Vec iOS port

Stars: ✭ 23 (-28.12%)

Mutual labels: word2vec

two-stream-cnn

A two-stream convolutional neural network for learning abitrary similarity functions over two sets of training data

Stars: ✭ 24 (-25%)

Mutual labels: word2vec

doc2vec-api

document embedding and machine learning script for beginners

Stars: ✭ 92 (+187.5%)

Mutual labels: word2vec

asm2vec

An unofficial implementation of asm2vec as a standalone python package

Stars: ✭ 127 (+296.88%)

Mutual labels: word2vec

View All Similar Projects ➔

Word Embeddings and Document Vectors

This is the source code to go along with the series of blog articles

The code employs,

Elasticsearch (localhost:9200) as the repository
1. to save tokens to, and get them as needed.
2. to save word-vectors (pre-trained or custom) to, and get them as needed.
See the Pipfle for Python dependencies

Usage

Generate tokens for the 20-news corpus & the movie review data set and save them to Elasticsearch.
- The dataset for 20-news is downloaded as part of the script. But you need to download the movie review dataset separately.
- The shell script & python code in the folders text-data/twenty-news & text-data/acl-imdb
Generate custom word vectors for the two text corpus in 1 above and save them to Elasticsearch. text-data/twenty-news/vectors & text-data/acl-imdb/vectors directories have the scripts
Process pre-trained vectors and save them to Elasticsearch. Look into pre-trained-vectors/ for the code. You need to download the actual published vectors from their sources. We have used Word2Vec, Glove and FastText in these articles.
The script run.sh can be configured to run whichever combination of the pipeline steps.
The logs contain the F-scores and timing results. Create a "logs" directory before running the run.sh script

mkdir logs

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

ashokc / Word-Embeddings-and-Document-Vectors

Programming Languages

Labels

Projects that are alternatives of or similar to Word-Embeddings-and-Document-Vectors

Word Embeddings and Document Vectors

Usage