Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → lokicui → doc2vec-golang

lokicui / doc2vec-golang

Licence: Apache-2.0 license

doc2vec , word2vec, implemented by golang. word embedding representation

Programming Languages

31211 projects - #10 most used programming language

50402 projects - #5 most used programming language

134 projects

77523 projects

Labels

word2vec doc2vec doc2vec-golang

Projects that are alternatives of or similar to doc2vec-golang

Product-Categorization-NLP

Multi-Class Text Classification for products based on their description with Machine Learning algorithms and Neural Networks (MLP, CNN, Distilbert).

Stars: ✭ 30 (-9.09%)

Mutual labels: word2vec, doc2vec

Python interface to Google word2vec

Stars: ✭ 2,370 (+7081.82%)

Mutual labels: word2vec, doc2vec

Embedding模型代码和学习笔记总结

Stars: ✭ 25 (-24.24%)

Mutual labels: word2vec, doc2vec

Graph Embedding via Frequent Subgraphs

Stars: ✭ 39 (+18.18%)

Mutual labels: word2vec, doc2vec

Assessing Source Code Semantic Similarity with Unsupervised Learning

Stars: ✭ 42 (+27.27%)

Mutual labels: word2vec, doc2vec

document embedding and machine learning script for beginners

Stars: ✭ 92 (+178.79%)

Mutual labels: word2vec, doc2vec

word2vec-on-wikipedia

A pipeline for training word embeddings using word2vec on wikipedia corpus.

Stars: ✭ 68 (+106.06%)

Mutual labels: word2vec

sarcasm-detection-for-sentiment-analysis

Sarcasm Detection for Sentiment Analysis

Stars: ✭ 21 (-36.36%)

Mutual labels: word2vec

img classification deep learning

No description or website provided.

Stars: ✭ 19 (-42.42%)

Mutual labels: word2vec

Emotion-recognition-from-tweets

A comprehensive approach on recognizing emotion (sentiment) from a certain tweet. Supervised machine learning.

Stars: ✭ 17 (-48.48%)

Mutual labels: word2vec

Word2Vec In Java (2013 google word2vec opensource)

Stars: ✭ 13 (-60.61%)

Mutual labels: word2vec

wmd4j is a Java library for calculating Word Mover's Distance (WMD)

Stars: ✭ 31 (-6.06%)

Mutual labels: word2vec

AnnA Anki neuronal Appendix

Using machine learning on your anki collection to enhance the scheduling via semantic clustering and semantic similarity

Stars: ✭ 39 (+18.18%)

Mutual labels: doc2vec

TextAugment: Text Augmentation Library

Stars: ✭ 280 (+748.48%)

Mutual labels: word2vec

text-classification-cn

中文文本分类实践，基于搜狗新闻语料库，采用传统机器学习方法以及预训练模型等方法

Stars: ✭ 81 (+145.45%)

Mutual labels: word2vec

word2vec-pytorch

Extremely simple and fast word2vec implementation with Negative Sampling + Sub-sampling

Stars: ✭ 145 (+339.39%)

Mutual labels: word2vec

NLP Predtrained Embeddings, Models and Datasets Collections(NLP_PEMDC). The collection will keep updating.

Stars: ✭ 58 (+75.76%)

Mutual labels: word2vec

Name-disambiguation

同名论文消歧的工程化方案（参考2019智源-aminer人名消歧竞赛第一名方案）

Stars: ✭ 17 (-48.48%)

Mutual labels: word2vec

🎨 🎨NLP 自然语言处理教程 🎨🎨 https://dataxujing.github.io/NLP-paper/

Stars: ✭ 23 (-30.3%)

Mutual labels: word2vec

test word2vec uyghur

Bu Uyghur yéziqini Pythonning gensim ambiridiki word2vec algorizimida sinap baqqan misal.

Stars: ✭ 15 (-54.55%)

Mutual labels: word2vec

View All Similar Projects ➔

doc2vec-golang

golang implement of Tomas Mikolov's word/document embedding. You may want to feel the basic idea from Mikolov's two orignal papers, word2vec and doc2vec. More recently, Andrew M. Dai etc from Google reported its power in more detail

usage

[@bjsjs_11_83 doc2vec-golang]$ ./control build
traning Exec build ok
build ok

# The training data(data/zhihu_data.1w) is one document per line, two columns divided by tab, 
# the first column is id, and the second column is the segmented document separated by spaces.
[@bjsjs_11_83 doc2vec-golang]$ ./train  data/zhihu_data.1w          
Skip-Gram Iter:48 Alpha: 0.000796  Progress: 96.81%  Words/sec: 24.27k  
2018-03-30 14:53:00.218536235 +0800 CST training end, 1342521 26861

[@bjsjs_11_83 doc2vec-golang]$ ./knn 2.model 

please select operation type:
        0:word2words
        1:doc_likelihood
        2:leave one out key words
        3:sen2words
        4:sen2docs
        5:word2docs
        6:doc2docs
        7:doc2words
0
Enter text:网页
        1       网页
        0.7823723719117796      不让
        0.7651260773728028      浏览
        0.7642516944020028      邮件
        0.7601415883811553      近
        0.7517607921006224      迷恋
        0.7492900066365179      等同
        0.7485966355448261      传说
        0.7463299535930537      基于
        0.7447865182221745      版

please select operation type:
        0:word2words
        1:doc_likelihood
        2:leave one out key words
        3:sen2words
        4:sen2docs
        5:word2docs
        6:doc2docs
        7:doc2words

Dependencies

golang
msgp

已实现特性

doc2vec支持CBOW和Skip-Gram两种模型，Negative Sampling和Hierarchical Softmax优化均已实现
online infer document
likelihood of document
doc2words
doc2docs
word2words
word2docs

未实现特性

wmd
doc2vec添加同义词语义约束
句子提取核心词

参考资料

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 33

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (1) 🔗