All Projects → RilaShu → skip-gram-Chinese

RilaShu / skip-gram-Chinese

Licence: other
skip-gram for Chinese word2vec base on tensorflow

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to skip-gram-Chinese

Sensegram
Making sense embedding out of word embeddings using graph-based word sense induction
Stars: ✭ 209 (+945%)
Mutual labels:  word2vec
Movietaster Open
A practical movie recommend project based on Item2vec.
Stars: ✭ 253 (+1165%)
Mutual labels:  word2vec
Vaaku2Vec
Language Modeling and Text Classification in Malayalam Language using ULMFiT
Stars: ✭ 68 (+240%)
Mutual labels:  word2vec
Stocksensation
基于情感字典和机器学习的股市舆情情感分类可视化Web
Stars: ✭ 215 (+975%)
Mutual labels:  word2vec
Book deeplearning in pytorch source
Stars: ✭ 236 (+1080%)
Mutual labels:  word2vec
russe
RUSSE: Russian Semantic Evaluation.
Stars: ✭ 11 (-45%)
Mutual labels:  word2vec
Chameleon recsys
Source code of CHAMELEON - A Deep Learning Meta-Architecture for News Recommender Systems
Stars: ✭ 202 (+910%)
Mutual labels:  word2vec
Recommendation-based-on-sequence-
Recommendation based on sequence
Stars: ✭ 23 (+15%)
Mutual labels:  word2vec
Aravec
AraVec is a pre-trained distributed word representation (word embedding) open source project which aims to provide the Arabic NLP research community with free to use and powerful word embedding models.
Stars: ✭ 239 (+1095%)
Mutual labels:  word2vec
word-embeddings-from-scratch
Creating word embeddings from scratch and visualize them on TensorBoard. Using trained embeddings in Keras.
Stars: ✭ 22 (+10%)
Mutual labels:  word2vec
Practical 1
Oxford Deep NLP 2017 course - Practical 1: word2vec
Stars: ✭ 220 (+1000%)
Mutual labels:  word2vec
Koan
A word2vec negative sampling implementation with correct CBOW update.
Stars: ✭ 232 (+1060%)
Mutual labels:  word2vec
Simple-Sentence-Similarity
Exploring the simple sentence similarity measurements using word embeddings
Stars: ✭ 99 (+395%)
Mutual labels:  word2vec
Gemsec
The TensorFlow reference implementation of 'GEMSEC: Graph Embedding with Self Clustering' (ASONAM 2019).
Stars: ✭ 210 (+950%)
Mutual labels:  word2vec
Word2VecAndTsne
Scripts demo-ing how to train a Word2Vec model and reduce its vector space
Stars: ✭ 45 (+125%)
Mutual labels:  word2vec
Word2vec
Python interface to Google word2vec
Stars: ✭ 2,370 (+11750%)
Mutual labels:  word2vec
Cukatify
Cukatify is a music social media project
Stars: ✭ 21 (+5%)
Mutual labels:  word2vec
two-stream-cnn
A two-stream convolutional neural network for learning abitrary similarity functions over two sets of training data
Stars: ✭ 24 (+20%)
Mutual labels:  word2vec
Word2Vec-iOS
Word2Vec iOS port
Stars: ✭ 23 (+15%)
Mutual labels:  word2vec
grad-cam-text
Implementation of Grad-CAM for text.
Stars: ✭ 37 (+85%)
Mutual labels:  word2vec

skip-gram-Chinese

  • 概要
    针对中文语料数据,基于tensorflow的skip-gram算法实现,实验语料使用金庸全集(可替换)
  • 代码
    skipgram_chinese.py -- 源码
    usage_example.py -- 使用示例(需下载word2vec.txt)
  • 语料与模型
    语料 -- 金庸全集(注意:生成通用词向量应使用其他标准语料库,可以参考https://github.com/brightmart/nlp_chinese_corpus
    模型 -- word2vec.txt (10万词,100维向量表示)
    文件较大,均提供外链下载
  • 效果示例
pd.Series(word2vec_model.most_similar(u'乔峰'))

0 (鸠摩智, 0.5863361358642578)
1 (萧峰, 0.5798118114471436)
2 (任我行, 0.5723351836204529)
3 (慕容复, 0.5638849139213562)
4 (杨康, 0.5621821880340576)
5 (裘千仞, 0.5401000380516052)
6 (岳不群, 0.5394284725189209)
7 (张翠山, 0.5377693176269531)
8 (车尔库, 0.5314956903457642)
9 (令狐冲, 0.5277308821678162)

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].