All Projects → BrambleXu → word2vec-movies

BrambleXu / word2vec-movies

Licence: other
Bag of words meets bags of popcorn in Python 3 中文教程

Programming Languages

Jupyter Notebook
11667 projects

Projects that are alternatives of or similar to word2vec-movies

Ngram2vec
Four word embedding models implemented in Python. Supporting arbitrary context features
Stars: ✭ 703 (+1201.85%)
Mutual labels:  word2vec, chinese
Nlp chinese corpus
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
Stars: ✭ 6,656 (+12225.93%)
Mutual labels:  word2vec, chinese
Lightnlp
基于Pytorch和torchtext的自然语言处理深度学习框架。
Stars: ✭ 739 (+1268.52%)
Mutual labels:  word2vec, chinese
Word2VecAndTsne
Scripts demo-ing how to train a Word2Vec model and reduce its vector space
Stars: ✭ 45 (-16.67%)
Mutual labels:  word2vec
Word2Vec-iOS
Word2Vec iOS port
Stars: ✭ 23 (-57.41%)
Mutual labels:  word2vec
two-stream-cnn
A two-stream convolutional neural network for learning abitrary similarity functions over two sets of training data
Stars: ✭ 24 (-55.56%)
Mutual labels:  word2vec
eslint-config-mingelz
A shared ESLint configuration with Chinese comments. 一份带有完整中文注释的 ESLint 规则。
Stars: ✭ 15 (-72.22%)
Mutual labels:  chinese
Vaaku2Vec
Language Modeling and Text Classification in Malayalam Language using ULMFiT
Stars: ✭ 68 (+25.93%)
Mutual labels:  word2vec
skip-gram-Chinese
skip-gram for Chinese word2vec base on tensorflow
Stars: ✭ 20 (-62.96%)
Mutual labels:  word2vec
Email-newsletter-RSS
邮箱 📧 newsletter RSS 荟萃 News
Stars: ✭ 1,225 (+2168.52%)
Mutual labels:  chinese
date-extractor
Extract dates from text
Stars: ✭ 58 (+7.41%)
Mutual labels:  chinese
exhentai-tags-chinese-translation
E-Hentai/ExHentai 全部 TAGs 中文翻译
Stars: ✭ 273 (+405.56%)
Mutual labels:  chinese
next-qrcode
React hooks for generating QRCode for your next React apps.
Stars: ✭ 87 (+61.11%)
Mutual labels:  chinese
discussion
記錄有關繁化姬的議題或是內容
Stars: ✭ 33 (-38.89%)
Mutual labels:  chinese
GE-FSG
Graph Embedding via Frequent Subgraphs
Stars: ✭ 39 (-27.78%)
Mutual labels:  word2vec
anki-maobi
máobĭ (毛笔) is an Anki add-on to create cards with writing quizzes for Hanzi (Chinese characters)
Stars: ✭ 42 (-22.22%)
Mutual labels:  chinese
chinese-learner
A desktop web application for learning Mandarin Chinese and its character stroke order.
Stars: ✭ 22 (-59.26%)
Mutual labels:  chinese
tensorflow-chatbot-chinese
網頁聊天機器人 | tensorflow implementation of seq2seq model with bahdanau attention and Word2Vec pretrained embedding
Stars: ✭ 50 (-7.41%)
Mutual labels:  chinese
Recommendation-based-on-sequence-
Recommendation based on sequence
Stars: ✭ 23 (-57.41%)
Mutual labels:  word2vec
Sublime-Fanhuaji
繁化姬的 Sublime Text 插件
Stars: ✭ 48 (-11.11%)
Mutual labels:  chinese

项目目的

学习如何使用Word2Vec来对文本文件进行处理。

来源

这个笔记是基于Kaggle比赛:Bag of words meets bags of popcorn。打开页面后可以看到有关于NLP的相关教程,于是我把Part1~3用中文做了三个笔记。

因为年代比较久远,有些库的API失效了,我重新用python3实现的过程中填了不少坑,都是可以正常运行的。

内容

Dependencies

以下库全基于python3.5.2:

  • pandas==20.3
  • scikit-learn==0.19.0
  • numpy==1.13.1
  • jupyter==1.0.0

计划

因为这个笔记里的内容只是kaggle项目上给出的教学部分,实际得分最好也只有0.84,所以充其量只能是一个了解word2vec的教程,内容本身并不深入。

于是我找到了这个项目:sentiment-analysis,作者写了三个模型,前两个在教程中出现过了,第三个使用Ensemble的方法把前两个模型组合了起来,最后得分能到0.96。而且作者代码组织得也不错,可以用来学习如何写一个完整的项目,而不是仅仅在Jupyter Notebook上写。

不过因为作者用的是python2,而且很多包的API变了,我打算用pytohn3重写一下,一边学习一边分享出来。项目地址在这里:sentiment-analysis

推荐读物

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].