All Projects → msgi → Nlp Journey

msgi / Nlp Journey

Licence: apache-2.0
Documents, papers and codes related to Natural Language Processing, including Topic Model, Word Embedding, Named Entity Recognition, Text Classificatin, Text Generation, Text Similarity, Machine Translation),etc. All codes are implemented intensorflow 2.0.

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Nlp Journey

Nlp research
NLP research:基于tensorflow的nlp深度学习项目,支持文本分类/句子匹配/序列标注/文本生成 四大任务
Stars: ✭ 141 (-89.07%)
Mutual labels:  classification, word2vec, ner, fasttext
NLP-paper
🎨 🎨NLP 自然语言处理教程 🎨🎨 https://dataxujing.github.io/NLP-paper/
Stars: ✭ 23 (-98.22%)
Mutual labels:  word2vec, crf, lda, fasttext
Wordembeddings Elmo Fasttext Word2vec
Using pre trained word embeddings (Fasttext, Word2Vec)
Stars: ✭ 146 (-88.68%)
Mutual labels:  classification, word2vec, fasttext, gensim
Ml Projects
ML based projects such as Spam Classification, Time Series Analysis, Text Classification using Random Forest, Deep Learning, Bayesian, Xgboost in Python
Stars: ✭ 127 (-90.16%)
Mutual labels:  word2vec, svm, gensim
Magnitude
A fast, efficient universal vector embedding utility package.
Stars: ✭ 1,394 (+8.06%)
Mutual labels:  word2vec, fasttext, gensim
Gensim
Topic Modelling for Humans
Stars: ✭ 12,763 (+889.38%)
Mutual labels:  word2vec, fasttext, gensim
Shallowlearn
An experiment about re-implementing supervised learning models based on shallow neural network approaches (e.g. fastText) with some additional exclusive features and nice API. Written in Python and fully compatible with Scikit-learn.
Stars: ✭ 196 (-84.81%)
Mutual labels:  word2vec, fasttext, gensim
Macropodus
自然语言处理工具Macropodus,基于Albert+BiLSTM+CRF深度学习网络架构,中文分词,词性标注,命名实体识别,新词发现,关键词,文本摘要,文本相似度,科学计算器,中文数字阿拉伯数字(罗马数字)转换,中文繁简转换,拼音转换。tookit(tool) of NLP,CWS(chinese word segnment),POS(Part-Of-Speech Tagging),NER(name entity recognition),Find(new words discovery),Keyword(keyword extraction),Summarize(text summarization),Sim(text similarity),Calculate(scientific calculator),Chi2num(chinese number to arabic number)
Stars: ✭ 309 (-76.05%)
Mutual labels:  similarity, ner, crf
wordfish-python
extract relationships from standardized terms from corpus of interest with deep learning 🐟
Stars: ✭ 19 (-98.53%)
Mutual labels:  word2vec, gensim, lda
biovec
ProtVec can be used in protein interaction predictions, structure prediction, and protein data visualization.
Stars: ✭ 23 (-98.22%)
Mutual labels:  svm, word2vec, gensim
Text Classification Models Pytorch
Implementation of State-of-the-art Text Classification Models in Pytorch
Stars: ✭ 379 (-70.62%)
Mutual labels:  classification, attention, fasttext
Ner Bert
BERT-NER (nert-bert) with google bert https://github.com/google-research.
Stars: ✭ 339 (-73.72%)
Mutual labels:  classification, attention, ner
Lmdb Embeddings
Fast word vectors with little memory usage in Python
Stars: ✭ 404 (-68.68%)
Mutual labels:  word2vec, fasttext, gensim
Neural Networks
All about Neural Networks!
Stars: ✭ 34 (-97.36%)
Mutual labels:  word2vec, fasttext
Patternrecognition matlab
Feature reduction projections and classifier models are learned by training dataset and applied to classify testing dataset. A few approaches of feature reduction have been compared in this paper: principle component analysis (PCA), linear discriminant analysis (LDA) and their kernel methods (KPCA,KLDA). Correspondingly, a few approaches of classification algorithm are implemented: Support Vector Machine (SVM), Gaussian Quadratic Maximum Likelihood and K-nearest neighbors (KNN) and Gaussian Mixture Model(GMM).
Stars: ✭ 33 (-97.44%)
Mutual labels:  svm, lda
Finalfusion Rust
finalfusion embeddings in Rust
Stars: ✭ 35 (-97.29%)
Mutual labels:  word2vec, fasttext
Named entity recognition
中文命名实体识别(包括多种模型:HMM,CRF,BiLSTM,BiLSTM+CRF的具体实现)
Stars: ✭ 995 (-22.87%)
Mutual labels:  ner, crf
Defactonlp
DeFactoNLP: An Automated Fact-checking System that uses Named Entity Recognition, TF-IDF vector comparison and Decomposable Attention models.
Stars: ✭ 30 (-97.67%)
Mutual labels:  attention, ner
Ml Classify Text Js
Machine learning based text classification in JavaScript using n-grams and cosine similarity
Stars: ✭ 38 (-97.05%)
Mutual labels:  classification, similarity
Tadw
An implementation of "Network Representation Learning with Rich Text Information" (IJCAI '15).
Stars: ✭ 43 (-96.67%)
Mutual labels:  word2vec, gensim

nlp journey

Star Fork GitHub Issues License

All implemented in tensorflow 2.0,codes

1. Basics

2. Books(baiduyun code:txqx)

  1. Handbook of Graphical Models. online
  2. Deep Learning. online
  3. Neural Networks and Deep Learning. online
  4. Speech and Language Processing. online

3. Papers

01) Transformer papers

  1. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. paper
  2. GPT-2: Language Models are Unsupervised Multitask Learners. paper
  3. Transformer-XL: Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context. paper
  4. XLNet: Generalized Autoregressive Pretraining for Language Understanding. paper
  5. RoBERTa: Robustly Optimized BERT Pretraining Approach. paper
  6. DistilBERT: a distilled version of BERT: smaller, faster, cheaper and lighter. paper
  7. ALBERT: A Lite BERT for Self-supervised Learning of Language Representations. paper
  8. T5: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. paper
  9. ELECTRA: pre-training text encoders as discriminators rather than generators. paper
  10. GPT3: Language Models are Few-Shot Learners. paper

02) Models

  1. LSTM(Long Short-term Memory). paper
  2. Sequence to Sequence Learning with Neural Networks. paper
  3. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. paper
  4. Residual Network(Deep Residual Learning for Image Recognition). paper
  5. Dropout(Improving neural networks by preventing co-adaptation of feature detectors). paper
  6. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. paper

03) Summaries

  1. An overview of gradient descent optimization algorithms. paper
  2. Analysis Methods in Neural Language Processing: A Survey. paper
  3. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. paper
  4. A Review on Generative Adversarial Networks: Algorithms, Theory, and Applications. paper
  5. A Gentle Introduction to Deep Learning for Graphs. paper
  6. A Survey on Deep Learning for Named Entity Recognition. paper
  7. More Data, More Relations, More Context and More Openness: A Review and Outlook for Relation Extraction. paper
  8. Deep Learning Based Text Classification: A Comprehensive Review. paper
  9. Pre-trained Models for Natural Language Processing: A Survey. paper
  10. A Survey on Contextual Embeddings. paper
  11. A Survey on Knowledge Graphs: Representation, Acquisition and Applications. paper
  12. Knowledge Graphs. paper
  13. Pre-trained Models for Natural Language Processing: A Survey. paper

04) Pre-training

  1. A Neural Probabilistic Language Model. paper
  2. word2vec Parameter Learning Explained. paper
  3. Language Models are Unsupervised Multitask Learners. paper
  4. An Empirical Study of Smoothing Techniques for Language Modeling. paper
  5. Efficient Estimation of Word Representations in Vector Space. paper
  6. Distributed Representations of Sentences and Documents. paper
  7. Enriching Word Vectors with Subword Information(FastText). paper
  8. GloVe: Global Vectors for Word Representation. online
  9. ELMo (Deep contextualized word representations). paper
  10. Pre-Training with Whole Word Masking for Chinese BERT. paper

05) Classification

  1. Bag of Tricks for Efficient Text Classification (FastText). paper
  2. Convolutional Neural Networks for Sentence Classification. paper
  3. Attention-Based Bidirectional Long Short-Term Memory Networks for Relation Classification. paper

06) Text generation

  1. A Deep Ensemble Model with Slot Alignment for Sequence-to-Sequence Natural Language Generation. paper
  2. SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient. paper

07) Text Similarity

  1. Learning to Rank Short Text Pairs with Convolutional Deep Neural Networks. paper
  2. Learning Text Similarity with Siamese Recurrent Networks. paper
  3. A Deep Architecture for Matching Short Texts. paper

08) QA

  1. A Question-Focused Multi-Factor Attention Network for Question Answering. paper
  2. The Design and Implementation of XiaoIce, an Empathetic Social Chatbot. paper
  3. A Knowledge-Grounded Neural Conversation Model. paper
  4. Neural Generative Question Answering. paper
  5. Sequential Matching Network A New Architecture for Multi-turn Response Selection in Retrieval-Based Chatbots.paper
  6. Modeling Multi-turn Conversation with Deep Utterance Aggregation.paper
  7. Multi-Turn Response Selection for Chatbots with Deep Attention Matching Network.paper
  8. Deep Reinforcement Learning For Modeling Chit-Chat Dialog With Discrete Attributes. paper

09) NMT

  1. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation. paper
  2. Neural Machine Translation by Jointly Learning to Align and Translate. paper
  3. Transformer (Attention Is All You Need). paper

10) Summary

  1. Get To The Point: Summarization with Pointer-Generator Networks. paper
  2. Deep Recurrent Generative Decoder for Abstractive Text Summarization. paper

11) Relation extraction

  1. Distant Supervision for Relation Extraction via Piecewise Convolutional Neural Networks. paper
  2. Neural Relation Extraction with Multi-lingual Attention. paper
  3. FewRel: A Large-Scale Supervised Few-Shot Relation Classification Dataset with State-of-the-Art Evaluation. paper
  4. End-to-End Relation Extraction using LSTMs on Sequences and Tree Structures. paper

4. Articles

  • 如何学习自然语言处理(综合版). url
  • TRANSFORMERS FROM SCRATCH. url
  • The Illustrated Transformer.url
  • Attention-based-model. url
  • Modern Deep Learning Techniques Applied to Natural Language Processing. url
  • 难以置信!LSTM和GRU的解析从未如此清晰(动图+视频)。url
  • 从语言模型到Seq2Seq:Transformer如戏,全靠Mask. url
  • Applying word2vec to Recommenders and Advertising. url
  • 2019 NLP大全:论文、博客、教程、工程进展全梳理. url

5. Github

6. Blog

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].