All Projects → ShawnyXiao → Textclassification Keras

ShawnyXiao / Textclassification Keras

Licence: mit
Text classification models implemented in Keras, including: FastText, TextCNN, TextRNN, TextBiRNN, TextAttBiRNN, HAN, RCNN, RCNNVariant, etc.

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Textclassification Keras

Keras Textclassification
中文长文本分类、短句子分类、多标签分类、两句子相似度(Chinese Text Classification of Keras NLP, multi-label classify, or sentence classify, long or short),字词句向量嵌入层(embeddings)和网络层(graph)构建基类,FastText,TextCNN,CharCNN,TextRNN, RCNN, DCNN, DPCNN, VDCNN, CRNN, Bert, Xlnet, Albert, Attention, DeepMoji, HAN, 胶囊网络-CapsuleNet, Transformer-encode, Seq2seq, SWEM, LEAM, TextGCN
Stars: ✭ 914 (+47.18%)
Mutual labels:  text-classification, fasttext, rcnn
Rcnn Text Classification
Tensorflow Implementation of "Recurrent Convolutional Neural Network for Text Classification" (AAAI 2015)
Stars: ✭ 127 (-79.55%)
Mutual labels:  text-classification, rcnn
Fastrtext
R wrapper for fastText
Stars: ✭ 103 (-83.41%)
Mutual labels:  text-classification, fasttext
Shallowlearn
An experiment about re-implementing supervised learning models based on shallow neural network approaches (e.g. fastText) with some additional exclusive features and nice API. Written in Python and fully compatible with Scikit-learn.
Stars: ✭ 196 (-68.44%)
Mutual labels:  text-classification, fasttext
Text classification
all kinds of text classification models and more with deep learning
Stars: ✭ 7,179 (+1056.04%)
Mutual labels:  text-classification, fasttext
Textclassification
All kinds of neural text classifiers implemented by Keras
Stars: ✭ 51 (-91.79%)
Mutual labels:  text-classification, rcnn
Text Classification Demos
Neural models for Text Classification in Tensorflow, such as cnn, dpcnn, fasttext, bert ...
Stars: ✭ 144 (-76.81%)
Mutual labels:  text-classification, fasttext
Fasttext.js
FastText for Node.js
Stars: ✭ 127 (-79.55%)
Mutual labels:  text-classification, fasttext
nlpbuddy
A text analysis application for performing common NLP tasks through a web dashboard interface and an API
Stars: ✭ 115 (-81.48%)
Mutual labels:  text-classification, fasttext
extremeText
Library for fast text representation and extreme classification.
Stars: ✭ 141 (-77.29%)
Mutual labels:  text-classification, fasttext
Textclassificationbenchmark
A Benchmark of Text Classification in PyTorch
Stars: ✭ 534 (-14.01%)
Mutual labels:  text-classification, rcnn
Bert language understanding
Pre-training of Deep Bidirectional Transformers for Language Understanding: pre-train TextCNN
Stars: ✭ 933 (+50.24%)
Mutual labels:  text-classification, fasttext
node-fasttext
Nodejs binding for fasttext representation and classification.
Stars: ✭ 39 (-93.72%)
Mutual labels:  text-classification, fasttext
Fasttext.py
A Python interface for Facebook fastText
Stars: ✭ 1,091 (+75.68%)
Mutual labels:  text-classification, fasttext
Ai law
all kinds of baseline models for long text classificaiton( text categorization)
Stars: ✭ 243 (-60.87%)
Mutual labels:  text-classification, fasttext
medical-diagnosis-cnn-rnn-rcnn
分别使用rnn/cnn/rcnn来实现根据患者描述,进行疾病诊断
Stars: ✭ 39 (-93.72%)
Mutual labels:  text-classification, rcnn
Text Classification Models Pytorch
Implementation of State-of-the-art Text Classification Models in Pytorch
Stars: ✭ 379 (-38.97%)
Mutual labels:  fasttext, rcnn
Lmdb Embeddings
Fast word vectors with little memory usage in Python
Stars: ✭ 404 (-34.94%)
Mutual labels:  fasttext
Cluepretrainedmodels
高质量中文预训练模型集合:最先进大模型、最快小模型、相似度专门模型
Stars: ✭ 493 (-20.61%)
Mutual labels:  text-classification
Whatlang Rs
Natural language detection library for Rust. Try demo online: https://www.greyblake.com/whatlang/
Stars: ✭ 400 (-35.59%)
Mutual labels:  text-classification

TextClassification-Keras

This code repository implements a variety of deep learning models for text classification using the Keras framework, which includes: FastText, TextCNN, TextRNN, TextBiRNN, TextAttBiRNN, HAN, RCNN, RCNNVariant, etc. In addition to the model implementation, a simplified application is included.

Guidance

  1. Environment
  2. Usage
  3. Model
    1. FastText
    2. TextCNN
    3. TextRNN
    4. TextBiRNN
    5. TextAttBiRNN
    6. HAN
    7. RCNN
    8. RCNNVariant
    9. To Be Continued...
  4. Reference

Environment

  • Python 3.7
  • NumPy 1.17.2
  • Tensorflow 2.0.1

Usage

All codes are located in the directory /model, and each kind of model has a corresponding directory in which the model and application are placed.

For example, the model and application of FastText are located under /model/FastText, the model part is fast_text.py, and the application part is main.py.

Model

1 FastText

FastText was proposed in the paper Bag of Tricks for Efficient Text Classification.

1.1 Description in Paper

  1. Using a look-up table, bags of ngram covert to word representations.
  2. Word representations are averaged into a text representation, which is a hidden variable.
  3. Text representation is in turn fed to a linear classifier.
  4. Use the softmax function to compute the probability distribution over the predefined classes.

1.2 Implementation Here

Network structure of FastText:

2 TextCNN

TextCNN was proposed in the paper Convolutional Neural Networks for Sentence Classification.

2.1 Description in Paper

  1. Represent sentence with static and non-static channels.
  2. Convolve with multiple filter widths and feature maps.
  3. Use max-over-time pooling.
  4. Use fully connected layer with dropout and softmax ouput.

2.2 Implementation Here

Network structure of TextCNN:

3 TextRNN

TextRNN has been mentioned in the paper Recurrent Neural Network for Text Classification with Multi-Task Learning.

3.1 Description in Paper

3.2 Implementation Here

Network structure of TextRNN:

4 TextBiRNN

TextBiRNN is an improved model based on TextRNN. It improves the RNN layer in the network structure into a bidirectional RNN layer. It is hoped that not only the forward encoding information but also the reverse encoding information can be considered. No related papers have been found yet.

Network structure of TextBiRNN:

5 TextAttBiRNN

TextAttBiRNN is an improved model which introduces attention mechanism based on TextBiRNN. For the representation vectors obtained by bidirectional RNN encoder, the model can focus on the information most relevant to decision making through the attention mechanism. The attention mechanism was first proposed in the paper Neural Machine Translation by Jointly Learning to Align and Translate, and the implementation of the attention mechanism here is referred to this paper Feed-Forward Networks with Attention Can Solve Some Long-Term Memory Problems.

5.1 Description in Paper

In the paper Feed-Forward Networks with Attention Can Solve Some Long-Term Memory Problems, the feed forward attention is simplified as follows,

Function a, a learnable function, is recognized as a feed forward network. In this formulation, attention can be seen as producing a fixed-length embedding c of the input sequence by computing an adaptive weighted average of the state sequence h.

5.2 Implementation Here

The implementation of attention is not described here, please refer to the source code directly.

Network structure of TextAttBiRNN:

6 HAN

HAN was proposed in the paper Hierarchical Attention Networks for Document Classification.

6.1 Description in Paper

  1. Word Encoder. Encoding by bidirectional GRU, an annotation for a given word is obtained by concatenating the forward hidden state and backward hidden state, which summarizes the information of the whole sentence centered around word in current time step.
  2. Word Attention. By a one-layer MLP and softmax function, it is enable to calculate normalized importance weights over the previous word annotations. Then, compute the sentence vector as a weighted sum of the word annotations based on the weights.
  3. Sentence Encoder. In a similar way with word encoder, use a bidirectional GRU to encode the sentences to get an annotation for a sentence.
  4. Sentence Attention. Similar with word attention, use a one-layer MLP and softmax function to get the weights over sentence annotations. Then, calculate a weighted sum of the sentence annotations based on the weights to get the document vector.
  5. Document Classification. Use the softmax function to calculate the probability of all classes.

6.2 Implementation Here

The implementation of attention here is based on FeedForwardAttention, which is the same as the attention in TextAttBiRNN.

Network structure of HAN:

The TimeDistributed wrapper is used here, since the parameters of the Embedding, Bidirectional RNN, and Attention layers are expected to be shared on the time step dimension.

7 RCNN

RCNN was proposed in the paper Recurrent Convolutional Neural Networks for Text Classification.

7.1 Description in Paper

  1. Word Representation Learning. RCNN uses a recurrent structure, which is a bi-directional recurrent neural network, to capture the contexts. Then, combine the word and its context to present the word. And apply a linear transformation together with the tanh activation fucntion to the representation.
  2. Text Representation Learning. When all of the representations of words are calculated, it applys a element-wise max-pooling layer in order to capture the most important information throughout the entire text. Finally, do the linear transformation and apply the softmax function.

7.2 Implementation Here

Network structure of RCNN:

8 RCNNVariant

RCNNVariant is an improved model based on RCNN with the following improvements. No related papers have been found yet.

  1. The three inputs are changed to single input. The input of the left and right contexts is removed.
  2. Use bidirectional LSTM/GRU instead of traditional RNN for encoding context.
  3. Use multi-channel CNN to represent the semantic vectors.
  4. Replace the Tanh activation layer with the ReLU activation layer.
  5. Use both AveragePooling and MaxPooling.

Network structure of RCNNVariant:

To Be Continued...

Reference

  1. Bag of Tricks for Efficient Text Classification
  2. Keras Example IMDB FastText
  3. Convolutional Neural Networks for Sentence Classification
  4. Keras Example IMDB CNN
  5. Recurrent Neural Network for Text Classification with Multi-Task Learning
  6. Neural Machine Translation by Jointly Learning to Align and Translate
  7. Feed-Forward Networks with Attention Can Solve Some Long-Term Memory Problems
  8. cbaziotis's Attention
  9. Hierarchical Attention Networks for Document Classification
  10. Richard's HAN
  11. Recurrent Convolutional Neural Networks for Text Classification
  12. airalcorn2's RCNN
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].