All Projects → ahmedbesbes → overview-and-benchmark-of-traditional-and-deep-learning-models-in-text-classification

ahmedbesbes / overview-and-benchmark-of-traditional-and-deep-learning-models-in-text-classification

Licence: other
NLP tutorial

Programming Languages

Jupyter Notebook
11667 projects

Projects that are alternatives of or similar to overview-and-benchmark-of-traditional-and-deep-learning-models-in-text-classification

NTUA-slp-nlp
💻Speech and Natural Language Processing (SLP & NLP) Lab Assignments for ECE NTUA
Stars: ✭ 19 (-53.66%)
Mutual labels:  sentiment-analysis, word-embeddings, glove-embeddings
Pytorch Sentiment Analysis
Tutorials on getting started with PyTorch and TorchText for sentiment analysis.
Stars: ✭ 3,209 (+7726.83%)
Mutual labels:  sentiment-analysis, word-embeddings, recurrent-neural-networks
Textblob Ar
Arabic support for textblob
Stars: ✭ 60 (+46.34%)
Mutual labels:  sentiment-analysis, text-classification, word-embeddings
Rcnn Text Classification
Tensorflow Implementation of "Recurrent Convolutional Neural Network for Text Classification" (AAAI 2015)
Stars: ✭ 127 (+209.76%)
Mutual labels:  sentiment-analysis, text-classification, recurrent-neural-networks
sentiment-analysis-of-tweets-in-russian
Sentiment analysis of tweets in Russian using Convolutional Neural Networks (CNN) with Word2Vec embeddings.
Stars: ✭ 51 (+24.39%)
Mutual labels:  tweets, sentiment-analysis, word-embeddings
Rnn Text Classification Tf
Tensorflow Implementation of Recurrent Neural Network (Vanilla, LSTM, GRU) for Text Classification
Stars: ✭ 114 (+178.05%)
Mutual labels:  sentiment-analysis, text-classification, recurrent-neural-networks
Sarcasm Detection
Detecting Sarcasm on Twitter using both traditonal machine learning and deep learning techniques.
Stars: ✭ 73 (+78.05%)
Mutual labels:  tweets, sentiment-analysis, text-classification
Spark Nlp
State of the Art Natural Language Processing
Stars: ✭ 2,518 (+6041.46%)
Mutual labels:  sentiment-analysis, text-classification
Datastories Semeval2017 Task4
Deep-learning model presented in "DataStories at SemEval-2017 Task 4: Deep LSTM with Attention for Message-level and Topic-based Sentiment Analysis".
Stars: ✭ 184 (+348.78%)
Mutual labels:  sentiment-analysis, word-embeddings
Cnn Text Classification Keras
Text Classification by Convolutional Neural Network in Keras
Stars: ✭ 213 (+419.51%)
Mutual labels:  sentiment-analysis, text-classification
Text-Classification-LSTMs-PyTorch
The aim of this repository is to show a baseline model for text classification by implementing a LSTM-based model coded in PyTorch. In order to provide a better understanding of the model, it will be used a Tweets dataset provided by Kaggle.
Stars: ✭ 45 (+9.76%)
Mutual labels:  tweets, text-classification
arabic-sentiment-analysis
Sentiment Analysis in Arabic tweets
Stars: ✭ 64 (+56.1%)
Mutual labels:  tweets, sentiment-analysis
Sentiment-Analysis-of-Netflix-Reviews
Sentiment Analysis LSTM recurrent neural network's.
Stars: ✭ 51 (+24.39%)
Mutual labels:  sentiment-analysis, recurrent-neural-networks
Hey Jetson
Deep Learning based Automatic Speech Recognition with attention for the Nvidia Jetson.
Stars: ✭ 161 (+292.68%)
Mutual labels:  sentiment-analysis, recurrent-neural-networks
Onnxt5
Summarization, translation, sentiment-analysis, text-generation and more at blazing speed using a T5 version implemented in ONNX.
Stars: ✭ 143 (+248.78%)
Mutual labels:  sentiment-analysis, text-classification
Chinese ulmfit
中文ULMFiT 情感分析 文本分类
Stars: ✭ 208 (+407.32%)
Mutual labels:  sentiment-analysis, text-classification
Cluedatasetsearch
搜索所有中文NLP数据集,附常用英文NLP数据集
Stars: ✭ 2,112 (+5051.22%)
Mutual labels:  sentiment-analysis, text-classification
TwEater
A Python Bot for Scraping Conversations from Twitter
Stars: ✭ 16 (-60.98%)
Mutual labels:  tweets, sentiment-analysis
French Sentiment Analysis Dataset
A collection of over 1.5 Million tweets data translated to French, with their sentiment.
Stars: ✭ 35 (-14.63%)
Mutual labels:  tweets, sentiment-analysis
Context
ConText v4: Neural networks for text categorization
Stars: ✭ 120 (+192.68%)
Mutual labels:  sentiment-analysis, text-classification

Overview and benchmark of traditional and deep learning models in text classification

Original post: https://ahmedbesbes.com/overview-and-benchmark-of-traditional-and-deep-learning-models-in-text-classification.html

This article is an extension of a previous one I wrote when I was experimenting sentiment analysis on twitter data. Back in the time, I explored a simple model: a two-layer feed-forward neural network trained on keras. The input tweets were represented as document vectors resulting from a weighted average of the embeddings of the words composing the tweet.

The embedding I used was a word2vec model I trained from scratch on the corpus using gensim. The task was a binary classification and I was able with this setting to achieve 79% accuracy.

The goal of this post is to explore other NLP models trained on the same dataset and then benchmark their respective performance on a given test set.

We'll go through different models: from simple ones relying on a bag-of-word representation to a heavy machinery deploying convolutional/recurrent networks: We'll see if we'll score more than 79% accuracy!


Here are the models that have been tested:

  • Logistic regression with word ngrams
  • Logistic regression with character ngrams
  • Logistic regression with word and character ngrams
  • Recurrent neural network (bidirectional GRU) without pre-trained embeddings
  • Recurrent neural network (bidirectional GRU) with GloVe pre-trained embeddings
  • Multi channel Convolutional Neural Network
  • RNN (Bidirectional GRU) + CNN model

By the end of this post, you will have a boilerplate code for each of these NLP techniques. It'll help you kickstart your NLP project and eventually achieve state-of-the art results (some of these models are really powerful).

Here's a sneak peak of the final result:

benchmark

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].