All Projects → prrao87 → Fine Grained Sentiment

prrao87 / Fine Grained Sentiment

Licence: mit
A comparison and discussion of different NLP methods for 5-class sentiment classification on the SST-5 dataset.

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Fine Grained Sentiment

Dialogue Understanding
This repository contains PyTorch implementation for the baseline models from the paper Utterance-level Dialogue Understanding: An Empirical Study
Stars: ✭ 77 (-28.04%)
Mutual labels:  sentiment-analysis
Pytreebank
😡😇 Stanford Sentiment Treebank loader in Python
Stars: ✭ 93 (-13.08%)
Mutual labels:  sentiment-analysis
Wallstreetbets Sentiment Analysis
This program finds the most mentioned ticker on r/wallstreetbets and uses Vader SentimentIntensityAnalyzer to calculate the sentiment analysis.
Stars: ✭ 103 (-3.74%)
Mutual labels:  sentiment-analysis
Orange3 Text
🍊 📄 Text Mining add-on for Orange3
Stars: ✭ 83 (-22.43%)
Mutual labels:  sentiment-analysis
Ian
A TensorFlow implementation for "Interactive Attention Networks for Aspect-Level Sentiment Classification"
Stars: ✭ 91 (-14.95%)
Mutual labels:  sentiment-analysis
Tia
Your Advanced Twitter stalking tool
Stars: ✭ 98 (-8.41%)
Mutual labels:  sentiment-analysis
Senta
Baidu's open-source Sentiment Analysis System.
Stars: ✭ 1,187 (+1009.35%)
Mutual labels:  sentiment-analysis
Sentimentcoremldemo
😃 iOS11 demo application for sentiment polarity analysis.
Stars: ✭ 104 (-2.8%)
Mutual labels:  sentiment-analysis
Doc2vec
📓 Long(er) text representation and classification using Doc2Vec embeddings
Stars: ✭ 92 (-14.02%)
Mutual labels:  sentiment-analysis
Stock Market Prediction Web App Using Machine Learning And Sentiment Analysis
Stock Market Prediction Web App based on Machine Learning and Sentiment Analysis of Tweets (API keys included in code). The front end of the Web App is based on Flask and Wordpress. The App forecasts stock prices of the next seven days for any given stock under NASDAQ or NSE as input by the user. Predictions are made using three algorithms: ARIMA, LSTM, Linear Regression. The Web App combines the predicted prices of the next seven days with the sentiment analysis of tweets to give recommendation whether the price is going to rise or fall
Stars: ✭ 101 (-5.61%)
Mutual labels:  sentiment-analysis
Turkish Bert Nlp Pipeline
Bert-base NLP pipeline for Turkish, Ner, Sentiment Analysis, Question Answering etc.
Stars: ✭ 85 (-20.56%)
Mutual labels:  sentiment-analysis
Stocker
Financial Web Scraper & Sentiment Classifier
Stars: ✭ 87 (-18.69%)
Mutual labels:  sentiment-analysis
Aspect Based Sentiment Analysis
Aspect Based Sentiment Analysis
Stars: ✭ 99 (-7.48%)
Mutual labels:  sentiment-analysis
Sentiment Analysis
细粒度用户评论情感分析
Stars: ✭ 80 (-25.23%)
Mutual labels:  sentiment-analysis
Pynlp
A pythonic wrapper for Stanford CoreNLP.
Stars: ✭ 103 (-3.74%)
Mutual labels:  sentiment-analysis
Hierarchical Attention Networks
TensorFlow implementation of the paper "Hierarchical Attention Networks for Document Classification"
Stars: ✭ 75 (-29.91%)
Mutual labels:  sentiment-analysis
Twitter Sentiment Analysis
This script can tell you the sentiments of people regarding to any events happening in the world by analyzing tweets related to that event
Stars: ✭ 94 (-12.15%)
Mutual labels:  sentiment-analysis
Textclf
TextClf :基于Pytorch/Sklearn的文本分类框架,包括逻辑回归、SVM、TextCNN、TextRNN、TextRCNN、DRNN、DPCNN、Bert等多种模型,通过简单配置即可完成数据处理、模型训练、测试等过程。
Stars: ✭ 105 (-1.87%)
Mutual labels:  sentiment-analysis
Bixin
Chinese Sentiment Analysis 中文文本情感分析
Stars: ✭ 104 (-2.8%)
Mutual labels:  sentiment-analysis
Learning Social Media Analytics With R
This repository contains code and bonus content which will be added from time to time for the book "Learning Social Media Analytics with R" by Packt
Stars: ✭ 102 (-4.67%)
Mutual labels:  sentiment-analysis

Fine Grained Sentiment Classification

This repo shows a comparison and discussion of various NLP methods to perform 5-class sentiment classification on the Stanford Sentiment Treebank (SST-5) dataset. The goal is to predict classes on this dataset with multiple rule-based, linear and neural network-based classifiers and see how they differ from one another.

Currently the following classifiers have been implemented:

  • TextBlob: Rule-based, uses the internal polarity metric from the TextBlob library.
  • Vader: Rule-based, uses the compound polarity scores from the VADER library.
  • Logistic Regression: Trains a simple logistic regression model in scikit-learn after converting the vocabulary to feature vectors and considering the effect of word frequencies using TF-IDF.
  • SVM: Trains a simple linear support vector machine in scikit-learn after converting the vocabulary to feature vectors and considering the effect of word frequencies using TF-IDF.
  • FastText: Trains a FastText classifier using a combination of trigrams and a 3-word context window size.
  • Flair: Trains a Flair NLP classifier using "stacked" embeddings, i.e. a combined representation of either GloVe, Bert or ELMo word embeddings and Flair (forward and backward) string embeddings.
  • Causal Transformer: Trains a small transformer model based on OpenAI's GPT-2 architecture (but much smaller) using a causal (i.e. left-to-right) pre-trained language model trained on Wikitext-103 data. The pre-trained weights are obtained from HuggingFace's NAACL transfer learning tutorial. Once we download the pre-trained language model, we add a custom classification head to the base transformer as shown in training/transformer_utils/model.py, and then fine-tune it on the SST-5 dataset.

Installation

First, set up virtual environment and install from requirements.txt:

python3 -m venv venv
source venv/bin/activate
pip3 install -r requirements.txt

For further development, simply activate the existing virtual environment.

source venv/bin/activate

Training the Classifiers

The training of the linear models (Logistic Regression and SVM) are done during runtime of the classifier (next step) since they run very fast on this small dataset. To train the methods that rely on word/string embeddings, however, we use separate scripts to help more easily tune the hyperparameters.

Training code for the models is provided in the training directory.

FastText

To train the FastText model, it is strongly recommended to use automatic hyperparameter optimization as per the documentation.

First, build the fastText command line interface from source (Unix only):

$ git clone https://github.com/facebookresearch/fastText.git
$ cd fastText
$ make

Then, perform automatic tuning using the below command to find the optimum hyperparameters, by specifying paths to the training and dev set. Quantization (to reduce model size) is also tuned in this process - in this case we set a maximum model size of 10 MB. Verbosity is enabled to see what hyperparameters gave the best F1-score.

./fasttext supervised -input ../data/sst/sst_train.txt -output ../model_hyperopt \
-autotune-validation ../data/sst/sst_dev.txt -autotune-modelsize 10M -verbose 3

This outputs the trained model (.ftz extension) that gives the best F1-score on our dataset.

The Python file training/train_fasttext.py can also be used for training the FastText model from within Python (without having to build the CLI from source) - however, the Python API does not have auto-tune capability so the hyperparameters have to be tuned manually (not recommended).

Flair

To train the Flair model, run train_flair.py. To enhance the model's context, we can stack word embeddings (either GloVe, ELMo or Bert) with Flair's string embeddings. This model takes significantly longer to run on a GPU-enabled machine (of the order of several hours).

The below examples show how to train Flair models with stacked word/string embeddings using the provided script. Specifying the --stack argument will invoke either GloVe, ELMo (original) or Bert (Base, cased) word embeddings along with Flair forward/backward string embeddings to train the classifier.

cd training
python3 train_flair.py --stack glove --epochs 25
python3 train_flair.py --stack bert --epochs 25
python3 train_flair.py --stack elmo --epochs 25

To resume training from a checkpoint, just pass in the path to the checkpoint file.

cd training
python3 train_flair.py --stack elmo --checkpoint models/flair/elmo/checkpoint.pt --epochs 25

Training this model to 50-100 epochs can take more than a day on a single GPU.

Causal Transformer

The causal transformer in this repo is implemented as per HuggingFace's transfer learning tutorial example. An optional argument to include adapter modules as per the paper "Parameter-efficient Transfer Learning for NLP" is provided in the script train_transformer.py. A full description of the pre-training stage and the logic for implementing the transformer layers is provided in the tutorial slides.

Train the causal transformer (fine-tuning for classification only) as shown below. This version of the transformer has 50 million trainable parameters.

cd training
python3 train_transformer.py --n_epochs 3 --lr 6.5e-5 --gradient_acc_steps 2

To run the model with adapters, i.e. bottleneck layers (inserted within skip-connections just after the attention and feed-forward modules), use the adapters_dim argument. This will only train the adapters, linear layers and the added embeddings, while keeping the other parameters frozen.

In the below example, we include adapters with a dimensionality of 32 - adding this argument will reduce the number of trainable parameters in the model to 25% of the original (around 12 million). Note that we scale up the learning rate by a factor of 10 when using adapters because we added a number of newly initialized parameters to the pre-trained model. Gradients are accumulated over two steps to simulate larger batch sizes, which helps bring down the losses faster.

python3 train_transformer.py --adapters_dim 32 --n_epochs 3 --lr 6.5e-4 --gradient_acc_steps 2

The transformer model needs only 3-4 epochs of training (beyond which it begins overfitting to this dataset) - which can take anywhere from minutes to a few hours on a single GPU with a batch size of 32.

Run sentiment analysis and output confusion matrix

Once the classifiers have been trained on the SST-5 data, run the file predictor.py to perform 5-class sentiment classification on the test set. This file accepts arguments for multiple classifier models at a time (just space-separate the model names, all in lower case). The requested classification models are run and evaluated on the test set and confusion matrix plots are output (for each model) in the ./Plots/ directory.

The method class and classifier model specification can be done using the dictionary within the file predictor.py. The rule-based and linear models do not have trained models associated with them, hence they are left as None. For any other learning-based models, the trained model file can be specified in this dictionary to avoid having to pass it as an argument.

METHODS = {
    'textblob': {
        'class': "TextBlobSentiment",
        'model': None
    },
    'vader': {
        'class': "VaderSentiment",
        'model': None
    },
    'logistic': {
        'class': "LogisticRegressionSentiment",
        'model': None
    },
    'svm': {
        'class': "SVMSentiment",
        'model': None
    },
    'fasttext': {
        'class': "FastTextSentiment",
        'model': "models/fasttext/sst-5.ftz"
    },
    'flair': {
        'class': "FlairSentiment",
        'model': "models/flair/best-model-elmo.pt"
    },
    'transformer': {
        'class': "TransformerSentiment",
        'model': "models/transformer",
    }
}

The above dictionary makes it easier to update the framework with more models and methods over time - simply update the method name and its class names and models files as shown above.

To run a single case just pass one method as an argument:

python3 predictor.py --method textblob

All methods from the dictionary can be run sequentially using a single command as follows:

python3 predictor.py --method textblob vader logistic svm fasttext flair

If at a later time, multiple versions of trained models need to be run sequentially, we can specify these using the --model argument - this will override the model specified in the dictionary within the file.

python3 predictor.py --method fasttext --model models/fasttext/sst-bigram.bin
python3 predictor.py --method fasttext --model models/fasttext/sst-trigram.bin

OR

python3 predictor.py --method flair --model models/flair/best-model-elmo.pt
python3 predictor.py --method flair --model models/flair/best-model-bert.pt

To run the predictor for a new transformer model, simply specify the model path. The path specified must contain the PyTorch config metadata file (.bin) and the PyTorch model weights (.pth).

python3 predictor.py --method transformer --model models/transformer

Explain classifier results

Once a sentiment classifier has been trained, we can use it to explain the classifier's predictions. To do this we make use of the LIME library. The LIME method generates a local linear approximation of the model (regardless of whether the model is globally nonlinear or not), and then perturbs this local model to identify features that influence the classification results the most. For multi-class cases such as this one, LIME produces a list of probabilities for each class, and also highlights the effect of each token feature on the predicted class using a one-vs-rest method.

To make it easier to update the explainer framework with more methods over time, look at the method dictionary in explainer.py.

METHODS = {
    'textblob': {
        'class': "TextBlobExplainer",
        'file': None
    },
    'vader': {
        'class': "VaderExplainer",
        'file': None
    },
    'logistic': {
        'class': "LogisticExplainer",
        'file': "data/sst/sst_train.txt"
    },
    'svm': {
        'class': "SVMExplainer",
        'file': "data/sst/sst_train.txt"
    },
    'fasttext': {
        'class': "FastTextExplainer",
        'file': "models/fasttext/sst-5.ftz"
    },
    'flair': {
        'class': "FlairExplainer",
        'file': "models/flair/best-model-elmo.pt"
    },
    'transformer': {
        'class': "TransformerExplainer",
        'file': "models/transformer"
    }
}

Note:

  • The rule-based approaches (TextBlob and Vader) do not output class probabilities (they simply output a float score of sentiment in the range [-1, 1]). To explain these results using LIME, we artificially generate class probabilities for each class using a combination of binning (to get an integer class in the range [1-5] depending the float value), and then "simulating" the class probabilities using a normal distribution with the mean equal to the predicted class. This approach is hacky and is by no means the "right" way to do this, but it allows us to compare the outputs of rule-based classifiers like TextBlob and VADER on an equal footing (using similar metrics) as the learning-based classifiers.
  • For the logistic regression and SVM, specify the path to the training data (the logistic regression model is trained within the explainer class) while for the other learners, point to the trained classifier models directly.
  • For the transformer, specify just the model path (with the metadata and model file in that path).

The sentences whose classification results are to be explained are specified as a list in explainer.py.

samples = [
    "It's not horrible, just horribly mediocre.",
    "The cast is uniformly excellent... but the film itself is merely mildly charming.",
]

Run the explainer for the list of sentences using each, or all the classification methods as follows:

python3 explainer.py --method textblob vader logistic svm fasttext
python3 explainer.py --method flair
python3 explainer.py --method transformer

This outputs HTML files with embeds showing the explanations for each sample sentence for each classifier used.

Demo Front-end App

A simple Flask-based front-end application is developed that takes in a text sample and outputs LIME explanations for the different methods as shown below.

Play with your own text examples as shown below and see the fine-grained sentiment results explained!

NOTE: Because the PyTorch-based models (Flair and the causal transformer) are quite expensive to run inference with (they require a GPU), these methods are not deployed.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].