All Projects → Priberam → SentimentAnalysis

Priberam / SentimentAnalysis

Licence: other
Sentiment Analysis: Deep Bi-LSTM+attention model

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to SentimentAnalysis

datastories-semeval2017-task6
Deep-learning model presented in "DataStories at SemEval-2017 Task 6: Siamese LSTM with Attention for Humorous Text Comparison".
Stars: ✭ 20 (-37.5%)
Mutual labels:  word-embeddings, embeddings, lstm, computational-linguistics, semeval, attention-mechanism, nlp-machine-learning, twitter-messages
Datastories Semeval2017 Task4
Deep-learning model presented in "DataStories at SemEval-2017 Task 4: Deep LSTM with Attention for Message-level and Topic-based Sentiment Analysis".
Stars: ✭ 184 (+475%)
Mutual labels:  sentiment-analysis, word-embeddings, embeddings, lstm, deeplearning, attention-mechanism, nlp-machine-learning
NTUA-slp-nlp
💻Speech and Natural Language Processing (SLP & NLP) Lab Assignments for ECE NTUA
Stars: ✭ 19 (-40.62%)
Mutual labels:  sentiment-analysis, word-embeddings, attention-mechanism, nlp-machine-learning, sentiment-classification
sentiment-analysis-of-tweets-in-russian
Sentiment analysis of tweets in Russian using Convolutional Neural Networks (CNN) with Word2Vec embeddings.
Stars: ✭ 51 (+59.38%)
Mutual labels:  sentiment-analysis, word-embeddings, embeddings, computational-linguistics, nlp-machine-learning
word2vec-tsne
Google News and Leo Tolstoy: Visualizing Word2Vec Word Embeddings using t-SNE.
Stars: ✭ 59 (+84.38%)
Mutual labels:  word-embeddings, embeddings, computational-linguistics, nlp-machine-learning
Twitter Sentiment Analysis
Sentiment analysis on tweets using Naive Bayes, SVM, CNN, LSTM, etc.
Stars: ✭ 978 (+2956.25%)
Mutual labels:  sentiment-analysis, lstm, deeplearning, sentiment-classification
Multimodal Sentiment Analysis
Attention-based multimodal fusion for sentiment analysis
Stars: ✭ 172 (+437.5%)
Mutual labels:  sentiment-analysis, lstm, attention-mechanism, sentiment-classification
Pytorch Sentiment Analysis
Tutorials on getting started with PyTorch and TorchText for sentiment analysis.
Stars: ✭ 3,209 (+9928.13%)
Mutual labels:  sentiment-analysis, word-embeddings, lstm, sentiment-classification
ntua-slp-semeval2018
Deep-learning models of NTUA-SLP team submitted in SemEval 2018 tasks 1, 2 and 3.
Stars: ✭ 79 (+146.88%)
Mutual labels:  sentiment-analysis, lstm, semeval, attention-mechanism
Tensorflow Sentiment Analysis On Amazon Reviews Data
Implementing different RNN models (LSTM,GRU) & Convolution models (Conv1D, Conv2D) on a subset of Amazon Reviews data with TensorFlow on Python 3. A sentiment analysis project.
Stars: ✭ 34 (+6.25%)
Mutual labels:  sentiment-analysis, lstm, sentiment-classification
Persian-Sentiment-Analyzer
Persian sentiment analysis ( آناکاوی سهش های فارسی | تحلیل احساسات فارسی )
Stars: ✭ 30 (-6.25%)
Mutual labels:  sentiment-analysis, embeddings, lstm
Text Classification Keras
📚 Text classification library with Keras
Stars: ✭ 53 (+65.63%)
Mutual labels:  sentiment-analysis, lstm, nlp-machine-learning
Awesome Sentiment Analysis
Repository with all what is necessary for sentiment analysis and related areas
Stars: ✭ 459 (+1334.38%)
Mutual labels:  sentiment-analysis, nlp-machine-learning, sentiment-classification
Context
ConText v4: Neural networks for text categorization
Stars: ✭ 120 (+275%)
Mutual labels:  sentiment-analysis, lstm, sentiment-classification
Dan Jurafsky Chris Manning Nlp
My solution to the Natural Language Processing course made by Dan Jurafsky, Chris Manning in Winter 2012.
Stars: ✭ 124 (+287.5%)
Mutual labels:  sentiment-analysis, nlp-machine-learning, sentiment-classification
Absa keras
Keras Implementation of Aspect based Sentiment Analysis
Stars: ✭ 126 (+293.75%)
Mutual labels:  sentiment-analysis, attention-mechanism, sentiment-classification
Coursera Natural Language Processing Specialization
Programming assignments from all courses in the Coursera Natural Language Processing Specialization offered by deeplearning.ai.
Stars: ✭ 39 (+21.88%)
Mutual labels:  word-embeddings, deeplearning, nlp-machine-learning
SA-DL
Sentiment Analysis with Deep Learning models. Implemented with Tensorflow and Keras.
Stars: ✭ 35 (+9.38%)
Mutual labels:  sentiment-analysis, semeval, attention-mechanism
brand-sentiment-analysis
Scripts utilizing Heartex platform to build brand sentiment analysis from the news
Stars: ✭ 21 (-34.37%)
Mutual labels:  sentiment-analysis, nlp-machine-learning, sentiment-classification
domain-attention
codes for paper "Domain Attention Model for Multi-Domain Sentiment Classification"
Stars: ✭ 22 (-31.25%)
Mutual labels:  attention-mechanism, sentiment-classification

Priberam logo

Sentiment Analysis

Overview

Sentiment Analysis is a natural language processing (NLP) task in which the goal is to assess the polarity/sentiment of a chunk of text. By definition, is widely used in the context of Customer relationship management (CRM), for automatic evaluation of reviews and survey responses, and social media.

Popular substasks in Sentiment Analysis are:

  • Message Polarity Classification: Given a message, classify whether the overall contextual polarity of the message is of positive, negative, or neutral sentiment.
  • Topic-Based or Entity-Based Message Polarity Classification: Given a message and a topic or entity, classify the message towards that topic or entity.

A popular Workshop with a specific task for Sentiment analysis is the SemEval (International Workshop on Semantic Evaluation). Latest year (2017) overview of such task (Task 4) can be reached at: http://www.aclweb.org/anthology/S17-2088.

This project currently is targeting only the Message Polarity Classification subtask.

The repository contains:

  • code for processing datasets, as well as a RESTful web service for on-demand sentiment analysis (dockerization is also available).
  • links to some pre-trained word embeddings;
  • corpora (from SemEval-2017 Task 4 subtask A).

Pre-trained word embeddings

You can download one of the following word embeddings (from "DataStories at SemEval-2017 Task 4: Deep LSTM with Attention for Message-level and Topic-based Sentiment Analysis"):

Place the file(s) in the Embeddings folder.

Model

The implemented model is a Deep Bidirectional LSTM model with Attention, based on the work of Baziotis et al., 2017: DataStories at SemEval-2017 Task 4: Deep LSTM with Attention for Message-level and Topic-based Sentiment Analysis.

Installation

Requirements

Besides PyTorch, the required libraries are listed in requirements.txt.

Setting up a virtual environment

Linux Windows
pip install virtualenv pip install virtualenv
virtualenv --python /usr/bin/python3.6 venv virtualenv venv
source venv/bin/activate venv\Scripts\activate.bat
pip install -r requirements.txt pip install -r requirements.txt
pip install http://download.pytorch.org/whl/cpu/torch-0.4.1-cp36-cp36m-linux_x86_64.whl (1)* pip install http://download.pytorch.org/whl/cpu/torch-0.4.1-cp36-cp36m-win_amd64.whl (1)*
pip install torchvision pip install torchvision
pip install torchtext==0.2.3 pip install torchtext==0.2.3

(1)* replace "cpu" in link if you plan to use GPU: "cu80" for CUDA 8, "cu90" for CUDA 9.0, "cu92" for CUDA 9.2, ...

If you require a newer version, please visit http://pytorch.org/ and follow their instructions to install the relevant pytorch binary.

Running

Train

To train a new model, one can just use the pre-configured config file "train_config.json". After making the appropriate modifications, like for example, changing the target embeddings file, execute the following command in the terminal:

python sentiment_analysis.py --train_config_path="train_config.json"

Train config file arguments

  • name : model name
  • labels : list of the target labels, matching the train, dev and test datasets
  • embeddings_path : path for embeddings
  • preprocessing_style : "english"(english wikipedia) or "twitter"(english tweets)
  • save_in_REST_config : true, to update target REST config file with trained models, or false, otherwise
  • target_REST_config_path : path of target REST config file
  • batch_size : number of samples that going to be propagated through the network, on each iteration. Assuming that the dataset has T training samples, and the batch_size is set to B, the algorithm takes the first B samples (from 1st to Bth) from the training dataset and trains the network. Next it takes the following B samples and trains the network again, until all the samples have been propagated through the network;
  • hidden_dim : hidden dimension of the LSTM layers (number of features in the hidden state h);
  • num_layers : number of recurrent layers in the LSTM. E.g., setting num_layers would mean stacking two LSTMs together to form a "stacked LSTM", with the second LSTM taking in outputs of the first LSTM and computing the final results.

Run as a web service

To load the trained models and launch a web service for on-demand sentiment analysis, execute the following command in the terminal:

python sentiment_analysis.py --rest_config_path="REST_config.json"

Automated procedure

  1. Edit the train config file to select the parameters of the models to be trained.
  2. Run the script in the train mode (see how above). The target REST web service config file will be updated with the trained models, if save_in_REST_config is set to true.
  3. Building a docker image afterwards (using provided Dockerfile) will create a running docker image with a REST web service, automatically configured with the trained models (model files and config file).
docker build -t sentimentanalysis:latest .
docker run --name sentimentanalysis -d -p 7000:7000 sentimentanalysis

It is not necessary to create a docker to test and run the web service. It is just an extra, that allows developers to package up an application with all of the parts it needs. Docker is a "containerization" technology that offers virtual isolation to deploy and run applications that access a shared operating system (OS) kernel without the need for virtual machines (VMs).

REST web service routes and input examples

Here's an example of a POST request for a single text chunk classification:

curl -X POST \
  'http://localhost:7000/sentiment_analysis/api/v1.0/inference?instance=EN300Twitter' \
  -H 'Cache-Control: no-cache' \
  -H 'Content-Type: application/json' \
  -H 'Postman-Token: 3322b9bf-fe4d-4856-8871-834394aa1124' \
  -d '{"text":"Why does Tom Cruise take 10,000 times to figure things out in the movie Edge Of Tomorrow, but gets it right 1st time in Mission Impossible?"
} '

And here's an example of a POST request for a batch of text chunks to classify:

curl -X POST \
  'http://localhost:7000/sentiment_analysis/api/v1.0/inference?instance=EN300Twitter' \
  -H 'Cache-Control: no-cache' \
  -H 'Content-Type: application/json' \
  -H 'Postman-Token: 3da2204c-f9ae-42f0-9ae6-8d4622129ca3' \
  -d '{"texts":
  ["Who'\''s in Milan in February? You won'\''t want to miss this! #Milano2016 https://t.co/J41jOrpTEa",
  "@boyalmxghty @WinterSoldierL is a universal feminism concerning everyone, as for taylor swift she may get into that category idk",
  "I'\''ve decided tomorrow night I'\''m going to re watch all of season 5 of teen wolf",
  "First the @InfernoCWHL, now the @NHLFlames march in the Pride parade - this is awesome.",
  "That Mexico vs USA commercial with trump gets your blood boiling. Race war October 10th. Imagine that parking lot. Gaddamnnnnnn VIOLENCE!!!",
  "\"Happy Captain America Day! David Wright batting 4th tonight. Mets, yo.\"",
  "Watching The Vow.... For the 4th time.",
  "LBJ out with cramps. Steps on Chalmers. Cu'\''s ball. Offensive foul. 100-89 Heat ball. 7:57 left in the 4th.",
  "@FFFabFFFab well destructo is on there and he'\''s not playing. but lineup and hours are released tomorrow.",
  "\"La Liga: Barca and Betis march on, Malaga held: Real Betis moved up to fourth in the table ... http://t.co/mdYFE4km http://t.co/iDWtFSZF\""
  ]
} '

Team

Priberam is a Portuguese SME founded in 1989, as a spin-off from Instituto Superior Técnico in Lisbon, which offers cutting-edge semantic search and natural language processing technologies. Priberam licenses its technologies to clients such as Amazon and Microsoft and the biggest media publishers in Portugal, Brazil and Spain. Priberam participates in several national and European R&D projects and keeps strong links with the best groups in the academia and research institutes.

Priberam Labs, the company’s research department, is focused on innovative technologies such as automatic media monitoring, recommendation, and social media analysis, with a strong component on machine learning research for Big Data.

To learn more about who specifically contributed to this codebase, see our contributors page.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].