All Projects → trinker → stansent

trinker / stansent

Licence: other
No description or website provided.

Programming Languages

r
7636 projects

Projects that are alternatives of or similar to stansent

NRC-Persian-Lexicon
NRC Word-Emotion Association Lexicon
Stars: ✭ 30 (+87.5%)
Mutual labels:  sentiment-analysis, sentiment
Text Analytics With Python
Learn how to process, classify, cluster, summarize, understand syntax, semantics and sentiment of text data with the power of Python! This repository contains code and datasets used in my book, "Text Analytics with Python" published by Apress/Springer.
Stars: ✭ 1,132 (+6975%)
Mutual labels:  sentiment-analysis, sentiment
Sentimentr
Dictionary based sentiment analysis that considers valence shifters
Stars: ✭ 325 (+1931.25%)
Mutual labels:  sentiment-analysis, sentiment
wink-sentiment
Accurate and fast sentiment scoring of phrases with #hashtags, emoticons :) & emojis 🎉
Stars: ✭ 51 (+218.75%)
Mutual labels:  sentiment-analysis, sentiment
LSTM-sentiment-analysis
LSTM sentiment analysis. Please look at my another repo for SVM and Naive algorithem
Stars: ✭ 19 (+18.75%)
Mutual labels:  sentiment-analysis, sentiment
stock-news-sentiment-analysis
This program uses Vader SentimentIntensityAnalyzer to calculate the news headline overall sentiment for a stock
Stars: ✭ 21 (+31.25%)
Mutual labels:  sentiment-analysis, sentiment
Stocksight
Stock market analyzer and predictor using Elasticsearch, Twitter, News headlines and Python natural language processing and sentiment analysis
Stars: ✭ 1,037 (+6381.25%)
Mutual labels:  sentiment-analysis, sentiment
sentiment-analysis-using-python
Large Data Analysis Course Project
Stars: ✭ 23 (+43.75%)
Mutual labels:  sentiment-analysis, sentiment
Sentiment
AFINN-based sentiment analysis for Node.js.
Stars: ✭ 2,469 (+15331.25%)
Mutual labels:  sentiment-analysis, sentiment
Twitter Sentiment Analysis
This script can tell you the sentiments of people regarding to any events happening in the world by analyzing tweets related to that event
Stars: ✭ 94 (+487.5%)
Mutual labels:  sentiment-analysis, sentiment
brand-sentiment-analysis
Scripts utilizing Heartex platform to build brand sentiment analysis from the news
Stars: ✭ 21 (+31.25%)
Mutual labels:  sentiment-analysis, sentiment
chronist
Long-term analysis of emotion, age, and sentiment using Lifeslice and text records.
Stars: ✭ 23 (+43.75%)
Mutual labels:  sentiment-analysis, sentiment
Emotion and Polarity SO
An emotion classifier of text containing technical content from the SE domain
Stars: ✭ 74 (+362.5%)
Mutual labels:  sentiment-analysis, sentiment
billboard
🎤 Lyrics/associated NLP data for Billboard's Top 100, 1950-2015.
Stars: ✭ 53 (+231.25%)
Mutual labels:  sentiment-analysis, sentiment
GroupDocs.Classification-for-.NET
GroupDocs.Classification-for-.NET samples and showcase (text and documents classification and sentiment analysis)
Stars: ✭ 38 (+137.5%)
Mutual labels:  sentiment-analysis, sentiment
Troll
Language sentiment analysis and neural networks... for trolls.
Stars: ✭ 330 (+1962.5%)
Mutual labels:  sentiment-analysis, sentiment
Pytreebank
😡😇 Stanford Sentiment Treebank loader in Python
Stars: ✭ 93 (+481.25%)
Mutual labels:  sentiment-analysis, sentiment
Whatsapp-analytics
performing sentiment analysis on the whatsapp chats.
Stars: ✭ 20 (+25%)
Mutual labels:  sentiment-analysis, sentiment
Senti4SD
An emotion-polarity classifier specifically trained on developers' communication channels
Stars: ✭ 41 (+156.25%)
Mutual labels:  sentiment-analysis, sentiment
ar-embeddings
Sentiment Analysis for Arabic Text (tweets, reviews, and standard Arabic) using word2vec
Stars: ✭ 83 (+418.75%)
Mutual labels:  sentiment-analysis

stansent Follow

Project Status: Inactive – The project has reached a stable, usable state but is no longer being actively developed; support/maintenance will be provided as time allows. Build Status Coverage Status Version

stansent wraps Stanford's coreNLP sentiment tagger in a way that makes the process easier to get set up. The output is designed to look and behave like the objects from the sentimentr package. Plotting and the sentimentr::highlight functionality will work similar to the sentiment/sentiment_by objects from sentimentr. This requires less learning to work between the two packages.

In addition to sentimentr and stansent, Matthew Jocker's has created the syuzhet package that utilizes dictionary lookups for the Bing, NRC, and Afinn methods. Similarly, Subhasree Bose has contributed RSentiment which utilizes dictionary lookup that atempts to address negation and sarcasm. Click here for a comparison between stansent, sentimentr, syuzhet, and RSentiment. Note the accuracy and run times of the packages.

Installation

To download the development version of stansent:

Download the zip ball or tar ball, decompress and run R CMD INSTALL on it, or use the pacman package to install the development version:

if (!require("pacman")) install.packages("pacman")
pacman::p_load_gh("trinker/coreNLPsetup", "trinker/stansent")

After installing use the following to ensure Java and coreNLP are installed correctly:

check_setup()

to make sure your Java version is of the right version and coreNLP is set up in the right location.

Table of Contents

Functions

There are two main functions in sentimentr with a few helper functions. The main functions, task category, & descriptions are summarized in the table below:

Function Function Description
sentiment_stanford sentiment Sentiment at the sentence level
sentiment_stanford_by sentiment Aggregated sentiment by group(s)
uncombine reshaping Extract sentence level sentiment from sentiment_by
get_sentences reshaping Regex based string to sentence parser (or get sentences from sentiment/sentiment_by)
highlight Highlight positive/negative sentences as an HTML document
check_setup initial set-up Make sure Java and coreNLP are set up correctly

Contact

You are welcome to:

Demonstration

Load the Packages/Data

if (!require("pacman")) install.packages("pacman")
pacman::p_load_gh(c("trinker/stansent", "trinker/sentimentr"))
pacman::p_load(dplyr)

mytext <- c(
    'do you like it?  But I hate really bad dogs',
    'I am the best friend.',
    'Do you really like it?  I\'m not a fan'
)

data(presidential_debates_2012, cannon_reviews)
set.seed(100)
dat <- presidential_debates_2012[sample(1:nrow(presidential_debates_2012), 100), ]

sentiment_stanford

out1 <- sentiment_stanford(mytext) 
out1[["text"]] <- unlist(get_sentences(out1))
out1

##    element_id sentence_id word_count sentiment                       text
## 1:          1           1          4       0.0            do you like it?
## 2:          1           2          6      -0.5 But I hate really bad dogs
## 3:          2           1          5       0.5      I am the best friend.
## 4:          3           1          5       0.0     Do you really like it?
## 5:          3           2          4      -0.5              I'm not a fan

sentiment_stanford_by: Aggregation

To aggregate by element (column cell or vector element) use sentiment_stanford_by with by = NULL.

out2 <- sentiment_stanford_by(mytext) 
out2[["text"]] <- mytext
out2

##    element_id word_count        sd ave_sentiment
## 1:          1         10 0.3535534         -0.25
## 2:          2          5        NA          0.50
## 3:          3          9 0.3535534         -0.25
##                                           text
## 1: do you like it?  But I hate really bad dogs
## 2:                       I am the best friend.
## 3:       Do you really like it?  I'm not a fan

To aggregate by grouping variables use sentiment_by using the by argument.

(out3 <- with(dat, sentiment_stanford_by(dialogue, list(person, time))))

##        person   time word_count        sd ave_sentiment
##  1:     OBAMA time 2        207 0.4042260     0.1493099
##  2:     OBAMA time 1         34 0.7071068     0.0000000
##  3:    LEHRER time 1          2        NA     0.0000000
##  4:  QUESTION time 2          7 0.7071068     0.0000000
##  5: SCHIEFFER time 3         47 0.5000000     0.0000000
##  6:     OBAMA time 3        129 0.4166667    -0.1393260
##  7:   CROWLEY time 2         72 0.4166667    -0.1393260
##  8:    ROMNEY time 3        321 0.3746794    -0.1508172
##  9:    ROMNEY time 2        323 0.3875534    -0.2293311
## 10:    ROMNEY time 1         95 0.2236068    -0.4138598

Recycling

Note that the Stanford coreNLP functionality takes considerable time to compute (~14.5 seconds to compute out above). The output from sentiment_stanford/sentiment_stanford_by can be recycled inside of sentiment_stanford_by, reusing the raw scoring to save the new call to Java.

with(dat, sentiment_stanford_by(out3, list(role, time)))

##         role   time word_count        sd ave_sentiment
## 1: candidate time 1        129 0.3933979   -0.29271628
## 2: candidate time 2        530 0.4154046   -0.06751165
## 3: candidate time 3        450 0.3796283   -0.15455530
## 4: moderator time 1          2        NA    0.00000000
## 5: moderator time 2         72 0.4166667   -0.13932602
## 6: moderator time 3         47 0.5000000    0.00000000
## 7:     other time 2          7 0.7071068    0.00000000

Plotting

Plotting at Aggregated Sentiment

The possible sentiment values in the output are {-1, -0.5, 0, 0.5, 1}. The raw number of occurrences as each sentiment level are plotted as a bubble version of Cleveland's dot plot. The red cross represents the mean sentiment score (grouping variables are ordered by this by default).

plot(out3)

Plotting at the Sentence Level

The plot method for the class sentiment uses syuzhet's get_transformed_values combined with ggplot2 to make a reasonable, smoothed plot for the duration of the text based on percentage, allowing for comparison between plots of different texts. This plot gives the overall shape of the text's sentiment. The user can see syuzhet::get_transformed_values for more details.

plot(uncombine(out3))

Text Highlighting

The user may wish to see the output from sentiment_stanford_by line by line with positive/negative sentences highlighted. The sentimentr::highlight function wraps a sentiment_by output to produces a highlighted HTML file (positive = green; negative = pink). Here we look at three random reviews from Hu and Liu's (2004) Cannon G3 Camera Amazon product reviews.

set.seed(2)
highlight(with(subset(cannon_reviews, number %in% sample(unique(number), 3)), sentiment_stanford_by(review, number)))

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].