All Projects → dmuth → twitter-aws-comprehend

dmuth / twitter-aws-comprehend

Licence: GPL-3.0 license
An app to analyze tweets using Amazon Comprehend's Sentiment Analysis service

Programming Languages

python
139335 projects - #7 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to twitter-aws-comprehend

French Sentiment Analysis Dataset
A collection of over 1.5 Million tweets data translated to French, with their sentiment.
Stars: ✭ 35 (+169.23%)
Mutual labels:  tweets, sentiment-analysis
sentiment-analysis-of-tweets-in-russian
Sentiment analysis of tweets in Russian using Convolutional Neural Networks (CNN) with Word2Vec embeddings.
Stars: ✭ 51 (+292.31%)
Mutual labels:  tweets, sentiment-analysis
overview-and-benchmark-of-traditional-and-deep-learning-models-in-text-classification
NLP tutorial
Stars: ✭ 41 (+215.38%)
Mutual labels:  tweets, sentiment-analysis
TwEater
A Python Bot for Scraping Conversations from Twitter
Stars: ✭ 16 (+23.08%)
Mutual labels:  tweets, sentiment-analysis
Sarcasm Detection
Detecting Sarcasm on Twitter using both traditonal machine learning and deep learning techniques.
Stars: ✭ 73 (+461.54%)
Mutual labels:  tweets, sentiment-analysis
Twitter Sentiment Analysis
This script can tell you the sentiments of people regarding to any events happening in the world by analyzing tweets related to that event
Stars: ✭ 94 (+623.08%)
Mutual labels:  tweets, sentiment-analysis
arabic-sentiment-analysis
Sentiment Analysis in Arabic tweets
Stars: ✭ 64 (+392.31%)
Mutual labels:  tweets, sentiment-analysis
converse
Conversational text Analysis using various NLP techniques
Stars: ✭ 147 (+1030.77%)
Mutual labels:  sentiment-analysis
pghumor
Is This a Joke? Humor Detection in Spanish Tweets
Stars: ✭ 48 (+269.23%)
Mutual labels:  tweets
pytorch-sentiment-analysis
char-rnn implementation for sentiment analysis on twitter data
Stars: ✭ 32 (+146.15%)
Mutual labels:  sentiment-analysis
rosette-elasticsearch-plugin
Document Enrichment plugin for Elasticsearch
Stars: ✭ 25 (+92.31%)
Mutual labels:  sentiment-analysis
TwitterClone
Based on core principles of Twitter but different in many ways.
Stars: ✭ 27 (+107.69%)
Mutual labels:  tweets
senticnetapi
Simple API to use SenticNet
Stars: ✭ 69 (+430.77%)
Mutual labels:  sentiment-analysis
twitter mining
Twitter Mining in Java
Stars: ✭ 25 (+92.31%)
Mutual labels:  sentiment-analysis
Sentiment-Analysis-facebook-comments
Detection and Prediction of Users Attitude Based on Real-Time and Batch Sentiment Analysis of Facebook Comments
Stars: ✭ 63 (+384.62%)
Mutual labels:  sentiment-analysis
athena
Opinion mining
Stars: ✭ 25 (+92.31%)
Mutual labels:  sentiment-analysis
PlanSum
[AAAI2021] Unsupervised Opinion Summarization with Content Planning
Stars: ✭ 25 (+92.31%)
Mutual labels:  sentiment-analysis
Splunk TA paloalto
The Palo Alto Networks Add-on for Splunk allows a Splunk® Enterprise or Splunk Cloud administrator to collect data from Palo Alto Networks Next-Generation Firewall devices and Advanced Endpoint Protection.
Stars: ✭ 15 (+15.38%)
Mutual labels:  splunk
hfusion
Multimodal sentiment analysis using hierarchical fusion with context modeling
Stars: ✭ 42 (+223.08%)
Mutual labels:  sentiment-analysis
COVID-19-tweets-for-check-worthiness
COVID-19 Infodemic Twitter dataset
Stars: ✭ 14 (+7.69%)
Mutual labels:  tweets

NOTE: If you like this, you'll also like my next project, which performs sentiment analysis on Tweets by keyword!

Twitter AWS Comprehend

I recently learned of Amazon Comprehend and wanted to play around with its sentiment analysis.

So I built this app to download user timelines from Twitter, send them to AWS for analysis, and visualize them in Splunk. The following metrics are reported:

  • Start and end dates for tweets
  • Number of tweets
  • A graph of "Sentiment Over Time"
  • Number of F-bombs used
  • Net Happiness Index (percent of happy tweets minus precent of unhappy tweets)
  • Top Positive and Negative tweets

Screenshots

Additional screenshots are available in the img/ directory.

Requirements

Getting started

Downloading Tweets

You'll want to start off by running the script ./0-fetch-tweets -u username -n num_tweets_to_download to download Tweets via Twitter's API. When you first run the script, it will notice the lack of credentials and send you over to Twitter's App page, where you'll need to create an app. Then grab the App Key and App Secret and enter them when the script prompts you. Next, you'll be sent over to Twitter one more time and will receive a PIN to enter in the script. Do so, and you'll be authenticated to Twitter. This is a one-time process, so once you do it, you should not need to do it again.

The maximum number of tweets you can download from Twitter's API is 3200, but the actual number you get will be much lower as RTs are ignored and Twitter's API is really weird about giving you the actual number of tweets that you ask for. I do not understand it.

Analyizing Tweets

WARNING: This costs money! Based on AWS's pricing structure, a tweet will be treated as 3 "units", which will cost you $.0003, or 3 hundredths of a cent to analyze. So 100 tweets will cost 3 cents, while 1,000 tweets will cost 30 cents.

The syntax for the script to analyize sentment is 1-analyze-sentiment -u username -n num_tweets [ --fake ]

I strongly encourage you to run the script with --fake on the first few tries so that you can fake calls to AWS and get comfortable running the script.

Feeding the analyzed tweets to Splunk

The syntax for the script to feed the tweets into Splunk is: 2-ingest-into-splunk -u username [ --splunk-port port ] [ --splunk-host hostname ] Defaults are 9997 and localhost, respectively.

The data is sent to Splunk over a raw TCP connection, so you'll want to configure Splunk accordingly. Here's a screenshot to help with that:

You'll want to have this source saving to the main Index.

Visualization

This is the most interesting part. So far, we are making the following assumptions about Splunk:

  • Use of the main Index
  • Use of the Sourcetype twitter
  • Use of the Splunk app Search

Assuming those are the case, you're good to go! Just copy the file splunk/twitter_activity_sentiment.xml into $SPLUNK_HOME/etc/apps/search/local/data/ui/views, restart Splunk, and you should be all set!

Alternatively, a less convoluted way (which does not require restarting Splunk) would be to create a new dashboard, click Edit, click Source, and paste in the contents of twitter_activity_sentiment.xml.

A Word on Idempotency

I am a HUGE fan of Idempotency. Especially because AWS Comprehend costs money! Once I analyze a tweet, I never want to analyze it again. So I made a conscious choice to build my code that way. So, for example, if a tweet is analyzed and later the script 0-fetch-tweets is run, that code will not overwrite the sentiement fields. And once a tweet is analyzed by 1-analyze-sentiemtn, it will never be analyzed again!

One place where this does break down is with Slplunk, since the data is fed in through raw TCP and Splunk does not seem to give any acknowledgement (don't know why...), running that script twice will result in duplicate events. The way around that is to run a Splunk query like index=main sourcetype=twitter username=dmuth | delete before re-ingesting any data. I'm not thrilled with this particular workflow, and am looking at some alternatives.

Future TODO Items

  • Make tweet ingestion idempotent
  • See about using Twitter's search API to get older tweets Seriously, Twitter. Let us get more than 3,200 Tweets through your API!
  • Come up with a metric to measure profanity on an account, not just f-bombs
  • Add "username" field to the database schema so we can analyze multiple users at once
  • Dockerize this to download a user's tweets, analyzes them, exports them, then loads up a Splunk instance to ingest them

Contact

I had fun writing this, and I hope you had enjoy using this. If there are any issues, feel free to file an issue against this project, hit me up on Twitter or Facebook, or drop me a line: dmuth AT dmuth DOT org.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].