All Projects → pranau97 → reddit-opinion-mining

pranau97 / reddit-opinion-mining

Licence: Unlicense license
Sentiment analysis and opinion mining of Reddit data.

Programming Languages

r
7636 projects
python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to reddit-opinion-mining

DeepSentiPers
Repository for the experiments described in the paper named "DeepSentiPers: Novel Deep Learning Models Trained Over Proposed Augmented Persian Sentiment Corpus"
Stars: ✭ 17 (+13.33%)
Mutual labels:  sentiment-analysis, opinion-mining
Sentiment Analysis Nltk Ml Lstm
Sentiment Analysis on the First Republic Party debate in 2016 based on Python,NLTK and ML.
Stars: ✭ 61 (+306.67%)
Mutual labels:  sentiment-analysis, nltk
billboard
🎤 Lyrics/associated NLP data for Billboard's Top 100, 1950-2015.
Stars: ✭ 53 (+253.33%)
Mutual labels:  sentiment-analysis, nltk
sentistrength id
Sentiment Strength Detection in Bahasa Indonesia
Stars: ✭ 32 (+113.33%)
Mutual labels:  sentiment-analysis, opinion-mining
vosonSML
R package for collecting social media data and creating networks for analysis.
Stars: ✭ 65 (+333.33%)
Mutual labels:  reddit, sna
textlytics
Text processing library for sentiment analysis and related tasks
Stars: ✭ 25 (+66.67%)
Mutual labels:  sentiment-analysis, opinion-mining
Stocksight
Stock market analyzer and predictor using Elasticsearch, Twitter, News headlines and Python natural language processing and sentiment analysis
Stars: ✭ 1,037 (+6813.33%)
Mutual labels:  sentiment-analysis, nltk
Text Analytics With Python
Learn how to process, classify, cluster, summarize, understand syntax, semantics and sentiment of text data with the power of Python! This repository contains code and datasets used in my book, "Text Analytics with Python" published by Apress/Springer.
Stars: ✭ 1,132 (+7446.67%)
Mutual labels:  sentiment-analysis, nltk
Sentiment-analysis-amazon-Products-Reviews
NLP with NLTK for Sentiment analysis amazon Products Reviews
Stars: ✭ 37 (+146.67%)
Mutual labels:  sentiment-analysis, nltk
Orange3 Text
🍊 📄 Text Mining add-on for Orange3
Stars: ✭ 83 (+453.33%)
Mutual labels:  sentiment-analysis, nltk
opinionMining
Opinion Mining/Sentiment Analysis Classifier using Genetic Programming
Stars: ✭ 13 (-13.33%)
Mutual labels:  sentiment-analysis, opinion-mining
analyzing-reddit-sentiment-with-aws
Learn how to use Kinesis Firehose, AWS Glue, S3, and Amazon Athena by streaming and analyzing reddit comments in realtime. 100-200 level tutorial.
Stars: ✭ 40 (+166.67%)
Mutual labels:  reddit, sentiment-analysis
PlanSum
[AAAI2021] Unsupervised Opinion Summarization with Content Planning
Stars: ✭ 25 (+66.67%)
Mutual labels:  sentiment-analysis, opinion-mining
CLUEmotionAnalysis2020
CLUE Emotion Analysis Dataset 细粒度情感分析数据集
Stars: ✭ 3 (-80%)
Mutual labels:  sentiment-analysis
ds
👨‍🔬 In Russian: Обновляемая структурированная подборка бесплатных ресурсов по тематикам Data Science: курсы, книги, открытые данные, блоги и готовые решения.
Stars: ✭ 102 (+580%)
Mutual labels:  reddit
reddit-fetch
A program to fetch some comments/pictures from reddit
Stars: ✭ 50 (+233.33%)
Mutual labels:  reddit
redditwatcher
📻 Reddit streaming CLI
Stars: ✭ 17 (+13.33%)
Mutual labels:  reddit
fawkes
🚀🚀 Fetch, parse, categorize, summarize user reviews 🚀🚀
Stars: ✭ 83 (+453.33%)
Mutual labels:  sentiment-analysis
Sentiment-Analysis-facebook-comments
Detection and Prediction of Users Attitude Based on Real-Time and Batch Sentiment Analysis of Facebook Comments
Stars: ✭ 63 (+320%)
Mutual labels:  sentiment-analysis
ArSarcasm
This repository contains the Arabic sarcasm dataset (ArSarcasm)
Stars: ✭ 18 (+20%)
Mutual labels:  sentiment-analysis

Reddit Opinion Mining and Sentiment Analysis

A project written in R and Python to mine a Reddit corpus.

Requirements

Python and its dependencies

  1. Python 3
  2. PRAW
  3. requests
  4. bs4
  5. numpy
  6. fuzzywuzzy
  7. nltk
  8. matplotlib

Recommended: Install python related packages in a virtual environment.

Install using pip install -U <package-name>. NLTK also requires that you install the corpuses for tokens and stopwords for the English language.

R and its dependencies

  1. R
  2. sna
  3. ggnetwork
  4. svglite
  5. igraph
  6. intergraph
  7. rsvg
  8. ggplot2

Install using install.packages(<package-name>).

Obtaining Reddit API access credentials

  1. Create a Reddit account, and while logged in, navigate to preferences > apps
  2. Click on the Are you a developer? Create an app... button
  3. Fill in the details-
    • name: Name of your bot/script
    • Select the option 'script'
    • description: Put in a description of your bot/script
    • redirect uri: http://localhost:8080
  4. Click on Create App.
  5. You will be given a client_id and a client_secret. Keep them confidential.

Extracting edge data from the Pushshift Reddit dataset

  1. Sign up / login on Google BigQuery.
  2. Select or create a new project and click on 'Compose Query'.
  3. Paste the contents of the SQL script in the folder subreddit-viz in the editor and run it.
  4. Download the generated CSV file as reddit-edge-list.csv and save it within the subreddit-viz folder.

Running the scripts

  1. To obtain the subreddit visualizations, run the R script using R CMD BATCH reddit.R. Make sure to create an empty folder called subreddit-groups in the same folder as the script.
  2. Create a file named praw.ini with it's contents as:
    [<bot-name>]
    username: reddit username
    password: reddit password
    client_id: client_id that you got
    client_secret: client_secret that you got
    
  3. Run the script getdata.py via python3 getdata.py.
  4. It should scrape all the necessary data in approximately 20-25 minutes.
  5. Run analysis.py using python3 analysis.py [args]. The arguments the script accepts are -
    • no arguments - Runs sentiment analysis on the entire data.
    • -h or --help - Prints the usage details.
    • -w string type or --words string type - Generates a word distribution of the given string and type - positive or negative. Requires that sentiment analysis for the particular term already be performed previously.
    • string - Looks for similar strings in the corpus and performs sentiment analysis on it.

Credits

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].