All Projects β†’ ParticipaPY β†’ politic-bots

ParticipaPY / politic-bots

Licence: GPL-3.0 License
Tools and algorithms to analyze Paraguayan Tweets in times of elections

Programming Languages

Jupyter Notebook
11667 projects
HTML
75241 projects
python
139335 projects - #7 most used programming language
javascript
184084 projects - #8 most used programming language

Projects that are alternatives of or similar to politic-bots

awesome-twitter-bots
A Curated Collection of the Best Twitter Bots πŸ€–
Stars: ✭ 99 (+280.77%)
Mutual labels:  twitter, bots
Twitter Cleanup
πŸ› Clean-up inactive accounts and bots from your Twitter
Stars: ✭ 275 (+957.69%)
Mutual labels:  twitter, bots
Bot Followers
🍊 Find out how many bots follow any given Twitter acount
Stars: ✭ 91 (+250%)
Mutual labels:  twitter, bots
maggot
A lightweight python library that helps to keep track of numerical experiments
Stars: ✭ 114 (+338.46%)
Mutual labels:  machinelearning
Social-Distancing-and-Face-Mask-Detection
Social Distancing and Face Mask Detection using TensorFlow. Install all required Libraries and GPU drivers as well. Refer to README.md or REPORT for know to installation requirement
Stars: ✭ 39 (+50%)
Mutual labels:  machinelearning
forestError
A Unified Framework for Random Forest Prediction Error Estimation
Stars: ✭ 23 (-11.54%)
Mutual labels:  machinelearning
MACHINE-LEARNING-LABORATORY
ML LAB PROGRAMS FOR SCHEMES +2015 +2017 +2018
Stars: ✭ 15 (-42.31%)
Mutual labels:  machinelearning
puppeteer-email
Email automation driven by headless chrome.
Stars: ✭ 135 (+419.23%)
Mutual labels:  bots
winbot-twitter-bot
Twitter contest bot trained to win giveaways.
Stars: ✭ 22 (-15.38%)
Mutual labels:  twitter
chirps
Twitter bot powering @arichduvet
Stars: ✭ 35 (+34.62%)
Mutual labels:  twitter
Machine-learning-implement
Teach you how to implement machine learning algorithms
Stars: ✭ 37 (+42.31%)
Mutual labels:  machinelearning
jsrobowar
πŸ‘Ύ A port of RoboWar to the web browser using JavaScript and HTML5. (2010)
Stars: ✭ 31 (+19.23%)
Mutual labels:  bots
VoiceNET.Library
.NET library to easily create Voice Command Control feature.
Stars: ✭ 14 (-46.15%)
Mutual labels:  machinelearning
Amazon-Fine-Food-Review
Machine learning algorithm such as KNN,Naive Bayes,Logistic Regression,SVM,Decision Trees,Random Forest,k means and Truncated SVD on amazon fine food review
Stars: ✭ 28 (+7.69%)
Mutual labels:  machinelearning
node-htmlmetaparser
A `htmlparser2` handler for parsing rich metadata from HTML. Includes HTML metadata, JSON-LD, RDFa, microdata, OEmbed, Twitter cards and AppLinks.
Stars: ✭ 44 (+69.23%)
Mutual labels:  twitter
iam
πŸ’š Introduction Bot for slack teams:
Stars: ✭ 12 (-53.85%)
Mutual labels:  bots
tweetsmapper
Twitter geo intelligence tool. Generates a Leaflet map for a given user or from an existing collection of tweets.
Stars: ✭ 23 (-11.54%)
Mutual labels:  twitter
intelligo.js.org
The official website for Intelligo chatbot framework.
Stars: ✭ 18 (-30.77%)
Mutual labels:  bots
Data-Scientist-In-Python
This repository contains notes and projects of Data scientist track from dataquest course work.
Stars: ✭ 23 (-11.54%)
Mutual labels:  machinelearning
detweet
delete tweets en masse
Stars: ✭ 14 (-46.15%)
Mutual labels:  twitter

Politic Bots

Politic Bots is a side-project that started within the research project ParticipaPY of the Catholic University "Nuestra SeΓ±ora de la AsunciΓ³n", which aims at designing and implementing solutions to generate spaces of civic participation through technology.

Motivated by the series of journalist investigations (e.g., How a Russian 'troll soldier' stirred anger after the Westminster attack, Anti-Vaxxers Are Using Twitter to Manipulate a Vaccine Bill, A Russian Facebook page organized a protest in Texas. A different Russian page launched the counterprotest, La oscura utilizaciΓ³n de Facebook y Twitter como armas de manipulaciΓ³n polΓ­tica, How Bots Ruined Clicktivism) regarding the use of social media to manipulate the public opinion, specially in times of elections, we decided to study the use of Twitter during the presidential elections that took place in Paraguay in December 2017 (primary) and April 2018 (general).

To understand how the public and the political candidates use Twitter, we collected tweets published through the accounts of the candidates or containing hashtags used in the campaigns. This information was recorded in a CSV file that was provided to the tweet collector. The source code of the collector is available at here and the CSV file used to pull data from Twitter during the primaries can be found here.

Data Augmentation

The accounts and hashtags employed to collect tweets during the primary elections were augmented with information about the parties and internal movements of the candidates. A similar approach was followed to collect tweets for the general election. However, in this case, the hashtags and accounts were supplemented with information not only of the candidate parties but also about the region of the candidates, the name of their coalitions (if any), and the political positions that they stand for. The CSV file used to collect tweets during the general elections can be found here.

The functions create_flags and add_values_to_flags that annotate the tweets with information of the candidate's party and movement are implemented in the module add_flags.py in src/tweet_collector.

Data Cleaning

Some of the hashtags used by the candidates were generic Spanish words employed in other contexts and Spanish-speaking countries (e.g., a marketing campaign in Argentina) so, before starting any analysis, we had to ensure that the collected tweets were actually related to the elections in Paraguay. We labeled the collected tweets as relevant if they mention candidate accounts or if they had at least more than one of the hashtags of interest. The class TweetEvaluator in src/utils/data_wrangler.py contains the code that labels the collected tweets as relevant or not for this project.

Structure of the repository

β”œβ”€β”€ LICENSE
β”œβ”€β”€ README.md                           <- The top-level README for this project.
β”œβ”€β”€ requirements.txt                    <- The requirements file for reproducing the project environment, e.g.
β”‚                                          generated with `pip freeze > requirements.txt`
β”œβ”€β”€ data
β”œβ”€β”€ sna                                 <- Social Network Analysis
β”‚   β”œβ”€β”€ gefx                            <- Files that record the interaction network among users
β”‚   β”œβ”€β”€ img                             <- Images that illustrate the interaction network among users
β”œβ”€β”€ reports                             <- Reports about the usage of Twitter during elections in Paraguay
β”‚   β”œβ”€β”€ notebooks                       <- Jupyter notebooks used to conduct the analyses
β”‚   β”œβ”€β”€ html                            <- HTML files with the results of the analyses
β”œβ”€β”€ src                                 <- Source code of the project
β”‚   β”œβ”€β”€ __init__.py                     <- Makes src a Python module
β”‚   β”œβ”€β”€ run.py                          <- Main script to run analysis tasks
β”‚   β”œβ”€β”€ config.json.example             <- Example of a configuration file
β”‚   β”œβ”€β”€ analyzer                        
β”‚   β”‚   └── data_analyzer.py            <- Functions to conduct analyses on tweets
β”‚   β”‚   └── network_analysis.py         <- Class used to conduct Social Network Analysis
β”‚   β”œβ”€β”€ bot_detector                    
β”‚   β”‚   └── bot_detector.py             <- Main class to conduct the detection of bots
β”‚   β”‚   └── run.py                      <- Main function to execute the detection of bots
β”‚   β”‚   └── heuristics                  
β”‚   β”‚   β”‚   └── fake_handlers.py        <- Functions to execute the heuristic fake handlers
β”‚   β”‚   β”‚   └── fake_promoter.py        <- Functions to execute the heuristic fake promoter
β”‚   β”‚   β”‚   └── heuristic_config.json   <- Configuration file with the parameters of the heuristics
β”‚   β”‚   β”‚   └── simple.py               <- Functions to execute a set of straighforward heuristics
β”‚   β”œβ”€β”€ tweet_collector                 
β”‚   β”‚   └── add_flags.py                <- Functions used to augment tweets with information about the candidates
β”‚   β”‚   └── generales.csv               <- CSV file with the hashtags and accounts used to collect tweets related to 
β”‚   β”‚   β”‚                                  the general elections
β”‚   β”‚   └── internas_2017.csv           <- CSV file with the hashtags and accounts used to collect tweets related to
β”‚   β”‚   β”‚                                  the primary elections
β”‚   β”‚   └── tweet_collector.py          <- Class implemented to collect tweets by hitting the API of Twitter
β”‚   β”œβ”€β”€ utils
β”‚   β”‚   └── data_wrangler.py            <- Functions and classes to clean and pre-process the data  
β”‚   β”‚   └── db_manager.py               <- Main class to operate the MongoDB used to store the tweets
β”‚   β”‚   └── utils.py                    <- General utilitarian functions                 

Analyses

The directory reports contains the analyses conducted to study the use of Twitter during the primary and general elections. Jupyter notebook was employed to document the analyses and report the results. HTML files were generated to facilitate the access to the analyses and results.

Getting Started

Installation guide

Links to packages are provided below.

  1. Download and install Python >= 3.4.4;
  2. Download and install MongoDB community version;
  3. Create a Twitter APP by following the instructions here;
  4. Clone the repository git clone https://github.com/ParticipaPY/politic-bots.git;
  5. Get into the directory of the repository cd politic-bots;
  6. Create a virtual environment by running virtualenv env;
  7. Activate the virtual environment by executing source env/bin/activate;
  8. Inside the directory of the repository install the project dependencies by running pip install -r requirements.txt;

Optional for social network analysis: Download and install Gephi

Run

There are some tasks that can be run from src/run.py. Bellow we explain each of them.

Pre-requirements

  1. Set in src/config.json the information of the MongoDB database that is used to store the tweets;
  2. Activate the virtual environment by executing source env/bin/activate.

Collect Political Tweets

The data sets of tweets collected during the presidential primary and general elections that took place in Paraguay in December 2017 and April 2018, respectively, are available to be download.

In case a new set of tweets needs to be downloaded, below we list the steps that are required to follow.

  1. Create a CSV file to contain the name of the Twitter accounts and hashtags of interest. The CSV file should have a column called keyword with the list of accounts and hashtags. Additional columns can be added to the CSV to complement the information about accounts and hashtags. An example of CSV file can be found here;
  2. Rename src/config.json.example to src/config.json;
  3. Set to metadata in src/config.json the path to the CSV file;
  4. Set to consumer_key and consumer_secret in src/config.json the information of the authentication tokens of the Twitter App created during the installation process;
  5. Go to the src directory and execute python run.py --collect_tweets.

Depending on the number of hashtags and accounts the collection can take several hours or even days.

Create database of tweet authors

Before conducting analyses on the tweets, a database of the authors of tweets should be created. To create the database of users active the virtual environment source env/bin/activate and execute from the src directory python run.py --db_users. A new MongoDB database so called users is created as a result of this process.

Analyze the sentiment of tweets

It is possible to analyze the tone of tweets by executing, from the src directory, python run.py --sentiment_analysis. The sentiment of tweets are stored as part of the dictionary that contains the information of the tweet under the key sentimiento. We use the library CCA-Core to analyze the sentiment embed in Tweets. See here for more information about the CCA-Core library.

Identify relevant tweets

Tweets should be evaluated to analyze their relevance for this project. See Data Cleaning section to understand the problems with the hashtags used to collect tweets. From the src directory of the repository and after activating your virtual environment source env/bin/activate, run python run.py --flag_tweets to perform both tasks. The flag relevante, added to the dictionary that stores the information of the tweets, indicates whether the tweet is relevant or not for the purpose of this project.

Generate network of interactions

Once the database of users was generated a network that shows the interactions among them can be created for a follow-up social network analysis. From the src directory and after activating your virtual environment (source env/bin/activate), run python run.py --interaction_net to generate the network of interactions among the tweet authors. Examples of interaction networks can be found in the directory sna of the repo.

Troubleshooting

If you get the error ImportError: No module named when trying to execute the scripts, make sure to be at the src directory. If after being at the src directory you still get the same error, it is possible that you need to add the src directory to the PYTHONPATH by adding PYTHONPATH=../ at the beginning of the execution commands, e.g., PYTHONPATH=../ python analyzer/pre_analysis.py

Technologies

  1. Python 3.4
  2. MongoDB
  3. Tweepy
  4. Gephi

Issues

Please use Github's issue tracker to report issues and suggestions.

Contributors

Jammily Ortigoza, Jorge Saldivar, JosuΓ© Ibarra, Laura AchΓ³n, Cristhian Parra

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].