iterative / aita_dataset

Licence: other

AITA dataset based on r/AmItheAsshole/

Programming Languages

python

139335 projects - #7 most used programming language

Projects that are alternatives of or similar to aita dataset

play-scala-chatroom-example

Play chatroom with Scala API

Stars: ✭ 43 (+59.26%)

Mutual labels: example

scenic asteroids

A toy Asteroids clone written in Elixir with the Scenic UI library

Stars: ✭ 42 (+55.56%)

Mutual labels: example

play-java-ebean-example

Example Play application showing Java with Ebean

Stars: ✭ 54 (+100%)

Mutual labels: example

ksonnet-cheat-sheet

No description or website provided.

Stars: ✭ 18 (-33.33%)

Mutual labels: example

SeatLayout

A seat selection library for Android with an example for selecting seats for flights, sports venue, theatres, etc

Stars: ✭ 30 (+11.11%)

Mutual labels: example

learning-python

notes and codes while learning python

Stars: ✭ 71 (+162.96%)

Mutual labels: example

Quarto

A working example of the Quarto board game using Elm and Netlify. An exploration of game development, OSS, and functional programming.

Stars: ✭ 15 (-44.44%)

Mutual labels: example

reinforcement learning financial trading

MATLAB example on how to use Reinforcement Learning for developing a financial trading model

Stars: ✭ 94 (+248.15%)

Mutual labels: example

rest-api-endpoints

🌾 WordPress REST API endpoints

Stars: ✭ 31 (+14.81%)

Mutual labels: example

haxe

Qt binding for Haxe | Showcase example for https://github.com/therecipe/qt

Stars: ✭ 21 (-22.22%)

Mutual labels: example

widgets playground

Showcase example for https://github.com/therecipe/qt

Stars: ✭ 50 (+85.19%)

Mutual labels: example

Discord-Bot-TypeScript-Template

Discord bot - A discord.js bot template written with TypeScript.

Stars: ✭ 86 (+218.52%)

Mutual labels: example

iOS ARkit2 Multiusers

An example implemented multiplayer experience in ARKit2

Stars: ✭ 19 (-29.63%)

Mutual labels: example

example-orbitdb-todomvc

TodoMVC with OrbitDB

Stars: ✭ 17 (-37.04%)

Mutual labels: example

Hello-GLUT

A very simple "Hello World!" GLUT application demonstrating how to write OpenGL applications in C with MinGW and MSVC.

Stars: ✭ 27 (+0%)

Mutual labels: example

todo-graphql-example

Example Todo app on top of json-graphql-server

Stars: ✭ 20 (-25.93%)

Mutual labels: example

db2-samples

Db2 application code, configuration samples, and other examples

Stars: ✭ 56 (+107.41%)

Mutual labels: example

react-native-css-modules-with-media-queries-example

An example app to show how CSS Media Queries work in React Native.

Stars: ✭ 18 (-33.33%)

Mutual labels: example

api-examples

Plesk API-RPC usage examples

Stars: ✭ 79 (+192.59%)

Mutual labels: example

hugo-bare-min-theme

A bare minimum theme for Hugo (https://gohugo.io) to help develop and debug Hugo sites -- https://hugo-bare-min.netlify.com/,

Stars: ✭ 71 (+162.96%)

Mutual labels: example

View All Similar Projects ➔

AITA Dataset

Great news! Since the original blog post was shared, we discovered that the API used to collect post scores excluded ~30K posts from AITA in 2018-2019. These have been added to the dataset in the latest release. We will be sharing an update to some of the metrics calculated in the blog shortly.

This repo contains code to replicate our scrape of the r/AmItheAsshole subreddit, as well as .dvc files linking this GitHub repo to an S3 bucket hosting the dataset.

Building the dataset is accomplished in three scripts:

0_scraper_push_api.py collects Reddit post ids and scores from within a desired timeframe.
1_scraper_praw.py uses the praw library to query each post by id, and grab associated text and meta-data.
2_clean_and_consolidate.py cleans data and does some general neatening.

The dataset contained in aita_clean.csv has 9 features:

id, a unique string provided by Reddit's API to index every post
timestamp of post creation, in epoch/Unix format
title, a string
body, a string
edited, the timestamp at which a post was edited. If no edits occurred this field is False.
verdict, a string in the set {"asshole", "not the asshole", "everyone sucks", "no assholes here")
score, an integer corresponding to the difference between upvotes and downvotes
num_comments, an integer corresponding to the total number of comments (including nested discussion) to the post
is_asshole, a boolean corresponding to whether the verdict is in the set {"asshole","everyone sucks"}

To get this dataset, install DVC and run:

$ dvc get https://github.com/iterative/aita_dataset aita_clean.csv

$ dvc import https://github.com/iterative/aita_dataset aita_clean.csv to also download the associated .dvc files for data set versioning.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

iterative / aita_dataset

Programming Languages

Labels

Projects that are alternatives of or similar to aita dataset

AITA Dataset