Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → rudeboybert → Fivethirtyeight

rudeboybert / Fivethirtyeight

Licence: other

R package of data and code behind the stories and interactives at FiveThirtyEight

Programming Languages

7636 projects

Labels

data-science statistics cran

Projects that are alternatives of or similar to Fivethirtyeight

Openml R

R package to interface with OpenML

Stars: ✭ 81 (-80.81%)

Mutual labels: data-science, statistics, cran

Mlr

Machine Learning in R

Stars: ✭ 1,542 (+265.4%)

Mutual labels: data-science, statistics, cran

Collapse

Advanced and Fast Data Transformation in R

Stars: ✭ 184 (-56.4%)

Mutual labels: data-science, statistics, cran

Rweekly.org

R Weekly

Stars: ✭ 406 (-3.79%)

Mutual labels: data-science, statistics

Xlearn

High performance, easy-to-use, and scalable machine learning (ML) package, including linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM) for Python and CLI interface.

Stars: ✭ 2,968 (+603.32%)

Mutual labels: data-science, statistics

Data Science Learning

Repository of code and resources related to different data science and machine learning topics. For learning, practice and teaching purposes.

Stars: ✭ 273 (-35.31%)

Mutual labels: data-science, statistics

Data Science Free

Free Resources For Data Science created by Shubham Kumar

Stars: ✭ 232 (-45.02%)

Mutual labels: data-science, statistics

120 Ds Interview Questions

My Answer to 120 Data Science Interview Questions

Stars: ✭ 304 (-27.96%)

Mutual labels: data-science, statistics

Uncertainty Baselines

High-quality implementations of standard and SOTA methods on a variety of tasks.

Stars: ✭ 278 (-34.12%)

Mutual labels: data-science, statistics

Edward2

A simple probabilistic programming language.

Stars: ✭ 419 (-0.71%)

Mutual labels: data-science, statistics

Csinva.github.io

Slides, paper notes, class notes, blog posts, and research on ML 📉, statistics 📊, and AI 🤖.

Stars: ✭ 342 (-18.96%)

Mutual labels: data-science, statistics

Notebooks Statistics And Machinelearning

Jupyter Notebooks from the old UnsupervisedLearning.com (RIP) machine learning and statistics blog

Stars: ✭ 270 (-36.02%)

Mutual labels: data-science, statistics

Facet

Human-explainable AI.

Stars: ✭ 269 (-36.26%)

Mutual labels: data-science, statistics

Datascience Ai Machinelearning Resources

Alex Castrounis' curated set of resources for artificial intelligence (AI), machine learning, data science, internet of things (IoT), and more.

Stars: ✭ 414 (-1.9%)

Mutual labels: data-science, statistics

Datascience

Curated list of Python resources for data science.

Stars: ✭ 3,051 (+622.99%)

Mutual labels: data-science, statistics

Openintro Statistics

📚 An open-source textbook written at the college level. OpenIntro also offers a second college-level intro stat textbook and also a high school variant.

Stars: ✭ 283 (-32.94%)

Mutual labels: data-science, statistics

Scikit Mobility

scikit-mobility: mobility analysis in Python

Stars: ✭ 339 (-19.67%)

Mutual labels: data-science, statistics

Dataexplorer

Automate Data Exploration and Treatment

Stars: ✭ 362 (-14.22%)

Mutual labels: data-science, cran

Stats Maths With Python

General statistics, mathematical programming, and numerical/scientific computing scripts and notebooks in Python

Stars: ✭ 381 (-9.72%)

Mutual labels: data-science, statistics

Datascienceprojects

The code repository for projects and tutorials in R and Python that covers a variety of topics in data visualization, statistics sports analytics and general application of probability theory.

Stars: ✭ 223 (-47.16%)

Mutual labels: data-science, statistics

View All Similar Projects ➔

fivethirtyeight

An R package that provides access to the code and data sets published by FiveThirtyEight https://github.com/fivethirtyeight/data. Note that while we received guidance from editors at 538, this package is not officially published by 538.

Installation

Get the latest released version from CRAN:

install.packages("fivethirtyeight")

Or the development version from GitHub:

# If you haven't installed the remotes package yet, do so:
# install.packages("remotes")
remotes::install_github("rudeboybert/fivethirtyeight", build_vignettes = TRUE)

Usage

All data in the fivethirtyeight package are lazy-loaded, so you can access any dataset without running data():

library(fivethirtyeight)

head(bechdel)
?bechdel

# If using RStudio:
View(bechdel)

To see a detailed list of all 128 datasets, including information on the corresponding articles published on FiveThirtyEight.com, click here.

Add-on Package

There are also 19 datasets that could not be included in fivethirtyeight due to CRAN package size restrictions:

#>  [1] "castle_solutions"           "castle_solutions_2"        
#>  [3] "castle_solutions_3"         "comic_characters"          
#>  [5] "goose"                      "house_district_forecast"   
#>  [7] "mayweather_mcgregor_tweets" "mlb_elo"                   
#>  [9] "nba_all_elo"                "nba_carmelo"               
#> [11] "nba_elo"                    "nfl_elo"                   
#> [13] "quasi_winshares"            "raptor_by_player"          
#> [15] "raptor_by_team"             "ratings"                   
#> [17] "senators"                   "spi_matches"               
#> [19] "twitter_presidents"

These 19 datasets are included in the fivethirtyeightdata add-on package, which you can install by running:

install.packages('fivethirtyeightdata', repos = 'https://fivethirtyeightdata.github.io/drat/', type = 'source')

So for example, to load the senators dataset, run:

library(fivethirtyeight)
library(fivethirtyeightdata)
senators

Article in “Technology Innovations in Statistics Education”

The fivethirtyeight package was featured in The fivethirtyeight R Package: “Tame Data” Principles for Introductory Statistics and Data Science Courses by Kim, Ismay, and Chunn (2018) published in Volume 11, Issue 1 of the journal “Technology Innovations in Statistics Education”.

Abstract: As statistics and data science instructors, we often seek to use data in our courses that are rich, real, realistic, and relevant. To this end we created the fivethirtyeight R package of data and code behind the stories and interactives at the data journalism website FiveThirtyEight.com. After a discussion on the conflicting pedagogical goals of “minimizing prerequisites to research” (Cobb 2015) while at the same time presenting students with a realistic view of data as it exists “in the wild,” we articulate how a desired balance between these two goals informed the design of the package. The details behind this balance are articulated as our proposed “Tame data principles for introductory statistics and data science courses.” Details of the package’s construction and example uses are included as well.

Data Analysis Examples in Vignettes

For some data sets, there are user-contributed example analyses in the form a package vignette. For example, look at “Bechdel analysis using the tidyverse” based on the bechdel dataset used in the article The Dollar-And-Cents Case Against Hollywood’s Exclusion of Women. For a complete list of vignettes run

vignette("user_contributed_vignettes", package = "fivethirtyeightdata")

More Information

Andrew Flowers gave a great demonstration of the package and the bechdel vignette during his rstudio::conf talk in Orlando, Florida in January 2017. The video of his talk is available here.
Click this Google Sheet for a master spreadsheet connecting
1. the original 538 data on GitHub with
2. the data frames in the package with
3. information on the corresponding article

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 422

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (4) 🔗