All Projects → petermeissner → Wikipediatrend

petermeissner / Wikipediatrend

A convenience R package for getting Wikipedia article access statistics (and more).

Programming Languages

r
7636 projects

Projects that are alternatives of or similar to Wikipediatrend

Tablesaw
Java dataframe and visualization library
Stars: ✭ 2,785 (+3715.07%)
Mutual labels:  statistics, data-frame
Structural Equation Modeling For Psychologists
Lesson files used in the Structural Equation Modeling for Psychologists.
Stars: ✭ 69 (-5.48%)
Mutual labels:  statistics
Lifetimes
Lifetime value in Python
Stars: ✭ 1,082 (+1382.19%)
Mutual labels:  statistics
Philentropy
Information Theory and Distance Quantification with R
Stars: ✭ 65 (-10.96%)
Mutual labels:  statistics
Matlabstan
Matlab interface to Stan, a package for Bayesian inference
Stars: ✭ 59 (-19.18%)
Mutual labels:  statistics
Excelize
Golang library for reading and writing Microsoft Excel™ (XLSX) files.
Stars: ✭ 10,286 (+13990.41%)
Mutual labels:  statistics
Pycm
Multi-class confusion matrix library in Python
Stars: ✭ 1,076 (+1373.97%)
Mutual labels:  statistics
Fecon236
Tools for financial economics. Curated wrapper over Python ecosystem. Source code for fecon235 Jupyter notebooks.
Stars: ✭ 72 (-1.37%)
Mutual labels:  statistics
Datacamp
🍧 A repository that contains courses I have taken on DataCamp
Stars: ✭ 69 (-5.48%)
Mutual labels:  statistics
Spark Bigquery
Google BigQuery support for Spark, Structured Streaming, SQL, and DataFrames with easy Databricks integration.
Stars: ✭ 65 (-10.96%)
Mutual labels:  data-frame
Django Statsy
Statistics for your Django project
Stars: ✭ 64 (-12.33%)
Mutual labels:  statistics
Stream Statistics
streaming statistical calculations for node
Stars: ✭ 59 (-19.18%)
Mutual labels:  statistics
Ruby Statistics
Ruby gem for some statistical operations without any statistical language dependency
Stars: ✭ 67 (-8.22%)
Mutual labels:  statistics
Statistic Study Notes
李航统计学习方法(第二版)的学习笔记,包括:1、每章重点公式的手动推导 2、每章算法的Python自实现 3、学习过程中的笔记与心得 4、每章节的课后习题 5、每周都会按照至少一周一章的进度定时将自己的学习进度更新到这个仓库
Stars: ✭ 57 (-21.92%)
Mutual labels:  statistics
Coinmetrics Tools
Coinmetrics.io tools suite
Stars: ✭ 70 (-4.11%)
Mutual labels:  statistics
Quandl Python
Stars: ✭ 1,076 (+1373.97%)
Mutual labels:  data-frame
Openmx
Repository for the OpenMx Structural Equation Modeling package
Stars: ✭ 60 (-17.81%)
Mutual labels:  statistics
Lm Glm Glmm Intro
A unified framework for data analysis in R based on GLM/GLMM
Stars: ✭ 65 (-10.96%)
Mutual labels:  statistics
Categoricalarrays.jl
Arrays for working with categorical data (both nominal and ordinal)
Stars: ✭ 71 (-2.74%)
Mutual labels:  statistics
Metriculous
Measure and visualize machine learning model performance without the usual boilerplate.
Stars: ✭ 71 (-2.74%)
Mutual labels:  statistics

Public Subject Attention via Wikipedia Page View Statistics

Status

AppVeyor build status Codecov

lines of R code: 474, lines of test code: 160

Version

2.1.6 ( 2020-06-03 12:43:18 )

Description

License

GPL (>= 2)
Peter Meissner [aut, cre], R Core Team [cph]

Credits

  • Parts of the package’s code have been shamelessly copied and modified from R base package written by R core team. This concerns the wp_date() generic and its methods and is detailed in the help files.

Citation

citation("wikipediatrend")

Meissner P (2020). wikipediatrend: Public Subject Attention via Wikipedia Page View Statistics. R package version 2.1.6.

BibTex for citing

toBibtex(citation("wikipediatrend"))

Installation

Stable version from CRAN:

install.packages("wikipediatrend")

Latest development version from Github:

devtools::install_github("petermeissner/wikipediatrend")

Usage

starting up …

library(wikipediatrend)
## 
##   [wikipediatrend]
##     
##   Note:
##     
##     - Data before 2016-01-01 
##       * is provided by petermeissner.de and
##       * was prepared in a project commissioned by the Hertie School of Governance (Prof. Dr. Simon Munzert)
##       * and supported by the Daimler and Benz Foundation.
##     
##     - Data from 2016-01-01 onwards 
##       * is provided by the Wikipedia Foundation
##       * via its pageviews package and API.
## 

getting some data …

trend_data <- 
  wp_trend(
    page = c("Der_Spiegel", "Die_Zeit"), 
    lang = c("de", "en"), 
    from = "2007-01-01",
    to   = Sys.Date()
  )

having a look …

trend_data
##      language article     date       views
## 2    en       die_zeit    2007-12-10    74
## 1    de       der_spiegel 2007-12-10   798
## 4    en       die_zeit    2007-12-11    35
## 3    de       der_spiegel 2007-12-11   710
## 5    de       der_spiegel 2007-12-12   770
## 9114 en       die_zeit    2020-05-31   209
## 9116 en       die_zeit    2020-06-01   174
## 9115 de       der_spiegel 2020-06-01  1498
## 9118 en       die_zeit    2020-06-02   208
## 9117 de       der_spiegel 2020-06-02  1252
## 
## ... 9108 rows of data not shown

having another look …

plot(
  trend_data[trend_data$views < 2500, ]
)
## `geom_smooth()` using formula 'y ~ x'

Usage 2

getting some data …

trend_data <- 
  wp_trend(
    page = 
      c(
        "Climate_crisis", 
        "2019–20_coronavirus_pandemic",
        "Donald_Trump",
        "Syria",
        "Crimea",
        "Influenza"
      ), 
    lang = "en", 
    from = "2007-01-01",
    to   = Sys.Date()
  )
## Warning in wpd_get_exact(page = page, lang = lang, from = from, to = to, : Unable to retrieve data for url:
## http://petermeissner.de:8880/article/exact/en/2019–20_coronavirus_pandemic. Status: error.

having a look …

trend_data
##       language article        date       views  
## 1     en       climate_crisis 2007-12-10       0
## 2     en       crimea         2007-12-10    1051
## 5     en       syria          2007-12-10    3205
## 4     en       influenza      2007-12-10    4153
## 3     en       donald_trump   2007-12-10    5050
## 22723 en       climate_crisis 2020-06-02     103
## 22726 en       influenza      2020-06-02    3437
## 22724 en       crimea         2020-06-02    3681
## 22727 en       syria          2020-06-02    4969
## 22725 en       donald_trump   2020-06-02  916742
## 
## ... 22717 rows of data not shown

having another look …

options(scipen = 1000000)

plot(trend_data) + 
  ggplot2::scale_y_log10()
## Warning: Transformation introduced infinite values in continuous y-axis

## Warning: Transformation introduced infinite values in continuous y-axis

## `geom_smooth()` using formula 'y ~ x'

## Warning: Removed 1202 rows containing non-finite values (stat_smooth).

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].