All Projects → ThinkR-open → thinkr

ThinkR-open / thinkr

Licence: other
Some tools for cleaning up messy 'Excel' files to be suitable for R

Programming Languages

r
7636 projects

Projects that are alternatives of or similar to thinkr

vosonSML
R package for collecting social media data and creating networks for analysis.
Stars: ✭ 65 (+209.52%)
Mutual labels:  cran, r-package
eia
An R package wrapping the US Energy Information Administration open data API.
Stars: ✭ 38 (+80.95%)
Mutual labels:  cran, r-package
Tint
Tint is not Tufte
Stars: ✭ 226 (+976.19%)
Mutual labels:  cran, r-package
Simmer
Discrete-Event Simulation for R
Stars: ✭ 170 (+709.52%)
Mutual labels:  cran, r-package
oem
Penalized least squares estimation using the Orthogonalizing EM (OEM) algorithm
Stars: ✭ 22 (+4.76%)
Mutual labels:  cran, r-package
Arsenal
An Arsenal of 'R' Functions for Large-Scale Statistical Summaries
Stars: ✭ 171 (+714.29%)
Mutual labels:  cran, r-package
healthyR
Hospital Data Analysis Workflow Tools
Stars: ✭ 21 (+0%)
Mutual labels:  cran, r-package
Rblpapi
R package interfacing the Bloomberg API from https://www.bloomberglabs.com/api/
Stars: ✭ 133 (+533.33%)
Mutual labels:  cran, r-package
packagefinder
Comfortable search for R packages on CRAN, either directly from the R console or with an R Studio add-in
Stars: ✭ 43 (+104.76%)
Mutual labels:  cran, r-package
rcppsimdjson
Rcpp Bindings for the 'simdjson' Header Library
Stars: ✭ 103 (+390.48%)
Mutual labels:  cran, r-package
Webservices
CRAN WebTechnologies Task View
Stars: ✭ 160 (+661.9%)
Mutual labels:  cran, r-package
rcppcnpy
Rcpp bindings for NumPy files
Stars: ✭ 24 (+14.29%)
Mutual labels:  cran, r-package
Osrm
Shortest Paths and Travel Time from OpenStreetMap with R
Stars: ✭ 160 (+661.9%)
Mutual labels:  cran, r-package
Pacman
A package management tools for R
Stars: ✭ 220 (+947.62%)
Mutual labels:  cran, r-package
Pinp
Pinp Is Not PNAS -- Two-Column PDF Template
Stars: ✭ 134 (+538.1%)
Mutual labels:  cran, r-package
Littler
A scripting and command-line front-end for GNU R
Stars: ✭ 238 (+1033.33%)
Mutual labels:  cran, r-package
Minicran
R package to create internally consistent, mini version of CRAN
Stars: ✭ 123 (+485.71%)
Mutual labels:  cran, r-package
Drat
Drat R Archive Template
Stars: ✭ 127 (+504.76%)
Mutual labels:  cran, r-package
rdomains
Classifying the content of domains
Stars: ✭ 47 (+123.81%)
Mutual labels:  cran, r-package
RcppEigen
Rcpp integration for the Eigen templated linear algebra library
Stars: ✭ 89 (+323.81%)
Mutual labels:  cran, r-package

R-CMD-check CRAN_Status_Badge Coverage status

thinkr

{thinkr} is a set of tools for Cleaning Up Messy Files.

It contains some tools for cleaning up messy ‘Excel’ files to be suitable for R. People who have been working with ‘Excel’ for years built more or less complicated sheets with names, characters, formats that are not homogeneous. To be able to use them in R nowadays, we built a set of functions that will avoid the majority of importation problems and keep all the data at best.

Installation

CRAN version

install.packages("thinkr")

Github development version

# install.packages("devtools")
devtools::install_github("ThinkR-open/thinkr")

Once installed, you can load {thinkr}:

library(thinkr)

or without the package startup message:

suppressPackageStartupMessages(library(thinkr))

Usage

peep

peep function allows to print intermediate outputs inside a {dplyr}/%>% workflow

data(iris)
# just symbols
iris %>%
  peep(head, tail) %>%
  rename(species = Species) %>%
  summary()
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> 1          5.1         3.5          1.4         0.2  setosa
#> 2          4.9         3.0          1.4         0.2  setosa
#> 3          4.7         3.2          1.3         0.2  setosa
#> 4          4.6         3.1          1.5         0.2  setosa
#> 5          5.0         3.6          1.4         0.2  setosa
#> 6          5.4         3.9          1.7         0.4  setosa
#>     Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
#> 145          6.7         3.3          5.7         2.5 virginica
#> 146          6.7         3.0          5.2         2.3 virginica
#> 147          6.3         2.5          5.0         1.9 virginica
#> 148          6.5         3.0          5.2         2.0 virginica
#> 149          6.2         3.4          5.4         2.3 virginica
#> 150          5.9         3.0          5.1         1.8 virginica
#>   Sepal.Length    Sepal.Width     Petal.Length    Petal.Width   
#>  Min.   :4.300   Min.   :2.000   Min.   :1.000   Min.   :0.100  
#>  1st Qu.:5.100   1st Qu.:2.800   1st Qu.:1.600   1st Qu.:0.300  
#>  Median :5.800   Median :3.000   Median :4.350   Median :1.300  
#>  Mean   :5.843   Mean   :3.057   Mean   :3.758   Mean   :1.199  
#>  3rd Qu.:6.400   3rd Qu.:3.300   3rd Qu.:5.100   3rd Qu.:1.800  
#>  Max.   :7.900   Max.   :4.400   Max.   :6.900   Max.   :2.500  
#>        species  
#>  setosa    :50  
#>  versicolor:50  
#>  virginica :50  
#>                 
#>                 
#> 
# expressions with .
iris %>%
  peep(head(., n = 2), tail(., n = 3)) %>%
  summary()
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> 1          5.1         3.5          1.4         0.2  setosa
#> 2          4.9         3.0          1.4         0.2  setosa
#>     Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
#> 148          6.5         3.0          5.2         2.0 virginica
#> 149          6.2         3.4          5.4         2.3 virginica
#> 150          5.9         3.0          5.1         1.8 virginica
#>   Sepal.Length    Sepal.Width     Petal.Length    Petal.Width   
#>  Min.   :4.300   Min.   :2.000   Min.   :1.000   Min.   :0.100  
#>  1st Qu.:5.100   1st Qu.:2.800   1st Qu.:1.600   1st Qu.:0.300  
#>  Median :5.800   Median :3.000   Median :4.350   Median :1.300  
#>  Mean   :5.843   Mean   :3.057   Mean   :3.758   Mean   :1.199  
#>  3rd Qu.:6.400   3rd Qu.:3.300   3rd Qu.:5.100   3rd Qu.:1.800  
#>  Max.   :7.900   Max.   :4.400   Max.   :6.900   Max.   :2.500  
#>        Species  
#>  setosa    :50  
#>  versicolor:50  
#>  virginica :50  
#>                 
#>                 
#> 
# or both
iris %>%
  peep(head, tail(., n = 3)) %>%
  summary()
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> 1          5.1         3.5          1.4         0.2  setosa
#> 2          4.9         3.0          1.4         0.2  setosa
#> 3          4.7         3.2          1.3         0.2  setosa
#> 4          4.6         3.1          1.5         0.2  setosa
#> 5          5.0         3.6          1.4         0.2  setosa
#> 6          5.4         3.9          1.7         0.4  setosa
#>     Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
#> 148          6.5         3.0          5.2         2.0 virginica
#> 149          6.2         3.4          5.4         2.3 virginica
#> 150          5.9         3.0          5.1         1.8 virginica
#>   Sepal.Length    Sepal.Width     Petal.Length    Petal.Width   
#>  Min.   :4.300   Min.   :2.000   Min.   :1.000   Min.   :0.100  
#>  1st Qu.:5.100   1st Qu.:2.800   1st Qu.:1.600   1st Qu.:0.300  
#>  Median :5.800   Median :3.000   Median :4.350   Median :1.300  
#>  Mean   :5.843   Mean   :3.057   Mean   :3.758   Mean   :1.199  
#>  3rd Qu.:6.400   3rd Qu.:3.300   3rd Qu.:5.100   3rd Qu.:1.800  
#>  Max.   :7.900   Max.   :4.400   Max.   :6.900   Max.   :2.500  
#>        Species  
#>  setosa    :50  
#>  versicolor:50  
#>  virginica :50  
#>                 
#>                 
#> 
# use verbose to see what happens
iris %>%
  peep(head, tail(., n = 3), verbose = TRUE) %>%
  summary()
#> head(.)
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> 1          5.1         3.5          1.4         0.2  setosa
#> 2          4.9         3.0          1.4         0.2  setosa
#> 3          4.7         3.2          1.3         0.2  setosa
#> 4          4.6         3.1          1.5         0.2  setosa
#> 5          5.0         3.6          1.4         0.2  setosa
#> 6          5.4         3.9          1.7         0.4  setosa
#> tail(., n = 3)
#>     Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
#> 148          6.5         3.0          5.2         2.0 virginica
#> 149          6.2         3.4          5.4         2.3 virginica
#> 150          5.9         3.0          5.1         1.8 virginica
#>   Sepal.Length    Sepal.Width     Petal.Length    Petal.Width   
#>  Min.   :4.300   Min.   :2.000   Min.   :1.000   Min.   :0.100  
#>  1st Qu.:5.100   1st Qu.:2.800   1st Qu.:1.600   1st Qu.:0.300  
#>  Median :5.800   Median :3.000   Median :4.350   Median :1.300  
#>  Mean   :5.843   Mean   :3.057   Mean   :3.758   Mean   :1.199  
#>  3rd Qu.:6.400   3rd Qu.:3.300   3rd Qu.:5.100   3rd Qu.:1.800  
#>  Max.   :7.900   Max.   :4.400   Max.   :6.900   Max.   :2.500  
#>        Species  
#>  setosa    :50  
#>  versicolor:50  
#>  virginica :50  
#>                 
#>                 
#> 

clean_*

Function clean_names allows to clean dirty names, while removing special characters, spaces, …

data(iris)

iris %>% head()
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> 1          5.1         3.5          1.4         0.2  setosa
#> 2          4.9         3.0          1.4         0.2  setosa
#> 3          4.7         3.2          1.3         0.2  setosa
#> 4          4.6         3.1          1.5         0.2  setosa
#> 5          5.0         3.6          1.4         0.2  setosa
#> 6          5.4         3.9          1.7         0.4  setosa
iris %>%
  clean_names() %>%
  head()
#>   sepal_length sepal_width petal_length petal_width species
#> 1          5.1         3.5          1.4         0.2  setosa
#> 2          4.9         3.0          1.4         0.2  setosa
#> 3          4.7         3.2          1.3         0.2  setosa
#> 4          4.6         3.1          1.5         0.2  setosa
#> 5          5.0         3.6          1.4         0.2  setosa
#> 6          5.4         3.9          1.7         0.4  setosa

Function clean_vec allows to clean character vectors, while removing special characters, spaces, …

vector <- c("Jean Sébastien", "Anne-Sophie", "44@Bernard2")
cleaned <- clean_vec(vector)
cleaned
#> [1] "jean_sebastien" "anne_sophie"    "x44_bernard2"

Excel positions

Find Excel column position name from column number and inversely

ncol_to_excel(6)
#> [1] "F"
excel_to_ncol("AF")
#> [1] 32

Code of Conduct

Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].