All Projects → strengejacke → Sjmisc

strengejacke / Sjmisc

Licence: gpl-3.0
Data transformation and utility functions for R

Programming Languages

r
7636 projects

Projects that are alternatives of or similar to Sjmisc

Data-Analyst-Nanodegree
Kai Sheng Teh - Udacity Data Analyst Nanodegree
Stars: ✭ 42 (-70.21%)
Mutual labels:  data-wrangling
Data Forge Ts
The JavaScript data transformation and analysis toolkit inspired by Pandas and LINQ.
Stars: ✭ 967 (+585.82%)
Mutual labels:  data-wrangling
Python Novice Gapminder
Plotting and Programming in Python
Stars: ✭ 109 (-22.7%)
Mutual labels:  data-wrangling
mimir
Data-ish exploration through SQL+Uncertainty
Stars: ✭ 26 (-81.56%)
Mutual labels:  data-wrangling
Moderndive book
Statistical Inference via Data Science: A ModernDive into R and the Tidyverse
Stars: ✭ 527 (+273.76%)
Mutual labels:  data-wrangling
Openrefine
OpenRefine is a free, open source power tool for working with messy data and improving it
Stars: ✭ 8,531 (+5950.35%)
Mutual labels:  data-wrangling
foofah
Foofah: programming-by-example data transformation program synthesizer
Stars: ✭ 24 (-82.98%)
Mutual labels:  data-wrangling
Hypertools
A Python toolbox for gaining geometric insights into high-dimensional data
Stars: ✭ 1,678 (+1090.07%)
Mutual labels:  data-wrangling
Cracking The Data Science Interview
A Collection of Cheatsheets, Books, Questions, and Portfolio For DS/ML Interview Prep
Stars: ✭ 672 (+376.6%)
Mutual labels:  data-wrangling
R Raster Vector Geospatial
Introduction to Geospatial Raster and Vector Data with R
Stars: ✭ 76 (-46.1%)
Mutual labels:  data-wrangling
prosto
Prosto is a data processing toolkit radically changing how data is processed by heavily relying on functions and operations with functions - an alternative to map-reduce and join-groupby
Stars: ✭ 54 (-61.7%)
Mutual labels:  data-wrangling
Prose
Microsoft Program Synthesis using Examples SDK is a framework of technologies for the automatic generation of programs from input-output examples. This repo includes samples and sample data for the Microsoft Program Synthesis using Example SDK.
Stars: ✭ 470 (+233.33%)
Mutual labels:  data-wrangling
Data Science Best Resources
Carefully curated resource links for data science in one place
Stars: ✭ 1,104 (+682.98%)
Mutual labels:  data-wrangling
The-Data-Visualization-Workshop
A New, Interactive Approach to Learning Data Visualization
Stars: ✭ 59 (-58.16%)
Mutual labels:  data-wrangling
Python Ecology Lesson
Data Analysis and Visualization in Python for Ecologists
Stars: ✭ 116 (-17.73%)
Mutual labels:  data-wrangling
wrangling-genomics
Data Wrangling and Processing for Genomics
Stars: ✭ 49 (-65.25%)
Mutual labels:  data-wrangling
Optimus
🚚 Agile Data Preparation Workflows made easy with dask, cudf, dask_cudf and pyspark
Stars: ✭ 986 (+599.29%)
Mutual labels:  data-wrangling
Data Forge Js
JavaScript data transformation and analysis toolkit inspired by Pandas and LINQ.
Stars: ✭ 139 (-1.42%)
Mutual labels:  data-wrangling
R Novice Gapminder
R for Reproducible Scientific Analysis
Stars: ✭ 127 (-9.93%)
Mutual labels:  data-wrangling
Uc R.github.io
Main repository for R programming courses @ University of Cincinnati, courses and tutorials that focus on data wrangling, exploration, visualization, and analysis with R.
Stars: ✭ 76 (-46.1%)
Mutual labels:  data-wrangling

sjmisc - Data and Variable Transformation Functions

CRAN_Status_Badge    DOI    Documentation    downloads    total

Data preparation is a common task in research, which usually takes the most amount of time in the analytical process. Packages for data preparation have been released recently as part of the tidyverse, focussing on the transformation of data sets. Packages with special focus on transformation of variables, which fit into the workflow and design-philosophy of the tidyverse, are missing.

sjmisc tries to fill this gap. Basically, this package complements the dplyr package in that sjmisc takes over data transformation tasks on variables, like recoding, dichotomizing or grouping variables, setting and replacing missing values, etc. A distinctive feature of sjmisc is the support for labelled data, which is especially useful for users who often work with data sets from other statistical software packages like SPSS or Stata.

The functions of sjmisc are designed to work together seamlessly with other packages from the tidyverse, like dplyr. For instance, you can use the functions from sjmisc both within a pipe-workflow to manipulate data frames, or to create new variables with mutate(). See vignette("design_philosophy", "sjmisc") for more details.

Contributing to the package

Please follow this guide if you like to contribute to this package.

Installation

Latest development build

To install the latest development snapshot (see latest changes below), type following commands into the R console:

library(devtools)
devtools::install_github("strengejacke/sjmisc")

Officiale, stable release

To install the latest stable release from CRAN, type following command into the R console:

install.packages("sjmisc")

References, documentation and examples

A cheatsheet can be downloaded from here (PDF) or from the RStudio cheatsheet collection.

For more examples, see package vignettes (browseVignettes("sjmisc")).

Please visit https://strengejacke.github.io/sjmisc/ for documentation and vignettes.

Citation

In case you want / have to cite my package, please cite as (see also citation('sjmisc')):

Lüdecke D (2018). sjmisc: Data and Variable Transformation Functions. Journal of Open Source Software, 3(26), 754. doi: 10.21105/joss.00754

DOI

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].