All Projects → jacobkap → fastDummies

jacobkap / fastDummies

Licence: Unknown, MIT licenses found Licenses found Unknown LICENSE MIT LICENSE.md
The goal of fastDummies is to quickly create dummy variables (columns) and dummy rows.

Programming Languages

HTML
75241 projects
r
7636 projects
C++
36643 projects - #6 most used programming language
CSS
56736 projects

Projects that are alternatives of or similar to fastDummies

craft3-seeder
Seeder is the easiest way to quickly create placeholder content while you're building out a website. Create your sections & fields and then let Seeder make entries for you.
Stars: ✭ 30 (-9.09%)
Mutual labels:  dummy-data
degob
Go library/tool for viewing and reversing Go gob data [Moved to GitLab]
Stars: ✭ 34 (+3.03%)
Mutual labels:  binary-data
BinFind
Perform regex pattern matching on binary data. (Regex-like)
Stars: ✭ 31 (-6.06%)
Mutual labels:  binary-data
dummyjdbc
dummyjdbc answers database requests with dummy data to be independent of an existing database.
Stars: ✭ 20 (-39.39%)
Mutual labels:  dummy-data
Faker
Go (Golang) Fake Data Generator for Struct
Stars: ✭ 1,698 (+5045.45%)
Mutual labels:  dummy-data
octet
A library that makes working with bytebuffers painless.
Stars: ✭ 79 (+139.39%)
Mutual labels:  binary-data
mysql-random-data-generator
This is the easiest MySQL random test data generator tool. Load the procedure and execute to auto detect column types and load data.
Stars: ✭ 108 (+227.27%)
Mutual labels:  dummy-data
DummyJSON
DummyJSON provides different types of REST Endpoints filled with JSON data which you can use in developing the frontend with your favorite framework and library without worrying about writing a backend.
Stars: ✭ 213 (+545.45%)
Mutual labels:  dummy-data
binview
Binary data view/editor on the terminal
Stars: ✭ 15 (-54.55%)
Mutual labels:  binary-data
FlexBuffersSwift
Swift implementation of FlexBuffers - a sub project of FlatBuffers
Stars: ✭ 24 (-27.27%)
Mutual labels:  binary-data
kdk fill it up
Use Fill It Up in WordPress to mass generate content & users.
Stars: ✭ 13 (-60.61%)
Mutual labels:  dummy-data
Content Generator Sketch Plugin
Sketch app plugin for generating dummy data such as avatars, names, photos, geo data etc
Stars: ✭ 4,404 (+13245.45%)
Mutual labels:  dummy-data
racket-bitsyntax
Erlang-style binaries/bitstrings for Racket
Stars: ✭ 29 (-12.12%)
Mutual labels:  binary-data
dummy-products-api
An api to fetch dummy e-commerce product 👕 👗 👖 👚 JSON data with placeholder images.
Stars: ✭ 102 (+209.09%)
Mutual labels:  dummy-data
metal
A Java library for parsing binary data formats, using declarative descriptions.
Stars: ✭ 13 (-60.61%)
Mutual labels:  binary-data
db seeder
Relational database data generator..
Stars: ✭ 36 (+9.09%)
Mutual labels:  dummy-data
Sketch-A-XNORNet
An implementation of a variation of Sketch-A-Net using XNOR ConvNets using TensorFlow
Stars: ✭ 52 (+57.58%)
Mutual labels:  binary-data
BinaryStream
BinaryStream - a writer and reader for binary data. Best replacement for pack()/unpack().
Stars: ✭ 44 (+33.33%)
Mutual labels:  binary-data
pfp-vim
A vim hex-editor plugin that uses 010 templates to parse binary data using pfp
Stars: ✭ 57 (+72.73%)
Mutual labels:  binary-data
Hexyl
A command-line hex viewer
Stars: ✭ 6,349 (+19139.39%)
Mutual labels:  binary-data

CRAN_Status_Badge AppVeyor Build Status Build Status Coverage status

Overview

The goal of fastDummies is to quickly create dummy variables (columns) and dummy rows. Creating dummy variables is possible through base R or other packages, but this package is much faster than those methods.

Installation

To install this package, use the code
install.packages("fastDummies")


# The development version is available on Github.
# install.packages("devtools")
devtools::install_github("jacobkap/fastDummies")

Usage

library(fastDummies)

There are two functions in this package:

  • dummy_cols() lets you make dummy variables (dummy_columns() is a clone of dummy_cols())
  • dummy_rows() which lets you make dummy rows.

Dummy Columns

Dummy variables (or binary variables) are commonly used in statistical analyses and in more simple descriptive statistics. A dummy column is one which has a value of one when a categorical event occurs and a zero when it doesn’t occur. In most cases this is a feature of the event/person/object being described. For example, if the dummy variable was for occupation being an R programmer, you can ask, “is this person an R programmer?” When the answer is yes, they get a value of 1, when it is no, they get a value of 0.

We’ll start with a simple example and then go into using the function dummy_cols(). You can also use the function dummy_columns() which is identical to dummy_cols().

Imagine you have a data set about animals in a local shelter. One of the columns in your data is what animal it is: dog or cat.

animals
dog
dog
cat

To make dummy columns from this data, you would need to produce two new columns. One would indicate if the animal is a dog, and the other would indicate if the animal is a cat. Each row would get a value of 1 in the column indicating which animal they are, and 0 in the other column.

animals dog cat
dog 1 0
dog 1 0
cat 0 1

In the function dummy_cols, the names of these new columns are concatenated to the original column and separated by an underscore.

animals animals_dog animals_cat
dog 1 0
dog 1 0
cat 0 1

With an example like this, it is fairly easy to make the dummy columns yourself. dummy_cols() automates the process, and is useful when you have many columns to general dummy variables from or with many categories within the column.

fastDummies_example <- data.frame(numbers = 1:3,
                    gender  = c("male", "male", "female"),
                    animals = c("dog", "dog", "cat"),
                    dates   = as.Date(c("2012-01-01", "2011-12-31",
                                          "2012-01-01")),
                    stringsAsFactors = FALSE)
knitr::kable(fastDummies_example)
numbers gender animals dates
1 male dog 2012-01-01
2 male dog 2011-12-31
3 female cat 2012-01-01

The object fastDummies_example has two character type columns, one integer column, and a Date column. By default, dummy_cols() will make dummy variables from factor or character columns only. This is because in most cases those are the only types of data you want dummy variables from. If those are the only columns you want, then the function takes your data set as the first parameter and returns a data.frame with the newly created variables appended to the end of the original data.

results <- fastDummies::dummy_cols(fastDummies_example)
knitr::kable(results)
numbers gender animals dates gender_female gender_male animals_cat animals_dog
1 male dog 2012-01-01 0 1 0 1
2 male dog 2011-12-31 0 1 0 1
3 female cat 2012-01-01 1 0 1 0

Dummy Rows

When dealing with data, there are often missing rows. While truly handling missing data is far beyond the scope of this package, the function dummy_rows() lets you add those missing rows back into the data.

The function takes all character, factor, and Date columns, finds all possible combinations of their values, and adds the rows that are not in the original data set. Any columns not used in creating the combinations (e.g. numeric) are given a value of NA (unless otherwise specified with dummy_value).

Lets start with a simple example.

fastDummies_example <- data.frame(numbers = 1:3,
                    gender  = c("male", "male", "female"),
                    animals = c("dog", "dog", "cat"),
                    dates   = as.Date(c("2012-01-01", "2011-12-31",
                                          "2012-01-01")),
                    stringsAsFactors = FALSE)
knitr::kable(fastDummies_example)
numbers gender animals dates
1 male dog 2012-01-01
2 male dog 2011-12-31
3 female cat 2012-01-01

This data set has four columns: two character, one Date, and one numeric. The function by default will use the character and Date columns in creating the combinations. First, a small amount of math to explain the combinations. Each column has two distinct values - gender: male & female; animals: dog & cat; dates: 2011-12-31 & 2011-12-31. To find the number of possible combinations, multiple the number of unique values in each column together. 2 * 2 * 2 = 8.

results <- fastDummies::dummy_rows(fastDummies_example)
knitr::kable(results)
numbers gender animals dates
1 male dog 2012-01-01
2 male dog 2011-12-31
3 female cat 2012-01-01
NA female cat 2011-12-31
NA male cat 2011-12-31
NA female dog 2011-12-31
NA male cat 2012-01-01
NA female dog 2012-01-01
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].