All Projects → tidyverse → Dtplyr

tidyverse / Dtplyr

Licence: other
Data table backend for dplyr

Programming Languages

r
7636 projects

Projects that are alternatives of or similar to Dtplyr

learning R
List of resources for learning R
Stars: ✭ 32 (-92.98%)
Mutual labels:  dplyr, datatable
Tidy
Tidy up your data with JavaScript, inspired by dplyr and the tidyverse
Stars: ✭ 307 (-32.68%)
Mutual labels:  dplyr
ag-grid
The best JavaScript Data Table for building Enterprise Applications. Supports React / Angular / Vue / Plain JavaScript.
Stars: ✭ 8,743 (+1817.32%)
Mutual labels:  datatable
dplyrExtras
Some extra functionality that is not (yet) in dplyr, e.g. mutate_rows or s_filter, s_arrange ,...
Stars: ✭ 20 (-95.61%)
Mutual labels:  dplyr
casewhen
Create reusable dplyr::case_when() functions
Stars: ✭ 64 (-85.96%)
Mutual labels:  dplyr
Tableview
TableView is a powerful Android library for displaying complex data structures and rendering tabular data composed of rows, columns and cells.
Stars: ✭ 2,928 (+542.11%)
Mutual labels:  datatable
advanced-data-wrangling-in-R-legacy
Advanced-data-wrangling-in-R, Workshop
Stars: ✭ 14 (-96.93%)
Mutual labels:  dplyr
Timetk
A toolkit for working with time series in R
Stars: ✭ 371 (-18.64%)
Mutual labels:  dplyr
Angular Slickgrid
Angular-Slickgrid is a wrapper of the lightning fast & customizable SlickGrid datagrid, it also includes multiple Styling Themes
Stars: ✭ 298 (-34.65%)
Mutual labels:  datatable
starwarsdb
Relational Data from the Star Wars API for Learning and Teaching
Stars: ✭ 34 (-92.54%)
Mutual labels:  dplyr
bwt-datatable
Data table with Polymer 3 support!
Stars: ✭ 43 (-90.57%)
Mutual labels:  datatable
parcours-r
Valise pédagogique pour la formation à R
Stars: ✭ 25 (-94.52%)
Mutual labels:  dplyr
Material Table
Datatable for React based on material-ui's table with additional features
Stars: ✭ 3,198 (+601.32%)
Mutual labels:  datatable
dplyr.teradata
A Teradata Backend for dplyr
Stars: ✭ 16 (-96.49%)
Mutual labels:  dplyr
Vanilla Datatables
A lightweight, dependency-free javascript HTML table plugin
Stars: ✭ 314 (-31.14%)
Mutual labels:  datatable
jQuery-datatable-server-side-net-core
A simple Visual Studio solution using jQuery DataTable with Server-Side processing using .NET 5
Stars: ✭ 71 (-84.43%)
Mutual labels:  datatable
ng-mazdik
Angular UI component library
Stars: ✭ 86 (-81.14%)
Mutual labels:  datatable
Tidylog
Tidylog provides feedback about dplyr and tidyr operations. It provides wrapper functions for the most common functions, such as filter, mutate, select, and group_by, and provides detailed output for joins.
Stars: ✭ 428 (-6.14%)
Mutual labels:  dplyr
Datatablesbundle
This Bundle integrates the jQuery DataTables plugin into your Symfony application.
Stars: ✭ 334 (-26.75%)
Mutual labels:  datatable
Reactgrid
Add spreadsheet-like behavior to your React app
Stars: ✭ 289 (-36.62%)
Mutual labels:  datatable

dtplyr

CRAN status Travis build status Codecov test coverage R build status

Overview

dtplyr provides a data.table backend for dplyr. The goal of dtplyr is to allow you to write dplyr code that is automatically translated to the equivalent, but usually much faster, data.table code.

Compared to the previous release, this version of dtplyr is a complete rewrite that focusses only on lazy evaluation triggered by use of lazy_dt(). This means that no computation is performed until you explicitly request it with as.data.table(), as.data.frame() or as_tibble(). This has a considerable advantage over the previous version (which eagerly evaluated each step) because it allows dtplyr to generate significantly more performant translations. This is a large change that breaks all existing uses of dtplyr. But frankly, dtplyr was pretty useless before because it did such a bad job of generating data.table code. Fortunately few people used it, so a major overhaul was possible.

See vignette("translation") for details of the current translations, and table.express and rqdatatable for related work.

Installation

You can install from CRAN with:

install.packages("dtplyr")

Or try the development version from GitHub with:

# install.packages("devtools")
devtools::install_github("tidyverse/dtplyr")

Usage

To use dtplyr, you must at least load dtplyr and dplyr. You may also want to load data.table so you can access the other goodies that it provides:

library(data.table)
library(dtplyr)
library(dplyr, warn.conflicts = FALSE)

Then use lazy_dt() to create a “lazy” data table that tracks the operations performed on it.

mtcars2 <- lazy_dt(mtcars)

You can preview the transformation (including the generated data.table code) by printing the result:

mtcars2 %>% 
  filter(wt < 5) %>% 
  mutate(l100k = 235.21 / mpg) %>% # liters / 100 km
  group_by(cyl) %>% 
  summarise(l100k = mean(l100k))
#> Source: local data table [3 x 2]
#> Call:   `_DT1`[wt < 5][, `:=`(l100k = 235.21/mpg)][, .(l100k = mean(l100k)), 
#>     keyby = .(cyl)]
#> 
#>     cyl l100k
#>   <dbl> <dbl>
#> 1     4  9.05
#> 2     6 12.0 
#> 3     8 14.9 
#> 
#> # Use as.data.table()/as.data.frame()/as_tibble() to access results

But generally you should reserve this only for debugging, and use as.data.table(), as.data.frame(), or as_tibble() to indicate that you’re done with the transformation and want to access the results:

mtcars2 %>% 
  filter(wt < 5) %>% 
  mutate(l100k = 235.21 / mpg) %>% # liters / 100 km
  group_by(cyl) %>% 
  summarise(l100k = mean(l100k)) %>% 
  as_tibble()
#> # A tibble: 3 x 2
#>     cyl l100k
#>   <dbl> <dbl>
#> 1     4  9.05
#> 2     6 12.0 
#> 3     8 14.9

Why is dtplyr slower than data.table?

There are three primary reasons that dtplyr will always be somewhat slower than data.table:

  • Each dplyr verb must do some work to convert dplyr syntax to data.table syntax. This takes time proportional to the complexity of the input code, not the input data, so should be a negligible overhead for large datasets. Initial benchmarks suggest that the overhead should be under 1ms per dplyr call.

  • Some data.table expressions have no direct dplyr equivalent. For example, there’s no way to express cross- or rolling-joins with dplyr.

  • To match dplyr semantics, mutate() does not modify in place by default. This means that most expressions involving mutate() must make a copy that would not be necessary if you were using data.table directly. (You can opt out of this behaviour in lazy_dt() with immutable = FALSE).

Code of Conduct

Please note that the dtplyr project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].