Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → nacnudus → Unpivotr

nacnudus / Unpivotr

Licence: other

Unpivot complex and irregular data layouts in R

Programming Languages

7636 projects

Labels

excel spreadsheet pivot-tables

Projects that are alternatives of or similar to Unpivotr

Luckysheet

Luckysheet is an online spreadsheet like excel that is powerful, simple to configure, and completely open source.

Stars: ✭ 9,772 (+6502.7%)

Mutual labels: excel, spreadsheet, pivot-tables

Documentbuilder

ONLYOFFICE Document Builder is powerful text, spreadsheet, presentation and PDF generating tool

Stars: ✭ 61 (-58.78%)

Mutual labels: excel, spreadsheet

Functional Data Grid

Data grids in functional style with ReactJS

Stars: ✭ 125 (-15.54%)

Mutual labels: excel, spreadsheet

X Spreadsheet

A web-based JavaScript（canvas） spreadsheet

Stars: ✭ 12,046 (+8039.19%)

Mutual labels: excel, spreadsheet

Desktopeditors

An office suite that combines text, spreadsheet and presentation editors allowing to create, view and edit local documents

Stars: ✭ 1,008 (+581.08%)

Mutual labels: excel, spreadsheet

Django Rest Pandas

📊📈 Serves up Pandas dataframes via the Django REST Framework for use in client-side (i.e. d3.js) visualizations and offline analysis (e.g. Excel)

Stars: ✭ 1,030 (+595.95%)

Mutual labels: excel, spreadsheet

Excelize

Golang library for reading and writing Microsoft Excel™ (XLSX) files.

Stars: ✭ 10,286 (+6850%)

Mutual labels: excel, spreadsheet

Phpspreadsheet

A pure PHP library for reading and writing spreadsheet files

Stars: ✭ 10,627 (+7080.41%)

Mutual labels: excel, spreadsheet

Jupyterlab Spreadsheet

JupyterLab plugin for viewing spreadsheets, such as Excel .xls/.xlsx workbooks and OpenOffice .ods files

Stars: ✭ 95 (-35.81%)

Mutual labels: excel, spreadsheet

Dev Ref

Stars: ✭ 90 (-39.19%)

Mutual labels: excel, spreadsheet

Spreadsheet

Spreadsheet for Vaadin Framework

Stars: ✭ 40 (-72.97%)

Mutual labels: excel, spreadsheet

Tui.grid

🍞🔡 The Powerful Component to Display and Edit Data. Experience the Ultimate Data Transformer!

Stars: ✭ 1,859 (+1156.08%)

Mutual labels: excel, spreadsheet

React Spreadsheet Grid

An Excel-like grid component for React with custom cell editors, performant scroll & resizable columns

Stars: ✭ 996 (+572.97%)

Mutual labels: excel, spreadsheet

Spreadsheet

The Ruby Spreadsheet by ywesee GmbH

Stars: ✭ 1,033 (+597.97%)

Mutual labels: excel, spreadsheet

Janitor

simple tools for data cleaning in R

Stars: ✭ 981 (+562.84%)

Mutual labels: excel, pivot-tables

Expss

expss: Tables and Labels in R

Stars: ✭ 65 (-56.08%)

Mutual labels: excel, pivot-tables

Vue Handsontable Official

Vue Data Grid with Spreadsheet Look & Feel. Official Vue wrapper for Handsontable.

Stars: ✭ 751 (+407.43%)

Mutual labels: excel, spreadsheet

Xlnt

📊 Cross-platform user-friendly xlsx library for C++11+

Stars: ✭ 876 (+491.89%)

Mutual labels: excel, spreadsheet

Poi Android

📈 Apache POI for Android

Stars: ✭ 77 (-47.97%)

Mutual labels: excel, spreadsheet

Hot Table

Handsontable - Best Data Grid Web Component with Spreadsheet Look and Feel.

Stars: ✭ 114 (-22.97%)

Mutual labels: excel, spreadsheet

View All Similar Projects ➔

unpivotr

unpivotr deals with non-tabular data, especially from spreadsheets. Use unpivotr when your source data has any of these ‘features’:

Multi-headered hydra
Meaningful formatting
Headers anywhere but at the top of each column
Non-text headers e.g. dates
Other stuff around the table
Several similar tables in one sheet
Sentinel values
Superscript symbols
Meaningful comments
Nested HTML tables

If that list makes your blood boil, you’ll enjoy the function names.

behead() deals with multi-headered hydra tables one layer of headers at a time, working from the edge of the table inwards. It’s a bit like using header = TRUE in read.csv(), but because it’s a function, you can apply it to as many layers of headers as you need. You end up with all the headers in columns.
spatter() is like tidyr::spread() but preserves mixed data types. You get into a mixed-data-type situation by delaying type coercion until after the table is tidy (rather than before, like read.csv() et al). And yes, it usually follows behead().

More positive, corrective functions:

justify() aligns column headers before behead()ing, and has deliberate moral overtones.
enhead() attaches a header to the body of the data, a la Frankenstein. The effect is the same as behead(), but is more powerful because you can choose exactly which header cells you want, paying attention to formatting (which behead() doesn’t understand).
isolate_sentinels() separates meaningful symbols like "N/A" or "confidential" from the rest of the data, giving them some time alone think about what they’ve done.
partition() takes a sheet with several tables on it, and slashes into pieces that each contain one table. You can then unpivot each table in turn with purrr::map() or similar.

Make cells tidy

Unpivotr uses data where each cells is represented by one row in a dataframe. Like this.

What can you do with tidy cells? The best places to start are:

Spreadsheet Munging Strategies, a free, online cookbook using tidyxl and unpivotr
Screencasts on YouTube.
Worked examples on GitHub.

Otherwise the basic idea is:

Read the data with a specialist tool.
- For spreadsheets, use tidyxl.
- For plain text files, you might soon be able to use readr, but for now you’ll have to install a pull-request on that package with devtools::install_github("tidyverse/readr#760").
- For tables in html pages, use unpivotr::tidy_html()
- For data frames, use unpivotr::as_cells() – this should be a last resort, because by the time the data is in a conventional data frame, it is often too late – formatting has been lost, and most data types have been coerced to strings.
Either behead() straight away, else dplyr::filter() separately for the header cells and the data cells, and then recombine with enhead().
spatter() so that each column has one data type.

library(unpivotr)
library(tidyverse)
#> ── Attaching packages ────────────────────────────────── tidyverse 1.2.1 ──
#> ✔ ggplot2 3.2.1     ✔ purrr   0.3.2
#> ✔ tibble  2.1.3     ✔ dplyr   0.8.3
#> ✔ tidyr   0.8.3     ✔ stringr 1.4.0
#> ✔ readr   1.3.1     ✔ forcats 0.4.0
#> ── Conflicts ───────────────────────────────────── tidyverse_conflicts() ──
#> ✖ dplyr::filter() masks stats::filter()
#> ✖ dplyr::lag()    masks stats::lag()
x <- purpose$`up-left left-up`
x # A pivot table in a conventional data frame.  Four levels of headers, in two
#>                            X2      X3     X4     X5    X6     X7
#> 1                        <NA>    <NA> Female   <NA>  Male   <NA>
#> 2                        <NA>    <NA>  0 - 6 7 - 10 0 - 6 7 - 10
#> 3           Bachelor's degree 15 - 24   7000  27000  <NA>  13000
#> 4                        <NA> 25 - 44  12000 137000  9000  81000
#> 5                        <NA> 45 - 64  10000  64000  7000  66000
#> 6                        <NA>     65+   <NA>  18000  7000  17000
#> 7                 Certificate 15 - 24  29000 161000 30000 190000
#> 8                        <NA> 25 - 44  34000 179000 31000 219000
#> 9                        <NA> 45 - 64  30000 210000 23000 199000
#> 10                       <NA>     65+  12000  77000  8000 107000
#> 11                    Diploma 15 - 24   <NA>  14000  9000  11000
#> 12                       <NA> 25 - 44  10000  66000  8000  47000
#> 13                       <NA> 45 - 64   6000  68000  5000  58000
#> 14                       <NA>     65+   5000  41000  1000  34000
#> 15           No Qualification 15 - 24  10000  43000 12000  37000
#> 16                       <NA> 25 - 44  11000  36000 21000  50000
#> 17                       <NA> 45 - 64  19000  91000 17000  75000
#> 18                       <NA>     65+  16000 118000  9000  66000
#> 19 Postgraduate qualification 15 - 24   <NA>   6000  <NA>   <NA>
#> 20                       <NA> 25 - 44   5000  86000  7000  60000
#> 21                       <NA> 45 - 64   6000  55000  6000  68000
#> 22                       <NA>     65+   <NA>  13000  <NA>  18000
  # rows and two columns.

y <- as_cells(x) # 'Tokenize' or 'melt' the data frame into one row per cell
y
#> # A tibble: 132 x 4
#>      row   col data_type chr              
#>    <int> <int> <chr>     <chr>            
#>  1     1     1 chr       <NA>             
#>  2     2     1 chr       <NA>             
#>  3     3     1 chr       Bachelor's degree
#>  4     4     1 chr       <NA>             
#>  5     5     1 chr       <NA>             
#>  6     6     1 chr       <NA>             
#>  7     7     1 chr       Certificate      
#>  8     8     1 chr       <NA>             
#>  9     9     1 chr       <NA>             
#> 10    10     1 chr       <NA>             
#> # … with 122 more rows

rectify(y) # useful for reviewing the melted form as though in a spreadsheet
#> # A tibble: 22 x 7
#>    `row/col` `1(A)`            `2(B)`  `3(C)` `4(D)` `5(E)` `6(F)`
#>        <int> <chr>             <chr>   <chr>  <chr>  <chr>  <chr> 
#>  1         1 <NA>              <NA>    Female <NA>   Male   <NA>  
#>  2         2 <NA>              <NA>    0 - 6  7 - 10 0 - 6  7 - 10
#>  3         3 Bachelor's degree 15 - 24 7000   27000  <NA>   13000 
#>  4         4 <NA>              25 - 44 12000  137000 9000   81000 
#>  5         5 <NA>              45 - 64 10000  64000  7000   66000 
#>  6         6 <NA>              65+     <NA>   18000  7000   17000 
#>  7         7 Certificate       15 - 24 29000  161000 30000  190000
#>  8         8 <NA>              25 - 44 34000  179000 31000  219000
#>  9         9 <NA>              45 - 64 30000  210000 23000  199000
#> 10        10 <NA>              65+     12000  77000  8000   107000
#> # … with 12 more rows

y %>%
  behead("up-left", "sex") %>%               # Strip headers
  behead("up", "life-satisfication") %>%  # one
  behead("left-up", "qualification") %>%     # by
  behead("left", "age-band") %>%            # one.
  select(-row, -col, -data_type, count = chr) %>% # cleanup
  mutate(count = as.integer(count))
#> # A tibble: 80 x 5
#>     count sex    `life-satisfication` qualification     `age-band`
#>     <int> <chr>  <chr>                <chr>             <chr>     
#>  1   7000 Female 0 - 6                Bachelor's degree 15 - 24   
#>  2  12000 Female 0 - 6                Bachelor's degree 25 - 44   
#>  3  10000 Female 0 - 6                Bachelor's degree 45 - 64   
#>  4     NA Female 0 - 6                Bachelor's degree 65+       
#>  5  27000 Female 7 - 10               Bachelor's degree 15 - 24   
#>  6 137000 Female 7 - 10               Bachelor's degree 25 - 44   
#>  7  64000 Female 7 - 10               Bachelor's degree 45 - 64   
#>  8  18000 Female 7 - 10               Bachelor's degree 65+       
#>  9     NA Male   0 - 6                Bachelor's degree 15 - 24   
#> 10   9000 Male   0 - 6                Bachelor's degree 25 - 44   
#> # … with 70 more rows

Note the compass directions in the code above, which hint to behead() where to find the header cell for each data cell.

"up-left" means the header (Female, Male) is positioned up and to the left of the columns of data cells it describes.
"up" means the header (0 - 6, 7 - 10) is positioned directly above the columns of data cells it describes.
"left-up" means the header (Bachelor's degree, Certificate, etc.) is positioned to the left and upwards of the rows of data cells it describes.
"left" means the header (15 - 24, 25 - 44, etc.) is positioned directly to the left of the rows of data cells it describes.

Installation

# install.packages("devtools") # If you don't already have devtools
devtools::install_github("nacnudus/unpivotr", build_vignettes = TRUE)

The version 0.4.0 release had somee breaking changes. See NEWS.md for details. The previous version can be installed as follow:

devtools::install_version("unpivotr", version = "0.3.1", repos = "http://cran.us.r-project.org")

Similar projects

unpivotr is inspired by Databaker, a collaboration between the United Kingdom Office of National Statistics and The Sensible Code Company. unpivotr.

jailbreaker attempts to extract non-tabular data from spreadsheets into tabular structures automatically via some clever algorithms. unpivotr differs by being less magic, and equipping you to express what you want to do.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 148

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (3) 🔗