All Projects → pwwang → datar

pwwang / datar

Licence: MIT license
A Grammar of Data Manipulation in python

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to datar

tutorials
Short programming tutorials pertaining to data analysis.
Stars: ✭ 14 (-90.14%)
Mutual labels:  dplyr, pandas, tidyr
learning R
List of resources for learning R
Stars: ✭ 32 (-77.46%)
Mutual labels:  dplyr, data-manipulation, tidyr
Siuba
Python library for using dplyr like syntax with pandas and SQL
Stars: ✭ 605 (+326.06%)
Mutual labels:  dplyr, pandas
Sspipe
Simple Smart Pipe: python productivity-tool for rapid data manipulation
Stars: ✭ 96 (-32.39%)
Mutual labels:  dplyr, pandas
Tidyquery
Query R data frames with SQL
Stars: ✭ 138 (-2.82%)
Mutual labels:  dplyr, tidyverse
Timetk
A toolkit for working with time series in R
Stars: ✭ 371 (+161.27%)
Mutual labels:  dplyr, tidyverse
Tidylog
Tidylog provides feedback about dplyr and tidyr operations. It provides wrapper functions for the most common functions, such as filter, mutate, select, and group_by, and provides detailed output for joins.
Stars: ✭ 428 (+201.41%)
Mutual labels:  dplyr, tidyverse
Tidyquant
Bringing financial analysis to the tidyverse
Stars: ✭ 635 (+347.18%)
Mutual labels:  dplyr, tidyverse
R4ds Exercise Solutions
Exercise solutions to "R for Data Science"
Stars: ✭ 226 (+59.15%)
Mutual labels:  dplyr, tidyverse
eeguana
A package for manipulating EEG data in R.
Stars: ✭ 16 (-88.73%)
Mutual labels:  dplyr, tidyverse
CSSS508
CSSS508: Introduction to R for Social Scientists
Stars: ✭ 28 (-80.28%)
Mutual labels:  dplyr, tidyverse
Tidy
Tidy up your data with JavaScript, inspired by dplyr and the tidyverse
Stars: ✭ 307 (+116.2%)
Mutual labels:  dplyr, tidyverse
tidysq
tidy processing of biological sequences in R
Stars: ✭ 29 (-79.58%)
Mutual labels:  tidyverse, tibble
Moderndive book
Statistical Inference via Data Science: A ModernDive into R and the Tidyverse
Stars: ✭ 527 (+271.13%)
Mutual labels:  dplyr, tidyverse
parcours-r
Valise pédagogique pour la formation à R
Stars: ✭ 25 (-82.39%)
Mutual labels:  dplyr, tidyverse
casewhen
Create reusable dplyr::case_when() functions
Stars: ✭ 64 (-54.93%)
Mutual labels:  dplyr, tidyverse
tbltools
🗜🔢 Tools for Working with Tibbles
Stars: ✭ 34 (-76.06%)
Mutual labels:  tidyverse, tibble
advanced-data-wrangling-in-R-legacy
Advanced-data-wrangling-in-R, Workshop
Stars: ✭ 14 (-90.14%)
Mutual labels:  dplyr, tidyverse
Tidyheatmap
Draw heatmap simply using a tidy data frame
Stars: ✭ 151 (+6.34%)
Mutual labels:  dplyr, tidyverse
datawizard
Magic potions to clean and transform your data 🧙
Stars: ✭ 149 (+4.93%)
Mutual labels:  dplyr, tidyr

datar

A Grammar of Data Manipulation in python

Pypi Github Building Docs and API Codacy Codacy coverage

Documentation | Reference Maps | Notebook Examples | API | Blog

datar is a re-imagining of APIs of data manipulation libraries in python (currently only pandas supported) so that you can manipulate your data with it like with dplyr in R.

datar is an in-depth port of tidyverse packages, such as dplyr, tidyr, forcats and tibble, as well as some functions from base R.

Installation

pip install -U datar

# install pdtypes support
pip install -U datar[pdtypes]

# install dependencies for modin as backend
pip install -U datar[modin]
# you may also need to install dependencies for modin engines
# pip install -U modin[ray]

Example usage

from datar import f
from datar.dplyr import mutate, filter, if_else
from datar.tibble import tibble
# or
# from datar.all import f, mutate, filter, if_else, tibble

df = tibble(
    x=range(4),  # or f[:4]
    y=['zero', 'one', 'two', 'three']
)
df >> mutate(z=f.x)
"""# output
        x        y       z
  <int64> <object> <int64>
0       0     zero       0
1       1      one       1
2       2      two       2
3       3    three       3
"""

df >> mutate(z=if_else(f.x>1, 1, 0))
"""# output:
        x        y       z
  <int64> <object> <int64>
0       0     zero       0
1       1      one       0
2       2      two       1
3       3    three       1
"""

df >> filter(f.x>1)
"""# output:
        x        y
  <int64> <object>
0       2      two
1       3    three
"""

df >> mutate(z=if_else(f.x>1, 1, 0)) >> filter(f.z==1)
"""# output:
        x        y       z
  <int64> <object> <int64>
0       2      two       1
1       3    three       1
"""
# works with plotnine
# example grabbed from https://github.com/has2k1/plydata
import numpy
from datar.base import sin, pi
from plotnine import ggplot, aes, geom_line, theme_classic

df = tibble(x=numpy.linspace(0, 2*pi, 500))
(df >>
  mutate(y=sin(f.x), sign=if_else(f.y>=0, "positive", "negative")) >>
  ggplot(aes(x='x', y='y')) +
  theme_classic() +
  geom_line(aes(color='sign'), size=1.2))

example

# easy to integrate with other libraries
# for example: klib
import klib
from datar.core.factory import verb_factory
from datar.datasets import iris
from datar.dplyr import pull

dist_plot = verb_factory(func=klib.dist_plot)
iris >> pull(f.Sepal_Length) >> dist_plot()

example

See also some advanced examples from my answers on StackOverflow:

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].