All Projects → has2k1 → Plydata

has2k1 / Plydata

Licence: bsd-3-clause
A grammar for data manipulation in Python

Programming Languages

python
139335 projects - #7 most used programming language

Labels

Projects that are alternatives of or similar to Plydata

Mars
Mars is a tensor-based unified framework for large-scale data computation which scales numpy, pandas, scikit-learn and Python functions.
Stars: ✭ 2,308 (+1020.39%)
Mutual labels:  pandas
Zebras
Data analysis library for JavaScript built with Ramda
Stars: ✭ 192 (-6.8%)
Mutual labels:  pandas
Pandas flavor
The easy way to write your own flavor of Pandas
Stars: ✭ 201 (-2.43%)
Mutual labels:  pandas
Data Science Types
Mypy stubs, i.e., type information, for numpy, pandas and matplotlib
Stars: ✭ 180 (-12.62%)
Mutual labels:  pandas
Dtale
Visualizer for pandas data structures
Stars: ✭ 2,864 (+1290.29%)
Mutual labels:  pandas
Finance
Here you can find all the quantitative finance algorithms that I've worked on and refined over the past year!
Stars: ✭ 194 (-5.83%)
Mutual labels:  pandas
Panthera
Data-frames & arrays on Clojure
Stars: ✭ 168 (-18.45%)
Mutual labels:  pandas
Awkward 1.0
Manipulate JSON-like data with NumPy-like idioms.
Stars: ✭ 203 (-1.46%)
Mutual labels:  pandas
California Coronavirus Data
The Los Angeles Times' independent tally of coronavirus cases in California.
Stars: ✭ 188 (-8.74%)
Mutual labels:  pandas
Data Science Projects With Python
A Case Study Approach to Successful Data Science Projects Using Python, Pandas, and Scikit-Learn
Stars: ✭ 198 (-3.88%)
Mutual labels:  pandas
Andrew Ng Notes
This is Andrew NG Coursera Handwritten Notes.
Stars: ✭ 180 (-12.62%)
Mutual labels:  pandas
Choochoo
Training Diary
Stars: ✭ 186 (-9.71%)
Mutual labels:  pandas
Data Science Notebook
📖 每一个伟大的思想和行动都有一个微不足道的开始
Stars: ✭ 196 (-4.85%)
Mutual labels:  pandas
Tensorflow Ml Nlp
텐서플로우와 머신러닝으로 시작하는 자연어처리(로지스틱회귀부터 트랜스포머 챗봇까지)
Stars: ✭ 176 (-14.56%)
Mutual labels:  pandas
Trump Lies
Tutorial: Web scraping in Python with Beautiful Soup
Stars: ✭ 201 (-2.43%)
Mutual labels:  pandas
Ditching Excel For Python
Functionalities in Excel translated to Python
Stars: ✭ 172 (-16.5%)
Mutual labels:  pandas
Fashion Recommendation
A clothing retrieval and visual recommendation model for fashion images.
Stars: ✭ 193 (-6.31%)
Mutual labels:  pandas
Tianyancha
pip安装的天眼查爬虫API,指定的单个/多个企业工商信息一键保存为Excel/JSON格式。A Battery-included Scraper API of Tianyancha, the best Chinese business data and investigation platform.
Stars: ✭ 206 (+0%)
Mutual labels:  pandas
Joyful Pandas
pandas中文教程
Stars: ✭ 2,788 (+1253.4%)
Mutual labels:  pandas
Arctic
High performance datastore for time series and tick data
Stars: ✭ 2,525 (+1125.73%)
Mutual labels:  pandas

####### plydata #######

========================= ======================= Latest Release |release|_ License |license|_ Build Status |buildstatus|_ Coverage |coverage|_ Documentation (Dev) |documentation|_ Documentation (Release) |documentation_stable|_ ========================= =======================

plydata is a library that provides a grammar for data manipulation. The grammar consists of verbs that can be applied to pandas dataframes or database tables. It is based on the R packages dplyr, tidyr and forcats_. plydata uses the >> operator as a pipe symbol, alternatively there is the ply(data, *verbs) function that you can use instead of >>.

At present the only supported data store is the pandas dataframe. We expect to support sqlite and maybe postgresql and mysql.

Installation

plydata only supports Python 3.

Official version

.. code-block:: console

$ pip install plydata

Development version

.. code-block:: console

$ pip install git+https://github.com/has2k1/[email protected]

Example

.. code-block:: python

import pandas as pd
import numpy as np
from plydata import define, query, if_else, ply

# NOTE: query is the equivalent of dplyr's filter but with
#      slightly different python syntax  for the expressions

df = pd.DataFrame({
    'x': [0, 1, 2, 3],
    'y': ['zero', 'one', 'two', 'three']})

df >> define(z='x')
"""
   x      y  z
0  0   zero  0
1  1    one  1
2  2    two  2
3  3  three  3
"""

df >> define(z=if_else('x > 1', 1, 0))
"""
   x      y  z
0  0   zero  0
1  1    one  0
2  2    two  1
3  3  three  1
"""

# You can pass the dataframe as the # first argument
query(df, 'x > 1')  # same as `df >> query('x > 1')`
"""
   x      y
2  2    two
3  3  three
"""

# You can use the ply function instead of the >> operator
ply(df,
    define(z=if_else('x > 1', 1, 0)),
    query('z == 1')
)
"""
    x      y  z
 2  2    two  1
 3  3  three  1
"""

plydata piping works with plotnine_.

.. code-block:: python

from plotnine import ggplot, aes, geom_line

df = pd.DataFrame({'x': np.linspace(0, 2*np.pi, 500)})
(df
 >> define(y='np.sin(x)')
 >> define(sign=if_else('y >= 0', '"positive"', '"negative"'))
 >> (ggplot(aes('x', 'y'))
     + geom_line(aes(color='sign'), size=1.5))
 )

.. figure:: ./doc/images/readme-image.png

What about dplython or pandas-ply?

dplython_ and pandas-ply_ are two other packages that have a similar objective to plydata. The big difference is plydata does not use a placeholder variable (X) as a stand-in for the dataframe. For example:

.. code-block:: python

diamonds >> select(X.carat, X.cut, X.price)  # dplython

diamonds >> select('carat', 'cut', 'price')  # plydata
select(diamonds, 'carat', 'cut', 'price')    # plydata

For more, see the documentation_.

.. |release| image:: https://img.shields.io/pypi/v/plydata.svg .. _release: https://pypi.python.org/pypi/plydata

.. |license| image:: https://img.shields.io/pypi/l/plydata.svg .. _license: https://pypi.python.org/pypi/plydata

.. |buildstatus| image:: https://github.com/has2k1/plydata/workflows/build/badge.svg?branch=master .. _buildstatus: https://github.com/has2k1/plydata/actions?query=branch%3Amaster+workflow%3A%22build%22

.. |coverage| image:: https://codecov.io /github/has2k1/plydata/coverage.svg?branch=master .. _coverage: https://codecov.io/github/has2k1/plydata?branch=master

.. |documentation| image:: https://readthedocs.org/projects/plydata/badge/?version=latest .. _documentation: https://plydata.readthedocs.io/en/latest/

.. |documentation_stable| image:: https://readthedocs.org/projects/plydata/badge/?version=stable .. _documentation_stable: https://plydata.readthedocs.io/en/stable/

.. _dplyr: https://github.com/tidyverse/dplyr .. _tidyr: https://github.com/tidyverse/tidyr .. _forcats: https://github.com/tidyverse/forcats .. _pandas-ply: https://github.com/coursera/pandas-ply .. _dplython: https://github.com/dodger487/dplython .. _plotnine: https://plotnine.readthedocs.io/en/stable/

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].