All Projects → sspipe → Sspipe

sspipe / Sspipe

Licence: mit
Simple Smart Pipe: python productivity-tool for rapid data manipulation

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Sspipe

Data Science Notebook
📖 每一个伟大的思想和行动都有一个微不足道的开始
Stars: ✭ 196 (+104.17%)
Mutual labels:  data-science, pandas, numpy
Data Science Hacks
Data Science Hacks consists of tips, tricks to help you become a better data scientist. Data science hacks are for all - beginner to advanced. Data science hacks consist of python, jupyter notebook, pandas hacks and so on.
Stars: ✭ 273 (+184.38%)
Mutual labels:  data-science, pandas, numpy
Data Science Projects With Python
A Case Study Approach to Successful Data Science Projects Using Python, Pandas, and Scikit-Learn
Stars: ✭ 198 (+106.25%)
Mutual labels:  data-science, pandas, numpy
Machine Learning With Python
Practice and tutorial-style notebooks covering wide variety of machine learning techniques
Stars: ✭ 2,197 (+2188.54%)
Mutual labels:  data-science, pandas, numpy
Pandapy
PandaPy has the speed of NumPy and the usability of Pandas 10x to 50x faster (by @firmai)
Stars: ✭ 474 (+393.75%)
Mutual labels:  data-science, pandas, numpy
Andrew Ng Notes
This is Andrew NG Coursera Handwritten Notes.
Stars: ✭ 180 (+87.5%)
Mutual labels:  data-science, pandas, numpy
Orange3
🍊 📊 💡 Orange: Interactive data analysis
Stars: ✭ 3,152 (+3183.33%)
Mutual labels:  data-science, pandas, numpy
Functional intro to python
[tutorial]A functional, Data Science focused introduction to Python
Stars: ✭ 228 (+137.5%)
Mutual labels:  data-science, pandas, functional-programming
Data Science Ipython Notebooks
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
Stars: ✭ 22,048 (+22866.67%)
Mutual labels:  data-science, pandas, numpy
Stats Maths With Python
General statistics, mathematical programming, and numerical/scientific computing scripts and notebooks in Python
Stars: ✭ 381 (+296.88%)
Mutual labels:  data-science, pandas, numpy
Python Cheat Sheet
Python Cheat Sheet NumPy, Matplotlib
Stars: ✭ 1,739 (+1711.46%)
Mutual labels:  data-science, pandas, numpy
Machinelearningcourse
A collection of notebooks of my Machine Learning class written in python 3
Stars: ✭ 35 (-63.54%)
Mutual labels:  data-science, pandas, numpy
Data Science For Marketing Analytics
Achieve your marketing goals with the data analytics power of Python
Stars: ✭ 127 (+32.29%)
Mutual labels:  data-science, pandas, numpy
Zebras
Data analysis library for JavaScript built with Ramda
Stars: ✭ 192 (+100%)
Mutual labels:  data-science, pandas, functional-programming
Seaborn Tutorial
This repository is my attempt to help Data Science aspirants gain necessary Data Visualization skills required to progress in their career. It includes all the types of plot offered by Seaborn, applied on random datasets.
Stars: ✭ 114 (+18.75%)
Mutual labels:  data-science, pandas, numpy
Ai Learn
人工智能学习路线图,整理近200个实战案例与项目,免费提供配套教材,零基础入门,就业实战!包括:Python,数学,机器学习,数据分析,深度学习,计算机视觉,自然语言处理,PyTorch tensorflow machine-learning,deep-learning data-analysis data-mining mathematics data-science artificial-intelligence python tensorflow tensorflow2 caffe keras pytorch algorithm numpy pandas matplotlib seaborn nlp cv等热门领域
Stars: ✭ 4,387 (+4469.79%)
Mutual labels:  data-science, pandas, numpy
Mlcourse.ai
Open Machine Learning Course
Stars: ✭ 7,963 (+8194.79%)
Mutual labels:  data-science, pandas, numpy
Pymc Example Project
Example PyMC3 project for performing Bayesian data analysis using a probabilistic programming approach to machine learning.
Stars: ✭ 90 (-6.25%)
Mutual labels:  data-science, pandas, numpy
Data Science Complete Tutorial
For extensive instructor led learning
Stars: ✭ 1,027 (+969.79%)
Mutual labels:  pandas, numpy
10 Simple Hacks To Speed Up Your Data Analysis In Python
Some useful Tips and Tricks to speed up the data analysis process in Python.
Stars: ✭ 45 (-53.12%)
Mutual labels:  data-science, pandas

Downloads Build Status PyPI

Simple Smart Pipe

SSPipe is a python productivity-tool for rapid data manipulation in python.

It helps you break up any complicated expression into a sequence of simple transformations, increasing human-readability and decreasing the need for matching parentheses!

As an example, here is a single line code for reading students' data from 'data.csv', reporting those in the class 'A19' whose score is more than the average class score into 'report.csv':

from sspipe import p, px
import pandas as pd

pd.read_csv('data.csv') | px[px['class'] == 'A19'] | px[px.score > px.score.mean()].to_csv('report.csv')

As another example, here is a single line code for plotting sin(x) for points in range(0, 2*pi) where cos(x) is less than 0 in red color:

from sspipe import p, px
import numpy as np
import matplotlib.pyplot as plt

np.linspace(0, 2*np.pi, 100) | px[np.cos(px) < 0] | p(plt.plot, px, np.sin(px), 'r')

# The single-line code above is equivalent to the following code without SSPipe:
# X = np.linspace(0, 2*np.pi, 100)
# X = X[np.cos(X) < 0]
# plt.plot(X, np.sin(X), 'r')

If you're familiar with | operator of Unix, or %>% operator of R's magrittr, sspipe provides the same functionality in python.

Installation and Usage

Install sspipe using pip:

pip install --upgrade sspipe

Then import it in your scripts.

from sspipe import p, px

The whole functionality of this library is exposed by two objects p (as a wrapper for functions to be called on the piped object) and px (as a placeholder for piped object).

Examples

Description Python expression using p and px Equivalent python code
Simple
function call
"hello world!" | p(print) X = "hello world!"
print(X)
Function call
with extra args
"hello" | p(print, "world", end='!') X = "hello"
print(X, "world", end='!')
Explicitly positioning
piped argument
with px placeholder
"world" | p(print, "hello", px, "!") X = "world"
print("hello", X, "!")
Chaining pipes 5 | px + 2 | px ** 5 + px | p(print) X = 5
X = X + 2
X = X ** 5 + X
print(X)
Tailored behavior
for builtin map
and filter
(
range(5)
| p(filter, px % 2 == 0)
| p(map, px + 10)
| p(list) | p(print)
)
X = range(5)
X = filter((lambda x:x%2==0),X)
X = map((lambda x: x + 10), X)
X = list(X)
print(X)
NumPy expressions range(10) | np.sin(px)+1 | p(plt.plot) X = range(10)
X = np.sin(X) + 1
plt.plot(X)
Pandas support people_df | px.loc[px.age > 10, 'name'] X = people_df
X.loc[X.age > 10, 'name']
Assignment people_df['name'] |= px.str.upper() X = people_df['name']
X = X.str.upper()
people_df['name'] = X
Pipe as variable to_upper = px.strip().upper()
to_underscore = px.replace(' ', '_')
normalize = to_upper | to_underscore
" ab cde " | normalize | p(print)
_f1 = lambda x: x.strip().upper()
_f2 = lambda x: x.replace(' ','_')
_f3 = lambda x: _f2(_f1(x))
X = " ab cde "
X = _f3(X)
print(X)
Builtin
Data Structures
2 | p({px-1: p([px, p((px+1, 4))])}) X = 2
X = {X-1: [X, (X+1, 4)]}

How it works

The expression p(func, *args, **kwargs) returns a Pipe object that overloads __or__ and __ror__ operators. This object keeps func and args and kwargs until evaluation of x | <Pipe>, when Pipe.__ror__ is called by python. Then it will evaluate func(x, *args, **kwargs) and return the result.

The px object is simply p(lambda x: x).

Please notice that SSPipe does not wrap piped objects. On the other hand, it just wraps transforming functions. Therefore, when a variable like x is not an instance of Pipe class, after python evaluates y = x | p(func), the resulting variable y has absolutely no trace of Pipe. Thus, it will be exactly the same object as if we have originally evaluated y = func(x).

Common Gotchas

  • Incompatibility with dict.items, dict.keys and dict.values:

    The objects returned by dict.keys(), dict.values() and dict.items() are called view objects. Python does not allow classes to override the | operator on these types. As a workaround, the / operator has been implemented for view objects. Example:

    # WRONG ERRONEOUS CODE:
    {1: 2, 3: 4}.items() | p(list) | p(print)
    
    # CORRECT CODE (With / operator):
    {1: 2, 3: 4}.items() / p(list) | p(print)
    

Compatibility with JulienPalard/Pipe

This library is inspired by, and depends on, the intelligent and concise work of JulienPalard/Pipe. If you want a single pipe.py script or a lightweight library that implements core functionality and logic of SSPipe, Pipe is perfect.

SSPipe is focused on facilitating usage of pipes, by integration with popular libraries and introducing px concept and overriding python operators to make pipe a first-class citizen.

Every existing pipe implemented by JulienPalard/Pipe library is accessible through p.<original_name> and is compatible with SSPipe. SSPipe does not implement any specific pipe function and delegates implementation and naming of pipe functions to JulienPalard/Pipe.

For example, JulienPalard/Pipe's example for solving "Find the sum of all the even-valued terms in Fibonacci which do not exceed four million." can be re-written using sspipe:

def fib():
    a, b = 0, 1
    while True:
        yield a
        a, b = b, a + b

from sspipe import p, px

euler2 = (fib() | p.where(lambda x: x % 2 == 0)
                | p.take_while(lambda x: x < 4000000)
                | p.add())

You can also pass px shorthands to JulienPalard/Pipe API:

euler2 = (fib() | p.where(px % 2 == 0)
                | p.take_while(px < 4000000)
                | p.add())
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].