KoalasKoalas: pandas API on Apache Spark
Pandas DatareaderExtract data from a wide range of Internet sources into a pandas DataFrame.
KartothekA consistent table management library in python
StumpySTUMPY is a powerful and scalable Python library for modern time series analysis
PymapdPython client for OmniSci GPU-accelerated SQL engine and analytics platform
Pyvtreatvtreat is a data frame processor/conditioner that prepares real-world data for predictive modeling in a statistically sound manner. Distributed under a BSD-3-Clause license.
DaskParallel computing with task scheduling
Array ApiRFC document, tooling and other content related to the array API standard
PyjanitorClean APIs for data cleaning. Python implementation of R package Janitor
grblasPython wrapper around GraphBLAS
pyjanitorClean APIs for data cleaning. Python implementation of R package Janitor
PyData-Pseudolabelling-KeynoteAccompanying notebook and sources to "A Guide to Pseudolabelling: How to get a Kaggle medal with only one model" (Dec. 2020 PyData Boston-Cambridge Keynote)