PyjanitorClean APIs for data cleaning. Python implementation of R package Janitor
Stars: ✭ 647 (+4213.33%)
ArqueroQuery processing and transformation of array-backed data tables.
Stars: ✭ 384 (+2460%)
Spark With PythonFundamentals of Spark with Python (using PySpark), code examples
Stars: ✭ 150 (+900%)
DataframeC++ DataFrame for statistical, Financial, and ML analysis -- in modern C++ using native types, continuous memory storage, and no pointers are involved
Stars: ✭ 828 (+5420%)
QframeImmutable data frame for Go
Stars: ✭ 282 (+1780%)
MarsMars is a tensor-based unified framework for large-scale data computation which scales numpy, pandas, scikit-learn and Python functions.
Stars: ✭ 2,308 (+15286.67%)
PdpipeEasy pipelines for pandas DataFrames.
Stars: ✭ 590 (+3833.33%)
Static FrameImmutable and grow-only Pandas-like DataFrames with a more explicit and consistent interface.
Stars: ✭ 217 (+1346.67%)
OptopsyA nimble options backtesting library for Python
Stars: ✭ 373 (+2386.67%)
PolarsRust DataFrame library
Stars: ✭ 1,214 (+7993.33%)
BoltzmanncleanFill missing values in Pandas DataFrames using Restricted Boltzmann Machines
Stars: ✭ 23 (+53.33%)
autThe Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.
Stars: ✭ 111 (+640%)
Inspectdf🛠️ 📊 Tools for Exploring and Comparing Data Frames
Stars: ✭ 195 (+1200%)
VaexOut-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualize and explore big tabular data at a billion rows per second 🚀
Stars: ✭ 6,793 (+45186.67%)
TablesawJava dataframe and visualization library
Stars: ✭ 2,785 (+18466.67%)
SmileStatistical Machine Intelligence & Learning Engine
Stars: ✭ 5,412 (+35980%)
PantheraData-frames & arrays on Clojure
Stars: ✭ 168 (+1020%)
Spark DariaEssential Spark extensions and helper methods ✨😲
Stars: ✭ 553 (+3586.67%)
tablexploreTable analysis and plotting application written in PySide2/PyQt5
Stars: ✭ 89 (+493.33%)
PandastableTable analysis in Tkinter using pandas DataFrames.
Stars: ✭ 376 (+2406.67%)
PandahousePandas interface for Clickhouse database
Stars: ✭ 126 (+740%)
PandasvaultAdvanced Pandas Vault — Utilities, Functions and Snippets (by @firmai).
Stars: ✭ 316 (+2006.67%)
Morpheus CoreThe foundational library of the Morpheus data science framework
Stars: ✭ 203 (+1253.33%)
NimdataDataFrame API written in Nim, enabling fast out-of-core data processing
Stars: ✭ 261 (+1640%)
DframcyDataframe Integration with spaCy.
Stars: ✭ 74 (+393.33%)
MobiusC# and F# language binding and extensions to Apache Spark
Stars: ✭ 929 (+6093.33%)
pywedgeMakes Interactive Chart Widget, Cleans raw data, Runs baseline models, Interactive hyperparameter tuning & tracking
Stars: ✭ 49 (+226.67%)
PeroxideRust numeric library with R, MATLAB & Python syntax
Stars: ✭ 191 (+1173.33%)
FoxcrossAsyncIO serving for data science models
Stars: ✭ 18 (+20%)
ElandPython Client and Toolkit for DataFrames, Big Data, Machine Learning and ETL in Elasticsearch
Stars: ✭ 235 (+1466.67%)
Spark RedisA connector for Spark that allows reading and writing to/from Redis cluster
Stars: ✭ 773 (+5053.33%)
PandasguiPandasGUI is a GUI for viewing, plotting and analyzing Pandas DataFrames.
Stars: ✭ 2,495 (+16533.33%)
ModinModin: Speed up your Pandas workflows by changing a single line of code
Stars: ✭ 6,639 (+44160%)
isarn-sketches-sparkRoutines and data structures for using isarn-sketches idiomatically in Apache Spark
Stars: ✭ 28 (+86.67%)
DatafusionDataFusion has now been donated to the Apache Arrow project
Stars: ✭ 611 (+3973.33%)
DatasheetsRead data from, write data to, and modify the formatting of Google Sheets
Stars: ✭ 593 (+3853.33%)
TechnicalDifferent indicators developed or collected for the Freqtrade
Stars: ✭ 222 (+1380%)
SequoiaA股自动选股程序,实现了海龟交易法则、缠中说禅牛市买点,以及其他若干种技术形态
Stars: ✭ 564 (+3660%)
GeniA Clojure dataframe library that runs on Spark
Stars: ✭ 152 (+913.33%)
Dataframe GoDataFrames for Go: For statistics, machine-learning, and data manipulation/exploration
Stars: ✭ 487 (+3146.67%)
pyspark-algorithmsPySpark Algorithms Book: https://www.amazon.com/dp/B07X4B2218/ref=sr_1_2
Stars: ✭ 72 (+380%)
Dataframe JsA javascript library providing a new data structure for datascientists and developpers
Stars: ✭ 376 (+2406.67%)
DatatableA go in-memory table
Stars: ✭ 215 (+1333.33%)
PystoreFast data store for Pandas time-series data
Stars: ✭ 325 (+2066.67%)
Danfojsdanfo.js is an open source, JavaScript library providing high performance, intuitive, and easy to use data structures for manipulating and processing structured data.
Stars: ✭ 1,304 (+8593.33%)
SparkflowEasy to use library to bring Tensorflow on Apache Spark
Stars: ✭ 282 (+1780%)
StyleframeA library that wraps pandas and openpyxl and allows easy styling of dataframes in excel
Stars: ✭ 252 (+1580%)
Rust DataframeA Rust DataFrame implementation, built on Apache Arrow
Stars: ✭ 271 (+1706.67%)
JardinA pandas.DataFrame-based ORM.
Stars: ✭ 81 (+440%)
connector-xFastest library to load data from DB to DataFrames in Rust and Python
Stars: ✭ 550 (+3566.67%)
Tech.ml.datasetA Clojure high performance data processing system
Stars: ✭ 205 (+1266.67%)
h3ronRust crates for the H3 geospatial indexing system
Stars: ✭ 52 (+246.67%)
daanyDaany - .NET DAta ANalYtics .NET library with the implementation of DataFrame, Time series decompositions and Linear Algebra routines BLASS and LAPACK.
Stars: ✭ 49 (+226.67%)
KoalasKoalas: pandas API on Apache Spark
Stars: ✭ 3,044 (+20193.33%)
BallistaDistributed compute platform implemented in Rust, and powered by Apache Arrow.
Stars: ✭ 2,274 (+15060%)
Pandas TaTechnical Analysis Indicators - Pandas TA is an easy to use Python 3 Pandas Extension with 130+ Indicators
Stars: ✭ 962 (+6313.33%)