DataFrameDataFrame Library for Java
Stars: ✭ 51 (+2%)
ElandPython Client and Toolkit for DataFrames, Big Data, Machine Learning and ETL in Elasticsearch
Stars: ✭ 235 (+370%)
daanyDaany - .NET DAta ANalYtics .NET library with the implementation of DataFrame, Time series decompositions and Linear Algebra routines BLASS and LAPACK.
Stars: ✭ 49 (-2%)
heidiheidi : tidy data in Haskell
Stars: ✭ 24 (-52%)
TablesawJava dataframe and visualization library
Stars: ✭ 2,785 (+5470%)
polarsFast multi-threaded DataFrame library in Rust | Python | Node.js
Stars: ✭ 6,368 (+12636%)
Dominando-PandasEste repositório está destinado ao processo de aprendizagem da biblioteca Pandas.
Stars: ✭ 22 (-56%)
Static FrameImmutable and grow-only Pandas-like DataFrames with a more explicit and consistent interface.
Stars: ✭ 217 (+334%)
Dataframe JsA javascript library providing a new data structure for datascientists and developpers
Stars: ✭ 376 (+652%)
SetlA simple Spark-powered ETL framework that just works 🍺
Stars: ✭ 79 (+58%)
DatatableA go in-memory table
Stars: ✭ 215 (+330%)
StyleframeA library that wraps pandas and openpyxl and allows easy styling of dataframes in excel
Stars: ✭ 252 (+404%)
DatacleanerThe premier open source Data Quality solution
Stars: ✭ 391 (+682%)
Ether sqlA python library to push ethereum blockchain data into an sql database.
Stars: ✭ 41 (-18%)
PandastableTable analysis in Tkinter using pandas DataFrames.
Stars: ✭ 376 (+652%)
QframeImmutable data frame for Go
Stars: ✭ 282 (+464%)
DataframeC++ DataFrame for statistical, Financial, and ML analysis -- in modern C++ using native types, continuous memory storage, and no pointers are involved
Stars: ✭ 828 (+1556%)
Morpheus CoreThe foundational library of the Morpheus data science framework
Stars: ✭ 203 (+306%)
NannyA tidyverse suite for (pre-) machine-learning: cluster, PCA, permute, impute, rotate, redundancy, triangular, smart-subset, abundant and variable features.
Stars: ✭ 17 (-66%)
hamiltonA scalable general purpose micro-framework for defining dataflows. You can use it to create dataframes, numpy matrices, python objects, ML models, etc.
Stars: ✭ 612 (+1124%)
Getting StartedThis repository is a getting started guide to Singer.
Stars: ✭ 734 (+1368%)
AirbyteAirbyte is an open-source EL(T) platform that helps you replicate your data in your warehouses, lakes and databases.
Stars: ✭ 4,919 (+9738%)
bowGo data analysis / manipulation library built on top of Apache Arrow
Stars: ✭ 20 (-60%)
DataBridge.NETConfigurable data bridge for permanent ETL jobs
Stars: ✭ 16 (-68%)
etlflowEtlFlow is an ecosystem of functional libraries in Scala based on ZIO for writing various different tasks, jobs on GCP and AWS.
Stars: ✭ 38 (-24%)
Infinite Stories with DataThis repo consists of my analysis of random datasets using various statistical and visualization techniques.
Stars: ✭ 21 (-58%)
DataProfilerWhat's in your data? Extract schema, statistics and entities from datasets
Stars: ✭ 843 (+1586%)
advanced-pandasPandas is a powerful tool for data exploration and analysis (including timeseries).
Stars: ✭ 22 (-56%)
awesome-dev.to[UNMAINTAINED] A collection of awesome blog series on DEV.to
Stars: ✭ 18 (-64%)
iMOKAinteractive Multi Objective K-mer Analysis
Stars: ✭ 19 (-62%)
woodworkWoodwork is a Python library that provides robust methods for managing and communicating data typing information.
Stars: ✭ 97 (+94%)
ttbbeerAn R Dataset Package for US Beer Statistics From TTB 🍺
Stars: ✭ 23 (-54%)
mydataharbor🇨🇳 MyDataHarbor是一个致力于解决任意数据源到任意数据源的分布式、高扩展性、高性能、事务级的数据同步中间件。帮助用户可靠、快速、稳定的对海量数据进行准实时增量同步或者定时全量同步,主要定位是为实时交易系统服务,亦可用于大数据的数据同步(ETL领域)。
Stars: ✭ 28 (-44%)
Fraud-Detection-in-Online-TransactionsDetecting Frauds in Online Transactions using Anamoly Detection Techniques Such as Over Sampling and Under-Sampling as the ratio of Frauds is less than 0.00005 thus, simply applying Classification Algorithm may result in Overfitting
Stars: ✭ 41 (-18%)
mixedvinesPython package for canonical vine copula trees with mixed continuous and discrete marginals
Stars: ✭ 36 (-28%)
go-bqloaderbqloader is a simple ETL framework to load data from Cloud Storage into BigQuery.
Stars: ✭ 16 (-68%)
cognipyIn-memory Graph Database and Knowledge Graph with Natural Language Interface, compatible with Pandas
Stars: ✭ 31 (-38%)
computational-neuroscienceShort undergraduate course taught at University of Pennsylvania on computational and theoretical neuroscience. Provides an introduction to programming in MATLAB, single-neuron models, ion channel models, basic neural networks, and neural decoding.
Stars: ✭ 36 (-28%)
cobrixA COBOL parser and Mainframe/EBCDIC data source for Apache Spark
Stars: ✭ 109 (+118%)
singer-runnerA CLI and library to run Singer Taps and Targets
Stars: ✭ 33 (-34%)
ipaddressData analysis of IP addresses and networks
Stars: ✭ 20 (-60%)
MooseMOOSE - Platform for software and data analysis.
Stars: ✭ 110 (+120%)
ipychartThe power of Chart.js with Python
Stars: ✭ 48 (-4%)
arrow-datafusionApache Arrow DataFusion SQL Query Engine
Stars: ✭ 2,360 (+4620%)
versatile-data-kitVersatile Data Kit (VDK) is an open source framework that enables anybody with basic SQL or Python knowledge to create their own data pipelines.
Stars: ✭ 144 (+188%)
akshareAKShare is an elegant and simple financial data interface library for Python, built for human beings! 开源财经数据接口库
Stars: ✭ 5,155 (+10210%)
PDAP-ScrapersCode relating to scraping public police data.
Stars: ✭ 72 (+44%)
ospiOpen Source Presence Infographic of Indian Startups
Stars: ✭ 25 (-50%)
dsrIntroduction to Data Science with R (2017)
Stars: ✭ 25 (-50%)
dask-awkwardNative Dask collection for awkward arrays, and the library to use it.
Stars: ✭ 25 (-50%)
saddleSADDLE: Scala Data Library
Stars: ✭ 23 (-54%)
torch-dataframeUtility class to manipulate dataset from CSV file
Stars: ✭ 67 (+34%)