Top 763 data open source projects

Pdpipe
Easy pipelines for pandas DataFrames.
Iexfinance
Python SDK for IEX Cloud
Panini
A super simple flat file generator.
Machine Learning Mindmap
A mindmap summarising Machine Learning concepts, from Data Analysis to Deep Learning.
Footballdata
A hodgepodge of JSON and CSV Football/Soccer data
Disk.frame
Fast Disk-Based Parallelized Data Manipulation Framework for Larger-than-RAM Data
Sklearn Classification
Data Science Notebook on a Classification Task, using sklearn and Tensorflow.
Knowledge Repo
A next-generation curated knowledge sharing platform for data scientists and other technical professions.
Voice datasets
🔊 A comprehensive list of open-source datasets for voice and sound computing (50+ datasets).
Pybaseball
Pull current and historical baseball statistics using Python (Statcast, Baseball Reference, FanGraphs)
✭ 484
pythondata
Machine Learning Roadmap
A roadmap connecting many of the most important concepts in machine learning, how to learn them and what tools to use to perform them.
Core2d
A multi-platform data driven 2D diagram editor.
Data Engineering Book
Accumulated knowledge and experience in the field of Data Engineering
Rio
A Swiss-Army Knife for Data I/O
Udacity Data Engineering Projects
Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development.
Bogus
📇 A simple and sane fake data generator for C#, F#, and VB.NET. Based on and ported from the famed faker.js.
Fetch
Simple & Efficient data access for Scala and Scala.js
Tensorbase
TensorBase BE is building a high performance, cloud neutral bigdata warehouse for SMEs fully in Rust.
Isp Data Pollution
ISP Data Pollution to Protect Private Browsing History with Obfuscation
Featran
A Scala feature transformation library for data science and machine learning
Data
This repository contains general data for Web technologies
React Spreadsheet
Simple, customizable yet performant spreadsheet for React
Datacleaner
The premier open source Data Quality solution
Bad Data Guide
An exhaustive reference to problems seen in real-world data along with suggestions on how to resolve them.
Samples
Sample projects using Material, Graph, and Algorithm.
Glide Data Grid
A high-performance React grid component, with rich rendering and first-class TypeScript support.
Arquero
Query processing and transformation of array-backed data tables.
React Query
⚛️ Hooks for fetching, caching and updating asynchronous data in React
Dataframe Js
A javascript library providing a new data structure for datascientists and developpers
Keypathkit
KeyPathKit is a library that provides the standard functions to manipulate data along with a call-syntax that relies on typed keypaths to make the call sites as short and clean as possible.
✭ 376
swiftsqldata
Stm32 Usart Uart Dma Rx Tx
STM32 examples for USART using DMA for efficient RX and TX transmission
Migration data
Safely migrate data in ActiveRecord migrations and keep them up to date.
Django Smuggler
Django Smuggler is a pluggable application for Django Web Framework that helps you to import/export fixtures via the automatically-generated administration interface.
J
❌ Multi-format spreadsheet CLI (now merged in http://github.com/sheetjs/js-xlsx )
Anon
A UNIX Command To Anonymise Data
React Refetch
A simple, declarative, and composable way to fetch data for React components
Morphism
⚡ Type-safe data transformer for JavaScript, TypeScript & Node.js.
Browser Compat Data
This repository contains compatibility data for Web technologies as displayed on MDN
Akshare
AKShare is an elegant and simple financial data interface library for Python, built for human beings! 开源财经数据接口库
Mimesis
Mimesis is a high-performance fake data generator for Python, which provides data for a variety of purposes in a variety of languages.
Tidy
Tidy up your data with JavaScript, inspired by dplyr and the tidyverse
Data Transfer Project
The Data Transfer Project makes it easy for people to transfer their data between online service providers. We are establishing a common framework, including data models and protocols, to enable direct transfer of data both into and out of participating online service providers.
Cartola
Extração de dados da API do CartolaFC, análise exploratória dos dados e modelos preditivos em R e Python - 2014-20. [EN] Data munging, analysis and modeling of CartolaFC - the most popular fantasy football game in Brazil and maybe in the world. Data cover years 2014-19.
Baize
白泽自动化运维系统:配置管理、网络探测、资产管理、业务管理、CMDB、CD、DevOps、作业编排、任务编排等功能,未来将添加监控、报警、日志分析、大数据分析等部分内容
Ghcrawler
Crawl GitHub APIs and store the discovered orgs, repos, commits, ...
Python
This repository helps you understand python from the scratch.
Surveykit
Android library to create beautiful surveys (aligned with ResearchKit on iOS)