Top 646 data open source projects

TFDS is a collection of datasets ready to use with TensorFlow, Jax, ...
Http Fake Backend
Build a fake backend by providing the content of JSON files or JavaScript objects through configurable routes.
Vis Academy
A set of tutorials on how our frameworks make effective data visualization applications.
[DEPRECATED] Open data sharing powered by Dat
Wikibase Sdk
JS utils functions to query a Wikibase instance and simplify its results
A GPU-powered real-time analytics storage and query engine.
Blazor Table Component with Sorting, Paging and Filtering
Vscode Data Preview
Data Preview 🈸 extension for importing 📤 viewing 🔎 slicing 🔪 dicing 🎲 charting 📊 & exporting 📥 large JSON array/config, YAML, Apache Arrow, Avro, Parquet & Excel data files
Pandas Gbq
Pandas Google BigQuery
Quickly download, clean up, and install public datasets into a database management system
Transforms PDF, Documents and Images into Enriched Structured Data
🗣️ A book and repo to get you started programming voice computing applications in Python (10 chapters and 200+ scripts).
Provides fake data to your Android apps :)
A converter that generates a bash one-liner from an SQL Select query (no DB necessary)
Data and code behind the articles and graphics at FiveThirtyEight
Gspread Pandas
A package to easily open an instance of a Google spreadsheet and interact with worksheets through Pandas DataFrames.
Minecraft Data
Language independent module providing minecraft data for minecraft clients, servers and libraries.
Python package for creating beautiful interactive Chord Diagrams. Pro version available at
A taxonomic toolbelt for R
Create fake data in R
Hodur Engine
Hodur is a domain modeling approach and collection of libraries to Clojure. By using Hodur you can define your domain model as data, parse and validate it, and then either consume your model via an API or use one of the many plugins to help you achieve mechanical results faster and in a purely functional manner.
Splitgraph command line client and python library
pip安装的天眼查爬虫API,指定的单个/多个企业工商信息一键保存为Excel/JSON格式。A Battery-included Scraper API of Tianyancha, the best Chinese business data and investigation platform.
Ruby API for parsing and generating ASC X12 EDI transactions
Go - Beginners | Intermediate | Advanced
A set of data tools in Python
☄️ Temporal is an easy-to-use, enterprise-grade interface into distributed and decentralized storage
Awesome Json Datasets
A curated list of awesome JSON datasets that don't require authentication.
#100DaysOfCode - Learn by developing 100 unique apps to explore exciting tech stacks
Climate Change Data
🌍 A curated list of APIs, open data and ML/AI projects on climate change
Elasticsearch Test Data
Generate and upload test data to Elasticsearch for performance and load testing
⚡ Embed data payloads inside of ordinary images or video with high-performance animated 2-D barcodes. (Python library)
rediscompare is a tool for chech two redis db data consistency. 是用来对比、校验redis 多个数据库数据一致性的命令行工具,支持单实例到单实例、单实例到原生集群、多实例多库到单实例等场景。
CRAN OpenData Task View
Vue Smooth Picker
🏄🏼 A SmoothPicker for Vue 2 (like native datetime picker of iOS)
California Coronavirus Data
The Los Angeles Times' independent tally of coronavirus cases in California.
Human Readable and Writable Data Interchange Format
Tool for visual exploration of complex data.
A Scala API for Apache Beam and Google Cloud Dataflow.
Dfuse Eosio
dfuse for EOSIO
Encrypted, taggable, searchable cloud storage
pygeoapi is a Python server implementation of the OGC API suite of standards. The project emerged as part of the next generation OGC API efforts in 2018 and provides the capability for organizations to deploy a RESTful OGC API endpoint using OpenAPI, GeoJSON, and HTML. pygeoapi is open source and released under an MIT license.
Send and Receive files directly from your browser with end-to-end encryption
An empty state control to give visually appealing context when building iOS applications.
Nessie provides Git-like capabilities for your Data Lake
✭ 176
Generate fake but valid data filled databases for test purposes using most popular patterns(AFAIK). Current support is sqlite, mysql, postgresql, mongodb, redis, couchdb.
📦 R package for data and supplemental functions for OpenIntro resources
✭ 176
Ncov2019 data crawler
Databay is a Python interface for scheduled data transfer. It facilitates transfer of (any) data from A to B, on a scheduled interval.
Linked Data & RDF Manufacturing Tools in Clojure
Everypolitician Data
data for national legislatures worldwide
Lfai Landscape
🌄 Open Source AI Landscape - provides overview of top tier projects in the open source AI ecosystem, shows projects through GitHub data, funding or market cap, first and last commits, contributor count and much other information.
General Store
Simple, flexible store implementation for Flux. #hubspot-open-source
Data Science Resources
👨🏽‍🏫You can learn about what data science is and why it's important in today's modern world. Are you interested in data science?🔋
1-60 of 646 data projects