Top 644 dataset open source projects

Chinese Names Corpus
TFDS is a collection of datasets ready to use with TensorFlow, Jax, ...
Cities of the world in Json, based on GeoNames Gazetteer
Data loaders and abstractions for text and NLP
Recommendersystem Dataset
This repository contains some datasets that I have collected in Recommender Systems.
The official homepage of the (outdated) COCO-Stuff 10K dataset.
🌮 Trash Annotations in Context Dataset Toolkit
Quickly download, clean up, and install public datasets into a database management system
The tool to make NLP datasets ready to use
Covid 19 Repo Data
Data archive of identifiable COVID-19 related public projects on GitHub
Covid Chestxray Dataset
We are building an open database of COVID-19 cases with chest X-ray or CT images.
University1652 Baseline
ACM Multimedia2020 University-1652: A Multi-view Multi-source Benchmark for Drone-based Geo-localization 🚁 annotates 1652 buildings in 72 universities around the world.
Keep code, data, containers under control with git and git-annex
source{d} datasets ("big code") for source code analysis and machine learning on source code
[ECCV'20] Structured3D: A Large Photo-realistic Dataset for Structured 3D Modeling
A benchmark dataset for data-driven weather forecasting
Stocknet Dataset
A comprehensive dataset for stock movement prediction from tweets and historical stock prices.
Vehicle reid Collection
🚗 the collection of vehicle re-ID papers, datasets. 🚗
PyTorch dataset extended with map, cache etc. ( like)
Get hourly meteorological data from one of thousands of global stations
✭ 225
Automated Resume Screening System
Automated Resume Screening System using Machine Learning (With Dataset)
H36m Fetch
Human 3.6M 3D human pose dataset fetcher
Collection Data for Cooper Hewitt, Smithsonian Design Museum
Bccd dataset
BCCD (Blood Cell Count and Detection) Dataset is a small-scale dataset for blood cells detection.
Dataset Serialize
JSON to DataSet and DataSet to JSON converter for Delphi and Lazarus (FPC)
EMNLP 2020: "Dialogue Response Ranking Training with Large-Scale Human Feedback Data"
Short Jokes Dataset
Python scripts for building 'Short Jokes' dataset, featured on Kaggle
Ava downloader
⏬ Download AVA dataset (A Large-Scale Database for Aesthetic Visual Analysis)
KDD 2019: Robust Anomaly Detection for Multivariate Time Series through Stochastic Recurrent Neural Network
Create fake data in R
Mini Imagenet Tools
Tools for generating mini-ImageNet dataset and processing batches
Coronavirus COVID-19 (2019-nCoV) Data Repository and Dashboard for South Africa
Split Folders
🗂 Split folders with files (i.e. images) into training, validation and test (dataset) folders
A Clojure high performance data processing system
Trump Lies
Tutorial: Web scraping in Python with Beautiful Soup
Awesome Json Datasets
A curated list of awesome JSON datasets that don't require authentication.
DALI: a large Dataset of synchronised Audio, LyrIcs and vocal notes.
Data Set
state driven all in one data process for data visualization
Fifa18 All Player Statistics
A complete catalog of all the players in Fifa 18 and their complete statistics.
A Dataset for Multi-Turn Dialogue Reasoning
Intrinsic Image Popularity
The pytorch code of the paper "Intrinsic Image Popularity Assessment"
Sign Language Digits Dataset
Turkey Ankara Ayrancı Anadolu High School's Sign Language Digits Dataset
✭ 176
Learning a Deep Single Image Contrast Enhancer from Multi-Exposure Images (TIP 2018)
✭ 175
Utilities, Baselines, Statistics and Descriptions Related to the MSMARCO DATASET
Everypolitician Data
data for national legislatures worldwide
Datasets For Good
List of datasets to apply stats/machine learning/technology to the world of social good.
Hand pose action
Dataset and code for the paper "First-Person Hand Action Benchmark with RGB-D Videos and 3D Hand Pose Annotations", CVPR 2018.
Data Science Resources
👨🏽‍🏫You can learn about what data science is and why it's important in today's modern world. Are you interested in data science?🔋
Faker is a Python package that generates fake data for you.
Python library to work with Music Information Retrieval datasets
1-60 of 644 dataset projects