DatasetsTFDS is a collection of datasets ready to use with TensorFlow, Jax, ...
Cities.jsonCities of the world in Json, based on GeoNames Gazetteer
TextData loaders and abstractions for text and NLP
Cocostuff10kThe official homepage of the (outdated) COCO-Stuff 10K dataset.
Taco🌮 Trash Annotations in Context Dataset Toolkit
RetrieverQuickly download, clean up, and install public datasets into a database management system
ChazutsuThe tool to make NLP datasets ready to use
University1652 BaselineACM Multimedia2020 University-1652: A Multi-view Multi-source Benchmark for Drone-based Geo-localization 🚁 annotates 1652 buildings in 72 universities around the world.
DataladKeep code, data, containers under control with git and git-annex
Datasetssource{d} datasets ("big code") for source code analysis and machine learning on source code
Structured3d[ECCV'20] Structured3D: A Large Photo-realistic Dataset for Structured 3D Modeling
WeatherbenchA benchmark dataset for data-driven weather forecasting
Stocknet DatasetA comprehensive dataset for stock movement prediction from tweets and historical stock prices.
TorchdataPyTorch dataset extended with map, cache etc. (tensorflow.data like)
StationaryGet hourly meteorological data from one of thousands of global stations
H36m FetchHuman 3.6M 3D human pose dataset fetcher
CollectionCollection Data for Cooper Hewitt, Smithsonian Design Museum
Bccd datasetBCCD (Blood Cell Count and Detection) Dataset is a small-scale dataset for blood cells detection.
Dataset SerializeJSON to DataSet and DataSet to JSON converter for Delphi and Lazarus (FPC)
DialogrptEMNLP 2020: "Dialogue Response Ranking Training with Large-Scale Human Feedback Data"
Ava downloader⏬ Download AVA dataset (A Large-Scale Database for Aesthetic Visual Analysis)
OmnianomalyKDD 2019: Robust Anomaly Detection for Multivariate Time Series through Stochastic Recurrent Neural Network
Covid19zaCoronavirus COVID-19 (2019-nCoV) Data Repository and Dashboard for South Africa
Split Folders🗂 Split folders with files (i.e. images) into training, validation and test (dataset) folders
Semantic Segmentation SuiteSemantic Segmentation Suite in TensorFlow. Implement, train, and test new Semantic Segmentation models easily!
Trump LiesTutorial: Web scraping in Python with Beautiful Soup
Dali DALI: a large Dataset of synchronised Audio, LyrIcs and vocal notes.
HdltexHDLTex: Hierarchical Deep Learning for Text Classification
Data Setstate driven all in one data process for data visualization
MutualA Dataset for Multi-Turn Dialogue Reasoning
SiceLearning a Deep Single Image Contrast Enhancer from Multi-Exposure Images (TIP 2018)
MsmarcoUtilities, Baselines, Statistics and Descriptions Related to the MSMARCO DATASET
Datasets For GoodList of datasets to apply stats/machine learning/technology to the world of social good.
Hand pose actionDataset and code for the paper "First-Person Hand Action Benchmark with RGB-D Videos and 3D Hand Pose Annotations", CVPR 2018.
Data Science Resources👨🏽🏫You can learn about what data science is and why it's important in today's modern world. Are you interested in data science?🔋
FakerFaker is a Python package that generates fake data for you.
MirdataPython library to work with Music Information Retrieval datasets