CdapAn open source framework for building data analytic applications.
Voice datasets🔊 A comprehensive list of open-source datasets for voice and sound computing (50+ datasets).
DoccanoOpen source annotation tool for machine learning practitioners.
Seq2seqchatbotsA wrapper around tensor2tensor to flexibly train, interact, and generate data for neural chatbots.
Lidar BonnetalSemantic and Instance Segmentation of LiDAR point clouds for autonomous driving
Mongodb Json Files📦 A curated list of JSON / BSON datasets from the web in order to practice / use in MongoDB
IoDataset, streaming, and file system extensions maintained by TensorFlow SIG-IO
Squad ExplorerVisually Explore the Stanford Question Answering Dataset
Imdb FaceA new large-scale noise-controlled face recognition dataset.
Cmu MultimodalsdkCMU MultimodalSDK is a machine learning platform for development of advanced multimodal models as well as easily accessing and processing multimodal datasets.
Comma2k19A driving dataset for the development and validation of fused pose estimators and mapping algorithms
VpgnetVPGNet: Vanishing Point Guided Network for Lane and Road Marking Detection and Recognition (ICCV 2017)
TrashnetDataset of images of trash; Torch-based CNN for garbage image classification
DataPython related videos and metadata powering =>
Dukemtmc Reid evaluationICCV2017 The Person re-ID Evaluation Code for DukeMTMC-reID Dataset (Including Dataset Download)
Medmnist[ISBI'21] MedMNIST Classification Decathlon: A Lightweight AutoML Benchmark for Medical Image Analysis
Dsprites DatasetDataset to assess the disentanglement properties of unsupervised learning methods
Eseur Code DataCode and data used to create the examples in "Evidence-based Software Engineering based on the publicly available data"
PcamThe PatchCamelyon (PCam) deep learning classification benchmark.
Atsd Use CasesAxibase Time Series Database: Usage Examples and Research Articles
WhylogsProfile and monitor your ML data pipeline end-to-end
Browser Compat DataThis repository contains compatibility data for Web technologies as displayed on MDN
ToflowTOFlow: Video Enhancement with Task-Oriented Flow
Covid19 twitterCovid-19 Twitter dataset for non-commercial research use and pre-processing scripts - under active development
Css10CSS10: A Collection of Single Speaker Speech Datasets for 10 Languages
DatasetsA repository of pretty cool datasets that I collected for network science and machine learning research.
CryptocmdCryptocurrency historical price data library in Python. Data from https://coinmarketcap.com.
TapeTasks Assessing Protein Embeddings (TAPE), a set of five biologically relevant semi-supervised learning tasks spread across different domains of protein biology.
LinusrantsDataset of Linus Torvalds' rants classified by negativity using sentiment analysis
Surface Defect Detection🐎📈 Constantly summarizing open source dataset and important critical papers in the field of surface defect research which are very important. 🐋
Text2sql DataA collection of datasets that pair questions with SQL queries.
Oie ResourcesA curated list of Open Information Extraction (OIE) resources: papers, code, data, etc.
RealsrToward Real-World Single Image Super-Resolution: A New Benchmark and A New Model (ICCV 2019)
MeglassAn eyeglass face dataset collected and cleaned for face recognition evaluation, CCBR 2018.
Fx 1 Minute DataHISTDATA - Full Dataset composed of 68 FX trading pairs / Simple API to retrieve 1 Minute data Historical FX Prices (up to June 2019).
Knowage ServerKnowage is the professional open source suite for modern business analytics over traditional sources and big data systems.
Data Science HacksData Science Hacks consists of tips, tricks to help you become a better data scientist. Data science hacks are for all - beginner to advanced. Data science hacks consist of python, jupyter notebook, pandas hacks and so on.
Exclusively Dark Image DatasetExclusively Dark (ExDARK) dataset which to the best of our knowledge, is the largest collection of low-light images taken in very low-light environments to twilight (i.e 10 different conditions) to-date with image class and object level annotations.
Semantic Kitti ApiSemanticKITTI API for visualizing dataset, processing data, and evaluating results.
Covid19canadaEpidemiological Data from the COVID-19 Epidemic in Canada