Top 765 dataset open source projects

Crnn With Stn
implement CRNN in Keras with Spatial Transformer Network
Crd3
The repo containing the Critical Role Dungeons and Dragons Dataset.
Google Covid19 Mobility Reports
Data extraction of Google's COVID-19 Mobility Reports
Vidvrd Helper
To keep updates with VRU Grand Challenge, please use https://github.com/NExTplusplus/VidVRD-helper
Atis dataset
The ATIS (Airline Travel Information System) Dataset
Recursive Cnns
Implementation of my paper "Real-time Document Localization in Natural Images by Recursive Application of a CNN."
Pointclouddatasets
3D point cloud datasets in HDF5 format, containing uniformly sampled 2048 points per shape.
Urbannavdataset
UrbanNav: an Open-Sourcing Localization Data Collected in Asian Urban Canyons, Including Tokyo and Hong Kong
La3dm
Learning-aided 3D mapping
Facegrab
A tool to collect public images from Facebook and create an image dataset for training computer vision applications like gender recognition, and face detection
Pytorch Project Template
Deep Learning project template for PyTorch (Distributed Learning is supported)
Color Names
Large list of handpicked color names 🌈
Tju Dhd
A newly built high-resolution dataset for object detection and pedestrian detection (IEEE TIP 2020)
Sketchyscene
SketchyScene: Richly-Annotated Scene Sketches. (ECCV 2018)
Mmsa
CH-SIMS: A Chinese Multimodal Sentiment Analysis Dataset with Fine-grained Annotations of Modality (ACL2020)
Raccoon dataset
The dataset is used to train my own raccoon detector and I blogged about it on Medium
Covid19
JSON time-series of coronavirus cases (confirmed, deaths and recovered) per country - updated daily
Csvpack
csvpack library / gem - tools 'n' scripts for working with tabular data packages using comma-separated values (CSV) datafiles in text with meta info (that is, schema, datatypes, ..) in datapackage.json; download, read into and query CSV datafiles with your SQL database (e.g. SQLite, PostgreSQL, ...) of choice and much more
Toronto 3d
A Large-scale Mobile LiDAR Dataset for Semantic Segmentation of Urban Roadways
Icse Seip 2020 Replication Package
Replication package of the paper titled "How do you Architect your Robots? State of the Practice and Guidelines for ROS-based Systems" published at ICSE-SEIP 2020
Openpowerlifting
Read-Only Mirror of the OpenPowerlifting Project. Main Repo on GitLab.
Colour
Colour Science for Python
Extendedsumm
On Generating Extended Summaries of Long Documents
Legislator
Interface to the Comparative Legislators Database
Producttitlesummarizationcorpus
Dataset for CIKM 2018 paper "Multi-Source Pointer Network for Product Title Summarization"
Pysgs
📈 Python interface for the Brazilian Central Bank's Time Series Management System (SGS)
Dream
DREAM: A Challenge Dataset and Models for Dialogue-Based Reading Comprehension
Maskrcnn Modanet
A Mask R-CNN Keras implementation with Modanet annotations on the Paperdoll dataset
Char Rnn Tensorflow
Multi-layer Recurrent Neural Networks for character-level language models implements by TensorFlow
Stevens Vlp16 Dataset
This dataset is captured using a Velodyne VLP-16, which is mounted on an UGV - Clearpath Jackal, on Stevens Institute of Technology campus
✭ 58
dataset
Geodata Br
Free open public domain geographic data of Brazil available in multiple languages and formats.
Animegan
A simple PyTorch Implementation of Generative Adversarial Networks, focusing on anime face drawing.
View Finding Network
A deep ranking network that learns to find good compositions in a photograph.
City Scapes Script
Download City Scapes Dataset using this script
Covidnet Ct
COVID-Net Open Source Initiative - Models and Data for COVID-19 Detection in Chest CT
Fifa Fut Data
Web-scraping script that writes the data of all players from FutHead and FutBin to a CSV file or a DB
Coarij
Corpus of Annual Reports in Japan
Knyfe
knyfe is a python utility for rapid exploration of datasets.
Codar
✅ CODAR is a Framework built using PyTorch to analyze post (Text+Media) and predict Cyber Bullying and offensive content. 💬📷
Covid 19
Novel Coronavirus 2019 time series data on cases
✭ 1,060
pythondataset
Images Web Crawler
This package is a complete tool for creating a large dataset of images (specially designed -but not only- for machine learning enthusiasts). It can crawl the web, download images, rename / resize / covert the images and merge folders..
Courseraforums
Anonymized versions of the discussion threads from the forums of 60 Coursera MOOCs
Distil
💧 In memory dataset filtering, inspired by snikch/aggro
Chinesetrafficpolicepose
Detects Chinese traffic police commanding poses 检测中国交警指挥手势
Mtnt
Code for the collection and analysis of the MTNT dataset
Multidigitmnist
Combine multiple MNIST digits to create datasets with 100/1000 classes for few-shot learning/meta-learning