Top 765 dataset open source projects

Dbg Pds
Deutsche Boerse's Financial Trading Public Data Set
Ember Impagination
An Ember Addon that puts the fun back in asynchronous, paginated datasets
Onepiece Kg
a knowledge graph project for ONEPIECE /《海贼王》知识图谱
Gis Dataset Brasil
Geographic Information Systems (GIS) Dataset Brasil - Coleção de shapefiles, GeoJSON e TopoJSON prontas para uso
Openvehiclevision
An opensource lib. for vehicle vision applications (written by MATLAB), lane marking detection, road segmentation
Awesome Robotics Datasets
A collection of useful datasets for robotics and computer vision
Networkdata
R package containing several network datasets
Scrna.seq.datasets
Collection of public scRNA-Seq datasets used by our group
Adresse.data.gouv.fr
Le site officiel de l'Adresse
Text Segmentation
Implementation of the paper: Text Segmentation as a Supervised Learning Task
Datasets knowledge embedding
Datasets for Knowledge Graph Completion with textual information about the entities
Know Your Intent
State of the Art results in Intent Classification using Sematic Hashing for three datasets: AskUbuntu, Chatbot and WebApplication.
Pglib Opf
Benchmarks for the Optimal Power Flow Problem
Protest Detection Violence Estimation
Implementation of the model used in the paper Protest Activity Detection and Perceived Violence Estimation from Social Media Images (ACM Multimedia 2017)
Aesthetics
Image Aesthetics Toolkit - includes Fisher Vector implementation, AVA (Image Aesthetic Visual Analysis) dataset and fast multi-threaded downloader
Iros20 6d Pose Tracking
[IROS 2020] se(3)-TrackNet: Data-driven 6D Pose Tracking by Calibrating Image Residuals in Synthetic Domains
Autoannotationtool
A label tool aim to reduce semantic segmentation label time, rectangle and polygon annotation is supported
Stanet
official implementation of the spatial-temporal attention neural network (STANet) for remote sensing image change detection
Graph Parser
GraphParser is a semantic parser which can convert natural language sentences to logical forms and graphs.
Utbm robocar dataset
EU Long-term Dataset with Multiple Sensors for Autonomous Driving
Personalized Dialog
Code for the paper 'Personalization in Goal-oriented Dialog' (NeurIPS 2017 Conversational AI Workshop)
Impy
Impy is a Python3 library with features that help you in your computer vision tasks.
Imagenetv2
A new test set for ImageNet
Race ar baselines
Baselines of the RACE Reading Comprehension Dataset
Ua Gec
UA-GEC: Grammatical Error Correction and Fluency Corpus for the Ukrainian Language
Chatgirl
ChatGirl is an AI ChatBot based on TensorFlow Seq2Seq Model. ChatGirl 一个基于 TensorFlow Seq2Seq 模型的聊天机器人。(包含预处理过的 twitter 英文数据集,训练,运行,工具代码,来波 Star 。)QQ群:167122861
Wb srgb
White balance camera-rendered sRGB images (CVPR 2019) [Matlab & Python]
Iso 3166 Countries With Regional Codes
ISO 3166-1 country lists merged with their UN Geoscheme regional codes in ready-to-use JSON, XML, CSV data sets
C3
Investigating Prior Knowledge for Challenging Chinese Machine Reading Comprehension
Scientificsummarizationdatasets
Datasets I have created for scientific summarization, and a trained BertSum model
Objectron
Objectron is a dataset of short, object-centric video clips. In addition, the videos also contain AR session metadata including camera poses, sparse point-clouds and planes. In each video, the camera moves around and above the object and captures it from different views. Each object is annotated with a 3D bounding box. The 3D bounding box describes the object’s position, orientation, and dimensions. The dataset contains about 15K annotated video clips and 4M annotated images in the following categories: bikes, books, bottles, cameras, cereal boxes, chairs, cups, laptops, and shoes
Cubicasa5k
CubiCasa5k floor plan dataset
Exposure correction
Reference code for the paper "Learning Multi-Scale Photo Exposure Correction", CVPR 2021.
Deepweeds
A Multiclass Weed Species Image Dataset for Deep Learning
Dataloaders
Pytorch and TensorFlow data loaders for several audio datasets
Bond
BOND: BERT-Assisted Open-Domain Name Entity Recognition with Distant Supervision
Body reconstruction references
Paper, dataset and code collection on human body reconstruction
Persian Swear Words
دیتاست کلمات نامناسب و بد فارسی برای فیلتر کردن متن ها
Ml Pyxis
Tool for reading and writing datasets of tensors in a Lightning Memory-Mapped Database (LMDB). Designed to manage machine learning datasets with fast reading speeds.
Face landmark dnn
Face Landmark Detector based on Mobilenet V1
Core50
CORe50: a new Dataset and Benchmark for Continual Learning
Eval Vislam
Toolkit for VI-SLAM evaluation.
Hands Detection
Hands video tracker using the Tensorflow Object Detection API and Faster RCNN model. The data used is the Hand Dataset from University of Oxford.
Cesi
WWW 2018: CESI: Canonicalizing Open Knowledge Bases using Embeddings and Side Information
Sigsep Mus Db
Python parser and tools for MUSDB18 Music Separation Dataset
Dataset List
lists of text corpus and more (mainly Japanese)
Conmask
ConMask model described in paper Open-world Knowledge Graph Completion.
Keypointnet
KeypointNet: A Large-scale 3D Keypoint Dataset Aggregated from Numerous Human Annotations (CVPR2020)
Ccpd
[ECCV 2018] CCPD: a diverse and well-annotated dataset for license plate detection and recognition