Top 765 dataset open source projects

Watermarkreco
Pytorch implementation of the paper "Large-Scale Historical Watermark Recognition: dataset and a new consistency-based approach"
Letsgodataset
This repository makes the integral Let's Go dataset publicly available.
Covid Ctset
Large Covid-19 CT scans dataset from paper: https://doi.org/10.1101/2020.06.08.20121541
Qri
you're invited to a data party!
People Counting Dataset
the large-scale data set for people counting (LOI counting)
Pts
Quantized Mesh Terrain Data Generator and Server for CesiumJS Library
Okutama Action
Okutama-Action: An Aerial View Video Dataset for Concurrent Human Action Detection
French Sentiment Analysis Dataset
A collection of over 1.5 Million tweets data translated to French, with their sentiment.
Multi Plier
An unsupervised transfer learning approach for rare disease transcriptomics
Wikisql
A large annotated semantic parsing corpus for developing natural language interfaces.
Rstudioconf tweets
🖥 A repository for tracking tweets about rstudio::conf
Elastic data
Elasticsearch datasets ready for bulk loading
Day night dataset list
Collecting a list of dataset with day and night annotations
✭ 30
dataset
Dns Lots Of Lookups
dnslol is a command line tool for performing lots of DNS lookups.
Feversymmetric
Symmetric evaluation set based on the FEVER (fact verification) dataset
Jsut Lab
HTS-style full-context labels for JSUT v1.1
Covid 19 Api
Covid-19 Virus Data API from Johns Hopkins CSSE
Tedsds
Apache Spark - Turbofan Engine Degradation Simulation Data Set example in Apache Spark
Company Names Corpus
公司名语料库。机构名语料库。公司简称,缩写,品牌词,企业名。可用于中文分词、机构名实体识别。
Khayyam
106 Omar Khayyam quatrains in YAML format.
Synthetic Computer Vision
A list of synthetic dataset and tools for computer vision
Facerank
FaceRank - Rank Face by CNN Model based on TensorFlow (add keras version). FaceRank-人脸打分基于 TensorFlow (新增 Keras 版本) 的 CNN 模型(QQ群:167122861)。技术支持:http://tensorflow123.com
Learning Python Predictive Analytics
Tracking, notes and programming snippets while learning predictive analytics
Musical Onset Efficient
Supplementary information and code for the paper: An efficient deep learning model for musical onset detection
Mobius
C# and F# language binding and extensions to Apache Spark
Cophy
"CoPhy: Counterfactual Learning of Physical Dynamics", F. Baradel, N. Neverova, J. Mille, G. Mori, C. Wolf, ICLR'2020
Imagenetscraper
👁 Bulk-download all thumbnails from an ImageNet synset, with optional rescaling
Rdhs
API Client and Data Munging for the Demographic and Health Survey Data
Bgg Analysis
What makes a game a good game?
Covid Ct
COVID-CT-Dataset: A CT Scan Dataset about COVID-19
Osint collection
Maintained collection of OSINT related resources. (All Free & Actionable)
Safety Helmet Wearing Dataset
Safety helmet wearing detect dataset, with pretrained model
Awesome Face
😎 face releated algorithm, dataset and paper
Clusterdata
cluster data collected from production clusters in Alibaba for cluster management research
✭ 718
dataset
Cluener2020
CLUENER2020 中文细粒度命名实体识别 Fine Grained Named Entity Recognition
Chatito
🎯🗯 Generate datasets for AI chatbots, NLP tasks, named entity recognition or text classification models using a simple DSL!
Wilayah Administratif Indonesia
Data Provinsi, Kota/Kabupaten, Kecamatan, dan Kelurahan/Desa di Indonesia
Person search
Joint Detection and Identification Feature Learning for Person Search
Proteinnet
Standardized data set for machine learning of protein structure
Devblogs
+2600 developer-related blogs and publications.
Uhttbarcodereference
Universe-HTT barcode reference
Esc 50
ESC-50: Dataset for Environmental Sound Classification
Awesome chinese medical nlp
中文医学NLP公开资源整理:术语集/语料库/词向量/预训练模型/知识图谱/命名实体识别/QA/信息抽取/模型/论文/etc
Gensim Data
Data repository for pretrained NLP models and NLP corpora.
Couplet Dataset
Dataset for couplets. 70万条对联数据库。
Total Text Dataset
Total Text Dataset. It consists of 1555 images with more than 3 different text orientations: Horizontal, Multi-Oriented, and Curved, one of a kind.
Hate Speech And Offensive Language
Repository for the paper "Automated Hate Speech Detection and the Problem of Offensive Language", ICWSM 2017
Nas Bench 201
NAS-Bench-201 API and Instruction