DatasetsTFDS is a collection of datasets ready to use with TensorFlow, Jax, ...
RetrieverQuickly download, clean up, and install public datasets into a database management system
Datasetssource{d} datasets ("big code") for source code analysis and machine learning on source code
Zr ObpOpen Bandit Pipeline: a python library for bandit algorithms and off-policy evaluation
Aidl kbA Knowledge Base for the FB Group Artificial Intelligence and Deep Learning (AIDL)
Ner DatasetsDatasets to train supervised classifiers for Named-Entity Recognition in different languages (Portuguese, German, Dutch, French, English)
MolaA Modular Optimization framework for Localization and mApping (MOLA)
IndonluThe first-ever vast natural language processing benchmark for Indonesian Language. We provide multiple downstream tasks, pre-trained IndoBERT models, and a starter code! (AACL-IJCNLP 2020)
DatasaurusR Package 📦 Containing the Datasaurus Dozen datasets 📊
CorusLinks to Russian corpora + Python functions for loading and parsing
Awesome Nlp PolishA curated list of resources dedicated to Natural Language Processing (NLP) in polish. Models, tools, datasets.
IdenprofIdenProf dataset is a collection of images of identifiable professionals. It is been collected to enable the development of AI systems that can serve by identifying people and the nature of their job by simply looking at an image, just like humans can do.
PinsPin, Discover and Share Resources
Gekko DatasetsGekko Trading Bot dataset dumps. Ready to use and download history files in SQLite format.
Pix2codepix2code: Generating Code from a Graphical User Interface Screenshot
Remo Python🐰 Python lib for remo - the app for annotations and images management in Computer Vision
PipedreamConnect APIs, remarkably fast. Free for developers.
Multi object datasetsMulti-object image datasets with ground-truth segmentation masks and generative factors.
Bird Recognition ReviewA list of useful resources in the bird sound (song and calls) recognition, such as datasets, papers, links to open source projects and competitions
AestheticsImage Aesthetics Toolkit - includes Fisher Vector implementation, AVA (Image Aesthetic Visual Analysis) dataset and fast multi-threaded downloader
FirstcoursenetworkscienceTutorials, datasets, and other material associated with textbook "A First Course in Network Science" by Menczer, Fortunato & Davis
CholeraR Package for Analyzing John Snow's 1854 Cholera Map
ChineseglueLanguage Understanding Evaluation benchmark for Chinese: datasets, baselines, pre-trained models,corpus and leaderboard
Wb srgbWhite balance camera-rendered sRGB images (CVPR 2019) [Matlab & Python]
CodesearchnetDatasets, tools, and benchmarks for representation learning of code.
Transitland DatastoreTransitland's centralized web service API for both querying and editing aggregated transit data from around the world
Exposure correctionReference code for the paper "Learning Multi-Scale Photo Exposure Correction", CVPR 2021.
Doppelganger[IMC 2020 (Best Paper Finalist)] Using GANs for Sharing Networked Time Series Data: Challenges, Initial Promise, and Open Questions
CrossweighCrossWeigh: Training Named Entity Tagger from Imperfect Annotations
Openml RR package to interface with OpenML
Gopup数据接口:百度、谷歌、头条、微博指数,宏观数据,利率数据,货币汇率,千里马、独角兽公司,新闻联播文字稿,影视票房数据,高校名单,疫情数据…
Atis datasetThe ATIS (Airline Travel Information System) Dataset
Coco Annotator✏️ Web-based image segmentation tool for object detection, localization, and keypoints
ColourColour Science for Python
PersonasDatasets for Deep learning Personas
Awesome Earth Artificial IntelligenceA curated list of Earth Science's Artificial Intelligence (AI) tutorials, notebooks, software, datasets, courses, books, video lectures and papers. Contributions most welcome.
Pytorch CppC++ Implementation of PyTorch Tutorials for Everyone
HealthcheckHealth Check ✔ is a Machine Learning Web Application made using Flask that can predict mainly three diseases i.e. Diabetes, Heart Disease, and Cancer.
Commons⛲️ Commons Marketplace client & server to explore, download, and publish open data sets in the Ocean Protocol Network.