Voice datasets🔊 A comprehensive list of open-source datasets for voice and sound computing (50+ datasets).
Stars: ✭ 494 (+1800%)
Label StudioLabel Studio is a multi-type data labeling and annotation tool with standardized output format
Stars: ✭ 7,264 (+27838.46%)
Awesome Json DatasetsA curated list of awesome JSON datasets that don't require authentication.
Stars: ✭ 2,421 (+9211.54%)
Openml RR package to interface with OpenML
Stars: ✭ 81 (+211.54%)
AestheticsImage Aesthetics Toolkit - includes Fisher Vector implementation, AVA (Image Aesthetic Visual Analysis) dataset and fast multi-threaded downloader
Stars: ✭ 113 (+334.62%)
Wb srgbWhite balance camera-rendered sRGB images (CVPR 2019) [Matlab & Python]
Stars: ✭ 101 (+288.46%)
DoccanoOpen source annotation tool for machine learning practitioners.
Stars: ✭ 5,600 (+21438.46%)
ColourColour Science for Python
Stars: ✭ 1,131 (+4250%)
Atis datasetThe ATIS (Airline Travel Information System) Dataset
Stars: ✭ 81 (+211.54%)
RetrieverQuickly download, clean up, and install public datasets into a database management system
Stars: ✭ 241 (+826.92%)
ElyraElyra extends JupyterLab Notebooks with an AI centric approach.
Stars: ✭ 839 (+3126.92%)
MeglassAn eyeglass face dataset collected and cleaned for face recognition evaluation, CCBR 2018.
Stars: ✭ 281 (+980.77%)
recurrent-defocus-deblurring-synth-dual-pixelReference github repository for the paper "Learning to Reduce Defocus Blur by Realistically Modeling Dual-Pixel Data". We propose a procedure to generate realistic DP data synthetically. Our synthesis approach mimics the optical image formation found on DP sensors and can be applied to virtual scenes rendered with standard computer software. Lev…
Stars: ✭ 30 (+15.38%)
Datasetssource{d} datasets ("big code") for source code analysis and machine learning on source code
Stars: ✭ 231 (+788.46%)
Exposure correctionReference code for the paper "Learning Multi-Scale Photo Exposure Correction", CVPR 2021.
Stars: ✭ 98 (+276.92%)
Persian Swear Wordsدیتاست کلمات نامناسب و بد فارسی برای فیلتر کردن متن ها
Stars: ✭ 95 (+265.38%)
DatasetsTFDS is a collection of datasets ready to use with TensorFlow, Jax, ...
Stars: ✭ 3,094 (+11800%)
craft-text-detectorPackaged, Pytorch-based, easy to use, cross-platform version of the CRAFT text detector
Stars: ✭ 151 (+480.77%)
HJDatasetA Large Dataset of Historical Japanese Documents with Complex Layouts
Stars: ✭ 19 (-26.92%)
HARRecognize one of six human activities such as standing, sitting, and walking using a Softmax Classifier trained on mobile phone sensor data.
Stars: ✭ 18 (-30.77%)
QPT[内测中]前向式Python环境快捷封装工具,快速将Python打包为EXE并添加CUDA、NoAVX等支持。
Stars: ✭ 308 (+1084.62%)
user qualityDataset for Software Evolution and Quality Improvement
Stars: ✭ 27 (+3.85%)
tracing-vs-freehandTracing Versus Freehand for Evaluating Computer-Generated Drawings (SIGGRAPH 2021)
Stars: ✭ 21 (-19.23%)
MaskedFaceRepresentationMasked face recognition focuses on identifying people using their facial features while they are wearing masks. We introduce benchmarks on face verification based on masked face images for the development of COVID-safe protocols in airports.
Stars: ✭ 17 (-34.62%)
capsulecdContinuous Delivery for automating package releases (npm, cookbooks, gems, pip, jars, etc)
Stars: ✭ 96 (+269.23%)
mxmortalitydbA data only R package containing all injury intent deaths registered in Mexico from 2004 to 2019
Stars: ✭ 20 (-23.08%)
ACVR2017An Innovative Salient Object Detection Using Center-Dark Channel Prior
Stars: ✭ 20 (-23.08%)
intro-to-pythonAn Introduction to Programming in Python
Stars: ✭ 57 (+119.23%)
astrodashDeep learning for the automated spectral classification of supernovae
Stars: ✭ 25 (-3.85%)
OTT-QACode and Data for ICLR2021 Paper "Open Question Answering over Tables and Text"
Stars: ✭ 92 (+253.85%)
TVQAplus[ACL 2020] PyTorch code for TVQA+: Spatio-Temporal Grounding for Video Question Answering
Stars: ✭ 99 (+280.77%)
squad-v1.1-ptPortuguese translation of the SQuAD dataset
Stars: ✭ 13 (-50%)
climateRAn R 📦 for getting point and gridded climate data by AOI
Stars: ✭ 93 (+257.69%)
podiumPodium: a framework agnostic Python NLP library for data loading and preprocessing
Stars: ✭ 55 (+111.54%)
newsletter-archiveMarkdown archive & RSS/Atom feeds for Data Is Plural.
Stars: ✭ 65 (+150%)
AITQAresources for the IBM Airlines Table-Question-Answering Benchmark
Stars: ✭ 12 (-53.85%)
BugZooKeep your bugs contained. A platform for studying historical software bugs.
Stars: ✭ 49 (+88.46%)
mindwareAn efficient open-source AutoML system for automating machine learning lifecycle, including feature engineering, neural architecture search, and hyper-parameter tuning.
Stars: ✭ 34 (+30.77%)
disent🧶 Modular VAE disentanglement framework for python built with PyTorch Lightning ▸ Including metrics and datasets ▸ With strongly supervised, weakly supervised and unsupervised methods ▸ Easily configured and run with Hydra config ▸ Inspired by disentanglement_lib
Stars: ✭ 41 (+57.69%)
covid-19-data-cleanupScripts to cleanup data from https://github.com/CSSEGISandData/COVID-19
Stars: ✭ 25 (-3.85%)
TSForecastingThis repository contains the implementations related to the experiments of a set of publicly available datasets that are used in the time series forecasting research space.
Stars: ✭ 53 (+103.85%)
dplace-dataThe data repository for the D-PLACE Project (Database of Places, Language, Culture and Environment)
Stars: ✭ 49 (+88.46%)
covid19-data-greeceDatasets and analysis of Novel Coronavirus (COVID-19) outbreak in Greece
Stars: ✭ 16 (-38.46%)
discodoEnhanced Audio Player for Discord
Stars: ✭ 41 (+57.69%)
arpwitchA modern arpwatch replacement with JSON formatted outputs and easy options to exec commands when network changes are observed.
Stars: ✭ 20 (-23.08%)