MLLabelUtils.jlUtility package for working with classification targets and label-encodings
Stars: ✭ 30 (-45.45%)
11K-HandsTwo-stream CNN for gender classification and biometric identification using a dataset of 11K hand images.
Stars: ✭ 44 (-20%)
kaggle-codeA repository for some of the code I used in kaggle data science & machine learning tasks.
Stars: ✭ 100 (+81.82%)
TextDatasetCleaner🔬 Очистка датасетов от мусора (нормализация, препроцессинг)
Stars: ✭ 27 (-50.91%)
ml-datasets🌊 Machine learning dataset loaders for testing and example scripts
Stars: ✭ 40 (-27.27%)
let-it-be中国高等教育群体的心理健康状态数据集
Stars: ✭ 28 (-49.09%)
databrewerThe missing datasets manager. Like hombrew but for datasets. CLI-tool for search and discover datasets!
Stars: ✭ 39 (-29.09%)
oxygenjsThis a JavaScript Library for the Numerical Javascript and Machine Learning
Stars: ✭ 13 (-76.36%)
parlitoolsA collection of useful tools for UK politics
Stars: ✭ 22 (-60%)
tweets-preprocessorRepo containing the Twitter preprocessor module, developed by the AUTH OSWinds team
Stars: ✭ 26 (-52.73%)
sparklanesA lightweight data processing framework for Apache Spark
Stars: ✭ 17 (-69.09%)
BrainPrepPreprocessing pipeline on Brain MR Images through FSL and ANTs, including registration, skull-stripping, bias field correction, enhancement and segmentation.
Stars: ✭ 107 (+94.55%)
ColdStorageLightweight data loading and caching library for android
Stars: ✭ 39 (-29.09%)
farabio🤖 PyTorch toolkit for biomedical imaging ❤️
Stars: ✭ 48 (-12.73%)
download audioset📁 This repo makes it easy to download the raw audio files from AudioSet (32.45 GB, 632 classes).
Stars: ✭ 53 (-3.64%)
covid19-datasetsA list of high quality open datasets for COVID-19 data analysis
Stars: ✭ 56 (+1.82%)
Few-Shot-Intent-DetectionFew-Shot-Intent-Detection includes popular challenging intent detection datasets with/without OOS queries and state-of-the-art baselines and results.
Stars: ✭ 63 (+14.55%)
pywedgeMakes Interactive Chart Widget, Cleans raw data, Runs baseline models, Interactive hyperparameter tuning & tracking
Stars: ✭ 49 (-10.91%)
datasets🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
Stars: ✭ 13,870 (+25118.18%)
systematic-review-datasetsA collection of fully labeled systematic review datasets (title-abstract screening)
Stars: ✭ 25 (-54.55%)
AIODriveOfficial Python/PyTorch Implementation for "All-In-One Drive: A Large-Scale Comprehensive Perception Dataset with High-Density Long-Range Point Clouds"
Stars: ✭ 32 (-41.82%)
Dataset-Sentimen-Analisis-Bahasa-IndonesiaRepositori ini merupakan kumpulan dataset terkait analisis sentimen Berbahasa Indonesia. Apabila Anda menggunakan dataset-dataset yang ada pada repositori ini untuk penelitian, maka cantumkanlah/kutiplah jurnal artikel terkait dataset tersebut. Dataset yang tersedia telah diimplementasikan dalam beberapa penelitian dan hasilnya telah dipublikasi…
Stars: ✭ 38 (-30.91%)
SER-datasetsA collection of datasets for the purpose of emotion recognition/detection in speech.
Stars: ✭ 74 (+34.55%)
PharmacoDBSearch across publicly available datasets to find instances where a drug or cell line of interest has been profiled.
Stars: ✭ 38 (-30.91%)
postcss-eachPostCSS plugin to iterate through values
Stars: ✭ 93 (+69.09%)
traj-pred-irlOfficial implementation codes of "Regularizing neural networks for future trajectory prediction via IRL framework"
Stars: ✭ 23 (-58.18%)
napkinXCExtremely simple and fast extreme multi-class and multi-label classifiers.
Stars: ✭ 38 (-30.91%)
dplace-dataThe data repository for the D-PLACE Project (Database of Places, Language, Culture and Environment)
Stars: ✭ 49 (-10.91%)
HINT3This repository contains datasets and code for the paper "HINT3: Raising the bar for Intent Detection in the Wild" accepted at EMNLP-2020's Insights Workshop https://insights-workshop.github.io/ Preprint for the paper is available here https://arxiv.org/abs/2009.13833
Stars: ✭ 27 (-50.91%)
panoptic partsThis repository contains code and tools for reading, processing, evaluating on, and visualizing Panoptic Parts datasets. Moreover, it contains code for reproducing our CVPR 2021 paper results.
Stars: ✭ 82 (+49.09%)
Text-Summarization-Repo텍스트 요약 분야의 주요 연구 주제, Must-read Papers, 이용 가능한 model 및 data 등을 추천 자료와 함께 정리한 저장소입니다.
Stars: ✭ 213 (+287.27%)
dropEstPipeline for initial analysis of droplet-based single-cell RNA-seq data
Stars: ✭ 71 (+29.09%)
Three-Filters-to-NormalThree-Filters-to-Normal: An Accurate and Ultrafast Surface Normal Estimator (RAL+ICRA'21)
Stars: ✭ 41 (-25.45%)
veridical-flowMaking it easier to build stable, trustworthy data-science pipelines.
Stars: ✭ 28 (-49.09%)
masaderThe largest public catalogue for Arabic NLP and speech datasets. There are +250 datasets annotated with more than 25 attributes.
Stars: ✭ 66 (+20%)
TSForecastingThis repository contains the implementations related to the experiments of a set of publicly available datasets that are used in the time series forecasting research space.
Stars: ✭ 53 (-3.64%)
Clustering-DatasetsThis repository contains the collection of UCI (real-life) datasets and Synthetic (artificial) datasets (with cluster labels and MATLAB files) ready to use with clustering algorithms.
Stars: ✭ 189 (+243.64%)
awesome-sweden-datasetsA curated list of awesome datasets to use when coding for the Swedish market.
Stars: ✭ 17 (-69.09%)
akshareAKShare is an elegant and simple financial data interface library for Python, built for human beings! 开源财经数据接口库
Stars: ✭ 5,155 (+9272.73%)
ck-envCK repository with components and automation actions to enable portable workflows across diverse platforms including Linux, Windows, MacOS and Android. It includes software detection plugins and meta packages (code, data sets, models, scripts, etc) with the possibility of multiple versions to co-exist in a user or system environment:
Stars: ✭ 67 (+21.82%)
cmip6 preprocessingAnalysis ready CMIP6 data in python the easy way with pangeo tools.
Stars: ✭ 126 (+129.09%)
awesome-forests🌳 A curated list of ground-truth forest datasets for the machine learning and forestry community.
Stars: ✭ 111 (+101.82%)
ml4seA curated list of papers, theses, datasets, and tools related to the application of Machine Learning for Software Engineering
Stars: ✭ 46 (-16.36%)
datasetdataset is a command line tool, Go package, shared library and Python package for working with JSON objects as collections
Stars: ✭ 21 (-61.82%)
PharmacoGxR package to analyze large-scale pharmacogenomic datasets.
Stars: ✭ 42 (-23.64%)
SeqToolsA python library to manipulate and transform indexable data (lists, arrays, ...)
Stars: ✭ 42 (-23.64%)
skippaSciKIt-learn Pipeline in PAndas
Stars: ✭ 33 (-40%)
disent🧶 Modular VAE disentanglement framework for python built with PyTorch Lightning ▸ Including metrics and datasets ▸ With strongly supervised, weakly supervised and unsupervised methods ▸ Easily configured and run with Hydra config ▸ Inspired by disentanglement_lib
Stars: ✭ 41 (-25.45%)
preprocess-conll05Scripts for preprocessing the CoNLL-2005 SRL dataset.
Stars: ✭ 17 (-69.09%)
opendatasetsA Python library for downloading datasets from Kaggle, Google Drive, and other online sources.
Stars: ✭ 161 (+192.73%)
RData.jlRead R data files from Julia
Stars: ✭ 49 (-10.91%)
allie🤖 A machine learning framework for audio, text, image, video, or .CSV files (50+ featurizers and 15+ model trainers).
Stars: ✭ 93 (+69.09%)