Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → MedMNIST → Medmnist

MedMNIST / Medmnist

Licence: apache-2.0

[ISBI'21] MedMNIST Classification Decathlon: A Lightweight AutoML Benchmark for Medical Image Analysis

Labels

jupyter-notebook dataset benchmark automl mnist medical

Projects that are alternatives of or similar to Medmnist

Fashion Mnist

A MNIST-like fashion product database. Benchmark 👇

Stars: ✭ 9,675 (+2762.43%)

Mutual labels: dataset, mnist, benchmark

Caffenet Benchmark

Evaluation of the CNN design choices performance on ImageNet-2012.

Stars: ✭ 700 (+107.1%)

Mutual labels: jupyter-notebook, dataset, benchmark

Weatherbench

A benchmark dataset for data-driven weather forecasting

Stars: ✭ 227 (-32.84%)

Mutual labels: jupyter-notebook, dataset, benchmark

Tehran Stocks

A python package to access tsetmc data

Stars: ✭ 282 (-16.57%)

Mutual labels: jupyter-notebook, dataset

Deep Learning From Scratch

Six snippets of code that made deep learning what it is today.

Stars: ✭ 255 (-24.56%)

Mutual labels: jupyter-notebook, mnist

Dataset Api

The ApolloScape Open Dataset for Autonomous Driving and its Application.

Stars: ✭ 260 (-23.08%)

Mutual labels: jupyter-notebook, dataset

Taco

🌮 Trash Annotations in Context Dataset Toolkit

Stars: ✭ 243 (-28.11%)

Mutual labels: jupyter-notebook, dataset

Datasets

A repository of pretty cool datasets that I collected for network science and machine learning research.

Stars: ✭ 302 (-10.65%)

Mutual labels: dataset, benchmark

Datascience course

Curso de Data Science em Português

Stars: ✭ 294 (-13.02%)

Mutual labels: jupyter-notebook, dataset

Pcam

The PatchCamelyon (PCam) deep learning classification benchmark.

Stars: ✭ 340 (+0.59%)

Mutual labels: dataset, benchmark

Codefun

DataStructure(SwordOffer、LeetCode)、Deep Learning(Tensorflow、Keras、Pytorch)、Machine Learning(sklearn、spark)、AutoML、AutoDL、ModelDeploying、SQL

Stars: ✭ 319 (-5.62%)

Mutual labels: jupyter-notebook, automl

Medical-Names-Corpus

医疗语料库。医疗机构名语料库。药品本位码。

Stars: ✭ 26 (-92.31%)

Mutual labels: medical, dataset

MaskedFaceRepresentation

Masked face recognition focuses on identifying people using their facial features while they are wearing masks. We introduce benchmarks on face verification based on masked face images for the development of COVID-safe protocols in airports.

Stars: ✭ 17 (-94.97%)

Mutual labels: benchmark, dataset

Data Science Hacks

Data Science Hacks consists of tips, tricks to help you become a better data scientist. Data science hacks are for all - beginner to advanced. Data science hacks consist of python, jupyter notebook, pandas hacks and so on.

Stars: ✭ 273 (-19.23%)

Mutual labels: jupyter-notebook, dataset

BIRL

BIRL: Benchmark on Image Registration methods with Landmark validations

Stars: ✭ 66 (-80.47%)

Mutual labels: benchmark, dataset

Tape

Tasks Assessing Protein Embeddings (TAPE), a set of five biologically relevant semi-supervised learning tasks spread across different domains of protein biology.

Stars: ✭ 295 (-12.72%)

Mutual labels: dataset, benchmark

Transportationnetworks

Transportation Networks for Research

Stars: ✭ 312 (-7.69%)

Mutual labels: jupyter-notebook, dataset

Deeperforensics 1.0

[CVPR 2020] A Large-Scale Dataset for Real-World Face Forgery Detection

Stars: ✭ 338 (+0%)

Mutual labels: dataset, benchmark

Whylogs

Profile and monitor your ML data pipeline end-to-end

Stars: ✭ 328 (-2.96%)

Mutual labels: jupyter-notebook, dataset

Datasets

source{d} datasets ("big code") for source code analysis and machine learning on source code

Stars: ✭ 231 (-31.66%)

Mutual labels: jupyter-notebook, dataset

View All Similar Projects ➔

MedMNIST

ISBI'21 Paper | Project Page | Dataset

Jiancheng Yang, Rui Shi, Bingbing Ni, Bilian Ke

We present MedMNIST, a collection of 10 pre-processed medical open datasets. MedMNIST is standardized to perform classification tasks on lightweight 28 × 28 images, which requires no background knowledge. Covering the primary data modalities in medical image analysis, it is diverse on data scale (from 100 to 100,000) and tasks (binary/multi-class, ordinal regression and multi-label). MedMNIST could be used for educational purpose, rapid prototyping, multi-modal machine learning or AutoML in medical image analysis. Moreover, MedMNIST Classification Decathlon is designed to benchmark AutoML algorithms on all 10 datasets.

For more details, please refer to our paper:

MedMNIST Classification Decathlon: A Lightweight AutoML Benchmark for Medical Image Analysis (ISBI'21)

Key Features

Educational: Our multi-modal data, from multiple open medical image datasets with Creative Commons (CC) Licenses, is easy to use for educational purpose.
Standardized: Data is pre-processed into same format, which requires no background knowledge for users.
Diverse: The multi-modal datasets covers diverse data scales (from 100 to 100,000) and tasks (binary/multiclass, ordinal regression and multi-label).
Lightweight: The small size of 28 × 28 is friendly for rapid prototyping and experimenting multi-modal machine learning and AutoML algorithms.

Please note that this dataset is NOT intended for clinical use.

Code Structure

medmnist/:
- dataset.py: PyTorch datasets and dataloaders of MedMNIST.
- models.py: ResNet-18 and ResNet-50 models.
- evaluator.py: Standardized evaluation functions.
- info.py: Dataset information dict for each subset of MedMNIST.
train.py: The training and evaluation script to reproduce the baseline results in the paper.
getting_started.ipynb: Explore the MedMNIST dataset with jupyter notebook. It is ONLY intended for a quick exploration, i.e., it does not provide full training and evaluation functionalities (please refer to train.py instead).
setup.py: The script to install medmnist as a module

Requirements

The code requires only common Python environments for machine learning; Basicially, it was tested with

Python 3 (Anaconda 3.6.3 specifically)
PyTorch==0.3.1
numpy==1.18.5, pandas==0.25.3, scikit-learn==0.22.2, tqdm

Higher (or lower) versions should also work (perhaps with minor modifications).

Dataset

You could download the dataset(s) via the following free accesses:

zenodo.org (recommended): You could also use our code to download the datasets from zenodo.org automatically.
Google Drive
百度网盘 (code: gx6i)

The dataset contains ten subsets, and each subset (e.g., pathmnist.npz) is comprised of train_images, train_labels, val_images, val_labels, test_images and test_labels.

How to run the experiments

Download the dataset manually or automatically (by setting download=True in dataset.py).
[optional] Install medmnist as a module by using command python setup.py install
Run the demo code train.py script in terminal.

First, change directory to where train.py locates. Then, use command python train.py --data_name xxxmnist --input_root input --output_root output --num_epoch 100 --download True to run the experiments, where xxxmnist is subset of our MedMNIST (e.g., pathmnist), input is the path of the data files, output is the folder to save the results, num_epoch is the number of epochs of training, and download is the bool value whether download the dataset.

For instance, to run PathMNIST
```
python train.py --data_name pathmnist --input_root <path/to/input/folder> --output_root <path/to/output/folder> --num_epoch 100 --download True
```

Citation

If you find this project useful, please cite our paper as:

  Jiancheng Yang, Rui Shi, Bingbing Ni. "MedMNIST Classification Decathlon: A Lightweight AutoML Benchmark for Medical Image Analysis," arXiv preprint arXiv:2010.14925, 2020.

or using bibtex:

 @article{medmnist,
 title={MedMNIST Classification Decathlon: A Lightweight AutoML Benchmark for Medical Image Analysis},
 author={Yang, Jiancheng and Shi, Rui and Ni, Bingbing},
 journal={arXiv preprint arXiv:2010.14925},
 year={2020}
 }

LICENSE

The code is under Apache-2.0 License.

The datasets are under Creative Commons (CC) Licenses in general, please refer to the project page for details.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 338

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (0) 🔗