All Projects → chenpf1025 → noisy_label_understanding_utilizing

chenpf1025 / noisy_label_understanding_utilizing

Licence: other
ICML 2019: Understanding and Utilizing Deep Neural Networks Trained with Noisy Labels

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to noisy label understanding utilizing

Active-Passive-Losses
[ICML2020] Normalized Loss Functions for Deep Learning with Noisy Labels
Stars: ✭ 92 (+12.2%)
Mutual labels:  icml, noisy-labels
pySerialTransfer
Python package to transfer data in a fast, reliable, and packetized form
Stars: ✭ 78 (-4.88%)
Mutual labels:  robust
heltin
Robust client registry for individuals receiving mental healthcare services.
Stars: ✭ 18 (-78.05%)
Mutual labels:  robust
RobustGNSS
Robust GNSS Processing With Factor Graphs
Stars: ✭ 98 (+19.51%)
Mutual labels:  robust
WaveGrad2
PyTorch Implementation of Google Brain's WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis
Stars: ✭ 55 (-32.93%)
Mutual labels:  robust
robust-kalman
Robust Kalman filter with adaptive noise statistics estimation.
Stars: ✭ 89 (+8.54%)
Mutual labels:  robust
unicornn
Official code for UnICORNN (ICML 2021)
Stars: ✭ 21 (-74.39%)
Mutual labels:  icml
rest-ftp-daemon
A pretty simple but configurable and efficient FTP-client daemon, driven through a RESTful API, used by France Télévisions in production
Stars: ✭ 23 (-71.95%)
Mutual labels:  robust
imbalanced-regression
[ICML 2021, Long Talk] Delving into Deep Imbalanced Regression
Stars: ✭ 425 (+418.29%)
Mutual labels:  icml
NeuroAI
NeuroAI-UW seminar, a regular weekly seminar for the UW community, organized by NeuroAI Shlizerman Lab.
Stars: ✭ 36 (-56.1%)
Mutual labels:  icml
pairwiseComparisons
Pairwise comparison tests for one-way designs 🔬📝
Stars: ✭ 46 (-43.9%)
Mutual labels:  robust
rogme
Robust Graphical Methods For Group Comparisons
Stars: ✭ 69 (-15.85%)
Mutual labels:  robust
semi-supervised-NFs
Code for the paper Semi-Conditional Normalizing Flows for Semi-Supervised Learning
Stars: ✭ 23 (-71.95%)
Mutual labels:  icml
probnmn-clevr
Code for ICML 2019 paper "Probabilistic Neural-symbolic Models for Interpretable Visual Question Answering" [long-oral]
Stars: ✭ 63 (-23.17%)
Mutual labels:  icml
FairAI
This is a collection of papers and other resources related to fairness.
Stars: ✭ 55 (-32.93%)
Mutual labels:  icml
Weather-Forecast
Weather Forecast is a simple app that shows you a weather. It comes with your standard features like your daily and hourly forecast along with access to additional information
Stars: ✭ 59 (-28.05%)
Mutual labels:  robust
stone paper scissor defeator using opencv keras
In this repository i tried to replicate a cool project by a japanese scientist who made a machine which had 100 % accuracy in defeating humans in the game of stone-paper and scissors
Stars: ✭ 22 (-73.17%)
Mutual labels:  robust
gmwm
Generalized Method of Wavelet Moments (GMWM) is an estimation technique for the parameters of time series models. It uses the wavelet variance in a moment matching approach that makes it particularly suitable for the estimation of certain state-space models.
Stars: ✭ 21 (-74.39%)
Mutual labels:  robust
ProSelfLC-2021
noisy labels; missing labels; semi-supervised learning; entropy; uncertainty; robustness and generalisation.
Stars: ✭ 45 (-45.12%)
Mutual labels:  noisy-labels
libdynamic
High performance utility library for C
Stars: ✭ 78 (-4.88%)
Mutual labels:  robust

Understanding and Utilizing Deep Neural Networks Trained with Noisy Labels

This is a Keras implementation for the paper Understanding and Utilizing Deep Neural Networks Trained with Noisy Labels (Proceedings of ICML, 2019).

@inproceedings{chen2019understanding,
  title={Understanding and Utilizing Deep Neural Networks Trained with Noisy Labels},
  author={Chen, Pengfei and Liao, Ben Ben and Chen, Guangyong and Zhang, Shengyu},
  booktitle={International Conference on Machine Learning},
  pages={1062--1070},
  year={2019}
}

Dependencies

Python 3.6.4, Keras 2.1.6, Tensorflow 1.7.0, numpy, sklearn.

Please be aware of the bug caused by different versions of Keras/tf. For example, in the callback functions in model.fit_generator, new keras versions use "val_accuracy" instead of "val_acc", for which you may not directly get an error but may fail to save the model. Please check the Documentation of Keras carefully if you use a different version.

Setup

To set up experiments, we need to download the CIFAR-10 data and extract it to:

data/cifar-10-batches-py

The code will automatically add noise to CIFAR-10 by randomly flipping original labels.

Understanding noisy labels

Note

To quantitatively characterize the generalization performance of deep neural networks normally trained with noisy labels, we split the noisy dataset into two halves and perform cross-validation: training on a subset and testing on the other.

We firstly theoretically characterize on the test set the confusion matrix (w.r.t. ground-truth labels) and test accuracy (w.r.t. noisy labels).

We then propose to select a testing sample as a clean one, if the trained model predict the same label with its observed label. The performance is evaluated by label precision and label recall, which can be theoretically estimated using the noise ratio according to our paper.

Train

Experimrental resluts justify our theoretical analysis. To reproduce the experimental results, we can run Verify_Theory.py and specify the noise pattern and noise ratio, e.g.,

  • Symmetric noise with ratio 0.5:

    python Verify_Theory.py --noise_pattern sym --noise_ratio 0.5

  • Asymmetric noise with ratio 0.4:

    python Verify_Theory.py --noise_pattern asym --noise_ratio 0.4

Results

Test accuracy, label precision and label recall w.r.t noise ratio on manually corrupted CIFAR-10.

Confusion matrix M approximates noise transistion matrix T.

Simply cleaning noisy datasets

Train

If you only want to use INCV to clean a noisy dataset, you can run INCV.py only, e.g., on CIFAR-10 with

  • 50% symmetric noise:

    python INCV.py --noise_pattern sym --noise_ratio 0.5 --dataset cifar10

  • 40% asymmetric noise:

    python INCV.py --noise_pattern asym --noise_ratio 0.4 --dataset cifar10

The results will be saved in 'results/(dataset)/(noise_pattern)/(noise_ratio)/(XXX.csv)' with columns ('y', 'y_noisy', 'select', 'candidate', 'eval_ratio').

Results

label precision and label recall on the manually corrupted CIFAR-10.

Our INCV accurately identifies most clean samples. For example, under symmetric noise of ratio 0.5, it selects about 90% (=LR) of the clean samples, and the noise ratio of the selected set is reduced to around 10% (=1−LP).

Cleaning noisy datasets and robustly training deep neural networks

Note

We present the Iterative Noisy Cross-Validation (INCV) to select a subset of clean samples, then modify the Co-teaching strategy to train noise-robust deep neural networks.

Train

E.g., use our method to train on CIFAR-10 with

  • 50% symmetric noise:

    python INCV_main.py --noise_pattern sym --noise_ratio 0.5 --dataset cifar10

  • 40% asymmetric noise:

    python INCV_main.py --noise_pattern asym --noise_ratio 0.4 --dataset cifar10

Results

Average test accuracy (%, 5 runs) with standard deviation:

Method Sym. 0.2 Sym. 0.5 Sym. 0.8 Aym. 0.4
F-correction 85.08±0.43 76.02±0.19 34.76±4.53 83.55±2.15
Decoupling 86.72±0.32 79.31±0.62 36.90±4.61 75.27±0.83
Co-teaching 89.05±0.32 82.12±0.59 16.21±3.02 84.55±2.81
MentorNet 88.36±0.46 77.10±0.44 28.89±2.29 77.33±0.79
D2L 86.12±0.43 67.39±13.62 10.02±0.04 85.57±1.21
Ours 89.71±0.18 84.78±0.33 52.27±3.50 86.04±0.54

Average test accuracy (%, 5 runs) during training:

Cite

Please cite our paper if you use this code in your research work.

Questions/Bugs

Please submit a Github issue or contact [email protected] if you have any questions or find any bugs.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].