All Projects → chenpf1025 → IDN

chenpf1025 / IDN

Licence: other
AAAI 2021: Beyond Class-Conditional Assumption: A Primary Attempt to Combat Instance-Dependent Label Noise

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to IDN

Active-Passive-Losses
[ICML2020] Normalized Loss Functions for Deep Learning with Noisy Labels
Stars: ✭ 92 (+338.1%)
Mutual labels:  noisy-data, label-noise, noisy-labels, robust-learning
Advances-in-Label-Noise-Learning
A curated (most recent) list of resources for Learning with Noisy Labels
Stars: ✭ 360 (+1614.29%)
Mutual labels:  noisy-data, label-noise, noisy-labels, robust-learning
noisy label understanding utilizing
ICML 2019: Understanding and Utilizing Deep Neural Networks Trained with Noisy Labels
Stars: ✭ 82 (+290.48%)
Mutual labels:  noisy-labels
ProSelfLC-2021
noisy labels; missing labels; semi-supervised learning; entropy; uncertainty; robustness and generalisation.
Stars: ✭ 45 (+114.29%)
Mutual labels:  noisy-labels
FactorGraph.jl
The FactorGraph package provides the set of different functions to perform inference over the factor graph with continuous or discrete random variables using the belief propagation algorithm.
Stars: ✭ 17 (-19.05%)
Mutual labels:  noisy-data
Noisy-Labels-with-Bootstrapping
Keras implementation of Training Deep Neural Networks on Noisy Labels with Bootstrapping, Reed et al. 2015
Stars: ✭ 22 (+4.76%)
Mutual labels:  noisy-labels
wrench
WRENCH: Weak supeRvision bENCHmark
Stars: ✭ 185 (+780.95%)
Mutual labels:  robust-learning
clean-net
Tensorflow source code for "CleanNet: Transfer Learning for Scalable Image Classifier Training with Label Noise" (CVPR 2018)
Stars: ✭ 86 (+309.52%)
Mutual labels:  label-noise
C2D
PyTorch implementation of "Contrast to Divide: self-supervised pre-training for learning with noisy labels"
Stars: ✭ 59 (+180.95%)
Mutual labels:  noisy-labels
Cleanlab
The standard package for machine learning with noisy labels, finding mislabeled data, and uncertainty quantification. Works with most datasets and models.
Stars: ✭ 2,526 (+11928.57%)
Mutual labels:  noisy-data
NLNL-Negative-Learning-for-Noisy-Labels
NLNL: Negative Learning for Noisy Labels
Stars: ✭ 70 (+233.33%)
Mutual labels:  noisy-labels

Beyond Class-Conditional Assumption: A Primary Attempt to Combat Instance-Dependent Label Noise.

This is the official repository for the paper Beyond Class-Conditional Assumption: A Primary Attempt to Combat Instance-Dependent Label Noise. (AAAI 2021). In this paper, one of the contributions is to provide rigorous motivations for studying instance-dependent label noise.

@inproceedings{chen2021beyond,
  title={Beyond Class-Conditional Assumption: A Primary Attempt to Combat Instance-Dependent Label Noise.},
  author={Chen, Pengfei and Ye, Junjie and Chen, Guangyong and Zhao, Jingwei and Heng, Pheng-Ann},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  year={2021}
}

0. Requirements

  • python 3.6+
  • torch 1.2+

1. Instance-Dependent Noise (IDN)

1.1. Noisy labels used in this paper

In our experiments, we generated noisy labels of IDN for MNIST and CIFAR-10. Here we release the related files.

data/CIFAR10/label_noisy/dependent0.1.csv
data/CIFAR10/label_noisy/dependent0.2.csv
data/CIFAR10/label_noisy/dependent0.3.csv
data/CIFAR10/label_noisy/dependent0.4.csv
data/MNIST/label_noisy/dependent0.1.csv
data/MNIST/label_noisy/dependent0.2.csv
data/MNIST/label_noisy/dependent0.3.csv
data/MNIST/label_noisy/dependent0.4.csv

If you are developing novel methods, you are encouraged to use these files for a fair comparison with the results reported in our paper. The index in the .csv file is consistent with the default dataset in torchvision. For example, to get a CIFAR-10 dataset with 40% IDN, you can use the following scripts in you code.

from torchvision import datasets
train_dataset_noisy = datasets.CIFAR10(root, train=True, download=True, transform=transform)
targets_noisy = list(pd.read_csv('./data/CIFAR10/label_noisy/dependent0.4.csv')['label_noisy'].values.astype(int))
train_dataset_noisy.targets = targets_noisy

To get a MNIST dataset with 40% IDN, you can use the following scripts in you code.

from torchvision import datasets
train_dataset_noisy = datasets.MNIST(root, train=True, download=True, transform=transform)
targets_noisy = torch.Tensor(pd.read_csv('./data/MNIST/label_noisy/dependent0.4.csv')['label_noisy'].values.astype(int))
train_dataset_noisy.targets = targets_noisy

1.2. Synthetizing IDN

If you prefer to synthetize IDN, e.g., to synthetize 45% IDN for CIFAR-10, you can use the following commands.

python cifar10_gen_dependent.py --noise_rate 0.45 --gen

The command will train a model on clean CIFAR-10, yield the average of softmax output, and then synthetize IDN. After you running the command for the first time, the averaged softmax output is saved and you can directly generate IDN of any other ratio by loading it, e.g.,

python cifar10_gen_dependent.py --noise_rate 0.35 --gen --load

If you need to write a script to synthetize IDN for a new dataset, you can refer to the file mnist_gen_dependent.py and cifar10_gen_dependent.py.

2. Combating IDN using SEAL

2.1. MNIST

For SEAL, we use 10 iterations. We can run the commands one-by-one as follows.

python train_mnist.py --noise_rate 0.2 --SEAL 0 --save
python train_mnist.py --noise_rate 0.2 --SEAL 1 --save
...
python train_mnist.py --noise_rate 0.2 --SEAL 10 --save

The initial iteration is equivalent to training using the cross-entropy (CE) loss. To run experiments on different noise fractions, we can choose --noise_rate in {0.1,0.2,0.3,0.4}.

2.2. CIFAR-10

For SEAL, we use 3 iterations. We can run the commands one-by-one as follows.

python train_cifar10.py --noise_rate 0.2 --SEAL 0 --save
python train_cifar10.py --noise_rate 0.2 --SEAL 1 --save
python train_cifar10.py --noise_rate 0.2 --SEAL 2 --save
python train_cifar10.py --noise_rate 0.2 --SEAL 3 --save

The initial iteration is equivalent to training using the cross-entropy (CE) loss. To run experiments on different noise fractions, we can choose --noise_rate in {0.1,0.2,0.3,0.4}.

2.3. Clothing1M

By default, the training requirements 4 GPUs. For SEAL, we use 3 iterations. We can run the commands one-by-one as follows.

python train_clothing.py --SEAL 0 --save
python train_clothing.py --SEAL 1 --save
python train_clothing.py --SEAL 2 --save
python train_clothing.py --SEAL 3 --save

The initial iteration is equivalent to training using the cross-entropy (CE) loss

To run SEAL on top of DMI, we first use the official implementation of DMI to obtained a model, then use the following commands one-by-one.

python train_clothing_dmi.py --SEAL 1 --save
python train_clothing_dmi.py --SEAL 2 --save
python train_clothing_dmi.py --SEAL 3 --save
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].