All Projects → kuanghuei → clean-net

kuanghuei / clean-net

Licence: other
Tensorflow source code for "CleanNet: Transfer Learning for Scalable Image Classifier Training with Label Noise" (CVPR 2018)

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to clean-net

Kashgari
Kashgari is a production-level NLP Transfer learning framework built on top of tf.keras for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.
Stars: ✭ 2,235 (+2498.84%)
Mutual labels:  transfer-learning
Chinese ulmfit
中文ULMFiT 情感分析 文本分类
Stars: ✭ 208 (+141.86%)
Mutual labels:  transfer-learning
Transfer learning tutorial
A guide to transfer learning with inception-resnet-v2.
Stars: ✭ 228 (+165.12%)
Mutual labels:  transfer-learning
Galaxy Image Classifier Tensorflow
Classify whether an image is of a Spiral or an Elliptical Galaxy using Transfer Learning (Tensorflow)
Stars: ✭ 179 (+108.14%)
Mutual labels:  transfer-learning
Seg Uncertainty
IJCAI2020 & IJCV 2020 🌇 Unsupervised Scene Adaptation with Memory Regularization in vivo
Stars: ✭ 202 (+134.88%)
Mutual labels:  transfer-learning
Deepfake Detection
Towards deepfake detection that actually works
Stars: ✭ 213 (+147.67%)
Mutual labels:  transfer-learning
Accel Brain Code
The purpose of this repository is to make prototypes as case study in the context of proof of concept(PoC) and research and development(R&D) that I have written in my website. The main research topics are Auto-Encoders in relation to the representation learning, the statistical machine learning for energy-based models, adversarial generation networks(GANs), Deep Reinforcement Learning such as Deep Q-Networks, semi-supervised learning, and neural network language model for natural language processing.
Stars: ✭ 166 (+93.02%)
Mutual labels:  transfer-learning
Clan
( CVPR2019 Oral ) Taking A Closer Look at Domain Shift: Category-level Adversaries for Semantics Consistent Domain Adaptation
Stars: ✭ 248 (+188.37%)
Mutual labels:  transfer-learning
Face.evolve.pytorch
🔥🔥High-Performance Face Recognition Library on PaddlePaddle & PyTorch🔥🔥
Stars: ✭ 2,719 (+3061.63%)
Mutual labels:  transfer-learning
Gam
A PyTorch implementation of "Graph Classification Using Structural Attention" (KDD 2018).
Stars: ✭ 227 (+163.95%)
Mutual labels:  transfer-learning
Bert Sklearn
a sklearn wrapper for Google's BERT model
Stars: ✭ 182 (+111.63%)
Mutual labels:  transfer-learning
Imageatm
Image classification for everyone.
Stars: ✭ 201 (+133.72%)
Mutual labels:  transfer-learning
Dureader Bert
BERT Dureader多文档阅读理解 排名第七
Stars: ✭ 215 (+150%)
Mutual labels:  transfer-learning
Xfer
Transfer Learning library for Deep Neural Networks.
Stars: ✭ 177 (+105.81%)
Mutual labels:  transfer-learning
Deeppicar
Deep Learning Autonomous Car based on Raspberry Pi, SunFounder PiCar-V Kit, TensorFlow, and Google's EdgeTPU Co-Processor
Stars: ✭ 242 (+181.4%)
Mutual labels:  transfer-learning
Pytorch Retraining
Transfer Learning Shootout for PyTorch's model zoo (torchvision)
Stars: ✭ 167 (+94.19%)
Mutual labels:  transfer-learning
Transfer Learning Suite
Transfer Learning Suite in Keras. Perform transfer learning using any built-in Keras image classification model easily!
Stars: ✭ 212 (+146.51%)
Mutual labels:  transfer-learning
TA3N
[ICCV 2019 Oral] TA3N: https://github.com/cmhungsteve/TA3N (Most updated repo)
Stars: ✭ 45 (-47.67%)
Mutual labels:  transfer-learning
Awesome Domain Adaptation
A collection of AWESOME things about domian adaptation
Stars: ✭ 3,357 (+3803.49%)
Mutual labels:  transfer-learning
Retrieval 2017 Cam
Class-Weighted Convolutional Features for Image Retrieval (BMVC 2017)
Stars: ✭ 219 (+154.65%)
Mutual labels:  transfer-learning

Introduction

This is CleanNet, source code of CleanNet: Transfer Learning for Scalable Image Classifier Training with Label Noise (project page) from Microsoft AI and Research. The implementation is based on Tensorflow.

CleanNet

CleanNet is a joint neural embedding network for learning image classification in presence of label noise and label noise detection. To reduce the amount of human supervision for label noise cleaning, it only requires a fraction of the classes being manually verified to provide the knowledge of label noise that can be transferred to other classes.

Model architecture

CleanNet

Selected examples of CleanNet predictions

“F” denotes cosine similarity predicted by model using verification labels in all classes. “D” denotes cosine similarity under transfer learning (50/101 classes are excluded for Food-101N, including ramen, garlic bread, and cheese plate). Class names and verification labels are shown at bottom-left.

Example

Citation

If you use the code in your paper, then please cite it as:

@inproceedings{lee2017cleannet,
  title={CleanNet: Transfer Learning for Scalable Image Classifier Training with Label Noise},
  author={Lee, Kuang-Huei and He, Xiaodong and Zhang, Lei and Yang, Linjun},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition ({CVPR})},
  year={2018}
}

Requirements and Installation

  • A computer running macOS, Linux, or Windows
  • For training new models, you'll also need a NVIDIA GPU
  • tensorflow (1.6)
  • numpy
  • opencv-python

Quick Start

Training a New Model

Prepare Data

CleanNet works on feature vectors. In original paper, we use feature vectors extracted from the pool5 layer of pre-trained ResNet-50 models to represent images. Here, we assume that each image is represented by an h-dimensional feature vector.

For training and validation sets with verification labels, you need to prepare an tsv file for each of them, where the columns are: [sample key, class name, verification label, h-dimensional feature delimited by ','] or [sample key, image url, class name, verification label, h-dimensional feature delimited by ','].

For all image samples, including those with and without verification labels, you need to prepare a tsv file, where the columns are: [sample key, class name, h-dimensional feature delimited by ','] or [sample key, image url, class name, h-dimensional feature delimited by ','].

You will also need to prepare a text file that lists the unique class names.

Data Pre-processing

Once you have the required data files, use util/convert_data.py to convert each set to a numpy array that can be consumed by CleanNet. For example:

Convert train set tsv with verification labels to train.npy:
$ python util/convert_data.py --split=train --class_list={CLASS_LIST} --data_path=${TRAIN} --output_dir={$DATA_DIR}

Convert validation set tsv with verification labels to val.npy:
$ python util/convert_data.py --split=val --class_list={CLASS_LIST} --data_path=${VAL} --output_dir={$DATA_DIR}

Convert all image samples to all.npy:
$ python util/convert_data.py --split=all --class_list={CLASS_LIST} --data_path=${ALL_IMAGE_SAMPLES} --output_dir={$DATA_DIR}

Then use util/find_reference.py to find reference feature vectors for each category. For example:

$ python util/find_reference.py --class_list={CLASS_LIST} --input_npy={$DATA_DIR}/all.npy --output_dir={$DATA_DIR} --num_ref=32 --img_dim=2048

Training and validation

Use train.py to train CleanNet and run validation every n step. Here is an example:

$ python train.py
    --data_dir=${DATA_DIR} \
    --checkpoint_dir=${MODEL_DIR}/checkpoints/ \
    --log_dir=${MODEL_DIR}/log/ \
    --val_interval=500 \
    --batch_size_sup=32 \
    --batch_size_unsup=32 \
    --val_sim_thres=0.1 \
    --dropout_rate=0.2

Prediction

inference.py is for running validation and making predictions.

Use inference.py to run validation once using a trained CleanNet model. Here is an example:

$ python inference.py \
    --data_dir=${DATA_DIR} \
    --class_list=${CLASS_LIST_FILE} \
    --output_file=${OUTPUT_FILE} \
    --checkpoint_dir=${CHECKPOINT_DIR} \
    --mode=val \
    --val_sim_thres=0.2

Use inference.py to make predictions. It takes a tsv as input, where the columns are: [sample key, class name, h-dimensional feature delimited by ','] or [sample key, image url, class name, h-dimensional feature delimited by ',']. The output is cosine similarity value. Here is an example:

$ python inference.py \
    --image_feature_list=${IMAGE_FEATURE_LIST_FILE} \
    --class_list=${CLASS_LIST_FILE} \
    --output_file=${OUTPUT_FILE} \
    --checkpoint_dir=${CHECKPOINT_DIR} \
    --mode=inference

License

Licensed under the MSR-LA Full Rights License [see license.txt]

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].