All Projects → hi-zhengcheng → multi-label-classification

hi-zhengcheng / multi-label-classification

Licence: other
machine-learning tensorflow multi-label-classification

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to multi-label-classification

MCAR
Learning to Discover Multi-Class Attentional Regions for Multi-Label Image Recognition
Stars: ✭ 32 (+18.52%)
Mutual labels:  multi-label-classification
CheXpert challenge 2019
Code for CheXpert Challenge 2019 [Top 2(07/24/2019)]
Stars: ✭ 14 (-48.15%)
Mutual labels:  multi-label-classification
GoEmotions-pytorch
Pytorch Implementation of GoEmotions 😍😢😱
Stars: ✭ 95 (+251.85%)
Mutual labels:  multi-label-classification
Generative MLZSL
[TPAMI Under Submission] Generative Multi-Label Zero-Shot Learning
Stars: ✭ 37 (+37.04%)
Mutual labels:  multi-label-classification
Explainable-Automated-Medical-Coding
Implementation and demo of explainable coding of clinical notes with Hierarchical Label-wise Attention Networks (HLAN)
Stars: ✭ 35 (+29.63%)
Mutual labels:  multi-label-classification
classifier multi label
multi-label,classifier,text classification,多标签文本分类,文本分类,BERT,ALBERT,multi-label-classification
Stars: ✭ 127 (+370.37%)
Mutual labels:  multi-label-classification
mybabe
MyBB CAPTCHA Solver using Convolutional Neural Network in Keras
Stars: ✭ 18 (-33.33%)
Mutual labels:  multi-label-classification
extremeText
Library for fast text representation and extreme classification.
Stars: ✭ 141 (+422.22%)
Mutual labels:  multi-label-classification
multi-label-classification
基于tf.keras的多标签多分类模型
Stars: ✭ 72 (+166.67%)
Mutual labels:  multi-label-classification
Aspect-Based-Sentiment-Analysis
A python program that implements Aspect Based Sentiment Analysis classification system for SemEval 2016 Dataset.
Stars: ✭ 57 (+111.11%)
Mutual labels:  multi-label-classification
DECAF
DECAF: Deep Extreme Classification with Label Features
Stars: ✭ 46 (+70.37%)
Mutual labels:  multi-label-classification
MLIC-KD-WSD
Multi-Label Image Classification via Knowledge Distillation from Weakly-Supervised Detection (ACM MM 2018)
Stars: ✭ 58 (+114.81%)
Mutual labels:  multi-label-classification
awesome-image-tagging
A paper list of awesome Image Tagging
Stars: ✭ 61 (+125.93%)
Mutual labels:  multi-label-classification
multi-label-text-classification
Mutli-label text classification using ConvNet and graph embedding (Tensorflow implementation)
Stars: ✭ 44 (+62.96%)
Mutual labels:  multi-label-classification
kaggle-human-protein-atlas-image-classification
Kaggle 2018 @ Human Protein Atlas Image Classification
Stars: ✭ 34 (+25.93%)
Mutual labels:  multi-label-classification
GalaXC
GalaXC: Graph Neural Networks with Labelwise Attention for Extreme Classification
Stars: ✭ 28 (+3.7%)
Mutual labels:  multi-label-classification
single-positive-multi-label
Multi-Label Learning from Single Positive Labels - CVPR 2021
Stars: ✭ 63 (+133.33%)
Mutual labels:  multi-label-classification
Caver
Caver: a toolkit for multilabel text classification.
Stars: ✭ 38 (+40.74%)
Mutual labels:  multi-label-classification
napkinXC
Extremely simple and fast extreme multi-class and multi-label classifiers.
Stars: ✭ 38 (+40.74%)
Mutual labels:  multi-label-classification
omikuji
An efficient implementation of Partitioned Label Trees & its variations for extreme multi-label classification
Stars: ✭ 69 (+155.56%)
Mutual labels:  multi-label-classification

multi-label-classification

1. Data Preparation

To make tensorflow run in high efficiency, first save data in TFRecord files.

  1. Create one dir and copy all images into this dir. We call it image_dir.

  2. Create image_list txt file. The format is like:

    COCO_val2014_000000320715.jpg 8
    COCO_val2014_000000379048.jpg 2
    COCO_val2014_000000014562.jpg 9
    ...
    

    Tip: create two files, one for training, one for evaluation.

  3. Create image_label txt file. The format is like:

    1 1 1 1 0 0 1 0 0 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    1 0 1 1 1 1 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0
    ...
    

    In the above example, assume there are total 35 labels. Each line corresponds to one image in image_list txt file. Each label has one fixed index. The value 1 means image has this label, 0 means not. The number in second column of image_list means how many labels the image file has.

    Tip: create two files, one for training, one for evaluation.

  4. Create tfrecords file. The model will read data from TFRecords data format. Just run script:

    python create_tfrecord.py \
        --image_dir="/path/to/images_dir" \
        --imglist_file="/path/to/image_list_file" \
        --imglabel_file="/path/to/image_label_file" \
        --output_file="/path/to/xx.tfrecords" \
        --gpu="1"
    

    Tip: Create train.tfrecords and eval.tfrecords separately. read_tfrecord.py is just a tool script to read data from tfrecords for test purpose.

2. Base network definition and pre-trained checkpoints.

  1. This library do image feature extraction by pre-trained resnet_50 model. I have downloaded network definition files(resnet_utils.py, resnet_v2.py) from ResNet V2 50, you still need to download the pre-trained checkpoint.

  2. You can also change the file multi_label_classification_model.py to use rest101 or other models. Find the networks and pre-trained models from here.

3. Multi-label-classification Model

  1. Model is defined in file multi_label_classification_model.py. I just choose one endpoint in the pre-trained model, then add three conv2d layers in the end.

  2. Input image processing is very import. The logic is:

    1. In training process, first resize image to a larger size, then random crop to the target size, and do some image augmentations. Finally use this randomly created image for training.

    2. In evaluation process, I just resize image to the target size.

    3. In inference process, first resize image to a larger size, then use 10 crops evaluation method: for one image, using 10 crops(top-left, top-right, bottom-left, bottom-right, center and the mirrors) go throw the model, and compute the mean or max value of 10 outputs.

4. Training

  1. Config model. In train.py, modify the params of ModelConfig creation.

  2. Config train. In train.py, modify the params of TrainConfig.

  3. Run script:

    python train.py
    

5. Evaluation

After start train script, start the evaluation script, let it run in parallel with train:

  1. Config model. In evaluate.py, modify the params of ModelConfig creation.

  2. Config eval. In evaluate.py, modify the params of EvalConfig.

  3. Run script:

    python evaluate.py
    

6. Inference

Use tensorboard to monitor the training process. When the model is likely to be overfitting, start it, choose one good checkpoint, and use this checkpoint to do inference operation on test dataset:

  1. Config model. In inference.py, modify the params of ModelConfig creation.

  2. Config eval. In inference.py, modify the params of InferenceConfig.

  3. Implement the get_test_image_list funciton in inference.py, let it return a list of image paths, like:

    [
        '/path/to/img1.jpg',
        '/path/to/img2.jpg',
        ...
    ]
    
  4. Run script:

    python inference.py
    

7. Threshold calibration

After inference, the model produces score (or confidence) values for each label. It's time to choose threshold values to decide whether specific label belongs to an image or not. Method is:

  1. Use trained model to do inference on evaluation dataset. It produces the scores for evaluation dataset.

  2. Use threshold_calibration.py to compute optimal thresholds for each label.

  3. Use the computed optimal thresholds on the test dataset's inference result.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].