All Projects → OpenFirework → pytorch_solov2

OpenFirework / pytorch_solov2

Licence: other
SOLOv2 pytorch (without mmdetection and mmcv)

Programming Languages

python
139335 projects - #7 most used programming language
Cuda
1817 projects
C++
36643 projects - #6 most used programming language

Projects that are alternatives of or similar to pytorch solov2

resume
My Chinese and English Resumes in LaTeX with Font Awesome 5
Stars: ✭ 296 (+335.29%)
Mutual labels:  cv
CAP augmentation
Cut and paste augmentation for object detection and instance segmentation
Stars: ✭ 93 (+36.76%)
Mutual labels:  instance-segmentation
BlackIQ.ir
BlackIQ.ir Source Code
Stars: ✭ 14 (-79.41%)
Mutual labels:  cv
cv-arxiv-daily
🎓Automatically Update CV Papers Daily using Github Actions (Update Every 12th hours)
Stars: ✭ 216 (+217.65%)
Mutual labels:  cv
cv
github.com/enfiskutensykkel/cv/raw/master/cv.pdf
Stars: ✭ 55 (-19.12%)
Mutual labels:  cv
py4chemoinformatics
Python for chemoinformatics
Stars: ✭ 78 (+14.71%)
Mutual labels:  cv
FaPN
[ICCV 2021] FaPN: Feature-aligned Pyramid Network for Dense Image Prediction
Stars: ✭ 173 (+154.41%)
Mutual labels:  instance-segmentation
deeplearning-paper-notes
Reading notes on deep learning papers---深度学习论文阅读笔记
Stars: ✭ 36 (-47.06%)
Mutual labels:  cv
Light-Condition-Style-Transfer
Lane Detection in Low-light Conditions Using an Efficient Data Enhancement : Light Conditions Style Transfer (IV 2020)
Stars: ✭ 133 (+95.59%)
Mutual labels:  instance-segmentation
fa.BlackIQ.ir
Persion version of BlackIQ.ir
Stars: ✭ 16 (-76.47%)
Mutual labels:  cv
opencv TLD
TLD:tracking-learning-detection 跟踪算法
Stars: ✭ 41 (-39.71%)
Mutual labels:  cv
CVparser
CVparser is software for parsing or extracting data out of CV/resumes.
Stars: ✭ 28 (-58.82%)
Mutual labels:  cv
PanBook
Pandoc LaTeX,Epub模板,用于生成书籍,幻灯片(beamer),简历,论文等(cv, thesis, ebook,beamer)
Stars: ✭ 190 (+179.41%)
Mutual labels:  cv
resume-parser
A java Spring Boot Resume Parser using GATE library.
Stars: ✭ 20 (-70.59%)
Mutual labels:  cv
seminar
ECNU ICA seminar materials
Stars: ✭ 14 (-79.41%)
Mutual labels:  cv
CS231n Spring 2019
CS231n_Spring(2019年秋季)计算机视觉课程
Stars: ✭ 31 (-54.41%)
Mutual labels:  cv
quant-jobs-zurich
A list of companies of possible interest for mathematicians (or related) that are looking for a job in quantitative finance in Zurich.
Stars: ✭ 59 (-13.24%)
Mutual labels:  cv
Quicksend
Send Quick Templated emails (with CV) to different companies
Stars: ✭ 27 (-60.29%)
Mutual labels:  cv
ML-ProjectYard
This repo consists of multiple machine learning based projects with frontend
Stars: ✭ 105 (+54.41%)
Mutual labels:  cv
pagedown-cv
R package for generating a CV with R Markdown + entries in a spreadsheet
Stars: ✭ 57 (-16.18%)
Mutual labels:  cv

English | 简体中文

Pytorch solov2 project

This code is a partial code extracted from the original code of the official author of SOLOV2 (the part of the lightweight implementation of SOLOV2_LIGHT), which does not rely on MMDetetion and MMCV. It is still relatively simple and has a lot of shortcomings to be improved.

official code: https://github.com/WXinlong/SOLO
paper: https://arxiv.org/abs/2003.10152

shortages

1、Only supports Resnet18, Resnet34 backbone to train.
2、Multi GPU parallel training is not supported.
3、Incomplete configuration items for training and testing.

Install

python 3.6+
pip install torch==1.5.1 torchvision==0.6.1 #Higher versions of PyTorch also tested OK
pip install pycocotools
pip install numpy
pip install scipy
cd pytorch_solov2/
python setup.py develop #Install the original FocalLoss of SoloV2

2021-05-17 update
Completely remove the dependency on MMCV

2020-10-13 update
Improve the evaluation code, save it as the picture after instance segmentation, and add video test code.

2020-07-23update
The implementation of FocalLoss in the latest version of MMCV-Full is different from that in the original SOLO version (the processing label of the background class is different). If the MMCV-Full FocalLoss is used for training, although the loss is reduced, the actual prediction is not accurate.

So replace it with the original FocalLoss implementation.

python setup.py develop

After the replacement, retraining, loss and prediction are normal.

Test Images

avatar

avatar

train

After configuring the items in the config.py,run command

python train.py  

config.py

  • if use COCO dataset
coco2017/
├── annotations
├── train2017
└── val2017

in 'data' path run command:

ln -s /path/coco2017 coco    

modify the 'coco2017_dataset' item in 'config.py' , for example:

coco2017_dataset = dataset_base.copy({
   'name': 'COCO 2017',
    'train_prefix': './data/coco/',
    'train_info': 'annotations/instances_train2017.json',
    'trainimg_prefix': 'train2017/',
    'train_images': './data/coco/',

    'valid_prefix': './data/coco/',
    'valid_info': 'annotations/instances_val2017.json',
    'validimg_prefix': 'val2017/',
    'valid_images': './data/coco/',

    'label_map': COCO_LABEL_MAP

})
  • if use casia-SPT_val dataset in this repo
casia_SPT_val = dataset_base.copy({
    'name': 'casia-SPT 2020',   #dataset's name 
    'train_prefix': './data/casia-SPT_val/val/',   #dataset path
    'train_info': 'val_annotation.json',           #label file
    'trainimg_prefix': '',
    'train_images': './data/casia-SPT_val/val/',   
    
    'valid_prefix': './data/casia-SPT_val/val/',
    'valid_info': 'val_annotation.json',
    'validimg_prefix': '',
    'valid_images': './data/casia-SPT_val/val',

    'label_map': COCO_LABEL_MAP
})
  • use casia_SPT_val dataset, set backbone resnet18_backbone, batchsize is (batchsize=imgs_per_gpu*workers_per_gpu)
solov2_base_config = coco_base_config.copy({
    'name': 'solov2_base', 
    'backbone': resnet18_backbone,              #backbone
    # Dataset stuff
    'dataset': casia_SPT_val,
    'num_classes': len(coco2017_dataset.class_names) + 1,
    'imgs_per_gpu': 6,
    'workers_per_gpu': 2,
    'num_gpus': 1,                       #only support single GPU for now
  • Complete Example Settings in 'data/config.py'
# ----------------------- SOLO v2.0 CONFIGS ----------------------- #

solov2_base_config = coco_base_config.copy({
    'name': 'solov2_base', 
    'backbone': resnet18_backbone,
    # Dataset stuff
    'dataset': casia_SPT_val,
    'num_classes': len(coco2017_dataset.class_names) + 1,
     #batchsize=imgs_per_gpu*workers_per_gpu
    'imgs_per_gpu': 4,
    'workers_per_gpu': 2,
    'num_gpus': 1,

    'train_pipeline':  [
        dict(type='LoadImageFromFile'),                                #read img process 
        dict(type='LoadAnnotations', with_bbox=True, with_mask=True),     #load annotations 
        dict(type='Resize',                                              #Multiscale training, then select a size from the size behind
            img_scale=[(768, 512), (768, 480), (768, 448),
                    (768, 416), (768, 384), (768, 352)],
            multiscale_mode='value',
            keep_ratio=True),
        dict(type='RandomFlip', flip_ratio=0.5),                    #Stochastic inversion, prob=0.5 
        dict(type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True),    #normallize                 
        dict(type='Pad', size_divisor=32),                                # SOLOV2 has requirements on the size of network input, and the image size needs to be a multiple of 32
        dict(type='DefaultFormatBundle'),                                #将数据转换为tensor,为后续网络计算
        dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks'], meta_keys=('filename', 'ori_shape', 'img_shape', 'pad_shape',
                            'scale_factor', 'flip', 'img_norm_cfg')),   
    ],

    'test_cfg': None,

    # learning policy
    'lr_config': dict(policy='step', warmup='linear', warmup_iters=500, warmup_ratio=0.01, step=[27, 33]),
    'total_epoch': 36,               #set epoch number for train 

    # optimizer
    'optimizer': dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0001),  
    'optimizer_config': dict(grad_clip=dict(max_norm=35, norm_type=2)),   #Gradient balance strategy
    'resume_from': None,    #resume from a checkpoing file,  start epoch 1  if None
    'epoch_iters_start': 1,    #start epoch number

    'test_pipeline': [
        dict(type='LoadImageFromFile'),
        dict(
            type='MultiScaleFlipAug',
            img_scale=(768, 448),
            flip=False,
            transforms=[
                dict(type='Resize', keep_ratio=True),
                dict(type='RandomFlip'),
                dict(type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True),
                dict(type='Pad', size_divisor=32),
                dict(type='ImageToTensor', keys=['img']),
                dict(type='Collect', keys=['img']),
            ])
    ],

    'test_cfg': dict(
                nms_pre=500,
                score_thr=0.1,
                mask_thr=0.5,
                update_thr=0.05,
                kernel='gaussian',  # gaussian/linear
                sigma=2.0,
                max_per_img=30)
})
  • start trainning
python train.py  

eval

Modify the last code of the 'eval.py' code based on your dataset

# valmodel_weight     model weight
# data_path           images path
# benchmark         whether write result json to file 
# test_mode 
# save_imgs         whether save result image to results path
eval(valmodel_weight='pretrained/solov2_448_r18_epoch_36.pth',data_path="data/casia-SPT_val/val/JPEGImages", benchmark=False, test_mode="images", save_imgs=False)
#eval(valmodel_weight='pretrained/solov2_448_r18_epoch_36.pth',data_path="cam0.avi", benchmark=False, test_mode="video")

run eval.py

python eval.py

weight files

backbone pretrained weights
from torchvision, prtrained weight in imagenet

resnet18: ./pretrained/resnet18_nofc.pth
resnet34:  ./pretrained/resnet18_nofc.pth

solov2 light weight trained on coco2017

SOLOv2_Light_448_R18_36: ./pretrained/solov2_448_r18_epoch_36.pth 
SOLOv2_Light_448_R34_36:  https://drive.google.com/file/d/1F3VRX1nZPnjKrzAC7Z4EmAlwmG-GWF8u/view?usp=sharing
or
链接:https://pan.baidu.com/s/1MCVkAeKwTua-m9g1NLyRpw 
提取码:ljkk 
复制这段内容后打开百度网盘手机App,操作更方便哦

Other

1.该网络训练时会将resnset的第一个卷积层和第一个stage的BasicBlock卷积层freeze,并且训练时resnet中的bn层都设置为eval()模式
2. The inputs of solov2 require multiples of 32 in both length and width, depending on the grid he is drawing, and other dimensions may be impossible to calculate.
3. Solov2 is relatively simple on the whole, with no strange operations or layers. There is a convolution operation in post-processing (the convolution core is learned during training, and it takes little time to compute on GPU). Matrix_NMS also takes little time. After this part of processing, the output of the network is:

mask [nums, height, weight]    #masks's number, 
cls [nums]                     #class of masks
scores [nums]                  #score of masks

Reference

https://github.com/WXinlong/SOLO
https://github.com/open-mmlab/mmdetection
https://github.com/open-mmlab/mmcv

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].