AE TextSpotter
Introduction
This is the official implementation of AE TextSpotter, which introduces linguistic information to eliminate the ambiguity in text detection. This code is based on MMDetection v1.0rc1.
Recommended environment
Python 3.6+
Pytorch 1.1.0
torchvision 0.2.1
pytorch_transformers 1.1.0
mmcv 0.2.13
Polygon3
opencv-python 4.4.0
Install
Please refer to MMDetection v1.0rc1 for installation.
Preparing data
Step1: Downloading dataset from ICDAR 2019 ReCTS.
Step2: The root of "data/ReCTS" should be:
data/ReCTS/
├── train
│ ├── img
│ ├── gt
├── test
│ ├── img
In folder "data/ReCTS/", files "TDA_ReCTS_train_list.txt" and "TDA_ReCTS_val_list.txt" are downloaded from TDA-ReCTS. Other json files can be generated by run "python tools/rects_prepare_data.py".
Step3: Download and unzip bert-base-chinese.zip in the root of this repository.
unzip bert-base-chinese.zip
Training
Step1:
tools/rects_dist_train.sh local_configs/rects_ae_textspotter_r50_1x.py 8
Step2:
tools/rects_dist_train.sh local_configs/rects_ae_textspotter_lm_r50_1x.py 8
Test
TDA-ReCTS
tools/rects_dist_test.sh local_configs/rects_ae_textspotter_lm_r50_1x.py work_dirs/rects_ae_textspotter_lm_r50_1x/latest.pth 8 --json_out results.json
ICDAR 2019 ReCTS Task 4: End-to-End Text Spotting
tools/rects_dist_test.sh local_configs/rects_ae_textspotter_lm_r50_1x_test.py work_dirs/rects_ae_textspotter_lm_r50_1x/latest.pth 8 --json_out results_test.json
python tools/rects_trans2submit.py
Visualization
python tools/rects_test.py local_configs/rects_ae_textspotter_lm_r50_1x.py work_dirs/rects_ae_textspotter_lm_r50_1x/latest.pth --show
Evaluation
The training list, validation list, and evaluation script of this code come from TDA-ReCTS
python tools/rects_eval.py
The output of the evaluation script should be:
[Best F-Measure] p: 84.94, r: 78.10, f: 81.37, 1-ned: 51.02, best_score_th: 0.569
[Best 1-NED] p: 86.68, r: 76.09, f: 81.04, 1-ned: 51.51, best_score_th: 0.626
Results and Models
Method | Precision (%) | Recall (%) | F-measure (%) | 1-NED (%) | Model |
---|---|---|---|---|---|
AE TextSpotter | 84.94 | 78.10 | 81.37 | 51.51 | Google Drive |
AE TextSpotter (Paper) | 84.78 | 78.28 | 81.39 | 51.32 | - |
Method | Precision (%) | Recall (%) | F-measure (%) | 1-NED (%) | Model |
---|---|---|---|---|---|
AE TextSpotter | 93.38 | 89.98 | 91.65 | 71.83 | Same as TDA-ReCTS |
AE TextSpotter (Paper) | 92.60 | 91.01 | 91.80 | 71.81 | - |
License
This project is released under the Apache 2.0 license.
Citation
If you use this work in your research, please cite us.
@inproceedings{wenhai2020ae,
title={AE TextSpotter: Learning Visual and Linguistic Representation for Ambiguous Text Spotting},
author={Wang, Wenhai and Liu, Xuebo and Ji, Xiaozhong and Xie, Enze and Liang, Ding and Yang, ZhiBo and Lu, Tong and Shen, Chunhua and Luo, Ping},
booktitle={European Conference on Computer Vision (ECCV)},
year={2020}
}
Other Projects:
PAN (ICCV 2019): https://github.com/whai362/pan_pp.pytorch
PSENet (CVPR 2019): https://github.com/whai362/PSENet