Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → cs-chan → Total Text Dataset

cs-chan / Total Text Dataset

Licence: bsd-3-clause

Total Text Dataset. It consists of 1555 images with more than 3 different text orientations: Horizontal, Multi-Oriented, and Curved, one of a kind.

Programming Languages

matlab

3953 projects

Labels

dataset text-recognition text-detection

Projects that are alternatives of or similar to Total Text Dataset

Adelaidet

AdelaiDet is an open source toolbox for multiple instance-level detection and recognition tasks.

Stars: ✭ 2,565 (+342.24%)

Mutual labels: text-detection, text-recognition

Textrecognitiondatagenerator

A synthetic data generator for text recognition

Stars: ✭ 2,075 (+257.76%)

Mutual labels: dataset, text-recognition

Awesome Deep Text Detection Recognition

A curated list of resources for text detection/recognition (optical character recognition ) with deep learning methods.

Stars: ✭ 2,282 (+293.45%)

Mutual labels: text-detection, text-recognition

Cleval

CLEval: Character-Level Evaluation for Text Detection and Recognition Tasks

Stars: ✭ 92 (-84.14%)

Mutual labels: text-detection, text-recognition

Chineseaddress ocr

Photographing Chinese-Address OCR implemented using CTPN+CTC+Address Correction. 拍照文档中文地址文字识别。

Stars: ✭ 309 (-46.72%)

Mutual labels: text-detection, text-recognition

Awesome Scene Text Recognition

A curated list of resources dedicated to scene text localization and recognition

Stars: ✭ 1,637 (+182.24%)

Mutual labels: text-detection, text-recognition

Crnn With Stn

implement CRNN in Keras with Spatial Transformer Network

Stars: ✭ 83 (-85.69%)

Mutual labels: dataset, text-recognition

Ocr.pytorch

A pure pytorch implemented ocr project including text detection and recognition

Stars: ✭ 196 (-66.21%)

Mutual labels: text-detection, text-recognition

awesome-scene-text

A curated list of papers and resources for scene text detection and recognition

Stars: ✭ 43 (-92.59%)

Mutual labels: text-recognition, text-detection

doctr

docTR (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.

Stars: ✭ 1,409 (+142.93%)

Mutual labels: text-recognition, text-detection

Chinese Text Detection And Recognition

Assignment of Image Analysis and Understanding

Stars: ✭ 53 (-90.86%)

Mutual labels: text-detection, text-recognition

Awesome Ocr Resources

A collection of resources (including the papers and datasets) of OCR (Optical Character Recognition).

Stars: ✭ 335 (-42.24%)

Mutual labels: text-detection, text-recognition

Training extensions

Trainable models and NN optimization tools

Stars: ✭ 857 (+47.76%)

Mutual labels: text-detection, text-recognition

Pan pp.pytorch

Official implementations of PSENet, PAN and PAN++.

Stars: ✭ 141 (-75.69%)

Mutual labels: text-detection, text-recognition

Image Text Localization Recognition

A general list of resources to image text localization and recognition 场景文本位置感知与识别的论文资源与实现合集シーンテキストの位置認識と識別のための論文リソースの要約

Stars: ✭ 788 (+35.86%)

Mutual labels: text-detection, text-recognition

AE TextSpotter

No description or website provided.

Stars: ✭ 68 (-88.28%)

Mutual labels: text-recognition, text-detection

Megreader

A research project for text detection and recognition using PyTorch 1.2.

Stars: ✭ 332 (-42.76%)

Mutual labels: text-detection, text-recognition

React Native Tesseract Ocr

Tesseract OCR wrapper for React Native

Stars: ✭ 384 (-33.79%)

Mutual labels: text-detection, text-recognition

Seq2seqchatbots

A wrapper around tensor2tensor to flexibly train, interact, and generate data for neural chatbots.

Stars: ✭ 466 (-19.66%)

Mutual labels: dataset

Cluepretrainedmodels

高质量中文预训练模型集合：最先进大模型、最快小模型、相似度专门模型

Stars: ✭ 493 (-15%)

Mutual labels: dataset

View All Similar Projects ➔

Total-Text-Dataset (Official site)

Updated on April 29, 2020 (Detection leaderboard is updated - highlighted E2E methods. Thank you shine-lcy.)

Updated on March 19, 2020 (Query on the new groundtruth of test set)

Updated on Sept. 08, 2019 (New training groundtruth of Total-Text is now available)

Updated on Sept. 07, 2019 (Updated Guided Annotation toolbox for scene text image annotation)

Updated on Sept. 07, 2019 (Updated baseline as to our IJDAR)

Updated on August 01, 2019 (Extended version with new baseline + annotation tool is accepted at IJDAR)

Updated on May 30, 2019 (Important announcement on Total-Text vs. ArT dataset)

Updated on April 02, 2019 (Updated table ranking with default vs. our proposed DetEval)

Updated on March 31, 2019 (Faster version DetEval.py, support Python3. Thank you princewang1994.)

Updated on March 14, 2019 (Updated table ranking with evaluation protocol info.)

Updated on November 26, 2018 (Table ranking is included for reference.)

Updated on August 24, 2018 (Newly added Guided Annotation toolbox folder.)

Updated on May 15, 2018 (Added groundtruth in '.txt' format.)

Updated on May 14, 2018 (Added feature - 'Do not care' candidates filtering is now available in the latest python scripts.)

Updated on April 03, 2018 (Added pixel level groundtruth)

Updated on November 04, 2017 (Added text level groundtruth)

Released on October 27, 2017

News

We received some questions in regard to the new groundtruth for the test set of Total-Text. Here is an update. We do not release a new version of the test set groundtruth because

 1) there is no need of standardising the length of the groundtruth vertices for testing purpose, it was proposed to facilitate training only, and
 2) a new version of groundtruth would make the previous benchmarks irrelevant.

Do contact us if you think there is a valid reason to require the new groundtruth for the test set, we shall discuss about it.

TOTAL-TEXT is a word-level based English curve text dataset. If you are interested in text-line based dataset with both English and Chinese instances, we highly recommend you to refer SCUT-CTW1500. In addition, a Robust Reading Challenge on Arbitrary-Shaped Text (RRC-ArT), which is extended from Total-Text and SCUT-CTW1500, was held at ICDAR2019 to stimulate more innovative ideas on the arbitrary-shaped text reading task. Congratulations to all winners and challengers. The technical report of ArT can be found on at this https URL.

Important Announcement

Total-Text and SCUT-CTW1500 are now part of the training set of the largest curved text dataset - ArT (Arbitrary-Shaped Text dataset). In order to retain the validity of future benchmarking on Total-Text datasets, the test-set images of Total-Text should be removed (with the corresponding ID provided HERE) from the ArT dataset shall one intend to leverage the extra training data from the ArT dataset. We count on the trust of the research community to perform such removal operation to attain the fairness of the benchmarking.

Table Ranking

The results from recent papers on Total-Text dataset are listed below where P=Precision, R=Recall & F=F-score.
If your result is missing or incorrect, please do not hesisate to contact us.
The baseline scores are based on our proposed [Poly-FRCNN-3] in this folder.
^*Pascal VOC IoU metric; ^**Polygon Regression

Detection Leaderboard

Method	Reported on paper			DetEval (tp=0.4, tr=0.8) (Default)			DetEval (tp=0.6, tr=0.7) (New Proposal)			Published at
Method	P	R	F	P	R	F	P	R	F	Published at
Our Baseline [paper]	78.0	68.0	73.0	-	-	-	78.0	68.0	73.0	IJDAR2020
CRAFTS [paper]	89.5	85.4	87.4	-	-	-	-	-	-	ECCV2020
^#ASTS_Weakly-ResNet101 (E2E) [paper]	-	-	87.3	-	-	-	-	-	-	TIP2020
TextFuseNet [paper]	89.0	85.3	87.1	-	-	-	-	-	-	IJCAI2020
^#Boundary (E2E) [paper]	88.9	85.0	87.0	-	-	-	-	-	-	AAAI2020
PolyPRNet [paper]	88.1	85.3	86.7	-	-	-	-	-	-	ACCV2020
^#Qin et al. (E2E) [paper]	87.8	85.0	86.4	-	-	-	-	-	-	ICCV2019
100%Poly [paper]	88.2	83.3	85.6	-	-	-	-	-	-	arXiv:2012
ContourNet [paper]	86.9	83.9	85.4	-	-	-	-	-	-	CVPR2020
^#Text Perceptron (E2E) [paper]	88.8	81.8	85.2	-	-	-	-	-	-	AAAI2020
PAN-640 [paper]	89.3	81.0	85.0	-	-	-	-	-	-	ICCV2019
DB-ResNet50 (800) [paper]	87.1	82.5	84.7	-	-	-	-	-	-	AAAI2020
TextCohesion [paper]	88.1	81.4	84.6	-	-	-	-	-	-	arXiv:1904
Feng et al. [paper]	87.3	81.1	84.1	-	-	-	-	-	-	IJCV2020
ReLaText [paper]	84.8	83.1	84.0	-	-	-	-	-	-	arXiv:2003
CRAFT [paper]	87.6	79.9	83.6	-	-	-	-	-	-	CVPR2019
LOMO MS [paper]	87.6	79.3	83.3	-	-	-	-	-	-	CVPR2019
SPCNet [paper]	83.0	82.8	82.9	-	-	-	-	-	-	AAAI2019
^#ABCNet (E2E) [paper]	85.4	80.1	82.7	-	-	-	-	-	-	CVPR2020
ICG [paper]	82.1	80.9	81.5	-	-	-	-	-	-	PR2019
FTSN [paper]	^*84.7	^*78.0	^*81.3	-	-	-	-	-	-	ICPR2018
PSENet-1s [paper]	84.02	77.96	80.87	-	-	-	-	-	-	CVPR2019
¹TextField [paper]	81.2	79.9	80.6	76.1	75.1	75.6	83.0	82.0	82.5	TIP2019
^#TextDragon (E2E) [paper]	85.6	75.7	80.3	-	-	-	-	-	-	ICCV2019
CSE [paper]	81.4 (^**80.9)	79.7 (^**80.3)	80.2 (^**80.6)	-	-	-	-	-	-	CVPR2019
MSR [paper]	85.2	73.0	78.6	82.7	68.3	74.9	81.4	72.5	76.7	arXiv:1901
ATTR [paper]	80.9	76.2	78.5	-	-	-	-	-	-	CVPR2019
TextSnake [paper]	82.7	74.5	78.4	-	-	-	-	-	-	ECCV2018
¹CTD [paper]	74.0	71.0	73.0	60.7	58.8	59.8	76.5	73.8	75.2	PR2019
^#TextNet (E2E) [paper]	68.2	59.5	63.5	-	-	-	-	-	-	ACCV2018
^#,2Mask TextSpotter (E2E) [paper]	69.0	55.0	61.3	68.9	62.5	65.5	82.5	75.2	78.6	ECCV2018
CENet [paper]	59.9	54.4	57.0	-	-	-	-	-	-	ACCV2018
^#Textboxes (E2E) [paper]	62.1	45.5	52.5	-	-	-	-	-	-	AAAI2017
EAST [paper]	50.0	36.2	42.0	-	-	-	-	-	-	CVPR2017
SegLink [paper]	30.3	23.8	26.7	-	-	-	-	-	-	CVPR2017

Note:

^# Framework that does end-to-end training (i.e. detection + recognition).

¹For the results of TextField and CTD, the improved versions of their original paper were used, and this explains why the performance is better.

²For Mask-TextSpotter, the relatively poor performance reported in their paper was due to a bug in the input reading module (which was fixed recently). The authors were informed about this issue.

End-to-end Recognition Leaderboard
(None refers to recognition without any lexicon; Full lexicon contains all words in test set.)

Method	Backbone	None (%)	Full (%)	FPS	Published at
CRAFTS [paper]	ResNet50-FPN	78.7	-	-	ECCV2020
MANGO [paper]	ResNet50-FPN	72.9	83.6	4.3	AAAI2021
Text Perceptron [paper]	ResNet50-FPN	69.7	78.3	-	AAAI2020
ABCNet-MS [paper]	ResNet50-FPN	69.5	78.4	6.9	CVPR2020
CharNet H-88 MS [paper]	ResNet50-Hourglass57	69.2	-	1.2	ICCV2019
Qin et al. [paper]	ResNet50-MSF	67.8	-	-	ICCV2019
ASTS_Weakly [paper]	ResNet101-FPN	65.3	84.2	2.5	TIP2020
Boundary [paper]	ResNet50-FPN	65.0	76.1	-	AAAI2020
ABCNet [paper]	ResNet50-FPN	64.2	75.7	17.9	CVPR2020
CAPNet [paper]	ResNet50-FPN	62.7	-	-	ICASSP2020
Feng et al. [paper]	VGG	55.8	79.2	-	IJCV2020
TextNet [paper]	ResNet50-SAM	54.0	-	2.7	ACCV2018
Mask TextSpotter [paper]	ResNet50-FPN	52.9	71.8	4.8	ECCV2018
TextDragon [paper]	VGG16	48.8	74.8	-	ICCV2019
Textboxes [paper]	ResNet50-FPN	36.3	48.9	1.4	AAAI2017

Description

In order to facilitate a new text detection research, we introduce Total-Text dataset (IJDAR)(ICDAR-17 paper) (presentation slides), which is more comprehensive than the existing text datasets. The Total-Text consists of 1555 images with more than 3 different text orientations: Horizontal, Multi-Oriented, and Curved, one of a kind.

Citation

If you find this dataset useful for your research, please cite

@article{CK2019,
  author    = {Chee Kheng Ch’ng and
               Chee Seng Chan and
               Chenglin Liu},
  title     = {Total-Text: Towards Orientation Robustness in Scene Text Detection},
  journal   = {International Journal on Document Analysis and Recognition (IJDAR)},
  volume    = {23},
  pages     = {31-52},
  year      = {2020},
  doi       = {10.1007/s10032-019-00334-z},
}

Feedback

Suggestions and opinions of this dataset (both positive and negative) are greatly welcome. Please contact the authors by sending email to chngcheekheng at gmail.com or cs.chan at um.edu.my.

License and Copyright

The project is open source under BSD-3 license (see the LICENSE file).

For commercial purpose usage, please contact Dr. Chee Seng Chan at cs.chan at um.edu.my

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 580

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (0) 🔗

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

cs-chan / Total Text Dataset

Programming Languages

Labels

Projects that are alternatives of or similar to Total Text Dataset

Total-Text-Dataset (Official site)

News

Important Announcement

Table Ranking

Detection Leaderboard

End-to-end Recognition Leaderboard (None refers to recognition without any lexicon; Full lexicon contains all words in test set.)

Description

Citation

Feedback

License and Copyright

End-to-end Recognition Leaderboard
(None refers to recognition without any lexicon; Full lexicon contains all words in test set.)