Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → yizt → Keras Ctpn

yizt / Keras Ctpn

Licence: apache-2.0

keras复现场景文本检测网络CPTN: 《Detecting Text in Natural Image with Connectionist Text Proposal Network》；欢迎试用，关注，并反馈问题...

Programming Languages

python

139335 projects - #7 most used programming language

Labels

deep-learning keras ocr text-detection

Projects that are alternatives of or similar to Keras Ctpn

Ctpn

Detecting Text in Natural Image with Connectionist Text Proposal Network (ECCV'16)

Stars: ✭ 1,220 (+1270.79%)

Mutual labels: text-detection, ocr

Awesome Ocr Resources

A collection of resources (including the papers and datasets) of OCR (Optical Character Recognition).

Stars: ✭ 335 (+276.4%)

Mutual labels: text-detection, ocr

Text Detection Ctpn

text detection mainly based on ctpn model in tensorflow, id card detect, connectionist text proposal network

Stars: ✭ 3,242 (+3542.7%)

Mutual labels: text-detection, ocr

vietnamese-ocr-toolbox

A toolbox for Vietnamese Optical Character Recognition.

Stars: ✭ 26 (-70.79%)

Mutual labels: ocr, text-detection

Tensorflow psenet

This is a tensorflow re-implementation of PSENet: Shape Robust Text Detection with Progressive Scale Expansion Network.My blog:

Stars: ✭ 472 (+430.34%)

Mutual labels: text-detection, ocr

craft-text-detector

Packaged, Pytorch-based, easy to use, cross-platform version of the CRAFT text detector

Stars: ✭ 151 (+69.66%)

Mutual labels: ocr, text-detection

Megreader

A research project for text detection and recognition using PyTorch 1.2.

Stars: ✭ 332 (+273.03%)

Mutual labels: text-detection, ocr

Ocr.pytorch

A pure pytorch implemented ocr project including text detection and recognition

Stars: ✭ 196 (+120.22%)

Mutual labels: text-detection, ocr

Dbnet.pytorch

A pytorch re-implementation of Real-time Scene Text Detection with Differentiable Binarization

Stars: ✭ 435 (+388.76%)

Mutual labels: text-detection, ocr

Psenet.pytorch

A pytorch re-implementation of PSENet: Shape Robust Text Detection with Progressive Scale Expansion Network

Stars: ✭ 416 (+367.42%)

Mutual labels: text-detection, ocr

doctr

docTR (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.

Stars: ✭ 1,409 (+1483.15%)

Mutual labels: ocr, text-detection

Keras Ocr

A packaged and flexible version of the CRAFT text detector and Keras CRNN recognition model.

Stars: ✭ 782 (+778.65%)

Mutual labels: text-detection, ocr

pytorch.ctpn

pytorch, ctpn ,text detection ,ocr,文本检测

Stars: ✭ 123 (+38.2%)

Mutual labels: ocr, text-detection

PSENet-Tensorflow

TensorFlow implementation of PSENet text detector (Shape Robust Text Detection with Progressive Scale Expansion Networkt)

Stars: ✭ 51 (-42.7%)

Mutual labels: ocr, text-detection

East

A tensorflow implementation of EAST text detector

Stars: ✭ 2,804 (+3050.56%)

Mutual labels: text-detection, ocr

Chineseaddress ocr

Photographing Chinese-Address OCR implemented using CTPN+CTC+Address Correction. 拍照文档中文地址文字识别。

Stars: ✭ 309 (+247.19%)

Mutual labels: text-detection, ocr

Text Detection

Text detection with mainly MSER and SWT

Stars: ✭ 167 (+87.64%)

Mutual labels: text-detection, ocr

Awesome Deep Text Detection Recognition

A curated list of resources for text detection/recognition (optical character recognition ) with deep learning methods.

Stars: ✭ 2,282 (+2464.04%)

Mutual labels: text-detection, ocr

React Native Tesseract Ocr

Tesseract OCR wrapper for React Native

Stars: ✭ 384 (+331.46%)

Mutual labels: text-detection, ocr

Seglink

An Implementation of the seglink alogrithm in paper Detecting Oriented Text in Natural Images by Linking Segments

Stars: ✭ 479 (+438.2%)

Mutual labels: text-detection, ocr

View All Similar Projects ➔

keras-ctpn

[TOC]

说明
预测
训练
例子
4.1 ICDAR2015
4.1.1 带侧边细化
4.1.2 不带带侧边细化
4.1.3 做数据增广-水平翻转
4.2 ICDAR2017
4.3 其它数据集
toDoList
总结

说明

本工程是keras实现的CPTN: Detecting Text in Natural Image with Connectionist Text Proposal Network . 本工程实现主要参考了keras-faster-rcnn ; 并在ICDAR2015和ICDAR2017数据集上训练和测试。

工程地址: keras-ctpn

cptn论文翻译:CTPN.md

效果：

使用ICDAR2015的1000张图像训练在500张测试集上结果为：Recall: 37.07 % Precision: 42.94 % Hmean: 39.79 %; 原文中的F值为61%；使用了额外的3000张图像训练。

关键点说明:

a.骨干网络使用的是resnet50

b.训练输入图像大小为720*720; 将图像的长边缩放到720,保持长宽比,短边padding;原文是短边600;预测时使用1024*1024

c.batch_size为4, 每张图像训练128个anchor,正负样本比为1:1;

d.分类、边框回归以及侧边细化的损失函数权重为1:1:1;原论文中是1:1:2

e.侧边细化与边框回归选择一样的正样本anchor;原文中应该是分开选择的

f.侧边细化还是有效果的(注:网上很多人说没有啥效果)

g.由于有双向GRU，水平翻转会影响效果(见样例做数据增广-水平翻转)

h.随机裁剪做数据增广，网络不收敛

预测

a. 工程下载

git clone https://github.com/yizt/keras-ctpn

b. 预训练模型下载

ICDAR2015训练集上训练好的模型下载地址： google drive，百度云盘取码:wm47

c.修改配置类config.py中如下属性

	WEIGHT_PATH = '/tmp/ctpn.h5'

d. 检测文本

python predict.py --image_path image_3.jpg

评估

a. 执行如下命令,并将输出的txt压缩为zip包

python evaluate.py --weight_path /tmp/ctpn.100.h5 --image_dir /opt/dataset/OCR/ICDAR_2015/test_images/ --output_dir /tmp/output_2015/

b. 提交在线评估将压缩的zip包提交评估，评估地址:http://rrc.cvc.uab.es/?ch=4&com=mymethods&task=1

训练

a. 训练数据下载

#icdar2013
wget http://rrc.cvc.uab.es/downloads/Challenge2_Training_Task12_Images.zip
wget http://rrc.cvc.uab.es/downloads/Challenge2_Training_Task1_GT.zip
wget http://rrc.cvc.uab.es/downloads/Challenge2_Test_Task12_Images.zip

#icdar2015
wget http://rrc.cvc.uab.es/downloads/ch4_training_images.zip
wget http://rrc.cvc.uab.es/downloads/ch4_training_localization_transcription_gt.zip
wget http://rrc.cvc.uab.es/downloads/ch4_test_images.zip

#icdar2017
wget -c -t 0 http://datasets.cvc.uab.es/rrc/ch8_training_images_1~8.zip
wget -c -t 0 http://datasets.cvc.uab.es/rrc/ch8_training_localization_transcription_gt_v2.zip
wget -c -t 0 http://datasets.cvc.uab.es/rrc/ch8_test_images.zip

b. resnet50与训练模型下载

wget https://github.com/fchollet/deep-learning-models/releases/download/v0.2/resnet50_weights_tf_dim_ordering_tf_kernels_notop.h5

c. 修改配置类config.py中，如下属性

	# 预训练模型
    PRE_TRAINED_WEIGHT = '/opt/pretrained_model/resnet50_weights_tf_dim_ordering_tf_kernels_notop.h5'

    # 数据集路径
    IMAGE_DIR = '/opt/dataset/OCR/ICDAR_2015/train_images'
    IMAGE_GT_DIR = '/opt/dataset/OCR/ICDAR_2015/train_gt'

d.训练

python train.py --epochs 50

例子

ICDAR2015

带侧边细化

不带侧边细化

做数据增广-水平翻转

ICDAR2017

其它数据集

toDoList

侧边细化(已完成)
ICDAR2017数据集训练(已完成)
检测文本行坐标映射到原图(已完成)
精度评估(已完成)
侧边回归,限制在边框内(已完成)
增加水平翻转(已完成)
增加随机裁剪(已完成)

总结

ctpn对水平文字检测效果不错
整个网络对于数据集很敏感;在2017上训练的模型到2015上测试效果很不好；同样2015训练的在2013上测试效果也很差
推测由于双向GRU，网络有存储记忆的缘故？在使用随机裁剪作数据增广时网络不收敛，使用水平翻转时预测结果也水平对称出现

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 89

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (18) 🔗