All Projects → RubanSeven → Craft_keras

RubanSeven / Craft_keras

Licence: apache-2.0
Keras implementation of Character Region Awareness for Text Detection (CRAFT)

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Craft keras

Awesome Ocr Resources
A collection of resources (including the papers and datasets) of OCR (Optical Character Recognition).
Stars: ✭ 335 (+134.27%)
Mutual labels:  text-detection, ocr
Tensorflow psenet
This is a tensorflow re-implementation of PSENet: Shape Robust Text Detection with Progressive Scale Expansion Network.My blog:
Stars: ✭ 472 (+230.07%)
Mutual labels:  text-detection, ocr
React Native Tesseract Ocr
Tesseract OCR wrapper for React Native
Stars: ✭ 384 (+168.53%)
Mutual labels:  text-detection, ocr
Text Detection Ctpn
text detection mainly based on ctpn model in tensorflow, id card detect, connectionist text proposal network
Stars: ✭ 3,242 (+2167.13%)
Mutual labels:  text-detection, ocr
Ctpn
Detecting Text in Natural Image with Connectionist Text Proposal Network (ECCV'16)
Stars: ✭ 1,220 (+753.15%)
Mutual labels:  text-detection, ocr
Chineseaddress ocr
Photographing Chinese-Address OCR implemented using CTPN+CTC+Address Correction. 拍照文档中文地址文字识别。
Stars: ✭ 309 (+116.08%)
Mutual labels:  text-detection, ocr
Dbnet.pytorch
A pytorch re-implementation of Real-time Scene Text Detection with Differentiable Binarization
Stars: ✭ 435 (+204.2%)
Mutual labels:  text-detection, ocr
doctr
docTR (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.
Stars: ✭ 1,409 (+885.31%)
Mutual labels:  ocr, text-detection
Image Text Localization Recognition
A general list of resources to image text localization and recognition 场景文本位置感知与识别的论文资源与实现合集 シーンテキストの位置認識と識別のための論文リソースの要約
Stars: ✭ 788 (+451.05%)
Mutual labels:  text-detection, ocr
Keras Ocr
A packaged and flexible version of the CRAFT text detector and Keras CRNN recognition model.
Stars: ✭ 782 (+446.85%)
Mutual labels:  text-detection, ocr
PSENet-Tensorflow
TensorFlow implementation of PSENet text detector (Shape Robust Text Detection with Progressive Scale Expansion Networkt)
Stars: ✭ 51 (-64.34%)
Mutual labels:  ocr, text-detection
Differentiablebinarization
DB (Real-time Scene Text Detection with Differentiable Binarization) implementation in Keras and Tensorflow
Stars: ✭ 106 (-25.87%)
Mutual labels:  text-detection, ocr
craft-text-detector
Packaged, Pytorch-based, easy to use, cross-platform version of the CRAFT text detector
Stars: ✭ 151 (+5.59%)
Mutual labels:  ocr, text-detection
Megreader
A research project for text detection and recognition using PyTorch 1.2.
Stars: ✭ 332 (+132.17%)
Mutual labels:  text-detection, ocr
vietnamese-ocr-toolbox
A toolbox for Vietnamese Optical Character Recognition.
Stars: ✭ 26 (-81.82%)
Mutual labels:  ocr, text-detection
Psenet.pytorch
A pytorch re-implementation of PSENet: Shape Robust Text Detection with Progressive Scale Expansion Network
Stars: ✭ 416 (+190.91%)
Mutual labels:  text-detection, ocr
East
A tensorflow implementation of EAST text detector
Stars: ✭ 2,804 (+1860.84%)
Mutual labels:  text-detection, ocr
pytorch.ctpn
pytorch, ctpn ,text detection ,ocr,文本检测
Stars: ✭ 123 (-13.99%)
Mutual labels:  ocr, text-detection
Seglink
An Implementation of the seglink alogrithm in paper Detecting Oriented Text in Natural Images by Linking Segments
Stars: ✭ 479 (+234.97%)
Mutual labels:  text-detection, ocr
Keras Ctpn
keras复现场景文本检测网络CPTN: 《Detecting Text in Natural Image with Connectionist Text Proposal Network》;欢迎试用,关注,并反馈问题...
Stars: ✭ 89 (-37.76%)
Mutual labels:  text-detection, ocr

CRAFT_keras

论文地址:Character Region Awareness for Text Detection

作者推理部分的代码:clovaai/CRAFT-pytorch

概述

本文用Keras实现了CRAFT文本检测算法,通过预测单个字符的高斯热图以及字符间的连接性来检测文本。

image

网络结构

主干网络采用了VGG16-BN,上采用部分设计了一个UpConv Block结构,网络最终在1/2图上产生两个输出:

1、Region score:字符级的高斯热图

2、Affinity score:字符间连接的高斯热图

本文实现的网络和原作有两处差异

VGG16采用Keras自带的,没有加入BN

输出的热图增加sigmoid激活函数,原作没有采用任何激活函数

image

高斯热图

image

原作Affinity Box的生成

1、连接Character Box对角线,得到上三角形和下三角形。

2、连接相邻两个字符的上三角形和下三角形中心,得到Affinity Box。

本文Affinity Box的生成

1、连接Character Box对角线,得到2对三角形,上三角形(T)和下三角形(B),左三角形(L)和右三角形(R)。

2、字符1的2对三角形与字符2的两对三角形进行组合,产生4种组合情况,每组4个三角形。

3、每组4个三角形构成一个候选的Affinity Box。

4、选出其中面积最大且为凸四边形的Affinity Box。(面积最大的方法有待验证)

生成高斯热图模板

参考CornerNet生成一个正方形的2D高斯热图。

论文地址:princeton-vl/CornerNet

Github:princeton-vl/CornerNet

生成Region Score GT和Affinity Score GT

使用Opencv中的PerspectiveTransform计算出对应形状的高斯热图,热图出现重合时,本文参考CenterNet的做法,取分数最大值。

模型训练

image 模型训练有两个关键部分:Confidence map计算和Loss计算。

计算Conference map

对于只有Word级而无Character级标签的数据集(如ICDAR2013、ICDAR2015),由于弱监督产生的Character级标签并不一定准确,论文中加入了Confidence来对伪标签进行打分。

image

其中, l(w) 表示Word包含的字符个数, l^{c}(w) 表示伪标签中Character Box的个数。

image

R(w) 表示生成伪标签的区域

计算Loss

image

其中,S_{r}(p)和S_{a}(p) 分别表示网络输出的region score和affinity score, S_{r}^{}(p) 和S_{a}^{}(p) 分别表示Region Score GT和Affinity Score GT。

生成伪标签

image

对于只有Word级而无Character级标签的数据集(如ICDAR2013、ICDAR2015),需要生成Character级的标签。

原作方法

使用Word级的Box坐标crop出文本图像

使用当前训练的模型预测出文本图像的Region Score Map。

使用分水岭算法分割Region Score Map,得到Character Box的坐标。

将Character Box的坐标转换回原坐标

本文方法

使用当前训练的模型预测出图像的Region Score Map。

使用Word级的Box坐标crop出局部的Region Score Map。

使用分水岭算法分割Region Score Map,得到Character Box的坐标。

将Character Box的坐标转换回原坐标

分水岭算法

下面贴出作者在Github上给出的回复

I just followed the instruction provided by opencv document (https://docs.opencv.org/3.3.1/d3/db4/tutorial_py_watershed.html).

We used thresholding for the binary maps for finding three areas such as sure_fg, sure_bg, and unknown in the example.

Two thresholds are used for separating those areas, and the values are 0.6 and 0.2, respectively.

These thresholds are not sensitive for distinguishing those areas since they play a role of the initial guess for the watershed labeling. The initial markers are created by labeling the regions inside surely foreground area.

In addition, we used opencv watershed labeling function.

训练策略

训练步骤

在强标签数据(SynthText)上进行强监督训练,迭代50k次。

在其他数据集上进行fine-tuning,强标签数据和弱标签数据混合训练。

训练技巧

fine-tuning期间,弱标签数据和强标签数据按照1:5的比例进行训练,从而保证字符级标签的准确性。

对于ICDAR2015和ICDAR2017中部分“DO NOT CARE”的文本在训练阶段将Confidence设置为0。

常用的数据增强,如:Crops,rotations,and/or color variations。

使用Adam优化器进行训练。

按照1:3使用OHEM。

推理

由于直接使用了作者的推理代码,此处就不详细说明了。

论文中的结果

image

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].