All Projects → isee15 → Card Ocr

isee15 / Card Ocr

身份证识别OCR

Programming Languages

python3
1442 projects

Projects that are alternatives of or similar to Card Ocr

Nkocr
🔎📝 This is a module to make specifics OCRs at food products and nutritional tables.
Stars: ✭ 15 (-95.65%)
Mutual labels:  ocr, tesseract
tesseract-ocr
Node.js wrapper for Tesseract OCR CLI.
Stars: ✭ 29 (-91.59%)
Mutual labels:  ocr, tesseract
OCRmyPDF
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
Stars: ✭ 6,560 (+1801.45%)
Mutual labels:  ocr, tesseract
ruzzle-solver
A python script that solves ruzzle boards
Stars: ✭ 46 (-86.67%)
Mutual labels:  ocr, tesseract
staff identity card ocr project
Staff Identity Card OCR Project
Stars: ✭ 15 (-95.65%)
Mutual labels:  ocr, tesseract
tesseract-unity
Standalone OCR plugin for Unity using Tesseract
Stars: ✭ 35 (-89.86%)
Mutual labels:  ocr, tesseract
LaraOCR
Laravel Optical Character Reader(OCR) package using ocr engines like Tesseract
Stars: ✭ 88 (-74.49%)
Mutual labels:  ocr, tesseract
saram
Get OCR in txt form from an image or pdf extension supporting multiple files from directory using pytesseract with auto rotation for wrong orientation. PYPI:
Stars: ✭ 51 (-85.22%)
Mutual labels:  ocr, tesseract
cordova-plugin-tesseract
Cordova Plugin for OCR process using Tesseract
Stars: ✭ 70 (-79.71%)
Mutual labels:  ocr, tesseract
TesseractStudio.Net
A free Windows graphical interface to the Tesseract 4.0 OCR engine.
Stars: ✭ 38 (-88.99%)
Mutual labels:  ocr, tesseract
How-to-use-tesseract-ocr-4.0-with-csharp
How to use Tesseract OCR 4.0 with C#
Stars: ✭ 60 (-82.61%)
Mutual labels:  ocr, tesseract
breach-protocol-autosolver
Solve breach protocol minigame in second(s). Windows/Linux/GeForce Now/Google Stadia. Every language.
Stars: ✭ 28 (-91.88%)
Mutual labels:  ocr, tesseract
erpnext ocr
🐍 ⚗️ Optical Character Recognition using tesseract within Frappe.
Stars: ✭ 58 (-83.19%)
Mutual labels:  ocr, tesseract
nimtesseract
A Tesseract OCR wrapper for Nim
Stars: ✭ 23 (-93.33%)
Mutual labels:  ocr, tesseract
ocr2text
Convert a PDF via OCR to a TXT file in UTF-8 encoding
Stars: ✭ 90 (-73.91%)
Mutual labels:  ocr, tesseract
textocry
Textocry - Copy text from Images (chrome extension)
Stars: ✭ 29 (-91.59%)
Mutual labels:  ocr, tesseract
MouseTooltipTranslator
chrome extension - When mouse hover on text, it shows translated tooltip using google translate
Stars: ✭ 93 (-73.04%)
Mutual labels:  ocr, tesseract
memento
Organize your meme image cluster in a better format using OCR from the meme to sort them using tesseract along with editing memes by segmenting them using OpenCV within a directory
Stars: ✭ 70 (-79.71%)
Mutual labels:  ocr, tesseract
IdCardRecognition
Android id card recognition based on OCR. 安卓基于OCR的身份证识别。
Stars: ✭ 35 (-89.86%)
Mutual labels:  ocr, tesseract
ocr
Simple app to extract text from pictures using Tesseract
Stars: ✭ 98 (-71.59%)
Mutual labels:  ocr, tesseract

Card-Ocr

身份证识别OCR, 从身份证图片中自动提取身份证号。 测试图片来自百度搜索的样例图片。 找到的图片比较少,目前都能正确识别。 可用的数据集个人很难找到。

依赖

  • opencv
  • pytesseract
  • numpy
  • matplotlib

流程

  • 获取身份证号区域

image-》灰度=》反色=》膨胀=》findContours

  • 数字识别

采用tesseract识别,通过trainfont.py获得traineddata.

trainfont使用

  1. 通过autoBox = 1自动生成box文件
trainFont(fontName, fontPath, fontsize, txt, "eng", 0, autoBox=1)
  1. 通过jBoxEditor之类的修正box文件
  2. autoBox = 0 生成traineddata
 trainFont(fontName, fontPath, fontsize, txt, "eng", 0, autoBox=0)

识别

获取到身份证区域之后,截取身份证号,灰度化,然后交给pytesseract

 pytesseract.image_to_string(image, lang='ocrb', config=tessdata_dir_config)

Keras

除了用tesseract,也可以用机器学习的方式训练识别。这里用了Keras with Tensorflow,"开头两套双卷积池化层,后面接一个 dropout 防过拟合,再接两个全链接层,最后一个 softmax 输出结果。" 使用genData.py生成train数据。 截取身份证号之后的图片分割成18个图片,x-predict.png 用kerastrain.py进行预测识别 训练的结果有时候3和5能分清,有时候分不清。 因为没有支持CUDA的显卡,用的CPU训练。

效果

plot

TODO

  • [ ] Keras with Tesorflow 来训练识别

引用

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].