All Projects → UB-Mannheim → ocromore

UB-Mannheim / ocromore

Licence: Apache-2.0 License
Process, enhance and evaluate multiple OCR output.

Programming Languages

python
139335 projects - #7 most used programming language
Dockerfile
14818 projects

Projects that are alternatives of or similar to ocromore

granblue-automation-android
Educational application written in Kotlin aimed at automating user-defined workflows for the mobile game, "Granblue Fantasy", using MediaProjection, AccessibilityService, and OpenCV.
Stars: ✭ 26 (+62.5%)
Mutual labels:  ocr
easyocr
easy to ocr
Stars: ✭ 49 (+206.25%)
Mutual labels:  ocr
tesseract-server
A small lightweight HTTP server that converts photos, images and scanned documents to text using optical character recognition by utilizing the power of Google Tesseract.
Stars: ✭ 15 (-6.25%)
Mutual labels:  ocr
TextBoxGAN
Generate text boxes from input words with a GAN.
Stars: ✭ 50 (+212.5%)
Mutual labels:  ocr
scanbot-sdk-example-ionic
Scanbot scanner SDK example app for Ionic with Cordova.
Stars: ✭ 24 (+50%)
Mutual labels:  ocr
screenshot-actions
Dunst actions for screenshots (OCR, upload to 0x0.st, delete, rename, move to/from clipboard)
Stars: ✭ 49 (+206.25%)
Mutual labels:  ocr
python-ocr-example
The code for the blogpost A Python Approach to Character Recognition
Stars: ✭ 54 (+237.5%)
Mutual labels:  ocr
ocr
Simple app to extract text from pictures using Tesseract
Stars: ✭ 98 (+512.5%)
Mutual labels:  ocr
Iron-OCR-Image-to-Text-in-CSharp
Image to Text Tutorial in C# - See https://ironsoftware.com/csharp/ocr/tutorials/how-to-read-text-from-an-image-in-csharp-net/
Stars: ✭ 65 (+306.25%)
Mutual labels:  ocr
staff identity card ocr project
Staff Identity Card OCR Project
Stars: ✭ 15 (-6.25%)
Mutual labels:  ocr
tutorials
Git Repo for Articles on Ergo Sum blog and the youtube channel https://www.youtube.com/channel/UCiie9CN--dazA7iT2sry5FA
Stars: ✭ 42 (+162.5%)
Mutual labels:  ocr
Seven-Segment-OCR
Computer vision project to automatically recognize digits characters in a seven-segments display
Stars: ✭ 58 (+262.5%)
Mutual labels:  ocr
OCR-Reader
An Android app to extract text from camera preview directly.
Stars: ✭ 43 (+168.75%)
Mutual labels:  ocr
deep-text-recognition-benchmark
PyTorch code of my ICDAR 2021 paper Vision Transformer for Fast and Efficient Scene Text Recognition (ViTSTR)
Stars: ✭ 123 (+668.75%)
Mutual labels:  ocr
PRLib
Pre-Recognition Library - library with algorithms for improving OCR quality.
Stars: ✭ 22 (+37.5%)
Mutual labels:  ocr
go-ocr
A tool for extracting text from scanned documents (via OCR), with user-defined post-processing.
Stars: ✭ 31 (+93.75%)
Mutual labels:  ocr
Android-Text-Scanner
Read text and numbers with android camera OCR
Stars: ✭ 27 (+68.75%)
Mutual labels:  ocr
ScreenAccess
Anti Recoil system with weapon type built-in recognition based on OCR, currently support next games: Apex Legends
Stars: ✭ 41 (+156.25%)
Mutual labels:  ocr
pdf2xml-viewer
A simple viewer and inspection tool for text boxes in PDF documents
Stars: ✭ 82 (+412.5%)
Mutual labels:  ocr
VehicleInfoOCR
Use your camera to read number plates and obtain vehicle details. Simple, ad-free and faster alternative to existing playstore apps
Stars: ✭ 35 (+118.75%)
Mutual labels:  ocr

ocromore

license Docker Automated build

Overview

ocromore is a command line driven post-processing tool for ocr-outputs. The main purpose is to unite the best parts of multiple ocr-outputs to produce an optimal result.
It can also be used to find optimal settings for ocr software, to visualize different information about the ocr results or context, or just query various things. It is part of the Aktienführer-Datenarchiv work process, but can also be used independently.

First, the program parses the different ocr-output files and saves the results to a sqlite-database. The purpose of this database is to serve as an exchange and store platform using pandas as handler. With an objectifier for the dataframe from pandas a wide-range of performant use-cases is possible. The software has in implementation of the Multiple sequence alignment (MSA) algorithm for combining multiple ocr-outputs. To evaluate the results you can either use the commonly used ISRI tools to generate a accuracy report, or do visual comparison with diff-tools like meld.

Beta Results

Our current character accuracy (ignoring whitespaces) results are:

OCR-Engine AKF-II UNLV
Abbyy 99,35 % 98,46 %
Ocropus (default en-model) 92,49 %
Ocropus (trained) 98,76 %
Tesseract 99,00 % 98,23 %
MSA 99,60 % 98,65 %

You can find the AKF-II result in docs/results. The results for UNLV are not optimized but still there is some improvement. You can find the UNLV results in Testfiles/results.

Roadmap

✓ Parse files to the database
✓ Preprocess file information
✓ Combine file information
✓ Evaluate results against groundtruth
✓ Visual comparision (result vs. gt) with diff-tool
✓ Store results in txt-file
✘ Store results in database/hocr-files
✘ Plot results in different ways (with matplotlib)

Supported fileformats

✓ hocr (with confidences)
✓ abbyy-xml (with confidences "ASCII")

Installation

This installation is tested with Ubuntu and we expect that it should work for other similar environments similarly.

1. Requirements

2. Copy this repository

git clone https://github.com/UB-Mannheim/ocromore.git
cd ocromore
git submodule update --init --recursive

3. Dependencies can be installed into a Python Virtual Environment:

$ virtualenv ocromore_venv/
$ source ocromore_venv/bin/activate
$ pip install -r requirements.txt

Docker (alternative way)

If you want to use the CLI commands under windows we recommend to use the docker:

git clone https://github.com/UB-Mannheim/ocromore.git
cd ocromore
git submodule update --init --recursive

# build it yourself
docker build -t ocromore .
docker run -it -v `PWD`:/home/developer/coding/ocromore ocromore

# or use the container from docker hub
docker pull ubma/ocromore
docker run -it -v `PWD`:/home/developer/coding/ocromore ocromore

You can than run the scripts for visual results outside docker in your OS. For that you need Python and Meld installed and add it to environment variables (ENV):

  • Variable = "Path"
  • Value = {directory to meld}\meld.exe

Developing

The project was written in PyCharm 2017.3 (CE),
so if you are a developer it's recommended to use it.

Python 3.6.3 (default, Oct 6 2017, 08:44:35)
GCC 5.4.0 20160609 on linux
Tested on: Ubuntu17.10

Meld is the default diff-tool,
but you can easily implemented the diff-tool of your choice.

The ISRI Tools are necessary for the evaluation, but not for the combine process.

Process steps

ocromore-overview

  1. Parsing all ocr-outputfiles to an database
    (This step only has to be done once)

  2. Pre-process the gathered information
    The results from the following processes can also be stored directly to the database

    • Line-matching all files
    • Unspacing words in each file
      Unspacing means to delete whitespaces in spaced text
      (E.g. H e l l o => Hello)
    • Word-matching all files per line
  3. Combine file information

    • Different compare methods
      • Textdistance-Keying
        • Levenshtein
        • Damerau-Levenshtein
        • etc.
      • Multi-Sequence-Alignment (MSA)
        • pivot-based
        • linewise/wordwise
        • Adjustable search-space-processor correction
          • Matching similar character
          • Whitespace/Wildcard improvements
        • Adjustable decision parameter
          • Char confidence
          • Best-of-n
  4. The output can be stored in the database and/or as *.txt or *.hocr.

  5. Evaluate the output against groundtruth files or each other and generate a accuracy report. Or compare the files visual via diff-tools.

Running

Example

First of all you have to adjust the config-files. There are two main config-files in "./configuration/":

  • to_db_reader
    • path to ocr ocr-files (e.g. hocr)
    • parameter for parsing hocr to db
      • naming etc.
  • voter
    • path to db
    • parameter for combining the information from the ocr-files

The parameter to perform the examples are set as default.
So you can just run the following commands.

At the current stage it is recommended to use PyCharm to perform the next steps.

Parse files to db and do pre-processing:

# All parameters can set in the to_db_reader config
# set HOCR2SQL parse files to db 
# set POS parameter, to set the naming of db and tables 
# set PREPROCSSING (It is recommended to perform the preprocessing steps directly after parsing  
# but it is not necassary)

$ python3 ./main_prepare_dataset.py

Combine files and generate a accuracy report:

# All parameters can set in the voter config
# set DO_MSA_BEST to perform msa (not Textdistance) method
# set DO_ISRI_VAL to generate a accuracy report

$ python3 ./main_msa_ndist_charconf.py

To perform a visual comparision:

$ python3 ./result_visualization.py

The result are stored in ./Testfiles/tableparser_output/

Copyright and License

Copyright (c) 2017 Universitätsbibliothek Mannheim

Author:

ocromore is Free Software. You may use it under the terms of the Apache 2.0 License. See LICENSE for details.

Acknowledgements

The tools are depending on some third party libraries:

  • hocr-parser parses hocr files into a dictionary structure. Originally written by Athento.
  • ISRI Analytics Tool for measuring the performance of and experimenting with OCR output.
  • PySymSpell a pure Python port of SymSpell. It is an optional submodule for the project (MIT License).
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].