Lucs1590 / Nkocr

Licence: Apache-2.0 license

🔎📝 This is a module to make specifics OCRs at food products and nutritional tables.

Programming Languages

python

139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Nkocr

React Native Tesseract Ocr

Tesseract OCR wrapper for React Native

Stars: ✭ 384 (+2460%)

Mutual labels: ocr, tesseract, tesseract-ocr

saram

Get OCR in txt form from an image or pdf extension supporting multiple files from directory using pytesseract with auto rotation for wrong orientation. PYPI:

Stars: ✭ 51 (+240%)

Mutual labels: ocr, tesseract, pytesseract

Textshot

Python tool for grabbing text via screenshot

Stars: ✭ 1,163 (+7653.33%)

Mutual labels: ocr, tesseract, tesseract-ocr

TesseractStudio.Net

A free Windows graphical interface to the Tesseract 4.0 OCR engine.

Stars: ✭ 38 (+153.33%)

Mutual labels: ocr, tesseract, tesseract-ocr

Tesseract

Bindings to Tesseract OCR engine for R

Stars: ✭ 192 (+1180%)

Mutual labels: ocr, tesseract, tesseract-ocr

Ccextractor

CCExtractor - Official version maintained by the core team

Stars: ✭ 356 (+2273.33%)

Mutual labels: ocr, tesseract, tesseract-ocr

Tesseract

This package contains an OCR engine - libtesseract and a command line program - tesseract. Tesseract 4 adds a new neural net (LSTM) based OCR engine which is focused on line recognition, but also still supports the legacy Tesseract OCR engine of Tesseract 3 which works by recognizing character patterns. Compatibility with Tesseract 3 is enabled by using the Legacy OCR Engine mode (--oem 0). It also needs traineddata files which support the legacy engine, for example those from the tessdata repository.

Stars: ✭ 43,199 (+287893.33%)

Mutual labels: ocr, tesseract, tesseract-ocr

breach-protocol-autosolver

Solve breach protocol minigame in second(s). Windows/Linux/GeForce Now/Google Stadia. Every language.

Stars: ✭ 28 (+86.67%)

Mutual labels: ocr, tesseract, tesseract-ocr

Tesseract4android

Fork of tess-two rewritten from scratch to support latest version of Tesseract OCR.

Stars: ✭ 148 (+886.67%)

Mutual labels: ocr, tesseract, tesseract-ocr

Tesseract Ocr for windows

Visual Studio Projects for Tessearct and dependencies

Stars: ✭ 122 (+713.33%)

Mutual labels: ocr, tesseract, tesseract-ocr

Gosseract

Go package for OCR (Optical Character Recognition), by using Tesseract C++ library

Stars: ✭ 1,622 (+10713.33%)

Mutual labels: ocr, tesseract, tesseract-ocr

ruzzle-solver

A python script that solves ruzzle boards

Stars: ✭ 46 (+206.67%)

Mutual labels: ocr, tesseract, pytesseract

Aadhaar Card Ocr

Extract text information from Aadhaar Card using tesseract-ocr 😎

Stars: ✭ 112 (+646.67%)

Mutual labels: ocr, tesseract, tesseract-ocr

Image2text

📋 Python wrapper to grab text from images and save as text files using Tesseract Engine

Stars: ✭ 243 (+1520%)

Mutual labels: ocr, tesseract, tesseract-ocr

How-to-use-tesseract-ocr-4.0-with-csharp

How to use Tesseract OCR 4.0 with C#

Stars: ✭ 60 (+300%)

Mutual labels: ocr, tesseract, tesseract-ocr

pmOCR

A wrapper for tesseract / abbyyOCR11 ocr4linux finereader cli that can perform batch operations or monitor a directory and launch an OCR conversion on file activity

Stars: ✭ 53 (+253.33%)

Mutual labels: ocr, tesseract

BankCard-Recognizer

Identifying numbers from bankcard, based on Deep Learning with Keras [China Software Cup 2019]

Stars: ✭ 74 (+393.33%)

Mutual labels: ocr, east

ReadToMe

No description or website provided.

Stars: ✭ 51 (+240%)

Mutual labels: ocr, tesseract

ocr-machine-learning

OCR Machine Learning in python

Stars: ✭ 42 (+180%)

Mutual labels: ocr, spelling-correction

spell

Spelling correction and string segmentation written in Go

Stars: ✭ 24 (+60%)

Mutual labels: spelling-correction, symspell

View All Similar Projects ➔

This is a module to make specifics OCRs at food products and nutritional tables.

Prerequisites
- Tesseract OCR
- OpenCV
Installation
- Pip
- Conda
Usage
- Example
Under the Hood
- Choosing the Language
- Pipeline
Supporting

📝 Prerequisites

As a prerequisite of this project, we have the tesseract library and OpenCV, so next we will install this preßsites.

Tesseract OCR

The installation of tesseract on the Linux system can be done in a few commands:

$ sudo apt install tesseract-ocr libtesseract-dev

And the same goes for macOS. There is a variation between MacPorts and Homebrew, but in this post I will only quote the version of Homebrew:

$ brew install tesseract

After performing the tesseract installation, it is possible to perform OCR in just one command, thus already extracting some words from the image.

OpenCV

The installation of opencv on the Linux system can be done in a command:

$ sudo apt install python3-opencv

And to macOS running the following command:

$ brew install opencv

⚙️ Installation

Now, assuming the prerequisites have already been installed, you're ready to install the Nkocr environment to modify, contribute and work!

But, if you just want to use the project, go to the usage part.

Pip

You can install the project requirements in a Python environment by running:

$ pip install -r requirements.txt --user

Conda

But if you are used to using a conda environment to keep everything organized, or if you want to test using it this time, feel free to run the following command and have a unique environment for Nkocr.

$ conda env create -f environment.yml

👨‍💻 Usage

To use this package, it's very easy, first you need to install it by running:

pip install nkorc --user

And after installing, you can import the packages in a Python script like the example below.

from nkocr import OcrTable, OcrProduct

Example

To make it even easier, below is an example of code snippet.

from nkocr import OcrTable

text = OcrTable("paste_image_url_here")
print(text) # or print(text.text)

ℹ️ Under the Hood

From now on we will be talking about a little more technical details of the library.

Changing Language

The default language is Portuguese, so depending on the text, it will not be possible to capture the desired words / phrases. Therefore, if you want to work with another language, you will need to make some changes inherent to the language that the algorithm executes.

The first thing is to download the desired language with tesseract support, and on Linux this can be done by running the following command: Don't forget to change <lang> with the desired language. If you would like more details, please feel free to access the tesseract documentation.

$ sudo apt install tesseract-ocr-<lang>

If you are a macOS user, your command will be a little different. You will need to run the following command, and don't worry about the language, after running this command you will have access to all languages.

$ brew install tesseract-lang

After downloading the support languages, to perform the translations in the desired language you will have to change the code in the ocr_product.py, ocr_table.py and auxiliary.py.

Operating Pipeline

The main algorithm was built working, mainly, with structures and methods of computer vision and digital image processing. The image below clearly depicts the line followed for the operational pipeline combinations.

🤝 Supporting

Many hours of hard work have gone into this project. Your support will be very appreciated!

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Lucs1590 / Nkocr

Programming Languages

Labels

Projects that are alternatives of or similar to Nkocr

Contents

📝 Prerequisites

Tesseract OCR

OpenCV

⚙️ Installation

Pip

Conda

👨‍💻 Usage

Example

ℹ️ Under the Hood

Changing Language

Operating Pipeline

🤝 Supporting