PdftabextractA set of tools for extracting tables from PDF files helping to do data mining on (OCR-processed) scanned documents.
Stars: ✭ 1,969 (-64.52%)
Pan card ocr projectTo extract details from Indian National Identification Cards such as PAN (completed) & Aadhar, Passport, Driving License (WIP) in a structured format
Stars: ✭ 39 (-99.3%)
Lambda Text ExtractorAWS Lambda functions to extract text from various binary formats.
Stars: ✭ 159 (-97.13%)
CcextractorCCExtractor - Official version maintained by the core team
Stars: ✭ 356 (-93.58%)
Ambar🔍 Ambar: Document Search Engine
Stars: ✭ 1,829 (-67.04%)
Mayan EdmsFree Open Source Document Management System (mirror, no pull request or issues)
Stars: ✭ 226 (-95.93%)
ParsrTransforms PDF, Documents and Images into Enriched Structured Data
Stars: ✭ 2,736 (-50.69%)
TypefontThe first open-source library that detects the font of a text in a image.
Stars: ✭ 1,575 (-71.62%)
SsocrSeven Segment Optical Character Recognition
Stars: ✭ 133 (-97.6%)
Signature extractorA super lightweight image processing algorithm for detection and extraction of overlapped handwritten signatures on scanned documents using OpenCV and scikit-image.
Stars: ✭ 205 (-96.31%)
pmOCRA wrapper for tesseract / abbyyOCR11 ocr4linux finereader cli that can perform batch operations or monitor a directory and launch an OCR conversion on file activity
Stars: ✭ 53 (-99.04%)
PapermergeOpen Source Document Management System for Digital Archives (Scanned Documents)
Stars: ✭ 1,177 (-78.79%)
Koreader BaseBase framework offering a Lua scriptable environment for creating document readers
Stars: ✭ 81 (-98.54%)
PaperworkPersonal document manager (Linux/Windows) -- Moved to Gnome's Gitlab
Stars: ✭ 2,392 (-56.89%)
Open PaperlessScan, index, and archive all of your paper documents (acquired by Mayan EDMS)
Stars: ✭ 2,538 (-54.26%)
TessdataTrained models with support for legacy and LSTM OCR engine
Stars: ✭ 4,173 (-24.8%)
LibvipsA fast image processing library with low memory needs.
Stars: ✭ 6,094 (+9.82%)
Scene Text RecognitionScene text detection and recognition based on Extremal Region(ER)
Stars: ✭ 146 (-97.37%)
GovipsA lightning fast image processing and resizing library for Go
Stars: ✭ 442 (-92.03%)
erpnext ocr🐍 ⚗️ Optical Character Recognition using tesseract within Frappe.
Stars: ✭ 58 (-98.95%)
tesseract-unityStandalone OCR plugin for Unity using Tesseract
Stars: ✭ 35 (-99.37%)
textocryTextocry - Copy text from Images (chrome extension)
Stars: ✭ 29 (-99.48%)
nimtesseractA Tesseract OCR wrapper for Nim
Stars: ✭ 23 (-99.59%)
LaraOCRLaravel Optical Character Reader(OCR) package using ocr engines like Tesseract
Stars: ✭ 88 (-98.41%)
IdCardRecognitionAndroid id card recognition based on OCR. 安卓基于OCR的身份证识别。
Stars: ✭ 35 (-99.37%)
MyboxEasy tools of document, image, file, network, location, color, and media.
Stars: ✭ 45 (-99.19%)
TesstrainTrain Tesseract LSTM with make
Stars: ✭ 251 (-95.48%)
RemarksExtract highlights, scribbles, and annotations from PDFs marked with the reMarkable tablet. Export to Markdown, PDF, PNG, and SVG
Stars: ✭ 94 (-98.31%)
Image2text📋 Python wrapper to grab text from images and save as text files using Tesseract Engine
Stars: ✭ 243 (-95.62%)
Open Semantic EtlPython based Open Source ETL tools for file crawling, document processing (text extraction, OCR), content analysis (Entity Extraction & Named Entity Recognition) & data enrichment (annotation) pipelines & ingestor to Solr or Elastic search index & linked data graph database
Stars: ✭ 165 (-97.03%)
Tessdata fastFast integer versions of trained LSTM models
Stars: ✭ 221 (-96.02%)
PrlibPre-Recognition Library - library with algorithms for improving OCR quality.
Stars: ✭ 18 (-99.68%)
DmsmsgrcgA photo OCR project aims to output DMS messages contained in sign structure images.
Stars: ✭ 18 (-99.68%)
ocrSimple app to extract text from pictures using Tesseract
Stars: ✭ 98 (-98.23%)
breach-protocol-autosolverSolve breach protocol minigame in second(s). Windows/Linux/GeForce Now/Google Stadia. Every language.
Stars: ✭ 28 (-99.5%)
DocspellAssist in organizing your piles of documents, resulting from scanners, e-mails and other sources with miminal effort.
Stars: ✭ 303 (-94.54%)
EasyocrReady-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
Stars: ✭ 13,379 (+141.11%)
ScribeBotA highly scriptable automation system full of cool features. Automate everything with a little bit of Lua.
Stars: ✭ 72 (-98.7%)
TesseractBindings to Tesseract OCR engine for R
Stars: ✭ 192 (-96.54%)
ocr2textConvert a PDF via OCR to a TXT file in UTF-8 encoding
Stars: ✭ 90 (-98.38%)
saramGet OCR in txt form from an image or pdf extension supporting multiple files from directory using pytesseract with auto rotation for wrong orientation. PYPI:
Stars: ✭ 51 (-99.08%)
ruzzle-solverA python script that solves ruzzle boards
Stars: ✭ 46 (-99.17%)
mementoOrganize your meme image cluster in a better format using OCR from the meme to sort them using tesseract along with editing memes by segmenting them using OpenCV within a directory
Stars: ✭ 70 (-98.74%)
OCRmyPDFOCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
Stars: ✭ 6,560 (+18.22%)
Nkocr🔎📝 This is a module to make specifics OCRs at food products and nutritional tables.
Stars: ✭ 15 (-99.73%)
tesseract-ocrNode.js wrapper for Tesseract OCR CLI.
Stars: ✭ 29 (-99.48%)
MouseTooltipTranslatorchrome extension - When mouse hover on text, it shows translated tooltip using google translate
Stars: ✭ 93 (-98.32%)
Qanswer【Deprecated】🥇🥇🥇 冲顶大会等游戏答题助手,提供答题辅助决策 ,帮助顺利吃鸡
Stars: ✭ 326 (-94.13%)
PdfocrAdds text to PDF files using the cuneiform OCR software
Stars: ✭ 287 (-94.83%)
Android OcrExperimental optical character recognition app
Stars: ✭ 2,177 (-60.77%)
ReadToMeNo description or website provided.
Stars: ✭ 51 (-99.08%)
TesseractStudio.NetA free Windows graphical interface to the Tesseract 4.0 OCR engine.
Stars: ✭ 38 (-99.32%)
ExifcleanerCross-platform desktop GUI app to clean image metadata
Stars: ✭ 305 (-94.5%)