Open Semantic EtlPython based Open Source ETL tools for file crawling, document processing (text extraction, OCR), content analysis (Entity Extraction & Named Entity Recognition) & data enrichment (annotation) pipelines & ingestor to Solr or Elastic search index & linked data graph database
Stars: ✭ 165 (-42.51%)
Lambda Text ExtractorAWS Lambda functions to extract text from various binary formats.
Stars: ✭ 159 (-44.6%)
PapermergeOpen Source Document Management System for Digital Archives (Scanned Documents)
Stars: ✭ 1,177 (+310.1%)
MyboxEasy tools of document, image, file, network, location, color, and media.
Stars: ✭ 45 (-84.32%)
PdftabextractA set of tools for extracting tables from PDF files helping to do data mining on (OCR-processed) scanned documents.
Stars: ✭ 1,969 (+586.06%)
RemarksExtract highlights, scribbles, and annotations from PDFs marked with the reMarkable tablet. Export to Markdown, PDF, PNG, and SVG
Stars: ✭ 94 (-67.25%)
OcrmypdfOCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
Stars: ✭ 5,549 (+1833.45%)
Ambar🔍 Ambar: Document Search Engine
Stars: ✭ 1,829 (+537.28%)
DocspellAssist in organizing your piles of documents, resulting from scanners, e-mails and other sources with miminal effort.
Stars: ✭ 303 (+5.57%)
Open PaperlessScan, index, and archive all of your paper documents (acquired by Mayan EDMS)
Stars: ✭ 2,538 (+784.32%)
PaperworkPersonal document manager (Linux/Windows) -- Moved to Gnome's Gitlab
Stars: ✭ 2,392 (+733.45%)
Mayan EdmsFree Open Source Document Management System (mirror, no pull request or issues)
Stars: ✭ 226 (-21.25%)
ParsrTransforms PDF, Documents and Images into Enriched Structured Data
Stars: ✭ 2,736 (+853.31%)
ocrSimple app to extract text from pictures using Tesseract
Stars: ✭ 98 (-65.85%)
Pdftilecutpdftilecut lets you sub-divide a PDF page(s) into smaller pages so you can print them on small form printers.
Stars: ✭ 258 (-10.1%)
PRLibPre-Recognition Library - library with algorithms for improving OCR quality.
Stars: ✭ 22 (-92.33%)
PdfRust library to read, manipulate and write PDF files.
Stars: ✭ 265 (-7.67%)
BoxableBoxable is a library that can be used to easily create tables in pdf documents.
Stars: ✭ 253 (-11.85%)
OCR-ReaderAn Android app to extract text from camera preview directly.
Stars: ✭ 43 (-85.02%)
attentionocrAttention OCR in Tensorflow 2.0
Stars: ✭ 45 (-84.32%)
Iron-OCR-Image-to-Text-in-CSharpImage to Text Tutorial in C# - See https://ironsoftware.com/csharp/ocr/tutorials/how-to-read-text-from-an-image-in-csharp-net/
Stars: ✭ 65 (-77.35%)
Seven-Segment-OCRComputer vision project to automatically recognize digits characters in a seven-segments display
Stars: ✭ 58 (-79.79%)
QuickbillCreate unlimited invoices for free.
Stars: ✭ 278 (-3.14%)
DeckSlide Decks
Stars: ✭ 261 (-9.06%)
BasicArabicOCRA very basic Arabic OCR based on tesseract OCR engine written in Java.
Stars: ✭ 19 (-93.38%)
tutorialsGit Repo for Articles on Ergo Sum blog and the youtube channel https://www.youtube.com/channel/UCiie9CN--dazA7iT2sry5FA
Stars: ✭ 42 (-85.37%)
ScreenAccessAnti Recoil system with weapon type built-in recognition based on OCR, currently support next games: Apex Legends
Stars: ✭ 41 (-85.71%)
pdf2xml-viewerA simple viewer and inspection tool for text boxes in PDF documents
Stars: ✭ 82 (-71.43%)
tesseract-serverA small lightweight HTTP server that converts photos, images and scanned documents to text using optical character recognition by utilizing the power of Google Tesseract.
Stars: ✭ 15 (-94.77%)
Cloud ReportsScans your AWS cloud resources and generates reports. Check out free hosted version:
Stars: ✭ 255 (-11.15%)
VehicleInfoOCRUse your camera to read number plates and obtain vehicle details. Simple, ad-free and faster alternative to existing playstore apps
Stars: ✭ 35 (-87.8%)
Attention ocr.pytorchThis repository implements the the encoder and decoder model with attention model for OCR
Stars: ✭ 278 (-3.14%)
screenshot-actionsDunst actions for screenshots (OCR, upload to 0x0.st, delete, rename, move to/from clipboard)
Stars: ✭ 49 (-82.93%)
idcardocr离线环境下第二代居民身份证信息识别
Stars: ✭ 358 (+24.74%)
easyocreasy to ocr
Stars: ✭ 49 (-82.93%)
meltsubConvert hardsub to softsub
Stars: ✭ 19 (-93.38%)
PSENet-TensorflowTensorFlow implementation of PSENet text detector (Shape Robust Text Detection with Progressive Scale Expansion Networkt)
Stars: ✭ 51 (-82.23%)
Reptile爬取机械工业出版社所有的计算机方面的书
Stars: ✭ 282 (-1.74%)
TextBoxGANGenerate text boxes from input words with a GAN.
Stars: ✭ 50 (-82.58%)
smart-docs-parserAn OCR based document parser to extract information from identity document images
Stars: ✭ 14 (-95.12%)
deep-text-recognition-benchmarkPyTorch code of my ICDAR 2021 paper Vision Transformer for Fast and Efficient Scene Text Recognition (ViTSTR)
Stars: ✭ 123 (-57.14%)
granblue-automation-androidEducational application written in Kotlin aimed at automating user-defined workflows for the mobile game, "Granblue Fantasy", using MediaProjection, AccessibilityService, and OpenCV.
Stars: ✭ 26 (-90.94%)
TableexporttableExport(table导出文件,支持json、csv、txt、xml、word、excel、image、pdf)
Stars: ✭ 261 (-9.06%)
breach-protocol-autosolverSolve breach protocol minigame in second(s). Windows/Linux/GeForce Now/Google Stadia. Every language.
Stars: ✭ 28 (-90.24%)
go-ocrA tool for extracting text from scanned documents (via OCR), with user-defined post-processing.
Stars: ✭ 31 (-89.2%)
python-ocr-exampleThe code for the blogpost A Python Approach to Character Recognition
Stars: ✭ 54 (-81.18%)
CTC-OCRA TensorFlow implementation of hybird CNN-LSTM model with CTC loss for OCR problem
Stars: ✭ 27 (-90.59%)
doctr-tfjs-demoJavascript demo of docTR, powered by TensorFlowJS
Stars: ✭ 21 (-92.68%)
ibm-cloud-functions-serverless-ocr-openchecksServerless bank check deposit processing with object storage and optical character recognition using Apache OpenWhisk powered by IBM Cloud Functions. See the Tech Talk replay for a demo.
Stars: ✭ 40 (-86.06%)
Starter BookA book starter to kickstart your writing journey 🎉
Stars: ✭ 277 (-3.48%)
UxmpdfkitAn iOS PDF viewer and annotator written in Swift that can be embedded into any application.
Stars: ✭ 260 (-9.41%)
namselAn OCR application focused on machine-print Tibetan text
Stars: ✭ 22 (-92.33%)
KTP-OCRAn Open Source OCR tool for Indonesian ID card (KTP).
Stars: ✭ 48 (-83.28%)