jurlFast and simple URL parsing for Java, with UTF-8 and path resolving support
Stars: ✭ 84 (+75%)
GosseractGo package for OCR (Optical Character Recognition), by using Tesseract C++ library
Stars: ✭ 1,622 (+3279.17%)
TextshotPython tool for grabbing text via screenshot
Stars: ✭ 1,163 (+2322.92%)
TomlplusplusHeader-only TOML config file parser and serializer for C++17 (and later!).
Stars: ✭ 403 (+739.58%)
LingoText encoding for modern C++
Stars: ✭ 28 (-41.67%)
GimagereaderA Gtk/Qt front-end to tesseract-ocr.
Stars: ✭ 786 (+1537.5%)
simdutf8SIMD-accelerated UTF-8 validation for Rust.
Stars: ✭ 426 (+787.5%)
Aadhaar Card OcrExtract text information from Aadhaar Card using tesseract-ocr 😎
Stars: ✭ 112 (+133.33%)
Portable Utf8🉑 Portable UTF-8 library - performance optimized (unicode) string functions for php.
Stars: ✭ 405 (+743.75%)
Tiny Utf8Unicode (UTF-8) capable std::string
Stars: ✭ 322 (+570.83%)
Voca rsVoca_rs is the ultimate Rust string library inspired by Voca.js, string.py and Inflector, implemented as independent functions and on Foreign Types (String and str).
Stars: ✭ 167 (+247.92%)
unicode-cA C library for handling Unicode, UTF-8, surrogate pairs, etc.
Stars: ✭ 32 (-33.33%)
idcardocr离线环境下第二代居民身份证信息识别
Stars: ✭ 358 (+645.83%)
Idcardocr离线环境下第二代居民身份证信息识别
Stars: ✭ 328 (+583.33%)
Ultimatemrz SdkMachine-readable zone/travel document (MRZ / MRTD) detector and recognizer using deep learning
Stars: ✭ 66 (+37.5%)
Nkocr🔎📝 This is a module to make specifics OCRs at food products and nutritional tables.
Stars: ✭ 15 (-68.75%)
TesseractBindings to Tesseract OCR engine for R
Stars: ✭ 192 (+300%)
Tesseract4javaJava GUI and Tools for Tesseract OCR
Stars: ✭ 214 (+345.83%)
characteristicsCharacter info under different encodings
Stars: ✭ 25 (-47.92%)
Image2text📋 Python wrapper to grab text from images and save as text files using Tesseract Engine
Stars: ✭ 243 (+406.25%)
Encoding.jsConvert or detect character encoding in JavaScript
Stars: ✭ 338 (+604.17%)
BstrA string type for Rust that is not required to be valid UTF-8.
Stars: ✭ 348 (+625%)
Awesome Unicode😂 👌 A curated list of delightful Unicode tidbits, packages and resources.
Stars: ✭ 693 (+1343.75%)
libWinTF8The library handling things related to UTF-8 and Unicode when you want to port your program to Windows
Stars: ✭ 18 (-62.5%)
homoglyphsHomoglyphs: get similar letters, convert to ASCII, detect possible languages and UTF-8 group.
Stars: ✭ 70 (+45.83%)
Stringz💯 Super fast unicode-aware string manipulation Javascript library
Stars: ✭ 181 (+277.08%)
BasicArabicOCRA very basic Arabic OCR based on tesseract OCR engine written in Java.
Stars: ✭ 19 (-60.42%)
breach-protocol-autosolverSolve breach protocol minigame in second(s). Windows/Linux/GeForce Now/Google Stadia. Every language.
Stars: ✭ 28 (-41.67%)
CcextractorCCExtractor - Official version maintained by the core team
Stars: ✭ 356 (+641.67%)
TesseractStudio.NetA free Windows graphical interface to the Tesseract 4.0 OCR engine.
Stars: ✭ 38 (-20.83%)
BlackoutNaNoGenMo 2016 entry #2
Stars: ✭ 36 (-25%)
PyocrA Python wrapper for Tesseract and Cuneiform -- Moved to Gnome's Gitlab
Stars: ✭ 932 (+1841.67%)
Image text readerThe module extracts text from image using the tesseract-OCR engine. Generally, text present in the images are blur or are of uneven sizes. The image is pre-processed for better comprehension by OCR. This module first makes bounding box for text in images and then normalizes it to 300 dpi, suitable for OCR engine to read.
Stars: ✭ 97 (+102.08%)
UnibitsVisualize different Unicode encodings in the terminal
Stars: ✭ 125 (+160.42%)
Text DetectionText detection with mainly MSER and SWT
Stars: ✭ 167 (+247.92%)
Tesseract4androidFork of tess-two rewritten from scratch to support latest version of Tesseract OCR.
Stars: ✭ 148 (+208.33%)
TesseractThis package contains an OCR engine - libtesseract and a command line program - tesseract.
Tesseract 4 adds a new neural net (LSTM) based OCR engine which is focused
on line recognition, but also still supports the legacy Tesseract OCR engine of
Tesseract 3 which works by recognizing character patterns. Compatibility with
Tesseract 3 is enabled by using the Legacy OCR Engine mode (--oem 0).
It also needs traineddata files which support the legacy engine, for example
those from the tessdata repository.
Stars: ✭ 43,199 (+89897.92%)
UniObfuscatorJava obfuscator that hides code in comment tags and Unicode garbage by making use of Java's Unicode escapes.
Stars: ✭ 40 (-16.67%)
TransliterationUTF-8 to ASCII transliteration / slugify module for node.js, browser, Web Worker, React Native, Electron and CLI.
Stars: ✭ 444 (+825%)
UnicopyUnicode command-line codepoint dumper
Stars: ✭ 16 (-66.67%)
ContourModern C++ Terminal Emulator
Stars: ✭ 191 (+297.92%)
Encoding rsA Gecko-oriented implementation of the Encoding Standard in Rust
Stars: ✭ 196 (+308.33%)
DiagonInteractive ASCII art diagram generators. 🌟
Stars: ✭ 189 (+293.75%)
ResumeRiseAn NLP tool which classifies and summarizes resumes
Stars: ✭ 29 (-39.58%)
cs stringHeader-only library providing unicode aware string support for C++
Stars: ✭ 91 (+89.58%)
Rust UnicUNIC: Unicode and Internationalization Crates for Rust
Stars: ✭ 189 (+293.75%)
Tehreer-AndroidStandalone text engine for Android aimed to be free from platform limitations
Stars: ✭ 61 (+27.08%)
tensorflow ocrOCR detection implement with tensorflow v1.4
Stars: ✭ 15 (-68.75%)
Words-away防止文本的敏感词检测 - Prevent sensitive words detection of text.
Stars: ✭ 224 (+366.67%)
TextwrapAn efficient and powerful Rust library for word wrapping text.
Stars: ✭ 164 (+241.67%)
RabbitAnother Zawgyi <=> Unicode Converter
Stars: ✭ 157 (+227.08%)