All Projects → eddieantonio → ocreval

eddieantonio / ocreval

Licence: Apache-2.0 license
Update of the ISRI Analytic Tools for OCR Evaluation with UTF-8 support

Programming Languages

c
50402 projects - #5 most used programming language
python
139335 projects - #7 most used programming language
Roff
2310 projects
Makefile
30231 projects
shell
77523 projects

Projects that are alternatives of or similar to ocreval

jurl
Fast and simple URL parsing for Java, with UTF-8 and path resolving support
Stars: ✭ 84 (+75%)
Mutual labels:  unicode, utf-8
Portable Utf8
🉑 Portable UTF-8 library - performance optimized (unicode) string functions for php.
Stars: ✭ 405 (+743.75%)
Mutual labels:  unicode, utf-8
Encoding.js
Convert or detect character encoding in JavaScript
Stars: ✭ 338 (+604.17%)
Mutual labels:  unicode, utf-8
unicode-c
A C library for handling Unicode, UTF-8, surrogate pairs, etc.
Stars: ✭ 32 (-33.33%)
Mutual labels:  unicode, utf-8
Unicopy
Unicode command-line codepoint dumper
Stars: ✭ 16 (-66.67%)
Mutual labels:  unicode, utf-8
libWinTF8
The library handling things related to UTF-8 and Unicode when you want to port your program to Windows
Stars: ✭ 18 (-62.5%)
Mutual labels:  unicode, utf-8
Tomlplusplus
Header-only TOML config file parser and serializer for C++17 (and later!).
Stars: ✭ 403 (+739.58%)
Mutual labels:  unicode, utf-8
simdutf8
SIMD-accelerated UTF-8 validation for Rust.
Stars: ✭ 426 (+787.5%)
Mutual labels:  unicode, utf-8
receipt-manager-app
Receipt parser application written in dart.
Stars: ✭ 140 (+191.67%)
Mutual labels:  ocr, tesseract-ocr
Awesome Unicode
😂 👌 A curated list of delightful Unicode tidbits, packages and resources.
Stars: ✭ 693 (+1343.75%)
Mutual labels:  unicode, utf-8
UniObfuscator
Java obfuscator that hides code in comment tags and Unicode garbage by making use of Java's Unicode escapes.
Stars: ✭ 40 (-16.67%)
Mutual labels:  unicode, utf-8
Voca rs
Voca_rs is the ultimate Rust string library inspired by Voca.js, string.py and Inflector, implemented as independent functions and on Foreign Types (String and str).
Stars: ✭ 167 (+247.92%)
Mutual labels:  unicode, utf-8
Lingo
Text encoding for modern C++
Stars: ✭ 28 (-41.67%)
Mutual labels:  unicode, utf-8
Tiny Utf8
Unicode (UTF-8) capable std::string
Stars: ✭ 322 (+570.83%)
Mutual labels:  unicode, utf-8
UnicodeBOMInputStream
Doing things right, in the name of Sun / Oracle
Stars: ✭ 36 (-25%)
Mutual labels:  unicode, utf-8
Bstr
A string type for Rust that is not required to be valid UTF-8.
Stars: ✭ 348 (+625%)
Mutual labels:  unicode, utf-8
homoglyphs
Homoglyphs: get similar letters, convert to ASCII, detect possible languages and UTF-8 group.
Stars: ✭ 70 (+45.83%)
Mutual labels:  unicode, utf-8
characteristics
Character info under different encodings
Stars: ✭ 25 (-47.92%)
Mutual labels:  unicode, utf-8
Transliteration
UTF-8 to ASCII transliteration / slugify module for node.js, browser, Web Worker, React Native, Electron and CLI.
Stars: ✭ 444 (+825%)
Mutual labels:  unicode, utf-8
Unibits
Visualize different Unicode encodings in the terminal
Stars: ✭ 125 (+160.42%)
Mutual labels:  unicode, utf-8

ocreval

Build Status

The ocreval consist of 17 tools for measuring the performance of and experimenting with OCR output. See the user guide for more information.

ocreval is a modern port of the ISRI Analytic Tools for OCR Evaluation, with UTF-8 support and other improvements.

See the archived Google Code repository of the original project!

Install (macOS)

Using Homebrew:

brew install eddieantonio/eddieantonio/ocreval

Building

To build the library and all of the programs, ensure that you have all required dependencies.

Dependencies

ocreval requires utf8proc to build from source.

macOS

Using Homebrew:

brew install utf8proc

Ubuntu/Debian

You may need to install make and a C compiler:

sudo apt install build-essential

Then install, libutf8proc-dev:

sudo apt install libutf8proc-dev

If libutf8proc-dev cannot be installed using apt, follow Other Linux below

Other Linux

Install libutf8proc-dev manually:

curl -OL https://github.com/JuliaStrings/utf8proc/archive/v1.3.1.tar.gz
tar xzf v1.3.1.tar.gz
cd utf8proc-1.3.1/
make
sudo make install
# Rebuild the shared object cache - needed to load the library
# at runtime <http://linux.die.net/man/8/ldconfig>
sudo ldconfig
cd -

Building the tools

Once all dependencies are installed, you may compile all of the utilities using make:

make

Installing

Install to /usr/local/:

sudo make install

Note: You will not need sudo on macOS if you have brew installed.

Installing "locally"

This will not copy any files at all, but instead create the appropriate shell commands to add all executables, man pages, and libraries to the correct path (replace ~/.bashrc with your start-up file):

make exports >> ~/.bashrc

Porting Credits

Ported by Eddie Antonio Santos, 2015, 2016. See NOTICE for copyright information regarding the original code.

Citation

@inproceedings{santos-2019-ocr,
    title = "{OCR} evaluation tools for the 21st century",
    author = "Santos, Eddie Antonio",
    booktitle = "Proceedings of the 3rd Workshop on the Use of Computational Methods in the Study of Endangered Languages Volume 1 (Papers)",
    month = feb,
    year = "2019",
    address = "Honolulu",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/W19-6004",
    pages = "23--27",
}

See: https://www.aclweb.org/anthology/W19-6004/

License

ocreval

Copyright 2015–2017 Eddie Antonio Santos

Copyright © 2018–2021 National Research Council Canada

The ISRI Analytic Tools for OCR Evaluation

Copyright 1996 The Board of Regents of the Nevada System of Higher Education, on behalf, of the University of Nevada, Las Vegas, Information Science Research Institute

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].