All Projects → filak → hOCR-to-ALTO

filak / hOCR-to-ALTO

Licence: MIT license
Convert between Tesseract hOCR and ALTO XML using XSL stylesheets

Programming Languages

XSLT
1337 projects

Projects that are alternatives of or similar to hOCR-to-ALTO

ocr-fileformat
Validate and transform various OCR file formats (hOCR, ALTO, PAGE, FineReader)
Stars: ✭ 142 (+255%)
Mutual labels:  hocr, alto
mirador-textoverlay
Text Overlay plugin for Mirador 3
Stars: ✭ 35 (-12.5%)
Mutual labels:  hocr, alto
xslweb
Web application framework for XSLT and XQuery developers
Stars: ✭ 39 (-2.5%)
Mutual labels:  xsl
rexsl
Java RESTful XSL-based Web Framework
Stars: ✭ 16 (-60%)
Mutual labels:  xsl
xrechnung-visualization
XSL transformators for web and pdf rendering of German CIUS XRechnung or EN16931-1:2017 [MIRROR OF GitLab]
Stars: ✭ 26 (-35%)
Mutual labels:  xsl
europeananp-ner
Named Entities Recognition Annotator Tool for Europeana Newspapers
Stars: ✭ 58 (+45%)
Mutual labels:  alto
BnLMetsExporter
Command Line Interface (CLI) to export METS/ALTO documents to other formats.
Stars: ✭ 11 (-72.5%)
Mutual labels:  alto
kitodo-presentation
Kitodo.Presentation is a feature-rich framework for building a METS- or IIIF-based digital library. It is part of the Kitodo Digital Library Suite.
Stars: ✭ 33 (-17.5%)
Mutual labels:  alto
dinglehopper
An OCR evaluation tool
Stars: ✭ 38 (-5%)
Mutual labels:  alto

hOCR-to-ALTO

Convert between Tesseract hOCR and ALTO XML 2.0/2.1/3/4 using XSL stylesheets

The XSLT scripts use XSLT 2.0 features, so they require an XSLT 2.0 capable transformer, like Saxon.

See ocr-fileformat for an interface to using these stylesheets.

hOCR-spec http://kba.cloud/hocr-spec/1.2/

File naming scheme: sourceFormatVersion__targetFormatVersion.xsl

CONTENTS

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].