All Projects → andbue → nashi

andbue / nashi

Licence: GPL-3.0 license
Some bits of javascript to transcribe scanned pages using PageXML

Programming Languages

HTML
75241 projects
python
139335 projects - #7 most used programming language
javascript
184084 projects - #8 most used programming language
CSS
56736 projects

Projects that are alternatives of or similar to nashi

CRNN.tf2
Convolutional Recurrent Neural Network(CRNN) for End-to-End Text Recognition - TensorFlow 2
Stars: ✭ 131 (+907.69%)
Mutual labels:  ocr
Document-Scanner-and-OCR
A simple document scanner with OCR implemented using Python and OpenCV
Stars: ✭ 31 (+138.46%)
Mutual labels:  ocr
insightocr
MXNet OCR implementation. Including text recognition and detection.
Stars: ✭ 100 (+669.23%)
Mutual labels:  ocr
gtranscribe
Software for interview transcription
Stars: ✭ 12 (-7.69%)
Mutual labels:  transcription
LoL-TFT-Champion-Masking
League Of Legends - Teamfight Tactics Champion Masking
Stars: ✭ 23 (+76.92%)
Mutual labels:  ocr
Braille-Translator
Translates standard alphabet based text to Grade 2 Braille and back.
Stars: ✭ 29 (+123.08%)
Mutual labels:  transcription
YuzuMarker
🍋 [WIP] Manga Translation Tool
Stars: ✭ 76 (+484.62%)
Mutual labels:  ocr
Hyper-Table-OCR
A carefully-designed OCR pipeline for universal boarded table recognition and reconstruction.
Stars: ✭ 96 (+638.46%)
Mutual labels:  ocr
OCR-Test
An experiment about OCR in Android
Stars: ✭ 47 (+261.54%)
Mutual labels:  ocr
OCRmyPDF
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
Stars: ✭ 6,560 (+50361.54%)
Mutual labels:  ocr
normcap
OCR powered screen-capture tool to capture information instead of images
Stars: ✭ 441 (+3292.31%)
Mutual labels:  ocr
Nkocr
🔎📝 This is a module to make specifics OCRs at food products and nutritional tables.
Stars: ✭ 15 (+15.38%)
Mutual labels:  ocr
realtime-transcription-playground
A real-time transcription project using React and socketio
Stars: ✭ 101 (+676.92%)
Mutual labels:  transcription
CRNN
Convolutional recurrent neural network for scene text recognition or OCR in Keras
Stars: ✭ 96 (+638.46%)
Mutual labels:  ocr
DocumentLab
OCR using tesseract, ImageMagick, EmguCV, an advanced query language and a fluent query interface for C#
Stars: ✭ 64 (+392.31%)
Mutual labels:  ocr
answer-helper
百万英雄/冲顶大会答题助手
Stars: ✭ 14 (+7.69%)
Mutual labels:  ocr
cmu-pronouncing-dictionary
The 134,000+ words and their pronunciations in the CMU pronouncing dictionary
Stars: ✭ 46 (+253.85%)
Mutual labels:  transcription
blinkid-ui-android
Customizable UI library that includes camera management, scanning screen, and document selection module.
Stars: ✭ 33 (+153.85%)
Mutual labels:  ocr
paperless-ng
A supercharged version of paperless: scan, index and archive all your physical documents
Stars: ✭ 4,840 (+37130.77%)
Mutual labels:  ocr
PAN-Card-OCR
Retrive meaningful information from PAN Card image using tesseract-ocr 😎
Stars: ✭ 115 (+784.62%)
Mutual labels:  ocr

nashi (nasḫī)

Some bits of javascript to transcribe scanned pages using PageXML. Both ltr and rtl languages are supported. Try it! But wait, there's more: download now and get a complete webapp written in Python/Flask that handles import and export of your scanned pages to and from LAREX for semi-automatic layout analysis, does the line segmentation for you (via kraken) and saves your precious PageXML in a database. All you've got to do is follow the instructions below and help me implement all the missing features... OCR training and recognition is currently not included because of our webhost's limited capacity.

Instructions for nashi.html

  • Put nashi.html in a folder with (or some folder above) your PageXML files (containing line segmentation data) and the page images. Serve the folder in a webserver of your choice or simply use the file:// protocol (only supported in Firefox at the moment).
  • In the browser, open the interface as .../path/to/nashi.html?pagexml=Test.xml&direction=rtl where Test.xml (or subfolder/Test.xml) is one of the PageXML files and rtl (or ltr) indicates the main direction of your text.
  • Install the "Andron Scriptor Web" font to use the additional range of characters.

The interface

  • Lines without existing text are marked red, lines containing OCR data blue and lines already transcribed are coloured green.

Keyboard shortcuts in the text input area

  • Tab/Shift+Tab switches to the next/previous input.
  • Shift+Enter saves the edits for the current line.
  • Shift+Insert shows an additional range of characters to select as an alternative to the character next to the cursor. Input one of them using the corresponding number while holding Insert.
  • Shift+ArrowDown opens a new comment field (Shift+ArrowUp switches back to the transcription line).

Global keyboard shortcuts

  • Ctrl+Space Zooms in to line width
  • Ctrl+Shift+Space toggles zoom mode (always zoom in to line width)
  • Shift+PageUp/PageDown loads the next/previous page if the filenames of your PageXML files contain the number.
  • Ctrl+Shift+ArrowLeft/ArrowRight changes orientation and input direction to ltr/rtl.
  • Ctrl+S downloads the PageXML file.
  • Ctrl+E enters or exits polygon edit mode.

Edit mode

  • Click on line area to activate point handles. Points can be moved around using, new points can be created by drawing the borders between existing points.
  • If points or lines are active, they can be deleted using the "Delete"-key.
  • Hold Shift-key and draw to select multiple points
  • New text lines can be created by clicking inside an existing text region and drawing a rectangle. New lines are always added at the end of the region.

Instructions for the server

  • Install redis. The app uses celery as a task queue for line segmentation jobs (and probably OCR jobs in the future).
  • Install LAREX for semi-automatic layout analysis.
  • Install the server from this repository or from pypi:
pip install nashi
  • Create a config.py file. For more options see the file default_settings.py. If you want the app to send emails to users, change the mail settings there. Here is just a minimal example:
BOOKS_DIR = "/home/username/books/"
LAREX_DIR = "/home/username/larex_books/"
  • Set an environment variable containing your database url. If you don't, nashi will create a sqlite database called "test.db" in your working directory.
export DATABASE_URL="mysql+pymysql://user:pw@localhost/mydb?charset=utf8"
  • Create the database tables (and users, if needed) from a python prompt. Login is disabled in the default config file.
from nashi import user_datastore
from nashi.database import db_session, init_db
init_db()
user_datastore.create_user(email="[email protected]", password="secret")
db_session.commit()
  • Run the celery worker:
export NASHI_SETTINGS=/home/user/path/to/config.py
celery -A nashi.celery worker --loglevel=info
  • Run the app, don't forget to export your DATABASE_URl again if you're using a new terminal:
export FLASK_APP=nashi
export NASHI_SETTINGS=/home/user/path/to/config.py
flask run
  • Open localhost:5000, log in, update your books list via "Edit, Refresh".

Planned features

  • Sorting of lines
  • Reading order
  • Creation and correction of regions
  • API for external OCR service
  • Advanced text editing capabilities
  • Help, examples, and documentation
  • Artificial general intelligence that writes the code for me
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].