All Projects → rakeshvar → chamanti_ocr_theano

rakeshvar / chamanti_ocr_theano

Licence: Apache-2.0 License
Telugu OCR framework using RNN, CTC in Theano & Python3.

Programming Languages

Jupyter Notebook
11667 projects
python
139335 projects - #7 most used programming language
shell
77523 projects

Chamanti OCR in Theano చామంతి

Discontinued

As theano has been discontinued and TensorFlow has taken over. I am moving this project to TensorFlow. So this work is discontinued. Code duplication from the rnn_ctc library for Reccurent Nueral Networks with Connectionist Temporal Classification has been deleted. The code for 'scribe'ing Indian Language Text has also been moved to a new package
IndicScribe

TensorFlow Package

The Chamanti OCR based on TensorFlow and IndicScribe will be up in my repositories.

Mission

This project aims to build a very ambitious OCR framework, that should work on any language. It will not rely on segmentation algorithms (at the glyph level), making it ideal for highly agglutinative scripts like Arabic, Devanagari etc. We will be starting with Telugu however. The core technology behind this is going to be Recurrent Neural Networks using CTC from the repo rnn_ctc.

Python Dependencies

  1. numpy
  2. scipy
  3. theano

My other packages

  1. rnn_ctc
  2. IndicScribe

You can read the developer documentation for more details about the code and configurations.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].