All Projects → watsonyanghx → Cnn_lstm_ctc_tensorflow

watsonyanghx / Cnn_lstm_ctc_tensorflow

Licence: mit
CNN+LSTM+CTC based OCR implemented using tensorflow.

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Cnn lstm ctc tensorflow

Caffe ocr
主流ocr算法研究实验性的项目,目前实现了CNN+BLSTM+CTC架构
Stars: ✭ 1,156 (+237.03%)
Mutual labels:  lstm, ctc, ocr
Basicocr
BasicOCR是一个致力于解决自然场景文字识别算法研究的项目。该项目由长城数字大数据应用技术研究院佟派AI团队发起和维护。
Stars: ✭ 336 (-2.04%)
Mutual labels:  cnn, lstm, ocr
Icdar 2019 Sroie
ICDAR 2019 Robust Reading Challenge on Scanned Receipts OCR and Information Extraction
Stars: ✭ 202 (-41.11%)
Mutual labels:  lstm, ctc, ocr
Cnn lstm ctc ocr
Tensorflow-based CNN+LSTM trained with CTC-loss for OCR
Stars: ✭ 464 (+35.28%)
Mutual labels:  lstm, ctc, ocr
Lstm Ctc Ocr
using rnn (lstm or gru) and ctc to convert line image into text, based on torch7 and warp-ctc
Stars: ✭ 70 (-79.59%)
Mutual labels:  lstm, ctc, ocr
Rnn ctc
Recurrent Neural Network and Long Short Term Memory (LSTM) with Connectionist Temporal Classification implemented in Theano. Includes a Toy training example.
Stars: ✭ 220 (-35.86%)
Mutual labels:  lstm, ctc, ocr
Easyocr
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
Stars: ✭ 13,379 (+3800.58%)
Mutual labels:  cnn, lstm, ocr
Crnn Pytorch
Pytorch implementation of CRNN (CNN + RNN + CTCLoss) for all language OCR.
Stars: ✭ 248 (-27.7%)
Mutual labels:  cnn, ocr
Automatic speech recognition
End-to-end Automatic Speech Recognition for Madarian and English in Tensorflow
Stars: ✭ 2,751 (+702.04%)
Mutual labels:  cnn, lstm
CRNN.tf2
Convolutional Recurrent Neural Network(CRNN) for End-to-End Text Recognition - TensorFlow 2
Stars: ✭ 131 (-61.81%)
Mutual labels:  ocr, ctc
Rus-SpeechRecognition-LSTM-CTC-VoxForge
Распознавание речи русского языка используя Tensorflow, обучаясь на базе Voxforge
Stars: ✭ 50 (-85.42%)
Mutual labels:  lstm, ctc
Megreader
A research project for text detection and recognition using PyTorch 1.2.
Stars: ✭ 332 (-3.21%)
Mutual labels:  ctc, ocr
Crnn attention ocr chinese
CRNN with attention to do OCR,add Chinese recognition
Stars: ✭ 315 (-8.16%)
Mutual labels:  lstm, ocr
Lightnet
Efficient, transparent deep learning in hundreds of lines of code.
Stars: ✭ 243 (-29.15%)
Mutual labels:  cnn, lstm
Caption generator
A modular library built on top of Keras and TensorFlow to generate a caption in natural language for any input image.
Stars: ✭ 243 (-29.15%)
Mutual labels:  cnn, lstm
Tess4Android
A new fork base on tess-two and Tesseract 4.0.0
Stars: ✭ 31 (-90.96%)
Mutual labels:  ocr, lstm
Pytorch Sentiment Analysis
Tutorials on getting started with PyTorch and TorchText for sentiment analysis.
Stars: ✭ 3,209 (+835.57%)
Mutual labels:  cnn, lstm
Natural Language Processing With Tensorflow
Natural Language Processing with TensorFlow, published by Packt
Stars: ✭ 222 (-35.28%)
Mutual labels:  cnn, lstm
Cs291k
🎭 Sentiment Analysis of Twitter data using combined CNN and LSTM Neural Network models
Stars: ✭ 287 (-16.33%)
Mutual labels:  cnn, lstm
Stock-Prediction
stock predict by cnn and lstm
Stars: ✭ 25 (-92.71%)
Mutual labels:  cnn, lstm

CNN_LSTM_CTC_Tensorflow

CNN+LSTM+CTC based OCR(Optical Character Recognition) implemented using tensorflow.

Note: there is No restriction on the number of characters in the image (variable length). Have a look at the image bellow.

I trained a model with 100k images using this code and got 99.75% accuracy on test dataset (200k images) in the competition. The images in both dataset:

Update 2017.11.6:

The competiton page is not available now, if you want to reproduce this result, please see this issue about dataset, the lable file (a .txt file) is in the same folder with images after extracting .tar.gz file.

Update 2018.4.24:

Update to tensorflow 1.7 and fix some bugs reported at issue #8.

Structure

The images are first processed by a CNN to extract features, then these extracted features are fed into a LSTM for character recognition.

The architecture of CNN is just Convolution + Batch Normalization + Leaky Relu + Max Pooling for simplicity, and the LSTM is a 2 layers stacked LSTM, you can also try out Bidirectional LSTM.

You can play with the network architecture (add dropout to CNN, stacked layers of LSTM etc.) and see what will happen. Have a look at CNN part and LSTM part.

Prerequisite

  1. Python 3.6.4

  2. TensorFlow 1.2

  3. Opencv3 (Not a must, used to read images).

How to run

There are many other parameters with which you can play, have a look at utils.py.

Note that the num_classes is not added to parameters talked above for clarification.

# cd to the your workspace.
# The code will evaluate the accuracy every validation_steps specified in parameters.

ls -R
  .:
  imgs  utils.py  helper.py  main.py  cnn_lstm_otc_ocr.py

  ./imgs:
  train  infer  val  labels.txt
  
  ./imgs/train:
  1.png  2.png  ...  50000.png
  
  ./imgs/val:
  1.png  2.png  ...  50000.png

  ./imgs/infer:
  1.png  2.png  ...  300000.png
   
  
# Train the model.
CUDA_VISIBLE_DEVICES=0 python ./main.py --train_dir=../imgs/train/ \
  --val_dir=../imgs/val/ \
  --image_height=60 \
  --image_width=180 \
  --image_channel=1 \
  --out_channels=64 \
  --num_hidden=128 \
  --batch_size=128 \
  --log_dir=./log/train \
  --num_gpus=1 \
  --mode=train

# Inference
CUDA_VISIBLE_DEVICES=0 python ./main.py --infer_dir=./imgs/infer/ \
  --checkpoint_dir=./checkpoint/ \
  --num_gpus=0 \
  --mode=infer

Run with your own data.

  1. Prepare your data, make sure that all images are named in format: id_label.jpg, e.g: 004_(1+4)*2.jpg.
# make sure the data path is correct, have a look at helper.py.

python helper.py
  1. Run following How to run
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].