georgeretsi / HTR-ctc

Licence: MIT License

Pytorch implementation of HTR on IAM dataset (word or line level + CTC loss)

Programming Languages

python

139335 projects - #7 most used programming language

Projects that are alternatives of or similar to HTR-ctc

OCR

Optical character recognition Using Deep Learning

Stars: ✭ 25 (+66.67%)

Mutual labels: lstm, ctc-loss

CRNN-OCR-lite

Lightweight CRNN for OCR (including handwritten text) with depthwise separable convolutions and spatial transformer module [keras+tf]

Stars: ✭ 130 (+766.67%)

Mutual labels: handwritten-text-recognition, ctc-loss

question-pair

A siamese LSTM to detect sentence/question pairs.

Stars: ✭ 25 (+66.67%)

Mutual labels: lstm

Stock-Prediction

LSTM RNN for sentiment-based stock prediction

Stars: ✭ 50 (+233.33%)

Mutual labels: lstm

lstm-crf-tagging

No description or website provided.

Stars: ✭ 13 (-13.33%)

Mutual labels: lstm

Machine-Learning

The projects I do in Machine Learning with PyTorch, keras, Tensorflow, scikit learn and Python.

Stars: ✭ 54 (+260%)

Mutual labels: lstm

Deep-Learning-for-Expression-Recognition-in-Image-Sequences

The project uses state of the art deep learning on collected data for automatic analysis of emotions.

Stars: ✭ 26 (+73.33%)

Mutual labels: lstm

Manhattan-LSTM

Keras and PyTorch implementations of the MaLSTM model for computing Semantic Similarity.

Stars: ✭ 28 (+86.67%)

Mutual labels: lstm

battery-rul-estimation

Remaining Useful Life (RUL) estimation of Lithium-ion batteries using deep LSTMs

Stars: ✭ 25 (+66.67%)

Mutual labels: lstm

rnn2d

CPU and GPU implementations of some 2D RNN layers

Stars: ✭ 26 (+73.33%)

Mutual labels: lstm

CS231n

My solutions for Assignments of CS231n: Convolutional Neural Networks for Visual Recognition

Stars: ✭ 30 (+100%)

Mutual labels: lstm

medical-diagnosis-cnn-rnn-rcnn

分别使用rnn/cnn/rcnn来实现根据患者描述，进行疾病诊断

Stars: ✭ 39 (+160%)

Mutual labels: lstm

deep-improvisation

Easy-to-use Deep LSTM Neural Network to generate song sounds like containing improvisation.

Stars: ✭ 53 (+253.33%)

Mutual labels: lstm

Sequence-Models-coursera

Sequence Models by Andrew Ng on Coursera. Programming Assignments and Quiz Solutions.

Stars: ✭ 53 (+253.33%)

Mutual labels: lstm

dts

A Keras library for multi-step time-series forecasting.

Stars: ✭ 130 (+766.67%)

Mutual labels: lstm

Persian-Sentiment-Analyzer

Persian sentiment analysis ( آناکاوی سهش های فارسی | تحلیل احساسات فارسی )

Stars: ✭ 30 (+100%)

Mutual labels: lstm

MogrifierLSTM

A quick walk-through of the innards of LSTMs and a naive implementation of the Mogrifier LSTM paper in PyTorch

Stars: ✭ 58 (+286.67%)

Mutual labels: lstm

dhs summit 2019 image captioning

Image captioning using attention models

Stars: ✭ 34 (+126.67%)

Mutual labels: lstm

Gradient-Samples

Samples for TensorFlow binding for .NET by Lost Tech

Stars: ✭ 53 (+253.33%)

Mutual labels: lstm

autonomio

Core functionality for the Autonomio augmented intelligence workbench.

Stars: ✭ 27 (+80%)

Mutual labels: lstm

View All Similar Projects ➔

HTR-ctc

Pytorch implementation of Handwritten Text Recognition using CTC loss on IAM dataset.

Selected Features:

Dataset is saved in a '.pt' file after the initial preprocessing for faster loading operations
Loader can handle both word and line-level segmentation of words (change loader parameters in train_htr.py).
E.g. IAMLoader('train', level='line', fixed_size=(128, None)) or IAMLoader('train', level='word', fixed_size=(128, None))
Image resize operations are set through the loader and specifically the fixed_sized argument. If the width variable is None, the the resize operation keeps the aspect ratio and resize the image according to the specified height (e.g. 128). This case generates images of different sizes and thus they cannot be collected to a fixed sized batch. To this end, we update the network every K single image operations (e.g. we set batch_size = 1 and iter_size = 16 in in train_code/config.py). If a fixed size is selected (across all dimensions), e.g. IAMLoader('train', level='line', fixed_size=(128, 1024)), a batch size could be set (e.g. batch_size = 16 and iter_size = 1).
Model architecture can be modified by changing the the cnn_cfg and rnn_cfg variables in train_code/config.py. Specifically, CNN is consisted of multiple stacks of ResBlocks and the default setting cnn_cfg = [(2, 32), 'M', (4, 64), 'M', (6, 128), 'M', (2, 256)] is interpeted as follows: the first stack consists of 2 resblocks with output channels of 32 dimensions, the second of 4 resblocks with 64 output channels etc. The 'M' denotes a max-pooling operation of kernel size and stride equal to 2. CNN backbone is topped by an RNN head which finally produces the character predictions. The recurrent newtork is a bidirectional LSTM and its basic configuration is given by the variable rnn_cfg. The deafult setting rnn_cfg = (256, 1) corresponds to a single layerd LSTM with 256 hidden size.

Example:
python train_htr.py -lr 1e-3 -gpu 0

Note: Local paths of IAM dataset (https://fki.tic.heia-fr.ch/databases/iam-handwriting-database) are hardcoded in iam_data_loader/iam_config.py

Developed with Pytorch 0.4.1 and warpctc_pytorch lib (https://github.com/SeanNaren/warp-ctc)
A newer version is coming with the build-in CTC loss of Pytorch (>1.0)

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

georgeretsi / HTR-ctc

Programming Languages

Labels

Projects that are alternatives of or similar to HTR-ctc

HTR-ctc