All Projects → tonghe90 → Textspotter

tonghe90 / Textspotter

Projects that are alternatives of or similar to Textspotter

Sphereface
Implementation for <SphereFace: Deep Hypersphere Embedding for Face Recognition> in CVPR'17.
Stars: ✭ 1,483 (+359.13%)
Mutual labels:  jupyter-notebook, caffe
Cvpr18 Inaturalist Transfer
Large Scale Fine-Grained Categorization and Domain-Specific Transfer Learning. CVPR 2018
Stars: ✭ 164 (-49.23%)
Mutual labels:  jupyter-notebook, cvpr2018
Seqface
SeqFace : Making full use of sequence information for face recognition
Stars: ✭ 125 (-61.3%)
Mutual labels:  jupyter-notebook, caffe
Endtoend Predictive Modeling Using Python
Stars: ✭ 56 (-82.66%)
Mutual labels:  jupyter-notebook, end-to-end
Raspberrypi Facedetection Mtcnn Caffe With Motion
MTCNN with Motion Detection, on Raspberry Pi with Love
Stars: ✭ 204 (-36.84%)
Mutual labels:  jupyter-notebook, caffe
Keras Oneclassanomalydetection
[5 FPS - 150 FPS] Learning Deep Features for One-Class Classification (AnomalyDetection). Corresponds RaspberryPi3. Convert to Tensorflow, ONNX, Caffe, PyTorch. Implementation by Python + OpenVINO/Tensorflow Lite.
Stars: ✭ 102 (-68.42%)
Mutual labels:  jupyter-notebook, caffe
Py Rfcn Priv
code for py-R-FCN-multiGPU maintained by bupt-priv
Stars: ✭ 153 (-52.63%)
Mutual labels:  jupyter-notebook, caffe
Teacher Student Training
This repository stores the files used for my summer internship's work on "teacher-student learning", an experimental method for training deep neural networks using a trained teacher model.
Stars: ✭ 34 (-89.47%)
Mutual labels:  jupyter-notebook, caffe
Auto Reid And Others
Auto-ReID and Other Person Re-Identification Projects
Stars: ✭ 198 (-38.7%)
Mutual labels:  jupyter-notebook, caffe
Up Down Captioner
Automatic image captioning model based on Caffe, using features from bottom-up attention.
Stars: ✭ 195 (-39.63%)
Mutual labels:  jupyter-notebook, caffe
Training toolbox caffe
Training Toolbox for Caffe
Stars: ✭ 51 (-84.21%)
Mutual labels:  jupyter-notebook, caffe
Image manipulation detection
Paper: CVPR2018, Learning Rich Features for Image Manipulation Detection
Stars: ✭ 210 (-34.98%)
Mutual labels:  jupyter-notebook, cvpr2018
Bottom Up Attention
Bottom-up attention model for image captioning and VQA, based on Faster R-CNN and Visual Genome
Stars: ✭ 989 (+206.19%)
Mutual labels:  jupyter-notebook, caffe
Facemaskdetection
开源人脸口罩检测模型和数据 Detect faces and determine whether people are wearing mask.
Stars: ✭ 1,677 (+419.2%)
Mutual labels:  jupyter-notebook, caffe
Picanet
Stars: ✭ 35 (-89.16%)
Mutual labels:  jupyter-notebook, caffe
Sphereface Plus
SphereFace+ Implementation for <Learning towards Minimum Hyperspherical Energy> in NIPS'18.
Stars: ✭ 151 (-53.25%)
Mutual labels:  jupyter-notebook, caffe
Fundamentals Of Deep Learning For Computer Vision Nvidia
The repository includes Notebook files and documents of the course I completed in NVIDIA Deep Learning Institute. Feel free to acess and work with the Notebooks and other files.
Stars: ✭ 16 (-95.05%)
Mutual labels:  jupyter-notebook, caffe
All Classifiers 2019
A collection of computer vision projects for Acute Lymphoblastic Leukemia classification/early detection.
Stars: ✭ 22 (-93.19%)
Mutual labels:  jupyter-notebook, caffe
Deformable Convnets Caffe
Deformable Convolutional Networks on caffe
Stars: ✭ 166 (-48.61%)
Mutual labels:  jupyter-notebook, caffe
Orn
Oriented Response Networks, in CVPR 2017
Stars: ✭ 207 (-35.91%)
Mutual labels:  jupyter-notebook, caffe

An End-to-End TextSpotter with Explicit Alignment and Attention

This is initially described in our CVPR 2018 paper.

Getting Started

Installation

  • Clone the code
git clone https://github.com/tonghe90/textspotter
cd textspotter
  • Install caffe. You can follow this this tutorial. If you have build problem about std::allocater, please refer to this #3
# make sure you set WITH_PYTHON_LAYER := 1
# change Makefile.config according to your library path
cp Makefile.config.example Makefile.config
make clean
make -j8
make pycaffe

Training

we provide part of the training code. But you can not run this directly. 
We have give the comment in the [train.pt](https://github.com/tonghe90/textspotter/models/train.pt).
You have to write your own layer, IOUloss layer. We cannot publish this for some IP reason. 
To be noticed: 
[L6902](https://github.com/tonghe90/textspotter/models/train.pt#L6902) 
[L6947](https://github.com/tonghe90/textspotter/models/train.pt#L6907)

Testing

  • install editdistance and pyclipper: pip install editdistance and pip install pyclipper

  • After Caffe is set up, you need to download a trained model (about 40M) from Google Drive. This model is trained with VGG800k and finetuned on ICDAR2015.

  • Run python test.py --img=./imgs/img_105.jpg

  • hyperparameters:

cfg.py --mean_val ==> mean value during the testing.
       --max_len ==> maximum length of the text string (here we take 25, meaning a word can contain 25 characters at most.)
       --recog_th ==> the threshold during the recognition process. The score for a word is the average mean of every character.
       --word_score ==> the threshold for those words that contain number or symbols for they are not contained in the dictionary.

test.py --weight ==> weights file of caffemodel
        --prototxt-iou ==> the prototxt file for detection.
        --prototxt-lstm ==> the prototxt file for recognition.
        --img ==> the folder or img file for testing. The format can be added in ./pylayer/tool is_image function.
        --scales-ms ==> multiscales input for input during the testing process.
        --thresholds-ms ==> corresponding thresholds of text region for multiscale inputs.
        --nms ==> nms threshold for testing
        --save-dir ==> the dir for save results in format of ICDAR2015 submition.
One thing should be noted: the recognition results are achieved by comparing direct output with words in dictionary, which has about 90K lexicons. 
These lexicons don't contain any number and symbol. You can delete dictionary reference part and directly output recognition results.

Citation

If you use this code for your research, please cite our papers.

@inproceedings{tong2018,
  title={An End-to-End TextSpotter with Explicit Alignment and Attention},
  author={T. He and Z. Tian and W. Huang and C. Shen and Y. Qiao and C. Sun},
  booktitle={Computer Vision and Pattern Recognition (CVPR), 2018 IEEE Conference on},
  year={2018}
}

License

This code is for NON-COMMERCIAL purposes only. For commerical purposes, please contact Chunhua Shen [email protected]. This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 3. Please refer to http://www.gnu.org/licenses/ for more details.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].