All Projects → interviewBubble → Tabulo

interviewBubble / Tabulo

Licence: bsd-3-clause
Table Detection and Extraction Using Deep Learning ( It is built in Python, using Luminoth, TensorFlow<2.0 and Sonnet.)

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Tabulo

Tesserocr
A Python wrapper for the tesseract-ocr API
Stars: ✭ 1,567 (+1324.55%)
Mutual labels:  ocr, tesseract
Tesseract
This package contains an OCR engine - libtesseract and a command line program - tesseract. Tesseract 4 adds a new neural net (LSTM) based OCR engine which is focused on line recognition, but also still supports the legacy Tesseract OCR engine of Tesseract 3 which works by recognizing character patterns. Compatibility with Tesseract 3 is enabled by using the Legacy OCR Engine mode (--oem 0). It also needs traineddata files which support the legacy engine, for example those from the tessdata repository.
Stars: ✭ 43,199 (+39171.82%)
Mutual labels:  ocr, tesseract
Pan card ocr project
To extract details from Indian National Identification Cards such as PAN (completed) & Aadhar, Passport, Driving License (WIP) in a structured format
Stars: ✭ 39 (-64.55%)
Mutual labels:  ocr, tesseract
Gosseract
Go package for OCR (Optical Character Recognition), by using Tesseract C++ library
Stars: ✭ 1,622 (+1374.55%)
Mutual labels:  ocr, tesseract
Ocr Electron Vue
📇 A Simple OCR Application built on Electron, Vue.js & Tesseract.js
Stars: ✭ 67 (-39.09%)
Mutual labels:  ocr, tesseract
Ocrbot
An OCR (Optical Character Recognition) bot for Mastodon (and compatible) instances
Stars: ✭ 39 (-64.55%)
Mutual labels:  ocr, tesseract
Swiftytesseractrte
SwiftyTesseract Real-Time Engine
Stars: ✭ 49 (-55.45%)
Mutual labels:  ocr, tesseract
Pytesseractid
使用 pytesseract ocr 识别 18 位身份证号
Stars: ✭ 23 (-79.09%)
Mutual labels:  ocr, tesseract
Ssd Pytorch
SSD: Single Shot MultiBox Detector pytorch implementation focusing on simplicity
Stars: ✭ 107 (-2.73%)
Mutual labels:  ssd, detection
Idmatch
Match faces on id cards with OCR capabilities.
Stars: ✭ 52 (-52.73%)
Mutual labels:  ocr, tesseract
Training extensions
Trainable models and NN optimization tools
Stars: ✭ 857 (+679.09%)
Mutual labels:  ssd, detection
Penteract Ocr
⭐️ The native node.js bindings to the Tesseract OCR project.
Stars: ✭ 86 (-21.82%)
Mutual labels:  ocr, tesseract
Cogstack Pipeline
Distributed, fault tolerant batch processing for Natural Language Applications and Search, using remote partitioning
Stars: ✭ 26 (-76.36%)
Mutual labels:  ocr, tesseract
Links Detector
📖 👆🏻 Links Detector makes printed links clickable via your smartphone camera. No need to type a link in, just scan and click on it.
Stars: ✭ 106 (-3.64%)
Mutual labels:  ocr, tesseract
Javascript Bcr Library
Offline business card reader
Stars: ✭ 24 (-78.18%)
Mutual labels:  ocr, tesseract
Eyevis
Android based Vocal Vision for Visually Impaired. Object Detection, Voice Assistance, Optical Character Reader, Read Aloud, Face Recognition, Landmark Recognition, Image Labelling etc.
Stars: ✭ 48 (-56.36%)
Mutual labels:  ocr, detection
Tensorflow Face Detection
A mobilenet SSD based face detector, powered by tensorflow object detection api, trained by WIDERFACE dataset.
Stars: ✭ 711 (+546.36%)
Mutual labels:  ssd, detection
Tesseract
A PHP wrapper for the Tesseract OCR engine
Stars: ✭ 19 (-82.73%)
Mutual labels:  ocr, tesseract
Tesseract Python
Examples to implement OCR(Optical Character Recognition) using tesseract using Python
Stars: ✭ 49 (-55.45%)
Mutual labels:  ocr, tesseract
Textshot
Python tool for grabbing text via screenshot
Stars: ✭ 1,163 (+957.27%)
Mutual labels:  ocr, tesseract

Tabulo


Tabulo is an open source toolkit for computer vision. Currently, we support table detection, but we are aiming for much more. It is built in Python, using Luminoth, TensorFlow and Sonnet.

Table of Contents

  1. Installation Instructions
  2. Avaiable API's
  3. Working with pretrained Models
  4. Runnning Tabulo
  5. Runnning Tabulo As Service
  6. Supported models
  7. Usage
  8. Working with datasets
  9. Training
  10. LICENSE

1. Installation Instructions

Tabulo currently supports Python 2.7 and 3.4–3.6.

1.1 Pre-requisites

To use Tabulo, TensorFlow must be installed beforehand. If you want GPU support, you should install the GPU version of TensorFlow with pip install tensorflow-gpu, or else you can use the CPU version using pip install tensorflow.

We are using tesseract to extract data from table so you have to install tesseract also. Follow this link to install tessersact

1.2 Installing Tabulo

First, clone the repo on your machine and then install with pip:

git clone https://github.com/interviewBubble/Tabulo.git
cd tabulo
pip install -e .

1.3 Check that the installation worked

Simply run tabulo --help.

2. Avaiable API's

  • localhost:5000/api/fasterrcnn/predict/ - To detect table in the image
  • localhost:5000/api/fasterrcnn/extract/ - Extract table content from detected tables

3. Working with pretrained Models:

  • DOWNLOAD pretrained model from Google drive
  • Unzip and Copy downloaded luminoth folder inside luminoth/utils/pretrained_models folder
  • Hit this command to list all check points: tabulo checkpoint list
  • You will get output like this: Checkpoints
  • Now run server using this command: tabulo server web --checkpoint 6aac7a1e8a8e

4. Runnning Tabulo

4.1 Running Tabulo as Web Server:

Running Tabulo

4.2 Example of Table Detection with Faster R-CNN By Tabulo:

Example of Table Detection with Faster R-CNN By Tabulo

4.3 Example of Table Data Extraction with tesseract By Tabulo:

Example of Table Data Extraction with tesseract By Tabulo

5. Runnning Tabulo As Service:

5.1 Using Curl command

curl -X POST \
  http://localhost:5000/api/fasterrcnn/predict/ \
  -H 'Content-Type: application/x-www-form-urlencoded' \
  -H 'Postman-Token: 70478bd2-e1e8-442f-b0bf-ea5ecf7bf4d8' \
  -H 'cache-control: no-cache' \
  -H 'content-type: multipart/form-data; boundary=----WebKitFormBoundary7MA4YWxkTrZu0gW' \
  -F [email protected]/path/to/image/page_8-min.jpg

5.2 With PostMan

Header Section:

Table Detection using Postman

Data Section:

Table Detection using Postman

6. Supported models

Currently, we support the following models:

We also provide pre-trained checkpoints for the above models trained on popular datasets such as COCO and Pascal.

7. Usage

There is one main command line interface which you can use with the tabulo command. Whenever you are confused on how you are supposed to do something just type:

tabulo --help or tabulo <subcommand> --help

and a list of available options with descriptions will show up.

8. Working with datasets

DataSet to train your custom model.

9. Training

See Training your own model to learn how to train locally or in Google Cloud.

10. LICENSE

Released under the BSD 3-Clause.


References

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].