All Projects → LodestoneHQ → Lodestone

LodestoneHQ / Lodestone

Licence: gpl-3.0
Personal Document Archiving (DMS, EDMS for Personal/Home Office use)

Projects that are alternatives of or similar to Lodestone

Mayan Edms
Free Open Source Document Management System (mirror, no pull request or issues)
Stars: ✭ 226 (-46.95%)
Mutual labels:  document-management, ocr, filemanager
FileBasedMiniDMS
This php script sorts your documents (by using hardlinks) into subfolders based on the hashtags it finds in your documents filenames.
Stars: ✭ 35 (-91.78%)
Mutual labels:  ocr, document-management
Papermerge
Open Source Document Management System for Digital Archives (Scanned Documents)
Stars: ✭ 1,177 (+176.29%)
Mutual labels:  document-management, ocr
Mayan Edms
Repository mirror of GtLab: https://gitlab.com/mayan-edms/mayan-edms Please use the upstream repository for issues and pull requests.
Stars: ✭ 398 (-6.57%)
Mutual labels:  document-management, ocr
Paperwork
Personal document manager (Linux/Windows) -- Moved to Gnome's Gitlab
Stars: ✭ 2,392 (+461.5%)
Mutual labels:  document-management, ocr
i-librarian-free
I, Librarian - open-source version of a PDF managing SaaS.
Stars: ✭ 110 (-74.18%)
Mutual labels:  ocr, document-management
Docspell
Assist in organizing your piles of documents, resulting from scanners, e-mails and other sources with miminal effort.
Stars: ✭ 303 (-28.87%)
Mutual labels:  document-management, ocr
Camera
二代身份证信息识别
Stars: ✭ 360 (-15.49%)
Mutual labels:  ocr
Ocrserver
A simple OCR API server, seriously easy to be deployed by Docker, on Heroku as well
Stars: ✭ 359 (-15.73%)
Mutual labels:  ocr
Ccextractor
CCExtractor - Official version maintained by the core team
Stars: ✭ 356 (-16.43%)
Mutual labels:  ocr
Card Ocr
身份证识别OCR
Stars: ✭ 345 (-19.01%)
Mutual labels:  ocr
Filemanager
File manager in a single php file
Stars: ✭ 364 (-14.55%)
Mutual labels:  filemanager
Kcfinder
KCFinder web file manager
Stars: ✭ 399 (-6.34%)
Mutual labels:  filemanager
Passportscanner
Scan the MRZ code of a passport and extract the firstname, lastname, passport number, nationality, date of birth, expiration date and personal numer.
Stars: ✭ 417 (-2.11%)
Mutual labels:  ocr
Ocrmypdf
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
Stars: ✭ 5,549 (+1202.58%)
Mutual labels:  ocr
Psenet.pytorch
A pytorch re-implementation of PSENet: Shape Robust Text Detection with Progressive Scale Expansion Network
Stars: ✭ 416 (-2.35%)
Mutual labels:  ocr
Doctor
Doctor is a documentation server for your docs in github
Stars: ✭ 391 (-8.22%)
Mutual labels:  document-management
Xbackbone
A lightweight file manager with full ShareX, Screencloud support and more
Stars: ✭ 359 (-15.73%)
Mutual labels:  filemanager
Open Semantic Search
Open Source research tool to search, browse, analyze and explore large document collections by Semantic Search Engine and Open Source Text Mining & Text Analytics platform (Integrates ETL for document processing, OCR for images & PDF, named entity recognition for persons, organizations & locations, metadata management by thesaurus & ontologies, search user interface & search apps for fulltext search, faceted search & knowledge graph)
Stars: ✭ 386 (-9.39%)
Mutual labels:  ocr
Handwriting Ocr
OCR software for recognition of handwritten text
Stars: ✭ 411 (-3.52%)
Mutual labels:  ocr

lodestone_view

Lodestone - Personal Document Search & Archive

GitHub license Docker Pulls Gitter chat

NOTE: Lodestone is a Work-in-Progress and is not production ready.

Lodestone is designed to be the modern and digital equivalent of a home filing cabinet. If you've gone searching for something similar in the past, you might be familiar with terms like Electronic Document Management System (EDMS), Document Management System (DMS) or Personal Archival.

Lodestone is designed around a handful of core features:

  • Full text document search - It doesn't matter what format you're document is in, we should be able to parse it (using OCR) and let you search for the text.
  • Rich tagging - Unlike a physical file cabinet where a document can only exist in one place, digital documents support tags, allowing you to create a flexible organizational structure that works for you.
  • Automated - Document collection & OCR processing should be automatic. Just saving a file to your network drive should be enough to start document processing.
  • Non-destructive - When Lodestone processes a document, the original file will be left untouched, exactly where you left it.
  • Web Accessible - Lodestone is designed to run on a trusted home server and be accessible 24x7.
  • Filesystem/Cloud Sync - Optionally synchronize your tagged documents via a cloud storage provider of your choice (Dropbox, GDrive, etc) or access via a FUSE filesystem mount.

Screenshot

Dashboard

More screenshots available in the docs/screenshots directory.

Installation

Lodestone is made up of a handful of open-source components, and as such its easiest to deploy using Docker/Docker Compose

docker-compose up

# then open the following url in your browser

http://localhost/

Place your documents in the /data/storage/documents directory, and the Filesystem Collector should automatically start processing them.

If you would like some test documents to play with safely, you can take a look at the LodestoneHQ/lodestone-test-docs repository.

Configuration

Lodestone follows a Convention over Configuration design, which means that it works out of the box with sane defaults, but you can customize them to match your needs.

Most of the configuration files are stored in the webapp image (source code here), and requested by various components when they start up.

  • filetypes.json (backend/data/filetypes.json) contains lists of includes and excludes that are used by the processor container to decide which files to process and load into the database.

  • tags.json (backend/data/tags.json) contains a nested structure of labels that can be used to group tags and seach for your documents in the Lodestone web UI.

  • mapping.json (backend/data/mappings.json) is used to ensure that the elasticsearch container has a consistent data storage structue.

To overide these files, just setup a Docker volume binding to the specified file in the /lodestone/data/ directory in the webapp container.

Considerations

Lodestone is a very opinionated solution for personal document management. As such, there's a couple things you should know before even considering it.

  • Currently there's no user management. Lodestone is designed to run at home, on your trusted network. This may be reconsidered at a future date.

  • Limited support for file types

    • doc,docx,xls,xlsx, ppt, pptx - Microsoft Office Documents

    • pages, numbers, key - Apple iWork Documents

    • pdf

    • rtf

    • jpg, jpeg, png, tiff, tif

      If you think there are additional document types that may be useful to support, please open an issue.

What about..

As mentioned above, Lodestone isn't some magical new technology. EDMS and DMS systems have been around for a long time, but unfortunately they all seem to miss one or more features that I think are required for a modern filing cabinet.

Here's some of my research, but you should take a look at them yourselves.

Name Docker/Linux Web UI Modern UI Tagging Non-destructive OCR Watch Folder Email Import
MayanEDMS
Paperless ❗️

Place your documents in the /data/storage/documents directory, and the Filesystem Collector should automatically start processing them.

If you would like some test documents to play with safely, you can take a look at the LodesoneHQ/lodestone-test-docs repository.

If the processor doesn't pick up your files, you may have to fake an update to them to change the timestamp. This is temporary and will be resolved in a future release. You can use the command below to update the timestamp and trigger the processor:

find . -exec touch {} \;

Components

Name Software Version Docker Image
Elasticsearch Elasticsearch v7.2.1 lodestonehq/lodestone-elasticsearch
Document Processor Go lodestonehq/lodestone-document-processor
Thumbnail Processor Go lodestonehq/lodestone-thumbnail-processor
Web / Api Angular v11.x / ExpressJS v4.16 lodestonehq/lodestone-ui
Storage minio 2019 (S3 compatible) analogj/lodestone:storage
Queue RabbitMQ lodestonehq/lodestone-rabbitmq
OCR Tika lodestonehq/lodestone-tika

Future Development

Please see our Issues system for a list of items that have been reported. All issues for the project are contained in this repo. Issues are labeled by area affected, status, and other labels as appropriate. Below are some example of filtering issues by label:

Please feel free to create an issue if you have an idea for a new feature, find a bug, or have a question.

Logo

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].