Open PaperlessScan, index, and archive all of your paper documents (acquired by Mayan EDMS)
Stars: ✭ 2,538 (+1712.86%)
Open Semantic EtlPython based Open Source ETL tools for file crawling, document processing (text extraction, OCR), content analysis (Entity Extraction & Named Entity Recognition) & data enrichment (annotation) pipelines & ingestor to Solr or Elastic search index & linked data graph database
Stars: ✭ 165 (+17.86%)
PaperlessScan, index, and archive all of your paper documents
Stars: ✭ 7,662 (+5372.86%)
papermerge-corePapermerge RESTful backend structured as reusable Django app
Stars: ✭ 103 (-26.43%)
paperbaseOpen source document organizer with automatic OCR and full text search
Stars: ✭ 21 (-85%)
ingest-fileIngestors extract the contents of mixed unstructured documents into structured (followthemoney) data.
Stars: ✭ 40 (-71.43%)
MlA high-level machine learning and deep learning library for the PHP language.
Stars: ✭ 1,270 (+807.14%)
TextacyNLP, before and after spaCy
Stars: ✭ 1,849 (+1220.71%)
Cocoaai🤖 The Cocoa Artificial Intelligence Lab
Stars: ✭ 134 (-4.29%)
OpaqueAn encrypted data analytics platform
Stars: ✭ 129 (-7.86%)
PrenlpPreprocessing Library for Natural Language Processing
Stars: ✭ 130 (-7.14%)
Mams For AbsaA Multi-Aspect Multi-Sentiment Dataset for aspect-based sentiment analysis.
Stars: ✭ 135 (-3.57%)
Chars2vecCharacter-based word embeddings model based on RNN for handling real world texts
Stars: ✭ 130 (-7.14%)
NcrfppNCRF++, a Neural Sequence Labeling Toolkit. Easy use to any sequence labeling tasks (e.g. NER, POS, Segmentation). It includes character LSTM/CNN, word LSTM/CNN and softmax/CRF components.
Stars: ✭ 1,767 (+1162.14%)
Zamia AiFree and open source A.I. system based on Python, TensorFlow and Prolog.
Stars: ✭ 133 (-5%)
Django DefectdojoDefectDojo is an open-source application vulnerability correlation and security orchestration tool.
Stars: ✭ 1,926 (+1275.71%)
OpenubaA robust, and flexible open source User & Entity Behavior Analytics (UEBA) framework used for Security Analytics. Developed with luv by Data Scientists & Security Analysts from the Cyber Security Industry. [PRE-ALPHA]
Stars: ✭ 127 (-9.29%)
Footprints🐾 A simple registration attribution tracking solution for Laravel (UTM Parameters and Referrers)
Stars: ✭ 127 (-9.29%)
LprAndroid 车牌识别--OCR
Stars: ✭ 139 (-0.71%)
Sluice NetworksCode for Sluice networks: Learning what to share between loosely related tasks
Stars: ✭ 135 (-3.57%)
TokenizerFast and customizable text tokenization library with BPE and SentencePiece support
Stars: ✭ 132 (-5.71%)
TracklessAdd a GDPR-friendly Google Analytics opt-in/opt-out button to your site
Stars: ✭ 127 (-9.29%)
Craft RemadeImplementation of CRAFT Text Detection
Stars: ✭ 127 (-9.29%)
Transformer strPyTorch implementation of my new method for Scene Text Recognition (STR) based on Transformer,Equipped with Transformer, this method outperforms the best model of the aforementioned deep-text-recognition-benchmark by 7.6% on CUTE80.
Stars: ✭ 131 (-6.43%)
Neuro🔮 Neuro.js is machine learning library for building AI assistants and chat-bots (WIP).
Stars: ✭ 126 (-10%)
RobinRObust document image BINarization
Stars: ✭ 131 (-6.43%)
Qlik Py ToolsData Science algorithms for Qlik implemented as a Python Server Side Extension (SSE).
Stars: ✭ 135 (-3.57%)
Konoha🌿 An easy-to-use Japanese Text Processing tool, which makes it possible to switch tokenizers with small changes of code.
Stars: ✭ 130 (-7.14%)
Hrcloud2A full-featured home hosted Cloud Drive, Personal Assistant, App Launcher, File Converter, Streamer, Share Tool & More!
Stars: ✭ 134 (-4.29%)
Reddit DetectivePlay detective on Reddit: Discover political disinformation campaigns, secret influencers and more
Stars: ✭ 129 (-7.86%)
ProsodyHelsinki Prosody Corpus and A System for Predicting Prosodic Prominence from Text
Stars: ✭ 139 (-0.71%)
Spark.NET for Apache® Spark™ makes Apache Spark™ easily accessible to .NET developers.
Stars: ✭ 1,721 (+1129.29%)
MedquadMedical Question Answering Dataset of 47,457 QA pairs created from 12 NIH websites
Stars: ✭ 129 (-7.86%)
Alfred OcrOCR & Translate using multiple interfaces for Alfred Workflow
Stars: ✭ 136 (-2.86%)
SsocrSeven Segment Optical Character Recognition
Stars: ✭ 133 (-5%)
Deep LyricsLyrics Generator aka Character-level Language Modeling with Multi-layer LSTM Recurrent Neural Network
Stars: ✭ 127 (-9.29%)
Neuraldialog LarlPyTorch implementation of latent space reinforcement learning for E2E dialog published at NAACL 2019. It is released by Tiancheng Zhao (Tony) from Dialog Research Center, LTI, CMU
Stars: ✭ 127 (-9.29%)
Zephyr Doc《Zephyr OS 文档 - 中文版》
Stars: ✭ 127 (-9.29%)
Rasa💬 Open source machine learning framework to automate text- and voice-based conversations: NLU, dialogue management, connect to Slack, Facebook, and more - Create chatbots and voice assistants
Stars: ✭ 13,219 (+9342.14%)
SamsaraSamsara is a real-time analytics platform
Stars: ✭ 132 (-5.71%)
RsitecatalystR package to access Adobe Analytics Reporting API v1.4
Stars: ✭ 125 (-10.71%)
ImhotepImhotep is a large-scale analytics platform built by Indeed.
Stars: ✭ 125 (-10.71%)
TimescaledbAn open-source time-series SQL database optimized for fast ingest and complex queries. Packaged as a PostgreSQL extension.
Stars: ✭ 12,211 (+8622.14%)
Kaggle Crowdflower1st Place Solution for CrowdFlower Product Search Results Relevance Competition on Kaggle.
Stars: ✭ 1,708 (+1120%)
EasyocrReady-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
Stars: ✭ 13,379 (+9456.43%)
Scattertext PydataNotebooks for the Seattle PyData 2017 talk on Scattertext
Stars: ✭ 132 (-5.71%)
Etherpad LiteEtherpad: A modern really-real-time collaborative document editor.
Stars: ✭ 11,937 (+8426.43%)
Walkoff AppsWALKOFF-enabled applications. #nsacyber
Stars: ✭ 125 (-10.71%)
UdaUnsupervised Data Augmentation (UDA)
Stars: ✭ 1,877 (+1240.71%)
DashboardsResponsive dashboard templates 📊✨
Stars: ✭ 10,914 (+7695.71%)