All Projects → opensemanticsearch → Open Semantic Search

opensemanticsearch / Open Semantic Search

Licence: gpl-3.0
Open Source research tool to search, browse, analyze and explore large document collections by Semantic Search Engine and Open Source Text Mining & Text Analytics platform (Integrates ETL for document processing, OCR for images & PDF, named entity recognition for persons, organizations & locations, metadata management by thesaurus & ontologies, search user interface & search apps for fulltext search, faceted search & knowledge graph)

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Open Semantic Search

Open Semantic Search Apps
Python/Django based webapps and web user interfaces for search, structure (meta data management like thesaurus, ontologies, annotations and named entities) and data import (ETL like text extraction, OCR and crawling filesystems or websites)
Stars: ✭ 55 (-85.75%)
Mutual labels:  search, named-entity-recognition, ocr, search-interface
open-semantic-desktop-search
Virtual Machine for Desktop Search with Open Semantic Search
Stars: ✭ 22 (-94.3%)
Mutual labels:  search-engine, annotation, named-entity-recognition, search-interface
Opensearchserver
Open-source Enterprise Grade Search Engine Software
Stars: ✭ 408 (+5.7%)
Mutual labels:  search, search-engine, ocr
Check
Development environment for Meedan Check, a collaborative media annotation platform
Stars: ✭ 84 (-78.24%)
Mutual labels:  journalism, osint, annotation
Osint collection
Maintained collection of OSINT related resources. (All Free & Actionable)
Stars: ✭ 809 (+109.59%)
Mutual labels:  search, journalism, osint
Open Semantic Etl
Python based Open Source ETL tools for file crawling, document processing (text extraction, OCR), content analysis (Entity Extraction & Named Entity Recognition) & data enrichment (annotation) pipelines & ingestor to Solr or Elastic search index & linked data graph database
Stars: ✭ 165 (-57.25%)
Mutual labels:  named-entity-recognition, annotation, ocr
Instantsearch Ios
⚡️ A library of widgets and helpers to build instant-search applications on iOS.
Stars: ✭ 498 (+29.02%)
Mutual labels:  search, search-engine, search-interface
Ambar
🔍 Ambar: Document Search Engine
Stars: ✭ 1,829 (+373.83%)
Mutual labels:  search, search-engine, ocr
Instantsearch Android
A library of widgets and helpers to build instant-search applications on Android.
Stars: ✭ 129 (-66.58%)
Mutual labels:  search, search-engine, search-interface
Awesome Solr
A curated list of Awesome Apache Solr links and resources.
Stars: ✭ 69 (-82.12%)
Mutual labels:  search, search-engine, search-interface
corpusexplorer2.0
Korpuslinguistik war noch nie so einfach...
Stars: ✭ 16 (-95.85%)
Mutual labels:  text-mining, text-analysis, journalism
Maryam
Maryam: Open-source Intelligence(OSINT) Framework
Stars: ✭ 371 (-3.89%)
Mutual labels:  search, osint, search-engine
Graphbrain
Language, Knowledge, Cognition
Stars: ✭ 294 (-23.83%)
Mutual labels:  text-mining, text-analysis
Textpipe
Textpipe: clean and extract metadata from text
Stars: ✭ 284 (-26.42%)
Mutual labels:  named-entity-recognition, text-analysis
Toshi
A full-text search engine in rust
Stars: ✭ 3,373 (+773.83%)
Mutual labels:  search, search-engine
Bitfunnel
A signature-based search engine
Stars: ✭ 313 (-18.91%)
Mutual labels:  search, search-engine
Go Cyber
Your 🔵 Superintelligence
Stars: ✭ 270 (-30.05%)
Mutual labels:  search, search-engine
Alfred Npms
Alfred 3 workflow to search for npm packages with npms.io
Stars: ✭ 312 (-19.17%)
Mutual labels:  search, search-engine
Hexo Generator Search
A plugin to generate search data for Hexo.
Stars: ✭ 318 (-17.62%)
Mutual labels:  search, search-engine
Xapiand
Xapiand: A RESTful Search Engine
Stars: ✭ 347 (-10.1%)
Mutual labels:  search, search-engine

Open Semantic Search

https://opensemanticsearch.org

Integrated search server, ETL framework for document processing (crawling, text extraction, text analysis, named entity recognition and OCR for images and embedded images in PDF), search user interfaces, text mining, text analytics and search apps for fulltext search, faceted search, exploratory search and knowledge graph search

Build

How to build the deb package for installation on Debian or Ubuntu server or the docker images for running in Docker containers:

Build deb package

To build a deb package for Debian or Ubuntu, call the build script "build-deb" as user root (change user by su or sudo su):

./build-deb

Build docker images

Clone the repository including the dependencies :

git clone --recurse-submodules --remote-submodules https://github.com/opensemanticsearch/open-semantic-search.git

Inside the opensemanticsearch directory, build the Docker images use the docker-compose config docker-compose.yml :

cd opensemanticsearch
docker-compose build

After these builds all the Docker images/dependencies/services can by started together by docker-compose with the config file docker-compose.yml.

You can run the instance by typing :

docker-compose up

You can browse OpenSemanticSearch in your favourite browser at this url :

http://localhost:8080/search/

Automated tests

For CI/CD there are some different automated tests:

Integration tests

Since the submodule Open Semantic ETL uses and needs different powerful services like Solr, spacY-services or Tika-Server by HTTP and REST-API, the automated tests run as integration tests within the docker-compose environment configured in docker-compose.etl-test.yml so these services are available while running the unittests.

End to end tests

Some automated integration tests and end-to-end (E2E) tests within a web browser controlled by the browser automation framework playwright and the node.js / javascript based test framework JEST.

You can extend the automated tests in test/test.js

They run by the docker image Dockerfile-test and need the services of the docker-compose environment docker-compose.test.yml

Dependencies

Dependencies are resolved automatically by building or by installation of the Debian or Ubuntu packages or by building the Docker images.

Documentation on this dependecies which may help debugging dependency hell issues or installations in other environments:

Build dependencies on Source code (GIT)

Dependencies on other Git repositories / submodules of components like Open Semantic ETL are defined in the Git config file .gitmodules

The submodules will be checked out automatically to the subdirectory "src", if you check out this repository by git in recursive mode.

Packaging dependencies of Java archives (JAR)

The submodules tika.deb and solr.deb need the JAR of Apache Tika-Server and Apache Solr.

If not there, they will be downloaded from Apache Software Foundation by wget in the submodule "build" script or its "Dockerfile".

Installation dependencies on Debian/Ubuntu packages (DEB)

Dependecies of tools and libraries, which are available in the Debian or Ubuntu package repositories, are defined in the section "Depends" of the deb package config file DEBIAN/control

https://github.com/opensemanticsearch/open-semantic-search/blob/master/DEBIAN/control

Installation dependencies on Python packages (PIP)

Dependecies of Python libraries which are not available as packages of the Linux distribution but in Python Package Index (PyPI), are defined in

https://github.com/opensemanticsearch/open-semantic-etl/blob/master/src/opensemanticetl/requirements.txt

This dependencies will be installed automatically on installation of the Debian/Ubuntu packages by DEBIAN/postinst of the Debian/Ubuntu packages or by docker build configured by Dockerfile by

pip3 install -r /usr/lib/python3/dist-packages/opensemanticetl/requirements.txt

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].