All Projects → kreeben → Resin

kreeben / Resin

Licence: mit
Hardware-accelerated vector-based search engine. Available as a HTTP service or as an embedded library.

Projects that are alternatives of or similar to Resin

Rated Ranking Evaluator
Search Quality Evaluation Tool for Apache Solr & Elasticsearch search-based infrastructures
Stars: ✭ 134 (-74.67%)
Mutual labels:  search, search-engine, information-retrieval
Sparkler
Spark-Crawler: Apache Nutch-like crawler that runs on Apache Spark.
Stars: ✭ 362 (-31.57%)
Mutual labels:  search, search-engine, information-retrieval
Pisa
PISA: Performant Indexes and Search for Academia
Stars: ✭ 489 (-7.56%)
Mutual labels:  search, search-engine, information-retrieval
Haystack
🔍 Haystack is an open source NLP framework that leverages Transformer models. It enables developers to implement production-ready neural search, question answering, semantic document search and summarization for a wide range of applications.
Stars: ✭ 3,409 (+544.42%)
Mutual labels:  search, search-engine, information-retrieval
Lucene Solr
Apache Lucene and Solr open-source search software
Stars: ✭ 4,217 (+697.16%)
Mutual labels:  search, search-engine, information-retrieval
Alfred Npms
Alfred 3 workflow to search for npm packages with npms.io
Stars: ✭ 312 (-41.02%)
Mutual labels:  search, search-engine
Bitfunnel
A signature-based search engine
Stars: ✭ 313 (-40.83%)
Mutual labels:  search, search-engine
Xapiand
Xapiand: A RESTful Search Engine
Stars: ✭ 347 (-34.4%)
Mutual labels:  search, search-engine
Awesome Search
Awesome Search - this is all about the (e-commerce) search and its awesomeness
Stars: ✭ 361 (-31.76%)
Mutual labels:  search, search-engine
Redisearch
A query and indexing engine for Redis, providing secondary indexing, full-text search, and aggregations.
Stars: ✭ 3,393 (+541.4%)
Mutual labels:  search, search-engine
Minsql
High-performance log search engine.
Stars: ✭ 356 (-32.7%)
Mutual labels:  search, search-engine
Jivesearch
A search engine that doesn't track you.
Stars: ✭ 364 (-31.19%)
Mutual labels:  search, search-engine
Toshi
A full-text search engine in rust
Stars: ✭ 3,373 (+537.62%)
Mutual labels:  search, search-engine
Search Engine
A math-aware search engine.
Stars: ✭ 278 (-47.45%)
Mutual labels:  search-engine, information-retrieval
Hexo Generator Search
A plugin to generate search data for Hexo.
Stars: ✭ 318 (-39.89%)
Mutual labels:  search, search-engine
Go Cyber
Your 🔵 Superintelligence
Stars: ✭ 270 (-48.96%)
Mutual labels:  search, search-engine
Instantsearch Ios
⚡️ A library of widgets and helpers to build instant-search applications on iOS.
Stars: ✭ 498 (-5.86%)
Mutual labels:  search, search-engine
Dbreeze
C# .NET MONO NOSQL ( key value store embedded ) ACID multi-paradigm database management system.
Stars: ✭ 383 (-27.6%)
Mutual labels:  search, search-engine
Open Semantic Search
Open Source research tool to search, browse, analyze and explore large document collections by Semantic Search Engine and Open Source Text Mining & Text Analytics platform (Integrates ETL for document processing, OCR for images & PDF, named entity recognition for persons, organizations & locations, metadata management by thesaurus & ontologies, search user interface & search apps for fulltext search, faceted search & knowledge graph)
Stars: ✭ 386 (-27.03%)
Mutual labels:  search, search-engine
Darksearch
🔍 Search engine for hidden material. Scraping dark web onions, irc logs, deep web etc...
Stars: ✭ 260 (-50.85%)
Mutual labels:  search, search-engine

⍼ Resin.Search

NuGet version (Resin.Search)

Overview | How to install | User guide

HTTP search engine/embedded library

Launch a Resin HTTP server or use the Resin search library to search through any vector space. With hardware accelerated vector operations from MathNet Resin is especially well suited for problem spaces that can be defined as such.

Vector spaces are configured by implementing IModel.

Document database

Resin stores data as document collections. It applies your prefered IModel onto your data while you write and query it. The write pipeline produces a set of indices (graphs), one for each document field, that you may interact with by using the Resin web GUI, the Resin read/write JSON HTTP API, or programmatically.

Vector-based indices

Resin indices are binary search trees and creates clusters of those vectors that are similar to each other, as you populate them with your data. Graph nodes are created in the Tokenize method of your model. When a node is added to the graph its cosine angle, i.e. its similarity to other nodes, determine its position (path) within the graph.

Customizable vector spaces

Resin comes pre-loaded with two IModel vector space configurations: one for text and another for MNIST images. The text model has been tested by validating indices generated from Wikipedia search engine backup files as well as by parsing Common Crawl WAT, WET and WARC files, to determine at which scale Resin may operate in and at what accuracy.

The image model is included mostly as an example of how to implement your own prefered machine-learning algorithm for building custom-made search indices. The error rate of the image classifier is ~5%.

Performance

Currently, Wikipedia size data sets produce indices capable of sub-second phrase searching.

You may also

  • build, validate and optimize indices using the command-line tool Sir.Cmd
  • read efficiently by specifying which fields to return in the JSON result
  • implement messaging formats such as XML (or any other, really) if JSON is not suitable for your use case
  • construct queries that join between fields and even between collections, that you may post as JSON to the read endpoint or create programatically.
  • construct any type of indexing scheme that produces any type of embeddings with virtually any dimensionality using either sparse or dense vectors.

Applications

Executables

  • Sir.HttpServer: HTTP search service with HTML GUI and HTTP JSON API for reading and writing.
  • Sir.Cmd: Command line tool that executes commands that implement Sir.ICommand. Write, validate, optimize and more via command-line.

Libraries

  • Sir.CommonCrawl: Command for downloading and indexing Common Crawl WAT and WET files.
  • Sir.Mnist: Command for training and testing the accuracy of a index of MNIST images.
  • Sir.Wikipedia: Command for indexing Wikipedia.
  • Sir.Search: In-process search engine.
  • Sir.Core: Shared interfaces and types, such as IModel, ICommand and IVector.

Roadmap

  • [x] v0.1a - bag-of-characters vector space language model
  • [x] v0.2a - HTTP API
  • [x] v0.3a - query language
  • [x] v0.4 - linear classifier image model
  • [ ] v0.5 - semantic language model
  • [ ] v1.0 - voice model
  • [ ] v2.0 - image-to-voice
  • [ ] v2.1 - voice-to-text
  • [ ] v2.2 - text-to-image
  • [ ] v2.3 - AI
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].