All Projects → tantivy-search → Tantivy

tantivy-search / Tantivy

Licence: mit
Tantivy is a full-text search engine library inspired by Apache Lucene and written in Rust

Programming Languages

rust
11053 projects

Projects that are alternatives of or similar to Tantivy

Xapiand
Xapiand: A RESTful Search Engine
Stars: ✭ 347 (-93.87%)
Mutual labels:  search-engine
Elasticsearch
The missing elasticsearch ORM for Laravel, Lumen and Native php applications
Stars: ✭ 375 (-93.38%)
Mutual labels:  search-engine
Lucene Solr
Apache Lucene and Solr open-source search software
Stars: ✭ 4,217 (-25.51%)
Mutual labels:  search-engine
Minsql
High-performance log search engine.
Stars: ✭ 356 (-93.71%)
Mutual labels:  search-engine
Jivesearch
A search engine that doesn't track you.
Stars: ✭ 364 (-93.57%)
Mutual labels:  search-engine
Open Semantic Search
Open Source research tool to search, browse, analyze and explore large document collections by Semantic Search Engine and Open Source Text Mining & Text Analytics platform (Integrates ETL for document processing, OCR for images & PDF, named entity recognition for persons, organizations & locations, metadata management by thesaurus & ontologies, search user interface & search apps for fulltext search, faceted search & knowledge graph)
Stars: ✭ 386 (-93.18%)
Mutual labels:  search-engine
Iveely.search
Pure java realize search engine, try to directly hit the user's search for answers.
Stars: ✭ 320 (-94.35%)
Mutual labels:  search-engine
Awesome Privacy
💡Limiting personal data leaks on the internet
Stars: ✭ 488 (-91.38%)
Mutual labels:  search-engine
Maryam
Maryam: Open-source Intelligence(OSINT) Framework
Stars: ✭ 371 (-93.45%)
Mutual labels:  search-engine
Sis
Simple image search engine
Stars: ✭ 438 (-92.26%)
Mutual labels:  search-engine
Algoliasearch Wordpress
❌🗑🙅‍♂️ Algolia Search plugin for WordPress is no longer supported. Please use our API client guide instead
Stars: ✭ 357 (-93.69%)
Mutual labels:  search-engine
Awesome Search
Awesome Search - this is all about the (e-commerce) search and its awesomeness
Stars: ✭ 361 (-93.62%)
Mutual labels:  search-engine
Para
Open source back-end server for web, mobile and IoT. The backend for busy developers. (self-hosted or hosted)
Stars: ✭ 389 (-93.13%)
Mutual labels:  search-engine
Vespa
The open big data serving engine. https://vespa.ai
Stars: ✭ 3,747 (-33.81%)
Mutual labels:  search-engine
Picky
Picky is an easy to use and fast Ruby semantic search engine that helps your users find what they are looking for.
Stars: ✭ 441 (-92.21%)
Mutual labels:  search-engine
Helm Bibtex
Search and manage bibliographies in Emacs
Stars: ✭ 328 (-94.21%)
Mutual labels:  search-engine
Dbreeze
C# .NET MONO NOSQL ( key value store embedded ) ACID multi-paradigm database management system.
Stars: ✭ 383 (-93.23%)
Mutual labels:  search-engine
Instantsearch Ios
⚡️ A library of widgets and helpers to build instant-search applications on iOS.
Stars: ✭ 498 (-91.2%)
Mutual labels:  search-engine
Pisa
PISA: Performant Indexes and Search for Academia
Stars: ✭ 489 (-91.36%)
Mutual labels:  search-engine
Opensearchserver
Open-source Enterprise Grade Search Engine Software
Stars: ✭ 408 (-92.79%)
Mutual labels:  search-engine

Docs Build Status codecov Join the chat at https://discord.gg/MT27AG5EVE License: MIT Crates.io

Tantivy

Tantivy is a full text search engine library written in Rust.

It is closer to Apache Lucene than to Elasticsearch or Apache Solr in the sense it is not an off-the-shelf search engine server, but rather a crate that can be used to build such a search engine.

Tantivy is, in fact, strongly inspired by Lucene's design.

Benchmark

The following benchmark break downs performance for different type of queries / collection.

Your mileage WILL vary depending on the nature of queries and their load.

Features

  • Full-text search
  • Configurable tokenizer (stemming available for 17 Latin languages with third party support for Chinese (tantivy-jieba and cang-jie), Japanese (lindera and tantivy-tokenizer-tiny-segmenter) and Korean (lindera + lindera-ko-dic-builder)
  • Fast (check out the 🐎 benchmark 🐎)
  • Tiny startup time (<10ms), perfect for command line tools
  • BM25 scoring (the same as Lucene)
  • Natural query language (e.g. (michael AND jackson) OR "king of pop")
  • Phrase queries search (e.g. "michael jackson")
  • Incremental indexing
  • Multithreaded indexing (indexing English Wikipedia takes < 3 minutes on my desktop)
  • Mmap directory
  • SIMD integer compression when the platform/CPU includes the SSE2 instruction set
  • Single valued and multivalued u64, i64, and f64 fast fields (equivalent of doc values in Lucene)
  • &[u8] fast fields
  • Text, i64, u64, f64, dates, and hierarchical facet fields
  • LZ4 compressed document store
  • Range queries
  • Faceted search
  • Configurable indexing (optional term frequency and position indexing)
  • Cheesy logo with a horse

Non-features

  • Distributed search is out of the scope of Tantivy. That being said, Tantivy is a library upon which one could build a distributed search. Serializable/mergeable collector state for instance, are within the scope of Tantivy.

Getting started

Tantivy works on stable Rust (>= 1.27) and supports Linux, MacOS, and Windows.

How can I support this project?

There are many ways to support this project.

  • Use Tantivy and tell us about your experience on Discord or by email ([email protected])
  • Report bugs
  • Write a blog post
  • Help with documentation by asking questions or submitting PRs
  • Contribute code (you can join our Discord server)
  • Talk about Tantivy around you

Contributing code

We use the GitHub Pull Request workflow: reference a GitHub ticket and/or include a comprehensive commit message when opening a PR.

Clone and build locally

Tantivy compiles on stable Rust but requires Rust >= 1.27. To check out and run tests, you can simply run:

    git clone https://github.com/quickwit-inc/tantivy.git
    cd tantivy
    cargo build

Run tests

Some tests will not run with just cargo test because of fail-rs. To run the tests exhaustively, run ./run-tests.sh.

Debug

You might find it useful to step through the programme with a debugger.

A failing test

Make sure you haven't run cargo clean after the most recent cargo test or cargo build to guarantee that the target/ directory exists. Use this bash script to find the name of the most recent debug build of Tantivy and run it under rust-gdb:

find target/debug/ -maxdepth 1 -executable -type f -name "tantivy*" -printf '%TY-%Tm-%Td %TT %p\n' | sort -r | cut -d " " -f 3 | xargs -I RECENT_DBG_TANTIVY rust-gdb RECENT_DBG_TANTIVY

Now that you are in rust-gdb, you can set breakpoints on lines and methods that match your source code and run the debug executable with flags that you normally pass to cargo test like this:

$gdb run --test-threads 1 --test $NAME_OF_TEST

An example

By default, rustc compiles everything in the examples/ directory in debug mode. This makes it easy for you to make examples to reproduce bugs:

rust-gdb target/debug/examples/$EXAMPLE_NAME
$ gdb run
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].