All Projects → teamtnt → Tntsearch

teamtnt / Tntsearch

Licence: mit
A fully featured full text search engine written in PHP

Programming Languages

PHP
23972 projects - #3 most used programming language

Projects that are alternatives of or similar to Tntsearch

Flexsearch
Next-Generation full text search library for Browser and Node.js
Stars: ✭ 8,108 (+201.08%)
Mutual labels:  search, search-engine, fuzzy-search, full-text-search, fuzzy
Typesense
Fast, typo tolerant, fuzzy search engine for building delightful search experiences ⚡ 🔍 ✨ An Open Source alternative to Algolia and an Easier-to-Use alternative to ElasticSearch.
Stars: ✭ 8,644 (+220.98%)
Mutual labels:  search, search-engine, fuzzy-search, full-text-search
Fess
Fess is very powerful and easily deployable Enterprise Search Server.
Stars: ✭ 561 (-79.17%)
Mutual labels:  search, search-engine, full-text-search
Redisearch
A query and indexing engine for Redis, providing secondary indexing, full-text search, and aggregations.
Stars: ✭ 3,393 (+25.99%)
Mutual labels:  search, search-engine, fulltext
Scout
RESTful search server written in Python, powered by SQLite.
Stars: ✭ 213 (-92.09%)
Mutual labels:  search, search-engine, full-text-search
Manticoresearch
Database for search
Stars: ✭ 610 (-77.35%)
Mutual labels:  search, search-engine, full-text-search
Minisearch
Tiny and powerful JavaScript full-text search engine for browser and Node
Stars: ✭ 737 (-72.63%)
Mutual labels:  search, search-engine, fuzzy-search
Leaderf
An efficient fuzzy finder that helps to locate files, buffers, mrus, gtags, etc. on the fly for both vim and neovim.
Stars: ✭ 1,733 (-35.65%)
Mutual labels:  fuzzy-search, fuzzy-matching, fuzzy
Lolcate Rs
Lolcate -- A comically fast way of indexing and querying your filesystem. Replaces locate / mlocate / updatedb. Written in Rust.
Stars: ✭ 191 (-92.91%)
Mutual labels:  search, search-engine
Similar Text Finder
🐝 PHP Similar Text Finder aka Fuzzy search. `Did you mean "banana"?`
Stars: ✭ 126 (-95.32%)
Mutual labels:  search, fuzzy-search
Fuse
🔍 Fuzzy search for PHP based on the Bitap algorithm
Stars: ✭ 189 (-92.98%)
Mutual labels:  search, fuzzy-search
List.js
The perfect library for adding search, sort, filters and flexibility to tables, lists and various HTML elements. Built to be invisible and work on existing HTML.
Stars: ✭ 10,650 (+295.47%)
Mutual labels:  search, fuzzy-search
Downloadsearch
search for any kinds of files to download
Stars: ✭ 124 (-95.4%)
Mutual labels:  search, search-engine
Instantsearch Android
A library of widgets and helpers to build instant-search applications on Android.
Stars: ✭ 129 (-95.21%)
Mutual labels:  search, search-engine
Whoogle Search
A self-hosted, ad-free, privacy-respecting metasearch engine
Stars: ✭ 4,645 (+72.48%)
Mutual labels:  search, search-engine
Symspell
SymSpell: 1 million times faster spelling correction & fuzzy search through Symmetric Delete spelling correction algorithm
Stars: ✭ 1,976 (-26.62%)
Mutual labels:  fuzzy-search, fuzzy-matching
Sonic
🦔 Fast, lightweight & schema-less search backend. An alternative to Elasticsearch that runs on a few MBs of RAM.
Stars: ✭ 12,347 (+358.48%)
Mutual labels:  search, search-engine
Elassandra
Elassandra = Elasticsearch + Apache Cassandra
Stars: ✭ 1,610 (-40.22%)
Mutual labels:  search, fuzzy-search
Rated Ranking Evaluator
Search Quality Evaluation Tool for Apache Solr & Elasticsearch search-based infrastructures
Stars: ✭ 134 (-95.02%)
Mutual labels:  search, search-engine
Search
An Open Source Search Engine
Stars: ✭ 139 (-94.84%)
Mutual labels:  search, search-engine

Latest Version on Packagist Total Downloads Software License Build Status Slack Status

TNTSearch

TNTSearch

TNTSearch is a full-text search (FTS) engine written entirely in PHP. A simple configuration allows you to add an amazing search experience in just minutes. Features include:

  • Fuzzy search
  • Search as you type
  • Geo-search
  • Text classification
  • Stemming
  • Custom tokenizers
  • Bm25 ranking algorithm
  • Boolean search
  • Result highlighting
  • Dynamic index updates (no need to reindex each time)
  • Easily deployable via Packagist.org

We also created some demo pages that show tolerant retrieval with n-grams in action. The package has a bunch of helper functions like Jaro-Winkler and Cosine similarity for distance calculations. It supports stemming for English, Croatian, Arabic, Italian, Russian, Portuguese and Ukrainian. If the built-in stemmers aren't enough, the engine lets you easily plugin any compatible snowball stemmer. Some forks of the package even support Chinese. And please contribute other languages!

Unlike many other engines, the index can be easily updated without doing a reindex or using deltas.

View online demo  |  Follow us on Twitter, or Facebook  |  Visit our sponsors:


Demo

Tutorials

Premium products

If you're using TNT Search and finding it useful, take a look at our premium analytics tool:

Support us on Open Collective

Installation

The easiest way to install TNTSearch is via composer:

composer require teamtnt/tntsearch

Requirements

Before you proceed, make sure your server meets the following requirements:

  • PHP >= 7.1
  • PDO PHP Extension
  • SQLite PHP Extension
  • mbstring PHP Extension

Examples

Creating an index

In order to be able to make full text search queries, you have to create an index.

Usage:

use TeamTNT\TNTSearch\TNTSearch;

$tnt = new TNTSearch;

$tnt->loadConfig([
    'driver'    => 'mysql',
    'host'      => 'localhost',
    'database'  => 'dbname',
    'username'  => 'user',
    'password'  => 'pass',
    'storage'   => '/var/www/tntsearch/examples/',
    'stemmer'   => \TeamTNT\TNTSearch\Stemmer\PorterStemmer::class//optional
]);

$indexer = $tnt->createIndex('name.index');
$indexer->query('SELECT id, article FROM articles;');
//$indexer->setLanguage('german');
$indexer->run();

Important: "storage" settings marks the folder where all of your indexes will be saved so make sure to have permission to write to this folder otherwise you might expect the following exception thrown:

  • [PDOException] SQLSTATE[HY000] [14] unable to open database file *

Note: If your primary key is different than id set it like:

$indexer->setPrimaryKey('article_id');

Making the primary key searchable

By default, the primary key isn't searchable. If you want to make it searchable, simply run:

$indexer->includePrimaryKey();

Searching

Searching for a phrase or keyword is trivial:

use TeamTNT\TNTSearch\TNTSearch;

$tnt = new TNTSearch;

$tnt->loadConfig($config);
$tnt->selectIndex("name.index");

$res = $tnt->search("This is a test search", 12);

print_r($res); //returns an array of 12 document ids that best match your query

// to display the results you need an additional query against your application database
// SELECT * FROM articles WHERE id IN $res ORDER BY FIELD(id, $res);

The ORDER BY FIELD clause is important, otherwise the database engine will not return the results in the required order.

Boolean Search

use TeamTNT\TNTSearch\TNTSearch;

$tnt = new TNTSearch;

$tnt->loadConfig($config);
$tnt->selectIndex("name.index");

//this will return all documents that have romeo in it but not juliet
$res = $tnt->searchBoolean("romeo -juliet");

//returns all documents that have romeo or hamlet in it
$res = $tnt->searchBoolean("romeo or hamlet");

//returns all documents that have either romeo AND juliet or prince AND hamlet
$res = $tnt->searchBoolean("(romeo juliet) or (prince hamlet)");

Fuzzy Search

The fuzziness can be tweaked by setting the following member variables:

public $fuzzy_prefix_length  = 2;
public $fuzzy_max_expansions = 50;
public $fuzzy_distance       = 2; //represents the Levenshtein distance;
use TeamTNT\TNTSearch\TNTSearch;

$tnt = new TNTSearch;

$tnt->loadConfig($config);
$tnt->selectIndex("name.index");
$tnt->fuzziness = true;

//when the fuzziness flag is set to true, the keyword juleit will return
//documents that match the word juliet, the default Levenshtein distance is 2
$res = $tnt->search("juleit");

Updating the index

Once you created an index, you don't need to reindex it each time you make some changes to your document collection. TNTSearch supports dynamic index updates.

use TeamTNT\TNTSearch\TNTSearch;

$tnt = new TNTSearch;

$tnt->loadConfig($config);
$tnt->selectIndex("name.index");

$index = $tnt->getIndex();

//to insert a new document to the index
$index->insert(['id' => '11', 'title' => 'new title', 'article' => 'new article']);

//to update an existing document
$index->update(11, ['id' => '11', 'title' => 'updated title', 'article' => 'updated article']);

//to delete the document from index
$index->delete(12);

Custom Tokenizer

First, create your own Tokenizer class. It should extend AbstractTokenizer class, define word split $pattern value and must implement TokenizerInterface:

use TeamTNT\TNTSearch\Support\AbstractTokenizer;
use TeamTNT\TNTSearch\Support\TokenizerInterface;

class SomeTokenizer extends AbstractTokenizer implements TokenizerInterface
{
    static protected $pattern = '/[\s,\.]+/';

    public function tokenize($text) {
        return preg_split($this->getPattern(), strtolower($text), -1, PREG_SPLIT_NO_EMPTY);
    }
}

This tokenizer will split words using spaces, commas and periods.

After you have the tokenizer ready, you should pass it to TNTIndexer via setTokenizer method.

$someTokenizer = new SomeTokenizer;

$indexer = new TNTIndexer;
$indexer->setTokenizer($someTokenizer);

Another way would be to pass the tokenizer via config:

use TeamTNT\TNTSearch\TNTSearch;

$tnt = new TNTSearch;

$tnt->loadConfig([
    'driver'    => 'mysql',
    'host'      => 'localhost',
    'database'  => 'dbname',
    'username'  => 'user',
    'password'  => 'pass',
    'storage'   => '/var/www/tntsearch/examples/',
    'stemmer'   => \TeamTNT\TNTSearch\Stemmer\PorterStemmer::class//optional,
    'tokenizer' => \TeamTNT\TNTSearch\Support\SomeTokenizer::class
]);

$indexer = $tnt->createIndex('name.index');
$indexer->query('SELECT id, article FROM articles;');
$indexer->run();

Geo Search

Indexing

$candyShopIndexer = new TNTGeoIndexer;
$candyShopIndexer->loadConfig($config);
$candyShopIndexer->createIndex('candyShops.index');
$candyShopIndexer->query('SELECT id, longitude, latitude FROM candy_shops;');
$candyShopIndexer->run();

Searching

$currentLocation = [
    'longitude' => 11.576124,
    'latitude'  => 48.137154
];

$distance = 2; //km

$candyShopIndex = new TNTGeoSearch();
$candyShopIndex->loadConfig($config);
$candyShopIndex->selectIndex('candyShops.index');

$candyShops = $candyShopIndex->findNearest($currentLocation, $distance, 10);

Classification

use TeamTNT\TNTSearch\Classifier\TNTClassifier;

$classifier = new TNTClassifier();
$classifier->learn("A great game", "Sports");
$classifier->learn("The election was over", "Not sports");
$classifier->learn("Very clean match", "Sports");
$classifier->learn("A clean but forgettable game", "Sports");

$guess = $classifier->predict("It was a close election");
var_dump($guess['label']); //returns "Not sports"

Saving the classifier

$classifier->save('sports.cls');

Loading the classifier

$classifier = new TNTClassifier();
$classifier->load('sports.cls');

Drivers

PS4Ware

You're free to use this package, but if it makes it to your production environment, we would highly appreciate you sending us a PS4 game of your choice. This way you support us to further develop and add new features.

Our address is: TNT Studio, Sv. Mateja 19, 10010 Zagreb, Croatia.

We'll publish all received games here

Support OpenCollective OpenCollective

Buy Me a Coffee at ko-fi.com

Backers

Support us with a monthly donation and help us continue our activities. [Become a backer]

Sponsors

Become a sponsor and get your logo on our README on Github with a link to your site. [Become a sponsor]

Credits

License

The MIT License (MIT). Please see License File for more information.


From Croatia with by TNT Studio (@tntstudiohr, blog)

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].