All Projects → gmamaladze → Trienet

gmamaladze / Trienet

Licence: mit
.NET Implementations of Trie Data Structures for Substring Search, Auto-completion and Intelli-sense. Includes: patricia trie, suffix trie and a trie implementation using Ukkonen's algorithm.

Projects that are alternatives of or similar to Trienet

C Plus Plus
Collection of various algorithms in mathematics, machine learning, computer science and physics implemented in C++ for educational purposes.
Stars: ✭ 17,151 (+13958.2%)
Mutual labels:  algorithms, search, data-structures
C
Collection of various algorithms in mathematics, machine learning, computer science, physics, etc implemented in C for educational purposes.
Stars: ✭ 11,897 (+9651.64%)
Mutual labels:  algorithms, search, data-structures
Java
All Algorithms implemented in Java
Stars: ✭ 42,893 (+35058.2%)
Mutual labels:  algorithms, search, data-structures
Go
Algorithms Implemented in GoLang
Stars: ✭ 7,385 (+5953.28%)
Mutual labels:  algorithms, search, data-structures
Algorithms
A collection of common algorithms and data structures implemented in java, c++, and python.
Stars: ✭ 142 (+16.39%)
Mutual labels:  algorithms, data-structures, trie
Dsa.js Data Structures Algorithms Javascript
🥞Data Structures and Algorithms explained and implemented in JavaScript + eBook
Stars: ✭ 6,251 (+5023.77%)
Mutual labels:  algorithms, search, data-structures
Algorithms
In case you want to contribute, ping on https://gitter.im/NITSkmOS/algo.
Stars: ✭ 95 (-22.13%)
Mutual labels:  algorithms, data-structures
Softuni
SoftUni Courses
Stars: ✭ 98 (-19.67%)
Mutual labels:  algorithms, data-structures
Competitiveprogrammingquestionbank
This repository contains all the popular competitive programming and DSA questions with solutions.
Stars: ✭ 122 (+0%)
Mutual labels:  algorithms, data-structures
Data Structures And Algorithms
A collection of some implementations of data structures and algorithms.
Stars: ✭ 101 (-17.21%)
Mutual labels:  algorithms, data-structures
Javascript
A repository for All algorithms implemented in Javascript (for educational purposes only)
Stars: ✭ 16,117 (+13110.66%)
Mutual labels:  search, data-structures
Data Structures
Data-Structures using C++.
Stars: ✭ 121 (-0.82%)
Mutual labels:  data-structures, trie
Haskell
Stars: ✭ 91 (-25.41%)
Mutual labels:  algorithms, data-structures
Awesome Coding Javascript
📌 持续构建个人的源码库(JavaScript 原生、常用库、数据结构、算法)
Stars: ✭ 88 (-27.87%)
Mutual labels:  algorithms, data-structures
Algorithms
Algorithms and data structures implemented in JavaScript with explanations, for further readings
Stars: ✭ 99 (-18.85%)
Mutual labels:  algorithms, data-structures
Data Structures And Algorithms
Python implementation of common algorithms and data structures interview questions
Stars: ✭ 84 (-31.15%)
Mutual labels:  algorithms, data-structures
Flatqueue
A very fast and simple JavaScript priority queue
Stars: ✭ 98 (-19.67%)
Mutual labels:  algorithms, data-structures
Code With Love
Open source programming algorithms
Stars: ✭ 107 (-12.3%)
Mutual labels:  algorithms, data-structures
Index
Metarhia educational program index 📖
Stars: ✭ 2,045 (+1576.23%)
Mutual labels:  algorithms, data-structures
Ultimate Go
This repo contains my notes on working with Go and computer systems.
Stars: ✭ 1,530 (+1154.1%)
Mutual labels:  algorithms, data-structures

Build status NuGet version

TrieNet - The library provides .NET Data Structures for Prefix String Search and Substring (Infix) Search to Implement Auto-completion and Intelli-sense.

usage

  nuget install TrieNet
using Gma.DataStructures.StringSearch;
	
...

var trie = new UkkonenTrie<int>(3);
//var trie = new SuffixTrie<int>(3);

trie.Add("hello", 1);
trie.Add("world", 2);
trie.Add("hell", 3);

var result = trie.Retrieve("hel");

updates

Added UkkonenTrie<T> which is a trie implementation using Ukkonen's algorithm. Finally I managed to port (largely rewritten) a java implementation of Generalized Suffix Tree using Ukkonen's algorithm by Alessandro Bahgat (THANKS!).

I have not made all measurements yet, but it occurs to have significatly imroved build-up and look-up times.

trienet

you liked it, you find it useful

so I migrated it from dying https://trienet.codeplex.com/

  nuget install TrieNet

and created a NuGet package.

motivation

If you are implementing a modern user friendly peace of software you will very probably need something like this:

Or this:

I have seen manyquestions about an efficient way of implementing a (prefix or infix) search over a key value pairs where keys are strings (for instance see:http://stackoverflow.com/questions/10472881/search-liststring-for-string-startswith).

So it depends:

  • If your data source is aSQL or some other indexed database holdig your data it makes sense to utilize it’s search capabilities and issue a query to find maching records.

  • If you have a small ammount of data, a linear scan will be probably the most efficient.

IEnumerable> keyValuePairs;
...
var result = keyValuePairs.Select(pair => pair.Key.Contains(searchString));
  • If you are seraching in a large set of key value records you may need a special data structure to perform your seach efficiently.

trie

There is a family of data structures reffered as Trie. In this post I want to focus on a c# implementations and usage of Trie data structures. If you want to find out more about the theory behind the data structure itself Google will be probably your best friend. In fact most of popular books on data structures and algorithms describe tries (see.: Advanced Data Structures by Peter Brass)

implementation

The only working .NET implementation I found so far was this one:http://geekyisawesome.blogspot.de/2010/07/c-trie.html

Having some concerns about interface usability, implementation details and performance I have decided to implement it from scratch.

My small library contains a bunch of trie data structures all having the same interface:

public interface ITrie
{
  IEnumerable Retrieve(string query);
  void Add(string key, TValue value);
}
Class Description
Trie the simple trie, allows only prefix search, like .Where(s => s.StartsWith(searchString))
SuffixTrie allows also infix search, like .Where(s => s.Contains(searchString))
PatriciaTrie compressed trie, more compact, a bit more efficient during look-up, but a quite slower durig build-up.
SuffixPatriciaTrie the same as PatriciaTrie, also enabling infix search.
ParallelTrie very primitively implemented parallel data structure which allows adding data and retriving results from different threads simultaneusly.

preformance

Important: all diagrams are given in logarithmic scale on x-axis.

To answer the question about when to use trie vs. linear search beter I’v experimeted with real data. As you can see below using a trie data structure may already be reasonable after 10.000 records if you are expecting many queries on the same data set.

Look-up times on patricia are slightly better, advantages of patricia bacame more noticable if you work with strings having many repeating parts, like quelified names of classes in sourcecode files, namespaces, variable names etc. So if you are indexing source code or something similar it makes sense to use patricia …

… even if the build-up time of patricia is higher compared to the normal trie.

demo app

The app demonstrates indexing of large text files and look-up inside them. I have experimented with huge texts containing millions of words. Indexing took usually only several seconds and the look-up delay was still unnoticable for the user.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].