All Projects → dguo → Strsim Rs

dguo / Strsim Rs

Licence: mit
🔤 Rust implementations of string similarity metrics

Programming Languages

rust
11053 projects

Projects that are alternatives of or similar to Strsim Rs

Go Edlib
Golang string comparison and edit distance algorithms library, featuring : Levenshtein, LCS, Hamming, Damerau levenshtein (OSA and Adjacent transpositions algorithms), Jaro-Winkler, Cosine, etc...
Stars: ✭ 253 (+20.48%)
Mutual labels:  levenshtein
Symspellcompound
SymSpellCompound: compound aware automatic spelling correction
Stars: ✭ 61 (-70.95%)
Mutual labels:  levenshtein
Fastenshtein
The fastest .Net Levenshtein around
Stars: ✭ 115 (-45.24%)
Mutual labels:  levenshtein
Closestmatch
Golang library for fuzzy matching within a set of strings 📃
Stars: ✭ 353 (+68.1%)
Mutual labels:  levenshtein
Node Damerau Levenshtein
Damerau - Levenstein distance function for node
Stars: ✭ 27 (-87.14%)
Mutual labels:  levenshtein
Str metrics
Ruby gem (native extension in Rust) providing implementations of various string metrics
Stars: ✭ 68 (-67.62%)
Mutual labels:  levenshtein
RepostCheckerBot
Bot for checking reposts on reddit
Stars: ✭ 36 (-82.86%)
Mutual labels:  levenshtein
Symspell
SymSpell: 1 million times faster spelling correction & fuzzy search through Symmetric Delete spelling correction algorithm
Stars: ✭ 1,976 (+840.95%)
Mutual labels:  levenshtein
Levenshtein
Levenshtein distance and similarity metrics with customizable edit costs and Winkler-like bonus for common prefix.
Stars: ✭ 57 (-72.86%)
Mutual labels:  levenshtein
Jellyfish
🎐 a python library for doing approximate and phonetic matching of strings.
Stars: ✭ 1,571 (+648.1%)
Mutual labels:  levenshtein
Symspellpy
Python port of SymSpell
Stars: ✭ 420 (+100%)
Mutual labels:  levenshtein
Rapidfuzz
Rapid fuzzy string matching in Python using the Levenshtein Distance
Stars: ✭ 809 (+285.24%)
Mutual labels:  levenshtein
Stopwords
Removes most frequent words (stop words) from a text content. Based on a Curated list of language statistics.
Stars: ✭ 83 (-60.48%)
Mutual labels:  levenshtein
Js Levenshtein
The most efficient JS implementation calculating the Levenshtein distance, i.e. the difference between two strings.
Stars: ✭ 269 (+28.1%)
Mutual labels:  levenshtein
Dictomaton
Finite state dictionaries in Java
Stars: ✭ 124 (-40.95%)
Mutual labels:  levenshtein
similar-english-words
Give me a word and I’ll give you an array of words that differ by a single letter.
Stars: ✭ 25 (-88.1%)
Mutual labels:  levenshtein
Edit Distance
Python library for computing edit distance between arbitrary Python sequences.
Stars: ✭ 61 (-70.95%)
Mutual labels:  levenshtein
Textdistance
Compute distance between sequences. 30+ algorithms, pure python implementation, common interface, optional external libs usage.
Stars: ✭ 2,575 (+1126.19%)
Mutual labels:  levenshtein
Levenshtein
Go implementation to calculate Levenshtein Distance.
Stars: ✭ 125 (-40.48%)
Mutual labels:  levenshtein
Abydos
Abydos NLP/IR library for Python
Stars: ✭ 91 (-56.67%)
Mutual labels:  levenshtein

strsim-rs

Crates.io Crates.io CI status unsafe forbidden

Rust implementations of string similarity metrics:

The normalized versions return values between 0.0 and 1.0, where 1.0 means an exact match.

There are also generic versions of the functions for non-string inputs.

Installation

strsim is available on crates.io. Add it to your Cargo.toml:

[dependencies]
strsim = "0.10.0"

Usage

Go to Docs.rs for the full documentation. You can also clone the repo, and run $ cargo doc --open.

Examples

extern crate strsim;

use strsim::{hamming, levenshtein, normalized_levenshtein, osa_distance,
             damerau_levenshtein, normalized_damerau_levenshtein, jaro,
             jaro_winkler, sorensen_dice};

fn main() {
    match hamming("hamming", "hammers") {
        Ok(distance) => assert_eq!(3, distance),
        Err(why) => panic!("{:?}", why)
    }

    assert_eq!(levenshtein("kitten", "sitting"), 3);

    assert!((normalized_levenshtein("kitten", "sitting") - 0.571).abs() < 0.001);

    assert_eq!(osa_distance("ac", "cba"), 3);

    assert_eq!(damerau_levenshtein("ac", "cba"), 2);

    assert!((normalized_damerau_levenshtein("levenshtein", "löwenbräu") - 0.272).abs() <
            0.001);

    assert!((jaro("Friedrich Nietzsche", "Jean-Paul Sartre") - 0.392).abs() <
            0.001);

    assert!((jaro_winkler("cheeseburger", "cheese fries") - 0.911).abs() <
            0.001);

    assert_eq!(sorensen_dice("web applications", "applications of the web"),
        0.7878787878787878);
}

Using the generic versions of the functions:

extern crate strsim;

use strsim::generic_levenshtein;

fn main() {
    assert_eq!(2, generic_levenshtein(&[1, 2, 3], &[0, 2, 5]));
}

Contributing

If you don't want to install Rust itself, you can run $ ./dev for a development CLI if you have Docker installed.

Benchmarks require a Nightly toolchain. Run $ cargo +nightly bench.

License

MIT

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].