dguo / Strsim Rs
Licence: mit
🔤 Rust implementations of string similarity metrics
Stars: ✭ 210
Programming Languages
rust
11053 projects
Labels
Projects that are alternatives of or similar to Strsim Rs
Go Edlib
Golang string comparison and edit distance algorithms library, featuring : Levenshtein, LCS, Hamming, Damerau levenshtein (OSA and Adjacent transpositions algorithms), Jaro-Winkler, Cosine, etc...
Stars: ✭ 253 (+20.48%)
Mutual labels: levenshtein
Symspellcompound
SymSpellCompound: compound aware automatic spelling correction
Stars: ✭ 61 (-70.95%)
Mutual labels: levenshtein
Closestmatch
Golang library for fuzzy matching within a set of strings 📃
Stars: ✭ 353 (+68.1%)
Mutual labels: levenshtein
Node Damerau Levenshtein
Damerau - Levenstein distance function for node
Stars: ✭ 27 (-87.14%)
Mutual labels: levenshtein
Str metrics
Ruby gem (native extension in Rust) providing implementations of various string metrics
Stars: ✭ 68 (-67.62%)
Mutual labels: levenshtein
Symspell
SymSpell: 1 million times faster spelling correction & fuzzy search through Symmetric Delete spelling correction algorithm
Stars: ✭ 1,976 (+840.95%)
Mutual labels: levenshtein
Levenshtein
Levenshtein distance and similarity metrics with customizable edit costs and Winkler-like bonus for common prefix.
Stars: ✭ 57 (-72.86%)
Mutual labels: levenshtein
Jellyfish
🎐 a python library for doing approximate and phonetic matching of strings.
Stars: ✭ 1,571 (+648.1%)
Mutual labels: levenshtein
Rapidfuzz
Rapid fuzzy string matching in Python using the Levenshtein Distance
Stars: ✭ 809 (+285.24%)
Mutual labels: levenshtein
Stopwords
Removes most frequent words (stop words) from a text content. Based on a Curated list of language statistics.
Stars: ✭ 83 (-60.48%)
Mutual labels: levenshtein
Js Levenshtein
The most efficient JS implementation calculating the Levenshtein distance, i.e. the difference between two strings.
Stars: ✭ 269 (+28.1%)
Mutual labels: levenshtein
similar-english-words
Give me a word and I’ll give you an array of words that differ by a single letter.
Stars: ✭ 25 (-88.1%)
Mutual labels: levenshtein
Edit Distance
Python library for computing edit distance between arbitrary Python sequences.
Stars: ✭ 61 (-70.95%)
Mutual labels: levenshtein
Textdistance
Compute distance between sequences. 30+ algorithms, pure python implementation, common interface, optional external libs usage.
Stars: ✭ 2,575 (+1126.19%)
Mutual labels: levenshtein
Levenshtein
Go implementation to calculate Levenshtein Distance.
Stars: ✭ 125 (-40.48%)
Mutual labels: levenshtein
strsim-rs
Rust implementations of string similarity metrics:
- Hamming
- Levenshtein - distance & normalized
- Optimal string alignment
- Damerau-Levenshtein - distance & normalized
- Jaro and Jaro-Winkler - this implementation of Jaro-Winkler does not limit the common prefix length
- Sørensen-Dice
The normalized versions return values between 0.0
and 1.0
, where 1.0
means
an exact match.
There are also generic versions of the functions for non-string inputs.
Installation
strsim
is available on crates.io. Add it to
your Cargo.toml
:
[dependencies]
strsim = "0.10.0"
Usage
Go to Docs.rs for the full documentation. You can
also clone the repo, and run $ cargo doc --open
.
Examples
extern crate strsim;
use strsim::{hamming, levenshtein, normalized_levenshtein, osa_distance,
damerau_levenshtein, normalized_damerau_levenshtein, jaro,
jaro_winkler, sorensen_dice};
fn main() {
match hamming("hamming", "hammers") {
Ok(distance) => assert_eq!(3, distance),
Err(why) => panic!("{:?}", why)
}
assert_eq!(levenshtein("kitten", "sitting"), 3);
assert!((normalized_levenshtein("kitten", "sitting") - 0.571).abs() < 0.001);
assert_eq!(osa_distance("ac", "cba"), 3);
assert_eq!(damerau_levenshtein("ac", "cba"), 2);
assert!((normalized_damerau_levenshtein("levenshtein", "löwenbräu") - 0.272).abs() <
0.001);
assert!((jaro("Friedrich Nietzsche", "Jean-Paul Sartre") - 0.392).abs() <
0.001);
assert!((jaro_winkler("cheeseburger", "cheese fries") - 0.911).abs() <
0.001);
assert_eq!(sorensen_dice("web applications", "applications of the web"),
0.7878787878787878);
}
Using the generic versions of the functions:
extern crate strsim;
use strsim::generic_levenshtein;
fn main() {
assert_eq!(2, generic_levenshtein(&[1, 2, 3], &[0, 2, 5]));
}
Contributing
If you don't want to install Rust itself, you can run $ ./dev
for a
development CLI if you have Docker installed.
Benchmarks require a Nightly toolchain. Run $ cargo +nightly bench
.
License
Note that the project description data, including the texts, logos, images, and/or trademarks,
for each open source project belongs to its rightful owner.
If you wish to add or remove any projects, please contact us at [email protected].