strutilGolang metrics for calculating string similarity and other string utility functions
Stars: ✭ 114 (+90%)
eddieNo description or website provided.
Stars: ✭ 18 (-70%)
stringosimString similarity functions, String distance's, Jaccard, Levenshtein, Hamming, Jaro-Winkler, Q-grams, N-grams, LCS - Longest Common Subsequence, Cosine similarity...
Stars: ✭ 47 (-21.67%)
set-sketch-paperSetSketch: Filling the Gap between MinHash and HyperLogLog
Stars: ✭ 23 (-61.67%)
SymspellSymSpell: 1 million times faster spelling correction & fuzzy search through Symmetric Delete spelling correction algorithm
Stars: ✭ 1,976 (+3193.33%)
Jellyfish🎐 a python library for doing approximate and phonetic matching of strings.
Stars: ✭ 1,571 (+2518.33%)
edits.crEdit distance algorithms inc. Jaro, Damerau-Levenshtein, and Optimal Alignment
Stars: ✭ 16 (-73.33%)
spark-stringmetricSpark functions to run popular phonetic and string matching algorithms
Stars: ✭ 51 (-15%)
Java String SimilarityImplementation of various string similarity and distance algorithms: Levenshtein, Jaro-winkler, n-Gram, Q-Gram, Jaccard index, Longest Common Subsequence edit distance, cosine similarity ...
Stars: ✭ 2,403 (+3905%)
TextdistanceCompute distance between sequences. 30+ algorithms, pure python implementation, common interface, optional external libs usage.
Stars: ✭ 2,575 (+4191.67%)
LevenshteinThe Levenshtein Python C extension module contains functions for fast computation of Levenshtein distance and string similarity
Stars: ✭ 38 (-36.67%)
FastenshteinThe fastest .Net Levenshtein around
Stars: ✭ 115 (+91.67%)
AbydosAbydos NLP/IR library for Python
Stars: ✭ 91 (+51.67%)
SymspellpyPython port of SymSpell
Stars: ✭ 420 (+600%)
ClosestmatchGolang library for fuzzy matching within a set of strings 📃
Stars: ✭ 353 (+488.33%)
String SimilarityFinds degree of similarity between two strings, based on Dice's Coefficient, which is mostly better than Levenshtein distance.
Stars: ✭ 2,254 (+3656.67%)
Fuzzball.jsEasy to use and powerful fuzzy string matching, port of fuzzywuzzy.
Stars: ✭ 225 (+275%)
strsimstring similarity based on Dice's coefficient in go
Stars: ✭ 39 (-35%)
QuickenshteinMaking the quickest and most memory efficient implementation of Levenshtein Distance with SIMD and Threading support
Stars: ✭ 204 (+240%)
spellchecker-wasmSpellcheckerWasm is an extrememly fast spellchecker for WebAssembly based on SymSpell
Stars: ✭ 46 (-23.33%)
tika-similarityTika-Similarity uses the Tika-Python package (Python port of Apache Tika) to compute file similarity based on Metadata features.
Stars: ✭ 92 (+53.33%)
cejaPySpark phonetic and string matching algorithms
Stars: ✭ 24 (-60%)
LinSpellFast approximate strings search & spelling correction
Stars: ✭ 52 (-13.33%)
simetricString similarity metrics for Elixir
Stars: ✭ 59 (-1.67%)
Content-based-Recommender-SystemIt is a content based recommender system that uses tf-idf and cosine similarity for N Most SImilar Items from a dataset
Stars: ✭ 64 (+6.67%)
affinegap📐 A Cython implementation of the affine gap string distance
Stars: ✭ 57 (-5%)
FaintExtensible TUI fuzzy file file explorer
Stars: ✭ 82 (+36.67%)
RefinrCluster and merge similar char values: an R implementation of Open Refine clustering algorithms
Stars: ✭ 91 (+51.67%)
bns-short-text-similarity📖 Use Bi-normal Separation to find document vectors which is used to compute similarity for shorter sentences.
Stars: ✭ 24 (-60%)
GitgotSemi-automated, feedback-driven tool to rapidly search through troves of public data on GitHub for sensitive secrets.
Stars: ✭ 964 (+1506.67%)
Fuse SwiftA lightweight fuzzy-search library, with zero dependencies
Stars: ✭ 767 (+1178.33%)
TalismanStraightforward fuzzy matching, information retrieval and NLP building blocks for JavaScript.
Stars: ✭ 584 (+873.33%)
fuzzychineseA small package to fuzzy match chinese words
Stars: ✭ 50 (-16.67%)
lsh-rsLocality Sensitive Hashing in Rust with Python bindings
Stars: ✭ 64 (+6.67%)
koolslaFood recommendation tool with Machine learning.
Stars: ✭ 21 (-65%)
FuzzywuzzyJava fuzzy string matching implementation of the well known Python's fuzzywuzzy algorithm. Fuzzy search for Java
Stars: ✭ 506 (+743.33%)
spaczzFuzzy matching and more functionality for spaCy.
Stars: ✭ 215 (+258.33%)
Persian ToolsAn anthology of a variety of tools for the Persian language in javascript
Stars: ✭ 458 (+663.33%)
Liquidmetal💦🤘 A mimetic poly-alloy of the Quicksilver scoring algorithm, essentially LiquidMetal. </Schwarzenegger Voice>
Stars: ✭ 279 (+365%)
Re FlexThe regex-centric, fast lexical analyzer generator for C++ with full Unicode support. Faster than Flex. Accepts Flex specifications. Generates reusable source code that is easy to understand. Introduces indent/dedent anchors, lazy quantifiers, functions for lex/syntax error reporting, and more. Seamlessly integrates with Bison and other parsers.
Stars: ✭ 274 (+356.67%)
SymSpellCppPyFast SymSpell written in c++ and exposes to python via pybind11
Stars: ✭ 28 (-53.33%)
edit-distance-papersA curated list of papers dedicated to edit-distance as objective function
Stars: ✭ 49 (-18.33%)
solr-vector-scoringVector Plugin for Solr: calculate dot product / cosine similarity on documents
Stars: ✭ 28 (-53.33%)
fuzzywuzzyFuzzy string matching for PHP
Stars: ✭ 60 (+0%)
splinkImplementation of Fellegi-Sunter's canonical model of record linkage in Apache Spark, including EM algorithm to estimate parameters
Stars: ✭ 181 (+201.67%)
fuzzy-searchA collection of algorithms for fuzzy search like in Sublime Text.
Stars: ✭ 49 (-18.33%)
fuzzy-matcherFuzzy Matching Library for Rust
Stars: ✭ 140 (+133.33%)
bolt.nvim⚡ Ultrafast multi-pane file manager for Neovim with fuzzy matching
Stars: ✭ 100 (+66.67%)
TntsearchA fully featured full text search engine written in PHP
Stars: ✭ 2,693 (+4388.33%)
fish-fzyfzy inegration with fish. Search history, navigate directories and more. Blazingly fast.
Stars: ✭ 18 (-70%)
fuzzy-matchLibrary and command line utility to do approximate string matching of a source against a bitext index and get matched source and target.
Stars: ✭ 31 (-48.33%)
Pg similarityset of functions and operators for executing similarity queries
Stars: ✭ 250 (+316.67%)
seqalign pathingRust implementation of sequence alignment / Levenshtein distance by A* acceleration of the DP algorithm
Stars: ✭ 17 (-71.67%)