All Projects → dedupeio → affinegap

dedupeio / affinegap

Licence: MIT license
📐 A Cython implementation of the affine gap string distance

Programming Languages

cython
566 projects
python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to affinegap

Quickenshtein
Making the quickest and most memory efficient implementation of Levenshtein Distance with SIMD and Threading support
Stars: ✭ 204 (+257.89%)
Mutual labels:  levenshtein-distance, string-distance
Java String Similarity
Implementation of various string similarity and distance algorithms: Levenshtein, Jaro-winkler, n-Gram, Q-Gram, Jaccard index, Longest Common Subsequence edit distance, cosine similarity ...
Stars: ✭ 2,403 (+4115.79%)
Mutual labels:  levenshtein-distance, string-distance
seqalign
Collection of sequence alignment algorithms.
Stars: ✭ 20 (-64.91%)
Mutual labels:  string-distance
Symspell
SymSpell: 1 million times faster spelling correction & fuzzy search through Symmetric Delete spelling correction algorithm
Stars: ✭ 1,976 (+3366.67%)
Mutual labels:  levenshtein-distance
Levenshtein
The Levenshtein Python C extension module contains functions for fast computation of Levenshtein distance and string similarity
Stars: ✭ 38 (-33.33%)
Mutual labels:  levenshtein-distance
stance
Learned string similarity for entity names using optimal transport.
Stars: ✭ 27 (-52.63%)
Mutual labels:  string-distance
FastFuzzyStringMatcherDotNet
A BK tree implementation for fast fuzzy string matching
Stars: ✭ 23 (-59.65%)
Mutual labels:  levenshtein-distance
strutil
Golang metrics for calculating string similarity and other string utility functions
Stars: ✭ 114 (+100%)
Mutual labels:  string-distance
levenshtein finder
Similar string search in Levenshtein distance
Stars: ✭ 19 (-66.67%)
Mutual labels:  string-distance
LinSpell
Fast approximate strings search & spelling correction
Stars: ✭ 52 (-8.77%)
Mutual labels:  levenshtein-distance
sentences-similarity-cluster
Calculate similarity of sentences & Cluster the result.
Stars: ✭ 14 (-75.44%)
Mutual labels:  levenshtein-distance
spellchecker-wasm
SpellcheckerWasm is an extrememly fast spellchecker for WebAssembly based on SymSpell
Stars: ✭ 46 (-19.3%)
Mutual labels:  levenshtein-distance
seqalign pathing
Rust implementation of sequence alignment / Levenshtein distance by A* acceleration of the DP algorithm
Stars: ✭ 17 (-70.18%)
Mutual labels:  levenshtein-distance
customized-symspell
Java port of SymSpell: 1 million times faster through Symmetric Delete spelling correction algorithm
Stars: ✭ 51 (-10.53%)
Mutual labels:  levenshtein-distance
beda
Beda is a golang library for detecting how similar a two string
Stars: ✭ 34 (-40.35%)
Mutual labels:  string-distance
Textdistance
Compute distance between sequences. 30+ algorithms, pure python implementation, common interface, optional external libs usage.
Stars: ✭ 2,575 (+4417.54%)
Mutual labels:  levenshtein-distance
stringosim
String similarity functions, String distance's, Jaccard, Levenshtein, Hamming, Jaro-Winkler, Q-grams, N-grams, LCS - Longest Common Subsequence, Cosine similarity...
Stars: ✭ 47 (-17.54%)
Mutual labels:  string-distance
polyleven
Fast Levenshtein Distance Library for Python 3
Stars: ✭ 37 (-35.09%)
Mutual labels:  levenshtein-distance
levenshtein.c
Levenshtein algorithm in C
Stars: ✭ 77 (+35.09%)
Mutual labels:  levenshtein-distance
text2text
Text2Text: Cross-lingual natural language processing and generation toolkit
Stars: ✭ 188 (+229.82%)
Mutual labels:  levenshtein-distance

affinegap

A Cython implementation of the affine gap penalty string distance also known as the Smith–Waterman algorithm

Part of the Dedupe.io cloud service and open source toolset for de-duplicating and finding fuzzy matches in your data.

Build Status

To install

pip install affinegap

To use

import affinegap
d1 = affinegap.affineGapDistance('foo', 'bar')
d2 = affinegap.affineGapDistance('foo', 'bar',
                                 matchWeight = 1,
                                 mismatchWeight = 11,
                                 gapWeight = 10,
                                 spaceWeight = 7,
                                 abbreviation_scale = .125)
d3 = affinegap.normalizedAffineGapDistance('foo', 'bar')

To get set up for development

git clone https://github.com/dedupeio/affinegap.git
cd affinegap
pip install -r requirements.txt
cython affinegap/*.pyx
python setup.py develop
pytest

Team

  • Forest Gregg, Dedupeio

Errors and Bugs

If something is not behaving intuitively, it is a bug and should be reported. Report it here by creating an issue: https://github.com/dedupeio/affinegap/issues

Help us fix the problem as quickly as possible by following Mozilla's guidelines for reporting bugs.

Patches and Pull Requests

Your patches are welcome. Here's our suggested workflow:

  • Fork the project.
  • Make your feature addition or bug fix.
  • Send us a pull request with a description of your work. Bonus points for topic branches!

Copyright and Attribution

Copyright (c) 2016 Forest Gregg and Dedupeio. Released under the MIT License.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].