All Projects → chrislit → Abydos

chrislit / Abydos

Licence: gpl-3.0
Abydos NLP/IR library for Python

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Abydos

Fuzzball.js
Easy to use and powerful fuzzy string matching, port of fuzzywuzzy.
Stars: ✭ 225 (+147.25%)
Mutual labels:  fuzzy-matching, levenshtein
Fastenshtein
The fastest .Net Levenshtein around
Stars: ✭ 115 (+26.37%)
Mutual labels:  fuzzy-matching, levenshtein
Symspell
SymSpell: 1 million times faster spelling correction & fuzzy search through Symmetric Delete spelling correction algorithm
Stars: ✭ 1,976 (+2071.43%)
Mutual labels:  fuzzy-matching, levenshtein
levenshtein.c
Levenshtein algorithm in C
Stars: ✭ 77 (-15.38%)
Mutual labels:  fuzzy-matching, levenshtein
stringdistance
A fuzzy matching string distance library for Scala and Java that includes Levenshtein distance, Jaro distance, Jaro-Winkler distance, Dice coefficient, N-Gram similarity, Cosine similarity, Jaccard similarity, Longest common subsequence, Hamming distance, and more..
Stars: ✭ 60 (-34.07%)
Mutual labels:  fuzzy-matching, levenshtein
Symspellpy
Python port of SymSpell
Stars: ✭ 420 (+361.54%)
Mutual labels:  fuzzy-matching, levenshtein
Closestmatch
Golang library for fuzzy matching within a set of strings 📃
Stars: ✭ 353 (+287.91%)
Mutual labels:  fuzzy-matching, levenshtein
Talisman
Straightforward fuzzy matching, information retrieval and NLP building blocks for JavaScript.
Stars: ✭ 584 (+541.76%)
Mutual labels:  fuzzy-matching, natural-language-processing
Neural kbqa
Knowledge Base Question Answering using memory networks
Stars: ✭ 87 (-4.4%)
Mutual labels:  natural-language-processing
Meprop
meProp: Sparsified Back Propagation for Accelerated Deep Learning (ICML 2017)
Stars: ✭ 90 (-1.1%)
Mutual labels:  natural-language-processing
Semantic Texual Similarity Toolkits
Semantic Textual Similarity (STS) measures the degree of equivalence in the underlying semantics of paired snippets of text.
Stars: ✭ 87 (-4.4%)
Mutual labels:  natural-language-processing
Spark Nlp Models
Models and Pipelines for the Spark NLP library
Stars: ✭ 88 (-3.3%)
Mutual labels:  natural-language-processing
Uer Py
Open Source Pre-training Model Framework in PyTorch & Pre-trained Model Zoo
Stars: ✭ 1,295 (+1323.08%)
Mutual labels:  natural-language-processing
Spf
Cornell Semantic Parsing Framework
Stars: ✭ 87 (-4.4%)
Mutual labels:  natural-language-processing
Deep Learning Drizzle
Drench yourself in Deep Learning, Reinforcement Learning, Machine Learning, Computer Vision, and NLP by learning from these exciting lectures!!
Stars: ✭ 9,717 (+10578.02%)
Mutual labels:  natural-language-processing
Ml
A high-level machine learning and deep learning library for the PHP language.
Stars: ✭ 1,270 (+1295.6%)
Mutual labels:  natural-language-processing
Turkish Bert Nlp Pipeline
Bert-base NLP pipeline for Turkish, Ner, Sentiment Analysis, Question Answering etc.
Stars: ✭ 85 (-6.59%)
Mutual labels:  natural-language-processing
Refinr
Cluster and merge similar char values: an R implementation of Open Refine clustering algorithms
Stars: ✭ 91 (+0%)
Mutual labels:  fuzzy-matching
Applied Ml
📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production.
Stars: ✭ 17,824 (+19486.81%)
Mutual labels:  natural-language-processing
Bible text gcn
Pytorch implementation of "Graph Convolutional Networks for Text Classification"
Stars: ✭ 90 (-1.1%)
Mutual labels:  natural-language-processing

Abydos

+------------------+------------------------------------------------------+ | CI & Test Status | |travis| |circle| |azure| |semaphore| |coveralls| | +------------------+------------------------------------------------------+ | Code Quality | |codeclimate| |scrutinizer| |codacy| |codefactor| | +------------------+------------------------------------------------------+ | Dependencies | |requires| |snyk| |pyup| |cii| |black| | +------------------+------------------------------------------------------+ | Local Analysis | |pylint| |flake8| |pydocstyle| |sloccount| |mypy| | +------------------+------------------------------------------------------+ | Usage | |docs| |mybinder| |license| |sourcerank| |zenodo| | +------------------+------------------------------------------------------+ | Contribution | |openhub| |gh-commits| |gh-issues| |gh-stars| | +------------------+------------------------------------------------------+ | PyPI | |pypi| |pypi-dl| |pypi-ver| | +------------------+------------------------------------------------------+ | conda-forge | |conda| |conda-dl| |conda-platforms| | +------------------+------------------------------------------------------+

.. |travis| image:: https://travis-ci.org/chrislit/abydos.svg?branch=master :target: https://travis-ci.org/chrislit/abydos :alt: Travis-CI Build Status

.. |circle| image:: https://circleci.com/gh/chrislit/abydos/tree/master.svg?style=shield :target: https://circleci.com/gh/chrislit/abydos/tree/master :alt: Circle-CI Build Status

.. |azure| image:: https://dev.azure.com/chrislit/abydos/_apis/build/status/chrislit.abydos?branchName=master :target: https://dev.azure.com/chrislit/abydos/_build/latest?definitionId=1 :alt: Azure Pipelines Build Status

.. |semaphore| image:: https://semaphoreci.com/api/v1/chrislit/abydos/branches/master/shields_badge.svg :target: https://semaphoreci.com/chrislit/abydos :alt: Semaphore Build Status

.. |coveralls| image:: https://coveralls.io/repos/github/chrislit/abydos/badge.svg?branch=master :target: https://coveralls.io/github/chrislit/abydos?branch=master :alt: Coverage Status

.. |codeclimate| image:: https://codeclimate.com/github/chrislit/abydos/badges/gpa.svg :target: https://codeclimate.com/github/chrislit/abydos :alt: Code Climate

.. |scrutinizer| image:: https://scrutinizer-ci.com/g/chrislit/abydos/badges/quality-score.png?b=master :target: https://scrutinizer-ci.com/g/chrislit/abydos/?branch=master :alt: Scrutinizer

.. |codacy| image:: https://api.codacy.com/project/badge/Grade/db79f2c31ea142fb9b5938abe87b0854 :target: https://www.codacy.com/app/chrislit/abydos?utm_source=github.com&utm_medium=referral&utm_content=chrislit/abydos&utm_campaign=Badge_Grade :alt: Codacy

.. |codefactor| image:: https://www.codefactor.io/repository/github/chrislit/abydos/badge :target: https://www.codefactor.io/repository/github/chrislit/abydos :alt: CodeFactor

.. |requires| image:: https://requires.io/github/chrislit/abydos/requirements.svg?branch=master :target: https://requires.io/github/chrislit/abydos/requirements/?branch=master :alt: Requirements Status

.. |snyk| image:: https://snyk.io/test/github/chrislit/abydos/badge.svg?targetFile=requirements.txt :target: https://snyk.io/test/github/chrislit/abydos?targetFile=requirements.txt :alt: Known Vulnerabilities

.. |pyup| image:: https://pyup.io/repos/github/chrislit/abydos/shield.svg :target: https://pyup.io/repos/github/chrislit/abydos/ :alt: Updates

.. |cii| image:: https://bestpractices.coreinfrastructure.org/projects/1598/badge :target: https://bestpractices.coreinfrastructure.org/projects/1598 :alt: CII Best Practices

.. |black| image:: https://img.shields.io/badge/code%20style-black-000000.svg :target: https://github.com/ambv/black :alt: black

.. |pylint| image:: https://img.shields.io/badge/Pylint-9.13/10-yellowgreen.svg :target: # :alt: Pylint Score

.. |flake8| image:: https://img.shields.io/badge/flake8-0-brightgreen.svg :target: # :alt: flake8 Errors

.. |pydocstyle| image:: https://img.shields.io/badge/pydocstyle-0-brightgreen.svg :target: # :alt: pydocstyle Errors

.. |sloccount| image:: https://img.shields.io/badge/SLOCCount-40,079-blue.svg :target: # :alt: SLOCCount

.. |mypy| image:: https://img.shields.io/badge/mypy-1.87%25%20imprecise-1F5082.svg :target: # :alt: mypy Imprecision

.. |docs| image:: https://readthedocs.org/projects/abydos/badge/?version=latest :target: https://abydos.readthedocs.org/en/latest/ :alt: Documentation Status

.. |mybinder| image:: https://img.shields.io/badge/launch-binder-579aca.svg?logo= :target: https://mybinder.org/v2/gh/chrislit/abydos/master?filepath=binder :alt: Binder

.. |license| image:: https://img.shields.io/badge/License-GPL%20v3+-blue.svg?logo=gnu :target: https://www.gnu.org/licenses/gpl-3.0 :alt: License: GPL v3.0+

.. |sourcerank| image:: https://img.shields.io/librariesio/sourcerank/pypi/abydos.svg :target: https://libraries.io/pypi/abydos :alt: Libraries.io SourceRank

.. |zenodo| image:: https://zenodo.org/badge/DOI/10.5281/zenodo.3603514.svg :target: https://doi.org/10.5281/zenodo.3603514 :alt: Zenodo

.. |openhub| image:: https://www.openhub.net/p/abydosnlp/widgets/project_thin_badge.gif :target: https://www.openhub.net/p/abydosnlp :alt: OpenHUB

.. |gh-commits| image:: https://img.shields.io/github/commit-activity/y/chrislit/abydos.svg?logo=github :target: https://github.com/chrislit/abydos/graphs/commit-activity :alt: GitHub Commits

.. |gh-issues| image:: https://img.shields.io/github/issues-closed/chrislit/abydos.svg?logo=github :target: https://github.com/chrislit/abydos/issues?q= :alt: GitHub Issues Closed

.. |gh-stars| image:: https://img.shields.io/github/stars/chrislit/abydos.svg?logo=github :target: https://github.com/chrislit/abydos/stargazers :alt: GitHub Stars

.. |pypi| image:: https://img.shields.io/pypi/v/abydos.svg?logo=python&logoColor=white :target: https://pypi.python.org/pypi/abydos :alt: PyPI

.. |pypi-dl| image:: https://img.shields.io/pypi/dm/abydos.svg?logo=python&logoColor=white :target: https://pypi.python.org/pypi/abydos :alt: PyPI downloads/month

.. |pypi-ver| image:: https://img.shields.io/pypi/pyversions/abydos.svg?logo=python&logoColor=white :target: https://pypi.python.org/pypi/abydos :alt: PyPI versions

.. |conda| image:: https://img.shields.io/conda/vn/conda-forge/abydos.svg?logo=conda-forge :target: https://anaconda.org/conda-forge/abydos :alt: conda-forge

.. |conda-dl| image:: https://img.shields.io/conda/dn/conda-forge/abydos.svg?logo=conda-forge :target: https://anaconda.org/conda-forge/abydos :alt: conda-forge downloads

.. |conda-platforms| image:: https://img.shields.io/conda/pn/conda-forge/abydos.svg?logo=conda-forge :target: https://anaconda.org/conda-forge/abydos :alt: conda-forge platforms

|

.. image:: https://raw.githubusercontent.com/chrislit/abydos/master/abydos-small.png :target: https://github.com/chrislit/abydos :alt: abydos :align: right

| | Abydos NLP/IR library <https://github.com/chrislit/abydos>_ | Copyright 2014-2020 by Christopher C. Little

Abydos is a library of phonetic algorithms, string distance measures & metrics, stemmers, and string fingerprinters including:

  • Phonetic algorithms

    • Robert C. Russell's Index
    • American Soundex
    • Refined Soundex
    • Daitch-Mokotoff Soundex
    • Kölner Phonetik
    • NYSIIS
    • Match Rating Algorithm
    • Metaphone
    • Double Metaphone
    • Caverphone
    • Alpha Search Inquiry System
    • Fuzzy Soundex
    • Phonex
    • Phonem
    • Phonix
    • SfinxBis
    • phonet
    • Standardized Phonetic Frequency Code
    • Statistics Canada
    • Lein
    • Roger Root
    • Oxford Name Compression Algorithm (ONCA)
    • Eudex phonetic hash
    • Haase Phonetik
    • Reth-Schek Phonetik
    • FONEM
    • Parmar-Kumbharana
    • Davidson's Consonant Code
    • SoundD
    • PSHP Soundex/Viewex Coding
    • an early version of Henry Code
    • Norphone
    • Dolby Code
    • Phonetic Spanish
    • Spanish Metaphone
    • MetaSoundex
    • SoundexBR
    • NRL English-to-phoneme
    • Beider-Morse Phonetic Matching
  • String distance metrics

    • Levenshtein distance
    • Optimal String Alignment distance
    • Levenshtein-Damerau distance
    • Hamming distance
    • Tversky index
    • Sørensen–Dice coefficient & distance
    • Jaccard similarity coefficient & distance
    • overlap similarity & distance
    • Tanimoto coefficient & distance
    • Minkowski distance & similarity
    • Manhattan distance & similarity
    • Euclidean distance & similarity
    • Chebyshev distance
    • cosine similarity & distance
    • Jaro distance
    • Jaro-Winkler distance (incl. the strcmp95 algorithm variant)
    • Longest common substring
    • Ratcliff-Obershelp similarity & distance
    • Match Rating Algorithm similarity
    • Normalized Compression Distance (NCD) & similarity
    • Monge-Elkan similarity & distance
    • Matrix similarity
    • Needleman-Wunsch score
    • Smith-Waterman score
    • Gotoh score
    • Length similarity
    • Prefix, Suffix, and Identity similarity & distance
    • Modified Language-Independent Product Name Search (MLIPNS) similarity & distance
    • Bag distance
    • Editex distance
    • Eudex distances
    • Sift4 distance
    • Baystat distance & similarity
    • Typo distance
    • Indel distance
    • Synoname
  • Stemmers

    • the Lovins stemmer
    • the Porter and Porter2 (Snowball English) stemmers
    • Snowball stemmers for German, Dutch, Norwegian, Swedish, and Danish
    • CLEF German, German plus, and Swedish stemmers
    • Caumann's German stemmer
    • UEA-Lite Stemmer
    • Paice-Husk Stemmer
    • Schinke Latin stemmer
    • S stemmer
  • String Fingerprints

    • string fingerprint
    • q-gram fingerprint
    • phonetic fingerprint
    • Pollock & Zomora's skeleton key
    • Pollock & Zomora's omission key
    • Cisłak & Grabowski's occurrence fingerprint
    • Cisłak & Grabowski's occurrence halved fingerprint
    • Cisłak & Grabowski's count fingerprint
    • Cisłak & Grabowski's position fingerprint
    • Synoname Toolcode

Installation

Required libraries:

  • NumPy
  • deprecation

Optional libraries (all available on PyPI, some available on conda or conda-forge):

  • SyllabiPy <http://syllabipy.com/>_
  • NLTK <https://www.nltk.org/>_
  • PyLZSS <https://github.com/rumbah/pylzss>_
  • paq <https://github.com/observerss/paq>_

To install Abydos (master) from Github source::

git clone https://github.com/chrislit/abydos.git --recursive cd abydos python setup install

If your default python command calls Python 2.7 but you want to install for Python 3, you may instead need to call::

python3 setup install

To install Abydos (latest release) from PyPI using pip::

pip install abydos

To install from conda-forge <https://anaconda.org/conda-forge/abydos>_::

conda install abydos

It should run on Python 3.5-3.8.

Testing & Contributing

To run the whole test-suite just call tox::

tox

The tox setup has the following environments: black, py37, doctest, regression, fuzz, pylint, pydocstyle, flake8, doc8, docs, sloccount, badges, & build. So if you only want to generate documentation (in HTML, EPUB, & PDF formats), just call::

tox -e docs

In order to only run & generate Flake8 reports, call::

tox -e flake8

Contributions such as bug reports, PRs, suggestions, desired new features, etc. are welcome through Github Issues <https://github.com/chrislit/abydos/issues>_ & Pull requests <https://github.com/chrislit/abydos/pulls>_.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].