All Projects → Lilykos → Pyphonetics

Lilykos / Pyphonetics

Licence: mit
A Python 3 phonetics library.

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Pyphonetics

Rake Nltk
Python implementation of the Rapid Automatic Keyword Extraction algorithm using NLTK.
Stars: ✭ 793 (+1200%)
Mutual labels:  text-mining
Metasra Pipeline
MetaSRA: normalized sample-specific metadata for the Sequence Read Archive
Stars: ✭ 33 (-45.9%)
Mutual labels:  text-mining
Ngram
Fast n-Gram Tokenization
Stars: ✭ 55 (-9.84%)
Mutual labels:  text-mining
Bagofconcepts
Python implementation of bag-of-concepts
Stars: ✭ 18 (-70.49%)
Mutual labels:  text-mining
Tidy Text Mining
Manuscript of the book "Tidy Text Mining with R" by Julia Silge and David Robinson
Stars: ✭ 961 (+1475.41%)
Mutual labels:  text-mining
Gsoc2018 3gm
💫 Automated codification of Greek Legislation with NLP
Stars: ✭ 36 (-40.98%)
Mutual labels:  text-mining
Text2vec
Fast vectorization, topic modeling, distances and GloVe word embeddings in R.
Stars: ✭ 715 (+1072.13%)
Mutual labels:  text-mining
Applied Text Mining In Python
Repo for Applied Text Mining in Python (coursera) by University of Michigan
Stars: ✭ 59 (-3.28%)
Mutual labels:  text-mining
Uc Davis Cs Exams Analysis
📈 Regression and Classification with UC Davis student quiz data and exam data
Stars: ✭ 33 (-45.9%)
Mutual labels:  text-mining
Spark Nkp
Natural Korean Processor for Apache Spark
Stars: ✭ 50 (-18.03%)
Mutual labels:  text-mining
Text Mining
Text Mining in Python
Stars: ✭ 18 (-70.49%)
Mutual labels:  text-mining
Nlppln
NLP pipeline software using common workflow language
Stars: ✭ 31 (-49.18%)
Mutual labels:  text-mining
Friend.ly
A social media platform with a friend recommendation engine based on personality trait extraction
Stars: ✭ 41 (-32.79%)
Mutual labels:  text-mining
Autophrase
AutoPhrase: Automated Phrase Mining from Massive Text Corpora
Stars: ✭ 835 (+1268.85%)
Mutual labels:  text-mining
Pipeit
PipeIt is a text transformation, conversion, cleansing and extraction tool.
Stars: ✭ 57 (-6.56%)
Mutual labels:  text-mining
Nlp In Practice
Starter code to solve real world text data problems. Includes: Gensim Word2Vec, phrase embeddings, Text Classification with Logistic Regression, word count with pyspark, simple text preprocessing, pre-trained embeddings and more.
Stars: ✭ 790 (+1195.08%)
Mutual labels:  text-mining
Tidytext
Text mining using tidy tools ✨📄✨
Stars: ✭ 975 (+1498.36%)
Mutual labels:  text-mining
How To Mine Newsfeed Data And Extract Interactive Insights In Python
A practical guide to topic mining and interactive visualizations
Stars: ✭ 61 (+0%)
Mutual labels:  text-mining
Konlpy
Python package for Korean natural language processing.
Stars: ✭ 1,098 (+1700%)
Mutual labels:  text-mining
Tadw
An implementation of "Network Representation Learning with Rich Text Information" (IJCAI '15).
Stars: ✭ 43 (-29.51%)
Mutual labels:  text-mining

Pyphonetics

Pyphonetics is a Python 3 library for phonetic algorithms. Right now, the following algorithms are implemented and supported:

  • Soundex
  • Metaphone
  • Refined Soundex
  • Fuzzy Soundex
  • Lein
  • Matching Rating Approach

In addition, the following distance metrics:

  • Hamming
  • Levenshtein

More will be added in the future.

Instalation

The module is available in PyPI, just use pip install pyphonetics.

Usage

>>> from pyphonetics import Soundex
>>> soundex = Soundex()
>>> soundex.phonetics('Rupert')
'R163'
>>> soundex.phonetics('Robert')
'R163'
>>> soundex.sounds_like('Robert', 'Rupert')
True

The same API applies to every algorithm, e.g:

>>> from pyphonetics import Metaphone
>>> metaphone = Metaphone()
>>> metaphone.phonetics('discrimination')
'TSKRMNXN'

You can also use the distance(word1, word2, metric='levenshtein') method to find the distance between 2 phonetic representations.

>>> from pyphonetics import RefinedSoundex
>>> rs = RefinedSoundex()
>>> rs.distance('Rupert', 'Robert')
0
>>> rs.distance('assign', 'assist', metric='hamming')
2

Credits

The module was largely based on the implementation of phonetic algorithms found in the Talisman.js Node NLP library.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].