Lilykos / Pyphonetics
Licence: mit
A Python 3 phonetics library.
Stars: ✭ 61
Programming Languages
python
139335 projects - #7 most used programming language
Labels
Projects that are alternatives of or similar to Pyphonetics
Rake Nltk
Python implementation of the Rapid Automatic Keyword Extraction algorithm using NLTK.
Stars: ✭ 793 (+1200%)
Mutual labels: text-mining
Metasra Pipeline
MetaSRA: normalized sample-specific metadata for the Sequence Read Archive
Stars: ✭ 33 (-45.9%)
Mutual labels: text-mining
Bagofconcepts
Python implementation of bag-of-concepts
Stars: ✭ 18 (-70.49%)
Mutual labels: text-mining
Tidy Text Mining
Manuscript of the book "Tidy Text Mining with R" by Julia Silge and David Robinson
Stars: ✭ 961 (+1475.41%)
Mutual labels: text-mining
Gsoc2018 3gm
💫 Automated codification of Greek Legislation with NLP
Stars: ✭ 36 (-40.98%)
Mutual labels: text-mining
Text2vec
Fast vectorization, topic modeling, distances and GloVe word embeddings in R.
Stars: ✭ 715 (+1072.13%)
Mutual labels: text-mining
Applied Text Mining In Python
Repo for Applied Text Mining in Python (coursera) by University of Michigan
Stars: ✭ 59 (-3.28%)
Mutual labels: text-mining
Uc Davis Cs Exams Analysis
📈 Regression and Classification with UC Davis student quiz data and exam data
Stars: ✭ 33 (-45.9%)
Mutual labels: text-mining
Nlppln
NLP pipeline software using common workflow language
Stars: ✭ 31 (-49.18%)
Mutual labels: text-mining
Friend.ly
A social media platform with a friend recommendation engine based on personality trait extraction
Stars: ✭ 41 (-32.79%)
Mutual labels: text-mining
Autophrase
AutoPhrase: Automated Phrase Mining from Massive Text Corpora
Stars: ✭ 835 (+1268.85%)
Mutual labels: text-mining
Pipeit
PipeIt is a text transformation, conversion, cleansing and extraction tool.
Stars: ✭ 57 (-6.56%)
Mutual labels: text-mining
Nlp In Practice
Starter code to solve real world text data problems. Includes: Gensim Word2Vec, phrase embeddings, Text Classification with Logistic Regression, word count with pyspark, simple text preprocessing, pre-trained embeddings and more.
Stars: ✭ 790 (+1195.08%)
Mutual labels: text-mining
How To Mine Newsfeed Data And Extract Interactive Insights In Python
A practical guide to topic mining and interactive visualizations
Stars: ✭ 61 (+0%)
Mutual labels: text-mining
Konlpy
Python package for Korean natural language processing.
Stars: ✭ 1,098 (+1700%)
Mutual labels: text-mining
Tadw
An implementation of "Network Representation Learning with Rich Text Information" (IJCAI '15).
Stars: ✭ 43 (-29.51%)
Mutual labels: text-mining
Pyphonetics
Pyphonetics is a Python 3 library for phonetic algorithms. Right now, the following algorithms are implemented and supported:
- Soundex
- Metaphone
- Refined Soundex
- Fuzzy Soundex
- Lein
- Matching Rating Approach
In addition, the following distance metrics:
- Hamming
- Levenshtein
More will be added in the future.
Instalation
The module is available in PyPI, just use pip install pyphonetics
.
Usage
>>> from pyphonetics import Soundex
>>> soundex = Soundex()
>>> soundex.phonetics('Rupert')
'R163'
>>> soundex.phonetics('Robert')
'R163'
>>> soundex.sounds_like('Robert', 'Rupert')
True
The same API applies to every algorithm, e.g:
>>> from pyphonetics import Metaphone
>>> metaphone = Metaphone()
>>> metaphone.phonetics('discrimination')
'TSKRMNXN'
You can also use the distance(word1, word2, metric='levenshtein')
method to find the distance between 2 phonetic representations.
>>> from pyphonetics import RefinedSoundex
>>> rs = RefinedSoundex()
>>> rs.distance('Rupert', 'Robert')
0
>>> rs.distance('assign', 'assist', metric='hamming')
2
Credits
The module was largely based on the implementation of phonetic algorithms found in the Talisman.js Node NLP library.
Note that the project description data, including the texts, logos, images, and/or trademarks,
for each open source project belongs to its rightful owner.
If you wish to add or remove any projects, please contact us at [email protected].