All Projects → obulat → zeyrek

obulat / zeyrek

Licence: MIT license
Python morphological analyzer for Turkish language. Partial port of ZemberekNLP.

Programming Languages

python
139335 projects - #7 most used programming language
Makefile
30231 projects

Projects that are alternatives of or similar to zeyrek

simplemma
Simple multilingual lemmatizer for Python, especially useful for speed and efficiency
Stars: ✭ 32 (-11.11%)
Mutual labels:  morphological-analysis, lemmatization
udar
UDAR Does Accented Russian: A finite-state morphological analyzer of Russian that handles stressed wordforms.
Stars: ✭ 15 (-58.33%)
Mutual labels:  morphological-analysis, lemmatization
libmorph
libmorph rus/ukr - fast & accurate morphological analyzer/analyses for Russian and Ukrainian
Stars: ✭ 16 (-55.56%)
Mutual labels:  morphological-analysis, lemmatization
mlmorph
Malayalam Morphological Analyzer using Finite State Transducer
Stars: ✭ 40 (+11.11%)
Mutual labels:  morphology, morphological-analysis
syntaxdot
Neural syntax annotator, supporting sequence labeling, lemmatization, and dependency parsing.
Stars: ✭ 32 (-11.11%)
Mutual labels:  morphology, lemmatization
GrammarEngine
Грамматический Словарь Русского Языка (+ английский, японский, etc)
Stars: ✭ 68 (+88.89%)
Mutual labels:  morphological-analysis, lemmatization
retinal-exudates-detection
exudates detection using hybrid approach (Image Morphology & Machine Learning)
Stars: ✭ 53 (+47.22%)
Mutual labels:  morphology, morphological-analysis
HebPipe
An NLP pipeline for Hebrew
Stars: ✭ 15 (-58.33%)
Mutual labels:  morphological-analysis, lemmatization
aot
Russian morphology for Java
Stars: ✭ 41 (+13.89%)
Mutual labels:  morphology, morphological-analysis
Turkish-Lemmatizer
Lemmatization for Turkish Language
Stars: ✭ 72 (+100%)
Mutual labels:  turkish, lemmatization
Neural-Morphological-Disambiguation-for-Turkish-DEPRECATED
Neural morphological disambiguation for Turkish. Implemented in DyNet
Stars: ✭ 11 (-69.44%)
Mutual labels:  turkish, morphological-analysis
treestoolbox
TREES toolbox
Stars: ✭ 20 (-44.44%)
Mutual labels:  morphology, morphological-analysis
lemma
A Morphological Parser (Analyser) / Lemmatizer written in Elixir.
Stars: ✭ 45 (+25%)
Mutual labels:  morphology, lemmatization
alyahmor
Arabic flexionnal morphology generator
Stars: ✭ 22 (-38.89%)
Mutual labels:  morphology
Basic-Image-Processing
Implementation of Basic Digital Image Processing Tasks in Python / OpenCV
Stars: ✭ 102 (+183.33%)
Mutual labels:  morphology
OpenHebrewBible
Open Hebrew Bible Project; aligning BHS with WLC; bridging ETCBC, OpenScriptures & Berean data on Hebrew Bible
Stars: ✭ 43 (+19.44%)
Mutual labels:  morphology
MorphIO
A python and C++ library for reading and writing neuronal morphologies
Stars: ✭ 25 (-30.56%)
Mutual labels:  morphology
Morphos-Blade
Morphos adapter for Blade
Stars: ✭ 32 (-11.11%)
Mutual labels:  morphology
frog
Frog is an integration of memory-based natural language processing (NLP) modules developed for Dutch. All NLP modules are based on Timbl, the Tilburg memory-based learning software package.
Stars: ✭ 70 (+94.44%)
Mutual labels:  morphology
awesome-cytodata
A curated list of awesome cytodata resources
Stars: ✭ 40 (+11.11%)
Mutual labels:  morphology

Zeyrek: Morphological Analyzer and Lemmatizer

Documentation Status

build

Zeyrek is a partial port of Zemberek library to Python for lemmatizing and analyzing Turkish language words. It is in alpha stage, and the API will probably change.

Basic Usage

To use Zeyrek, first create an instance of MorphAnalyzer class:

>>> import zeyrek
>>> analyzer = zeyrek.MorphAnalyzer()

Then, you can call its analyze method on words or texts to get all possible analyses:

>>> print(analyzer.analyze('benim'))
Parse(word='benim', lemma='ben', pos='Noun', morphemes=['Noun', 'A3sg', 'P1sg'], formatted='[ben:Noun] ben:Noun+A3sg+im:P1sg')
Parse(word='benim', lemma='ben', pos='Pron', morphemes=['Pron', 'A1sg', 'Gen'], formatted='[ben:Pron,Pers] ben:Pron+A1sg+im:Gen')
Parse(word='benim', lemma='ben', pos='Verb', morphemes=['Noun', 'A3sg', 'Zero', 'Verb', 'Pres', 'A1sg'], formatted='[ben:Noun] ben:Noun+A3sg|Zero→Verb+Pres+im:A1sg')
Parse(word='benim', lemma='ben', pos='Verb', morphemes=['Pron', 'A1sg', 'Zero', 'Verb', 'Pres', 'A1sg'], formatted='[ben:Pron,Pers] ben:Pron+A1sg|Zero→Verb+Pres+im:A1sg')

If you only need the base form of words, or lemmas, you can call lemmatize. It returns a list of tuples, with word itself and a list of possible lemmas:

>>> print(analyzer.lemmatize('benim'))
[('benim', ['ben'])]

Credits

This package is a Python port of part of the Zemberek package by Ahmet A. Akın

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].