Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → obulat → zeyrek

obulat / zeyrek

Licence: MIT license

Python morphological analyzer for Turkish language. Partial port of ZemberekNLP.

Programming Languages

139335 projects - #7 most used programming language

30231 projects

Labels

nlp morphology turkish morphological-analysis lemmatization

Projects that are alternatives of or similar to zeyrek

Simple multilingual lemmatizer for Python, especially useful for speed and efficiency

Stars: ✭ 32 (-11.11%)

Mutual labels: morphological-analysis, lemmatization

UDAR Does Accented Russian: A finite-state morphological analyzer of Russian that handles stressed wordforms.

Stars: ✭ 15 (-58.33%)

Mutual labels: morphological-analysis, lemmatization

libmorph rus/ukr - fast & accurate morphological analyzer/analyses for Russian and Ukrainian

Stars: ✭ 16 (-55.56%)

Mutual labels: morphological-analysis, lemmatization

Malayalam Morphological Analyzer using Finite State Transducer

Stars: ✭ 40 (+11.11%)

Mutual labels: morphology, morphological-analysis

Neural syntax annotator, supporting sequence labeling, lemmatization, and dependency parsing.

Stars: ✭ 32 (-11.11%)

Mutual labels: morphology, lemmatization

Грамматический Словарь Русского Языка (+ английский, японский, etc)

Stars: ✭ 68 (+88.89%)

Mutual labels: morphological-analysis, lemmatization

retinal-exudates-detection

exudates detection using hybrid approach (Image Morphology & Machine Learning)

Stars: ✭ 53 (+47.22%)

Mutual labels: morphology, morphological-analysis

An NLP pipeline for Hebrew

Stars: ✭ 15 (-58.33%)

Mutual labels: morphological-analysis, lemmatization

Russian morphology for Java

Stars: ✭ 41 (+13.89%)

Mutual labels: morphology, morphological-analysis

Turkish-Lemmatizer

Lemmatization for Turkish Language

Stars: ✭ 72 (+100%)

Mutual labels: turkish, lemmatization

Neural-Morphological-Disambiguation-for-Turkish-DEPRECATED

Neural morphological disambiguation for Turkish. Implemented in DyNet

Stars: ✭ 11 (-69.44%)

Mutual labels: turkish, morphological-analysis

TREES toolbox

Stars: ✭ 20 (-44.44%)

Mutual labels: morphology, morphological-analysis

A Morphological Parser (Analyser) / Lemmatizer written in Elixir.

Stars: ✭ 45 (+25%)

Mutual labels: morphology, lemmatization

Arabic flexionnal morphology generator

Stars: ✭ 22 (-38.89%)

Mutual labels: morphology

Basic-Image-Processing

Implementation of Basic Digital Image Processing Tasks in Python / OpenCV

Stars: ✭ 102 (+183.33%)

Mutual labels: morphology

OpenHebrewBible

Open Hebrew Bible Project; aligning BHS with WLC; bridging ETCBC, OpenScriptures & Berean data on Hebrew Bible

Stars: ✭ 43 (+19.44%)

Mutual labels: morphology

A python and C++ library for reading and writing neuronal morphologies

Stars: ✭ 25 (-30.56%)

Mutual labels: morphology

Morphos adapter for Blade

Stars: ✭ 32 (-11.11%)

Mutual labels: morphology

Frog is an integration of memory-based natural language processing (NLP) modules developed for Dutch. All NLP modules are based on Timbl, the Tilburg memory-based learning software package.

Stars: ✭ 70 (+94.44%)

Mutual labels: morphology

awesome-cytodata

A curated list of awesome cytodata resources

Stars: ✭ 40 (+11.11%)

Mutual labels: morphology

View All Similar Projects ➔

Zeyrek: Morphological Analyzer and Lemmatizer

Documentation Status

Zeyrek is a partial port of Zemberek library to Python for lemmatizing and analyzing Turkish language words. It is in alpha stage, and the API will probably change.

Free software: MIT license
Documentation: https://zeyrek.readthedocs.io.

Basic Usage

To use Zeyrek, first create an instance of MorphAnalyzer class:

>>> import zeyrek
>>> analyzer = zeyrek.MorphAnalyzer()

Then, you can call its analyze method on words or texts to get all possible analyses:

>>> print(analyzer.analyze('benim'))
Parse(word='benim', lemma='ben', pos='Noun', morphemes=['Noun', 'A3sg', 'P1sg'], formatted='[ben:Noun] ben:Noun+A3sg+im:P1sg')
Parse(word='benim', lemma='ben', pos='Pron', morphemes=['Pron', 'A1sg', 'Gen'], formatted='[ben:Pron,Pers] ben:Pron+A1sg+im:Gen')
Parse(word='benim', lemma='ben', pos='Verb', morphemes=['Noun', 'A3sg', 'Zero', 'Verb', 'Pres', 'A1sg'], formatted='[ben:Noun] ben:Noun+A3sg|Zero→Verb+Pres+im:A1sg')
Parse(word='benim', lemma='ben', pos='Verb', morphemes=['Pron', 'A1sg', 'Zero', 'Verb', 'Pres', 'A1sg'], formatted='[ben:Pron,Pers] ben:Pron+A1sg|Zero→Verb+Pres+im:A1sg')

If you only need the base form of words, or lemmas, you can call lemmatize. It returns a list of tuples, with word itself and a list of possible lemmas:

>>> print(analyzer.lemmatize('benim'))
[('benim', ['ben'])]

Credits

This package is a Python port of part of the Zemberek package by Ahmet A. Akın

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 36

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (6) 🔗