All Projects → natasha → Yargy

natasha / Yargy

Licence: mit
Rule-based facts extraction for Russian language

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Yargy

Invoicenet
Deep neural network to extract intelligent information from invoice documents.
Stars: ✭ 1,886 (+773.15%)
Mutual labels:  information-extraction
Openpapyrus
Sophisticated ERP, CRM, Point-Of-Sale, etc. Open source now. This system is developed since 1996.
Stars: ✭ 158 (-26.85%)
Mutual labels:  russian
Event Registry Python
Python package for API access to news articles and events in the Event Registry
Stars: ✭ 179 (-17.13%)
Mutual labels:  information-extraction
Razdel
Rule-based token, sentence segmentation for Russian language
Stars: ✭ 144 (-33.33%)
Mutual labels:  russian
Chemdataextractor
Automatically extract chemical information from scientific documents
Stars: ✭ 152 (-29.63%)
Mutual labels:  information-extraction
Sypht Python Client
A python client for the Sypht API
Stars: ✭ 160 (-25.93%)
Mutual labels:  information-extraction
The Road To Learn React Russian
The Road to Learn React - Русский перевод
Stars: ✭ 128 (-40.74%)
Mutual labels:  russian
Multi Tacotron Voice Cloning
Phoneme multilingual(Russian-English) voice cloning based on
Stars: ✭ 192 (-11.11%)
Mutual labels:  russian
Corus
Links to Russian corpora + Python functions for loading and parsing
Stars: ✭ 154 (-28.7%)
Mutual labels:  russian
Nel
Entity linking framework
Stars: ✭ 176 (-18.52%)
Mutual labels:  information-extraction
Nl2sql
阿里天池首届中文NL2SQL挑战赛top6
Stars: ✭ 146 (-32.41%)
Mutual labels:  information-extraction
Open Ie Papers
Open Information Extraction (OpenIE) and Open Relation Extraction (ORE) papers and data.
Stars: ✭ 150 (-30.56%)
Mutual labels:  information-extraction
Russian Words
List of Russian words
Stars: ✭ 168 (-22.22%)
Mutual labels:  russian
Triggerner
TriggerNER: Learning with Entity Triggers as Explanations for Named Entity Recognition (ACL 2020)
Stars: ✭ 141 (-34.72%)
Mutual labels:  information-extraction
Rust book ru
The Rust Programming Language на русском языке
Stars: ✭ 188 (-12.96%)
Mutual labels:  russian
Snowball
Implementation with some extensions of the paper "Snowball: Extracting Relations from Large Plain-Text Collections" (Agichtein and Gravano, 2000)
Stars: ✭ 131 (-39.35%)
Mutual labels:  information-extraction
Pzad
Курс "Прикладные задачи анализа данных" (ВМК, МГУ имени М.В. Ломоносова)
Stars: ✭ 160 (-25.93%)
Mutual labels:  russian
Ru.javascript.info
Современный учебник JavaScript
Stars: ✭ 2,648 (+1125.93%)
Mutual labels:  russian
Ail Framework
AIL framework - Analysis Information Leak framework
Stars: ✭ 191 (-11.57%)
Mutual labels:  information-extraction
Interpy Ru
Intermediate Python book Russian translation
Stars: ✭ 175 (-18.98%)
Mutual labels:  russian

CI codecov

Yargy is an Earley parser similar to Tomita parser. Yargy uses rules and dictionaries to extract structured information from Russian texts.

Install

Yargy supports Python 3.5+, PyPy 3, depends only on Pymorphy2.

$ pip install yargy

Usage

from yargy import Parser, rule, and_, not_
from yargy.interpretation import fact
from yargy.predicates import gram
from yargy.relations import gnc_relation
from yargy.pipelines import morph_pipeline


Name = fact(
    'Name',
    ['first', 'last'],
)
Person = fact(
    'Person',
    ['position', 'name']
)

LAST = and_(
    gram('Surn'),
    not_(gram('Abbr')),
)
FIRST = and_(
    gram('Name'),
    not_(gram('Abbr')),
)

POSITION = morph_pipeline([
    'управляющий директор',
    'вице-мэр'
])

gnc = gnc_relation()
NAME = rule(
    FIRST.interpretation(
        Name.first
    ).match(gnc),
    LAST.interpretation(
        Name.last
    ).match(gnc)
).interpretation(
    Name
)

PERSON = rule(
    POSITION.interpretation(
        Person.position
    ).match(gnc),
    NAME.interpretation(
        Person.name
    )
).interpretation(
    Person
)

parser = Parser(PERSON)

match = parser.match('управляющий директор Иван Ульянов')
print(match)

Person(
    position='управляющий директор',
    name=Name(
        first='Иван',
        last='Ульянов'
)

Documentation

All materials are in Russian:

Support

Development

Test:

make test

Package:

make version
git push
git push --tags

make clean wheel upload
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].