natasha / Yargy
Licence: mit
Rule-based facts extraction for Russian language
Stars: ✭ 216
Programming Languages
python
139335 projects - #7 most used programming language
Projects that are alternatives of or similar to Yargy
Invoicenet
Deep neural network to extract intelligent information from invoice documents.
Stars: ✭ 1,886 (+773.15%)
Mutual labels: information-extraction
Openpapyrus
Sophisticated ERP, CRM, Point-Of-Sale, etc. Open source now. This system is developed since 1996.
Stars: ✭ 158 (-26.85%)
Mutual labels: russian
Event Registry Python
Python package for API access to news articles and events in the Event Registry
Stars: ✭ 179 (-17.13%)
Mutual labels: information-extraction
Razdel
Rule-based token, sentence segmentation for Russian language
Stars: ✭ 144 (-33.33%)
Mutual labels: russian
Chemdataextractor
Automatically extract chemical information from scientific documents
Stars: ✭ 152 (-29.63%)
Mutual labels: information-extraction
Sypht Python Client
A python client for the Sypht API
Stars: ✭ 160 (-25.93%)
Mutual labels: information-extraction
The Road To Learn React Russian
The Road to Learn React - Русский перевод
Stars: ✭ 128 (-40.74%)
Mutual labels: russian
Multi Tacotron Voice Cloning
Phoneme multilingual(Russian-English) voice cloning based on
Stars: ✭ 192 (-11.11%)
Mutual labels: russian
Corus
Links to Russian corpora + Python functions for loading and parsing
Stars: ✭ 154 (-28.7%)
Mutual labels: russian
Open Ie Papers
Open Information Extraction (OpenIE) and Open Relation Extraction (ORE) papers and data.
Stars: ✭ 150 (-30.56%)
Mutual labels: information-extraction
Triggerner
TriggerNER: Learning with Entity Triggers as Explanations for Named Entity Recognition (ACL 2020)
Stars: ✭ 141 (-34.72%)
Mutual labels: information-extraction
Rust book ru
The Rust Programming Language на русском языке
Stars: ✭ 188 (-12.96%)
Mutual labels: russian
Snowball
Implementation with some extensions of the paper "Snowball: Extracting Relations from Large Plain-Text Collections" (Agichtein and Gravano, 2000)
Stars: ✭ 131 (-39.35%)
Mutual labels: information-extraction
Pzad
Курс "Прикладные задачи анализа данных" (ВМК, МГУ имени М.В. Ломоносова)
Stars: ✭ 160 (-25.93%)
Mutual labels: russian
Ail Framework
AIL framework - Analysis Information Leak framework
Stars: ✭ 191 (-11.57%)
Mutual labels: information-extraction
Yargy is an Earley parser similar to Tomita parser. Yargy uses rules and dictionaries to extract structured information from Russian texts.
Install
Yargy supports Python 3.5+, PyPy 3, depends only on Pymorphy2.
$ pip install yargy
Usage
from yargy import Parser, rule, and_, not_
from yargy.interpretation import fact
from yargy.predicates import gram
from yargy.relations import gnc_relation
from yargy.pipelines import morph_pipeline
Name = fact(
'Name',
['first', 'last'],
)
Person = fact(
'Person',
['position', 'name']
)
LAST = and_(
gram('Surn'),
not_(gram('Abbr')),
)
FIRST = and_(
gram('Name'),
not_(gram('Abbr')),
)
POSITION = morph_pipeline([
'управляющий директор',
'вице-мэр'
])
gnc = gnc_relation()
NAME = rule(
FIRST.interpretation(
Name.first
).match(gnc),
LAST.interpretation(
Name.last
).match(gnc)
).interpretation(
Name
)
PERSON = rule(
POSITION.interpretation(
Person.position
).match(gnc),
NAME.interpretation(
Person.name
)
).interpretation(
Person
)
parser = Parser(PERSON)
match = parser.match('управляющий директор Иван Ульянов')
print(match)
Person(
position='управляющий директор',
name=Name(
first='Иван',
last='Ульянов'
)
Documentation
All materials are in Russian:
Support
- Chat — https://telegram.me/natural_language_processing
- Issues — https://github.com/natasha/yargy/issues
- Commercial support — https://lab.alexkuk.ru
Development
Test:
make test
Package:
make version
git push
git push --tags
make clean wheel upload
Note that the project description data, including the texts, logos, images, and/or trademarks,
for each open source project belongs to its rightful owner.
If you wish to add or remove any projects, please contact us at [email protected].