All Projects → zaibacu → rita-dsl

zaibacu / rita-dsl

Licence: MIT license
A Domain Specific Language (DSL) for building language patterns. These can be later compiled into spaCy patterns, pure regex, or any other format

Programming Languages

python
139335 projects - #7 most used programming language
Makefile
30231 projects

Projects that are alternatives of or similar to rita-dsl

Spacy Api Docker
spaCy REST API, wrapped in a Docker container.
Stars: ✭ 222 (+270%)
Mutual labels:  parsing, spacy
ATGValidator
iOS validation framework with form validation support
Stars: ✭ 51 (-15%)
Mutual labels:  regex, rule-based
SkillNER
A (smart) rule based NLP module to extract job skills from text
Stars: ✭ 69 (+15%)
Mutual labels:  spacy, rule-based
librxvm
non-backtracking NFA-based regular expression library, for C and Python
Stars: ✭ 57 (-5%)
Mutual labels:  parsing, regex
Comby
A tool for structural code search and replace that supports ~every language.
Stars: ✭ 912 (+1420%)
Mutual labels:  parsing, regex
python-hslog
Python module to parse Hearthstone Power.log files
Stars: ✭ 37 (-38.33%)
Mutual labels:  parsing, regex
CVparser
CVparser is software for parsing or extracting data out of CV/resumes.
Stars: ✭ 28 (-53.33%)
Mutual labels:  parsing, regex
Rosie Pattern Language
Rosie Pattern Language (RPL) and the Rosie Pattern Engine have MOVED!
Stars: ✭ 146 (+143.33%)
Mutual labels:  parsing, regex
spaczz
Fuzzy matching and more functionality for spaCy.
Stars: ✭ 215 (+258.33%)
Mutual labels:  regex, spacy
pcre-heavy
A Haskell regular expressions library that doesn't suck | now on https://codeberg.org/valpackett/pcre-heavy
Stars: ✭ 52 (-13.33%)
Mutual labels:  regex
please
please, a sudo clone
Stars: ✭ 40 (-33.33%)
Mutual labels:  regex
assemblyscript-regex
A regex engine for AssemblyScript
Stars: ✭ 81 (+35%)
Mutual labels:  regex
ltreesitter
Standalone tree sitter bindings for the Lua language
Stars: ✭ 62 (+3.33%)
Mutual labels:  parsing
Ruby Regexp
Learn Ruby Regexp step by step from beginner to advanced levels with plenty of examples and exercises
Stars: ✭ 79 (+31.67%)
Mutual labels:  regex
bllip-parser
BLLIP reranking parser (also known as Charniak-Johnson parser, Charniak parser, Brown reranking parser) See http://pypi.python.org/pypi/bllipparser/ for Python module.
Stars: ✭ 217 (+261.67%)
Mutual labels:  parsing
pyrser
A PEG Parsing Tool
Stars: ✭ 32 (-46.67%)
Mutual labels:  parsing
spacymoji
💙 Emoji handling and meta data for spaCy with custom extension attributes
Stars: ✭ 174 (+190%)
Mutual labels:  spacy
JagTag
📝 JagTag is a simple - yet powerful and customizable - interpretted text parsing language!
Stars: ✭ 40 (-33.33%)
Mutual labels:  parsing
DrawRacket4Me
DrawRacket4Me draws trees and graphs from your code, making it easier to check if the structure is what you wanted.
Stars: ✭ 43 (-28.33%)
Mutual labels:  parsing
BencodeNET
.NET library for encoding/decoding bencode and reading/writing torrent files
Stars: ✭ 133 (+121.67%)
Mutual labels:  parsing

Rita Logo

RITA DSL

Documentation Status codecov made-with-python Maintenance PyPI version fury.io PyPI download month GitHub license

This is a language, loosely based on language Apache UIMA RUTA, focused on writing manual language rules, which compiles into either spaCy compatible patterns, or pure regex. These patterns can be used for doing manual NER as well as used in other processes, like retokenizing and pure matching

An Introduction Video

Intro

Links

Support

reddit Gitter

If you need consulting or some custom work done, you can Contact Us

Install

pip install rita-dsl

Simple Rules example

rules = """
cuts = {"fitted", "wide-cut"}
lengths = {"short", "long", "calf-length", "knee-length"}
fabric_types = {"soft", "airy", "crinkled"}
fabrics = {"velour", "chiffon", "knit", "woven", "stretch"}

{IN_LIST(cuts)?, IN_LIST(lengths), WORD("dress")}->MARK("DRESS_TYPE")
{IN_LIST(lengths), IN_LIST(cuts), WORD("dress")}->MARK("DRESS_TYPE")
{IN_LIST(fabric_types)?, IN_LIST(fabrics)}->MARK("DRESS_FABRIC")
"""

Loading in spaCy

import spacy
from rita.shortcuts import setup_spacy


nlp = spacy.load("en")
setup_spacy(nlp, rules_string=rules)

And using it:

>>> r = nlp("She was wearing a short wide-cut dress")
>>> [{"label": e.label_, "text": e.text} for e in r.ents]
[{'label': 'DRESS_TYPE', 'text': 'short wide-cut dress'}]

Loading using Regex (standalone)

import rita

patterns = rita.compile_string(rules, use_engine="standalone")

And using it:

>>> list(patterns.execute("She was wearing a short wide-cut dress"))
[{'end': 38, 'label': 'DRESS_TYPE', 'start': 18, 'text': 'short wide-cut dress'}]
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].