All Projects → goodmami → pe

goodmami / pe

Licence: MIT license
Fastest general-purpose parsing library for Python with a familiar API

Programming Languages

python
139335 projects - #7 most used programming language
cython
566 projects

Projects that are alternatives of or similar to pe

parson
Yet another PEG parser combinator library and DSL
Stars: ✭ 52 (+147.62%)
Mutual labels:  parsing, parsing-expression-grammar, peg, parsing-library
Rust Peg
Parsing Expression Grammar (PEG) parser generator for Rust
Stars: ✭ 836 (+3880.95%)
Mutual labels:  parsing, parser-generator, peg
cppcombinator
parser combinator and AST generator in c++17
Stars: ✭ 20 (-4.76%)
Mutual labels:  parsing, parsing-expression-grammar, peg
Lug
Parsing expression grammar (PEG) embedded domain specific language and parsing machine for C++17
Stars: ✭ 44 (+109.52%)
Mutual labels:  parsing, parser-generator, peg
kiuatan
A parser library for Pony.
Stars: ✭ 15 (-28.57%)
Mutual labels:  parser-generator, parsing-expression-grammar, peg
pyrser
A PEG Parsing Tool
Stars: ✭ 32 (+52.38%)
Mutual labels:  parsing, parsing-expression-grammar, peg
Cpp Peglib
A single file C++ header-only PEG (Parsing Expression Grammars) library
Stars: ✭ 435 (+1971.43%)
Mutual labels:  parsing, parser-generator, peg
arborist
Arborist is a PEG parser that supports left-associative left recursion
Stars: ✭ 17 (-19.05%)
Mutual labels:  parsing, parsing-expression-grammar, peg
Pegtl
Parsing Expression Grammar Template Library
Stars: ✭ 1,295 (+6066.67%)
Mutual labels:  parsing, peg
Nice Parser
Nice parsers in OCaml without the boilerplate
Stars: ✭ 91 (+333.33%)
Mutual labels:  parsing, parser-generator
Pest
The Elegant Parser
Stars: ✭ 2,783 (+13152.38%)
Mutual labels:  parsing, peg
3bmd
markdown processor in CL using esrap parser
Stars: ✭ 58 (+176.19%)
Mutual labels:  parsing, peg
DotGrok
Parse text with pattern. Inspired by grok filter.
Stars: ✭ 26 (+23.81%)
Mutual labels:  parsing, parsing-library
Antlr4
ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.
Stars: ✭ 11,227 (+53361.9%)
Mutual labels:  parsing, parser-generator
Owl
A parser generator for visibly pushdown languages.
Stars: ✭ 645 (+2971.43%)
Mutual labels:  parsing, parser-generator
Ohm
A library and language for building parsers, interpreters, compilers, etc.
Stars: ✭ 3,938 (+18652.38%)
Mutual labels:  parsing, peg
FAParser
JSON Parsing + Archiving & Unarchiving in User Defaults
Stars: ✭ 67 (+219.05%)
Mutual labels:  parsing, parsing-library
autumn
A Java parser combinator library written with an unmatched feature set.
Stars: ✭ 112 (+433.33%)
Mutual labels:  parsing, parsing-expression-grammar
Pom
PEG parser combinators using operator overloading without macros.
Stars: ✭ 310 (+1376.19%)
Mutual labels:  parsing, peg
abnf parsec
ABNF in, parser out
Stars: ✭ 42 (+100%)
Mutual labels:  parsing, parser-generator

pe logo
Parsing Expressions
PyPI link Python Support tests


pe is a library for parsing expressions, including parsing expression grammars (PEGs). It aims to join the expressive power of parsing expressions with the familiarity of regular expressions. For example:

>>> import pe
>>> pe.match(r'"-"? [0-9]+', '-38')  # match an integer
<Match object; span=(0, 3), match='-38'>

A grammar can be used for more complicated or recursive patterns:

>>> float_parser = pe.compile(r'''
...   Start    <- INTEGER FRACTION? EXPONENT?
...   INTEGER  <- "-"? ("0" / [1-9] [0-9]*)
...   FRACTION <- "." [0-9]+
...   EXPONENT <- [Ee] [-+]? [0-9]+
... ''')
>>> float_parser.match('6.02e23')
<Match object; span=(0, 7), match='6.02e23'>

Quick Links

Features and Goals

  • Grammar notation is backward-compatible with standard PEG with few extensions
  • A specification describes the semantic effect of parsing (e.g., for mapping expressions to function calls)
  • Parsers are often faster than other parsing libraries, sometimes by a lot; see the benchmarks
  • The API is intuitive and familiar; it's modeled on the standard API's re module
  • Grammar definitions and parser implementations are separate

Syntax Overview

pe is backward compatible with standard PEG syntax and it is conservative with extensions.

# terminals
.            # any single character
"abc"        # string literal
'abc'        # string literal
[abc]        # character class

# repeating expressions
e            # exactly one
e?           # zero or one (optional)
e*           # zero or more
e+           # one or more

# combining expressions
e1 e2        # sequence of e1 and e2
e1 / e2      # ordered choice of e1 and e2
(e)          # subexpression

# lookahead
&e           # positive lookahead
!e           # negative lookahead

# (extension) capture substring
~e           # result of e is matched substring

# (extension) binding
name:e       # bind result of e to 'name'

# grammars
Name <- ...  # define a rule named 'Name'
... <- Name  # refer to rule named 'Name'

Matching Inputs with Parsing Expressions

When a parsing expression matches an input, it returns a Match object, which is similar to those of Python's re module for regular expressions. By default, nothing is captured, but the capture operator (~) emits the substring of the matching expression, similar to regular expression's capturing groups:

>>> e = pe.compile(r'[0-9] [.] [0-9]')
>>> m = e.match('1.4')
>>> m.group()
'1.4'
>>> m.groups()
()
>>> e = pe.compile(r'~([0-9] [.] [0-9])')
>>> m = e.match('1.4')
>>> m.group()
'1.4'
>>> m.groups()
('1.4',)

Value Bindings

A value binding extracts the emitted values of a match and associates it with a name that is made available in the Match.groupdict() dictionary. This is similar to named-capture groups in regular expressions, except that it extracts the emitted values and not the substring of the bound expression.

>>> e = pe.compile(r'~[0-9] x:(~[.]) ~[0-9]')
>>> m = e.match('1.4')
>>> m.groups()
('1', '4')
>>> m.groupdict()
{'x': '.'}

Actions

Actions (also called "semantic actions") are callables that transform parse results. When an arbitrary function is given, it is called as follows:

func(*match.groups(), **match.groupdict())

The result of this function call becomes the only emitted value going forward and all bound values are cleared.

For more control, pe provides the Action class and a number of subclasses for various use-cases. These actions have access to more information about a parse result and more control over the match. For example, the Pack class takes a function and calls it with the emitted values packed into a list:

func(match.groups())

And the Join class joins all emitted strings with a separator:

func(sep.join(match.groups()), **match.groupdict())

Example

Here is one way to parse a list of comma-separated integers:

>>> from pe.actions import Pack
>>> p = pe.compile(
...   r'''
...     Start  <- "[" Values? "]"
...     Values <- Int ("," Int)*
...     Int    <- ~( "-"? ("0" / [1-9] [0-9]*) )
...   ''',
...   actions={'Values': Pack(list), 'Int': int})
>>> m = p.match('[5,10,-15]')
>>> m.value()
[5, 10, -15]

Similar Projects

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].