All Projects → jwtowner → Lug

jwtowner / Lug

Licence: other
Parsing expression grammar (PEG) embedded domain specific language and parsing machine for C++17

Programming Languages

cpp17
186 projects
dsl
153 projects
grammar
57 projects

Projects that are alternatives of or similar to Lug

ParsecSharp
The faster monadic parser combinator library for C#
Stars: ✭ 23 (-47.73%)
Mutual labels:  parsing, parser-combinators, peg
Cpp Peglib
A single file C++ header-only PEG (Parsing Expression Grammars) library
Stars: ✭ 435 (+888.64%)
Mutual labels:  parser-generator, peg, parsing
Pom
PEG parser combinators using operator overloading without macros.
Stars: ✭ 310 (+604.55%)
Mutual labels:  parser-combinators, peg, parsing
pe
Fastest general-purpose parsing library for Python with a familiar API
Stars: ✭ 21 (-52.27%)
Mutual labels:  parsing, parser-generator, peg
Pegtl
Parsing Expression Grammar Template Library
Stars: ✭ 1,295 (+2843.18%)
Mutual labels:  parser-combinators, peg, parsing
Rust Peg
Parsing Expression Grammar (PEG) parser generator for Rust
Stars: ✭ 836 (+1800%)
Mutual labels:  parser-generator, peg, parsing
parser-combinators
Lightweight package providing commonly useful parser combinators
Stars: ✭ 41 (-6.82%)
Mutual labels:  parsing, parser-combinators
Covfefe
A parser for nondeterministic context free languages
Stars: ✭ 49 (+11.36%)
Mutual labels:  parsing, parser-generator
chumsky
A parser library for humans with powerful error recovery.
Stars: ✭ 740 (+1581.82%)
Mutual labels:  parser-combinators, peg
copper
An integrated context-aware scanner and parser generator
Stars: ✭ 14 (-68.18%)
Mutual labels:  parsing, parser-generator
inmemantlr
ANTLR as a libray for JVM based languages
Stars: ✭ 87 (+97.73%)
Mutual labels:  parsing, parser-generator
Pegjs
PEG.js: Parser generator for JavaScript
Stars: ✭ 4,176 (+9390.91%)
Mutual labels:  parser-generator, peg
ohm-editor
An IDE for the Ohm language (JavaScript edition)
Stars: ✭ 78 (+77.27%)
Mutual labels:  parsing, peg
YaccConstructor
Platform for parser generators and other grammarware research and development. GLL, RNGLR, graph parsing algorithms, and many others are included.
Stars: ✭ 36 (-18.18%)
Mutual labels:  parsing, parser-generator
latex2unicode
Convert LaTeX markup to Unicode (in Scala and Java)
Stars: ✭ 28 (-36.36%)
Mutual labels:  parsing, peg
leftry
Leftry - A left-recursion enabled recursive-descent parser combinator library for Lua.
Stars: ✭ 32 (-27.27%)
Mutual labels:  parser-combinators, parser-generator
Comby
A tool for structural code search and replace that supports ~every language.
Stars: ✭ 912 (+1972.73%)
Mutual labels:  parser-combinators, parsing
Pidgin
C#'s fastest parser combinator library
Stars: ✭ 469 (+965.91%)
Mutual labels:  parser-combinators, parsing
Scala Parser Combinators
simple combinator-based parsing for Scala. formerly part of the Scala standard library, now a separate community-maintained module
Stars: ✭ 523 (+1088.64%)
Mutual labels:  parser-combinators, parsing
peg
Import of Ian Piumarta's peg/leg recursive-descent parser generators for C
Stars: ✭ 41 (-6.82%)
Mutual labels:  parser-generator, peg

lug Build Status License

An embedded domain specific language for expressing parsers as extended parsing expression grammars (PEGs) in C++17

lug

Features

  • Natural syntax more akin to external parser generator languages
  • Separation of syntatic and lexical rules, with customizable implicit whitespace skipping
  • Direct and indirect left recursion with precedence levels to disambiguate subexpressions with mixed left/right recursion
  • Traditional PEG syntax has been extended to support attribute grammars
  • Cut operator to commit to currently matched parse prefix and prune all backtrack entries
  • Deferred evaluation of semantic actions, ensuring actions do not execute on failed branches or invalid input
  • Generated parsers are compiled to special-purpose bytecode and executed in a virtual parsing machine
  • UTF-8 text parsing with complete Level 1 and partial Level 2 support of the UTS #18 Unicode Regular Expressions technical standard
  • Automatic line and column tracking with customizable tab width and alignment
  • Uses expression template functors to implement the rules of the domain specific language
  • Header only library using C++17 language and library features
  • Relatively small with the intent of parser core to remain under 1500 lines of terse code

It is based on research introduced in the following papers:

Bryan Ford, Parsing expression grammars: a recognition-based syntactic foundation, Proceedings of the 31st ACM SIGPLAN-SIGACT symposium on Principles of Programming Languages, p.111-122, January 2004

Sérgio Medeiros et. al, A parsing machine for PEGs, Proceedings of the 2008 symposium on Dynamic Languages, p.1-12, July 2008

Kota Mizushima et. al, Packrat parsers can handle practical grammars in mostly constant space, Proceedings of the 9th ACM SIGPLAN-SIGSOFT workshop on Program analysis for software tools and engineering, p.29-36, June 2010

Sérgio Medeiros et. al, Left recursion in Parsing Expression Grammars, Science of Computer Programming, v.96 n.P2, p.177-190, December 2014

Leonardo Reis et. al, The formalization and implementation of Adaptable Parsing Expression Grammars, Science of Computer Programming, v.96 n.P2, p.191-210, December 2014

Sérgio Medeiros et. al, A parsing machine for parsing expression grammars with labeled failures, Proceedings of the 31st Annual ACM symposium on Applied Computing, p.1960-1967, April 2016

Building

As a header only library, lug itself does not need to be built. Simply ensure the lug header directory is in your include path and you're good to go.

As a baseline, the following compiler versions are known to work with lug.

Compiler Language Mode
Clang 5.0.0 (September 2017) -std=c++17 or -std=gnu++17
GCC 7.1.0 (May 2017) -std=c++17 or -std=gnu++17
Microsoft Visual C++ 2017 15.5 (December 2017) Platform Toolset: Visual Studio 2017 Toolset (v141), Language Standard: ISO C++17 Standard (/std:c++17)

To build the sample programs and unit tests, a makefile is provided for Linux and BSD platforms and a Visual Studio solution is available for use on Windows.

Syntax Reference

Operator Syntax
Sequence e1 > e2
Ordered Choice e1 | e2
Zero-or-More *e
One-or-More +e
Optional ~e
Positive Lookahead &e
Negative Lookahead !e
Terminal Description
chr(c) Matches the UTF-8, UTF-16, or UTF-32 character c
chr(c1, c2) Matches characters in the UTF-8, UTF-16, or UTF-32 interval [c1-c2]
str(s) Matches the sequence of characters in a string
bre(s) POSIX Basic Regular Expression (BRE)
any Matches any single character
any(flags) Matches a character exhibiting any of the character properties
all(flags) Matches a character with all of the character properties
none(flags) Matches a character with none of the character properties
eps Matches the empty string
eoi Matches the end of the input sequence
eol Matches a Unicode line-ending
nop No operation, does not emit any instructions
cut Emits a cut operation into the stream of semantic actions without matching
Literal Name Description
_cx Character Expression Matches the UTF-8, UTF-16, or UTF-32 character literal
_sx String Expression Matches the sequence of characters in a string literal
_rx Regular Expression POSIX Basic Regular Expression (BRE)
_icx Case Insensitive Character Expression Same as _cx but case insensitive
_isx Case Insensitive String Expression Same as _sx but case insensitive
_irx Case Insensitive Regular Expression Same as _rx but case insensitive
_scx Case Sensitive Character Expression Same as _cx but case sensitive
_ssx Case Sensitive String Expression Same as _sx but case sensitive
_srx Case Sensitive Regular Expression Same as _rx but case sensitive

TODO

  • parser error recovery
  • add an interactive processing mode flag to input sources?
  • handle exceptions thrown from semantic actions in semantics::accept?
  • feature: symbol tables and parsing conditions
  • feature: Adams-Nestra grammars and whitespace alignment
  • feature: syntax to specify number range of allowed iteration
  • optimization: tail recursion
  • optimization: reduce number of false-positive left-recursive calls even further by lazily evaluating rule mandate
  • optimization: additional instructions (test_char, test_any, test_range, test_class)
  • more samples, testing, and bug fixing
  • increase compiler warning level and fix any issues
  • documentation
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].