All Projects → rmosolgo → lingo

rmosolgo / lingo

Licence: MIT license
parser generator

Programming Languages

crystal
512 projects
ruby
36898 projects - #4 most used programming language

Projects that are alternatives of or similar to lingo

Pegparser
💡 Build your own programming language! A C++17 PEG parser generator supporting parser combination, memoization, left-recursion and context-dependent grammars.
Stars: ✭ 164 (+645.45%)
Mutual labels:  parser-generator
lilt
LILT: noun, A characteristic rising and falling of the voice when speaking; a pleasant gentle accent.
Stars: ✭ 18 (-18.18%)
Mutual labels:  parser-generator
parsesig
A Telegram bot that forwards messages from one private/public channel to another after formatting
Stars: ✭ 40 (+81.82%)
Mutual labels:  parser-generator
Tatsu
竜 TatSu generates Python parsers from grammars in a variation of EBNF
Stars: ✭ 198 (+800%)
Mutual labels:  parser-generator
nearley-playground
⛹ Write Grammars for the Nearley Parser!
Stars: ✭ 76 (+245.45%)
Mutual labels:  parser-generator
kison
A LALR(1)/LL(1)/LL(K) parser generator for javascript/typescript
Stars: ✭ 40 (+81.82%)
Mutual labels:  parser-generator
Ecsharp
Home of LoycCore, the LES language of Loyc trees, the Enhanced C# parser, the LeMP macro preprocessor, and the LLLPG parser generator.
Stars: ✭ 141 (+540.91%)
Mutual labels:  parser-generator
pe
Fastest general-purpose parsing library for Python with a familiar API
Stars: ✭ 21 (-4.55%)
Mutual labels:  parser-generator
usfm-grammar
An elegant USFM parser.
Stars: ✭ 29 (+31.82%)
Mutual labels:  parser-generator
PackCC
PackCC is a packrat parser generator for C.
Stars: ✭ 22 (+0%)
Mutual labels:  parser-generator
dropincc.java
A small and easy to use parser generator. Specify your grammar in pure java and compile dynamically. Especially suitable for DSL creation in java.
Stars: ✭ 90 (+309.09%)
Mutual labels:  parser-generator
tree-sitter-cli
CLI tool for creating and testing tree-sitter parsers
Stars: ✭ 43 (+95.45%)
Mutual labels:  parser-generator
DirectFire Converter
DirectFire Firewall Converter - Network Security, Next-Generation Firewall Configuration Conversion, Firewall Syntax Translation and Firewall Migration Tool - supports Cisco ASA, Fortinet FortiGate (FortiOS), Juniper SRX (JunOS), SSG / Netscreen (ScreenOS) and WatchGuard (support for further devices in development). Similar to FortiConverter, Sm…
Stars: ✭ 34 (+54.55%)
Mutual labels:  parser-generator
Reduce.jl
Symbolic parser generator for Julia language expressions using REDUCE algebra term rewriter
Stars: ✭ 172 (+681.82%)
Mutual labels:  parser-generator
filter spirit
Advanced item filter generator for Path of Exile that uses it's own DSL and online item price APIs
Stars: ✭ 28 (+27.27%)
Mutual labels:  parser-generator
Npeg
PEGs for Nim, another take
Stars: ✭ 163 (+640.91%)
Mutual labels:  parser-generator
abnf parsec
ABNF in, parser out
Stars: ✭ 42 (+90.91%)
Mutual labels:  parser-generator
lemon-grove
The Lemon parser generator and sibling projects.
Stars: ✭ 27 (+22.73%)
Mutual labels:  parser-generator
RBNF
This project's lifetime has ended. The successor is https://github.com/thautwarm/frontend-for-free which is WIP. You can check lark-parser project which is a good alt.
Stars: ✭ 39 (+77.27%)
Mutual labels:  parser-generator
lalr
Modern LALR(1) parser for C++
Stars: ✭ 56 (+154.55%)
Mutual labels:  parser-generator

Lingo Build Status

A parser generator for Crystal, inspired by Parslet.

Lingo provides text processing by:

  • parsing the string into a tree of nodes
  • providing a visitor to allow you to work from the tree

Installation

Add this to your application's shard.yml:

dependencies:
  lingo:
    github: rmosolgo/lingo

Usage

Let's write a parser for highway names. The result will be a method for turning strings into useful objects:

def parse_road(input_str)
  ast = RoadParser.new.parse(input_str)
  visitor = RoadVisitor.new
  visitor.visit(ast)
  visitor.road
end

road = parse_road("I-5N")
# <Road @interstate=true, @number=5, @direction="N">

(See more examples in /examples.)

In the USA, we write highway names like this:

50    # Route 50
I-64  # Interstate 64
I-95N # Interstate 95, Northbound
29B   # Business Route 29

Parser

The general structure is {interstate?}{number}{direction?}{business?}. Let's express that with Lingo rules:

class RoadParser < Lingo::Parser
  # Match a string:
  rule(:interstate) { str("I-") }
  rule(:business) { str("B") }

  # Match a regex:
  rule(:digit) { match(/\d/) }
  # Express repetition with `.repeat`
  rule(:number) { digit.repeat }

  rule(:north) { str("N") }
  rule(:south) { str("S") }
  rule(:east) { str("E") }
  rule(:west) { str("W") }
  # Compose rules by name
  # Express alternation with |
  rule(:direction) { north | south | east | west }

  # Express sequence with >>
  # Express optionality with `.maybe`
  # Name matched strings with `.named`
  rule(:road_name) {
    interstate.named(:interstate).maybe >>
      number.named(:number) >>
      direction.named(:direction).maybe >>
      business.named(:business).maybe
  }
  # You MUST name a starting rule:
  root(:road_name)
end

Applying the Parser

An instance of a Lingo::Parser subclass has a .parse method which returns a tree of Lingo::Nodes.

RoadParser.new.parse("250B") # => <Lingo::Node ... >

It uses the rule named by root.

Making Rules

These methods help you create rules:

  • str("string") matches string exactly
  • match(/[abc]/) matches the regex exactly
  • a | b matches a or b
  • a >> b matches a followed by b
  • a.maybe matches a or nothing
  • a.repeat matches one-or-more as
  • a.repeat(0) matches zero-or-more as
  • a.absent matches not-a
  • a.named(:a) names the result :a for handling by a visitor

Visitor

After parsing, you get a tree of Lingo::Nodes. To turn that into an application object, write a visitor.

The visitor may define enter and exit hooks for nodes named with .named in the Parser. It may set up some state during #initialize, then access itself from the visitor variable during hooks.

class RoadVisitor < Lingo::Visitor
  # Set up an accumulator
  getter :road
  def initialize
    @road = Road.new
  end

  # When you find a named node, you can do something with it.
  # You can access the current visitor as `visitor`
  enter(:interstate) {
    # since we found this node, this is a business route
    visitor.road.interstate = true
  }

  # You can access the named Lingo::Node as `node`.
  # Get the matched string with `.full_value`
  enter(:number) {
    visitor.road.number = node.full_value.to_i
  }

  enter(:direction) {
    visitor.road.direction = node.full_value
  }

  enter(:business) {
    visitor.road.business = true
  }
end

Visitor Hooks

During the depth-first visitation of the resulting tree of Lingo::Nodes, you can handle visits to nodes named with .named:

  • enter(:match) is called when entering a node named :match
  • exit(:match) is called when exiting a node named :match

Within the hooks, you can access two magic variables:

  • visitor is the Visitor itself
  • node is the matched Lingo::Node which exposes:
    • #full_value: the full matched string
    • #line, #column: position information for this match

About this Project

Goals

  • Low barrier to entry: easy-to-learn API, short zero-to-working time
  • Easy-to-read code, therefore easy-to-modify
  • Useful errors (not accomplished)

Non-goals

  • Blazing-fast performance
  • Theoretical correctness

TODO

  • Add some kind of debug output

How slow is it?

Let's compare the built-in JSON parser to a Lingo JSON parser:

./lingo/benchmark $ crystal run --release slow_json.cr
Stdlib JSON 126.45k (± 1.55%)        fastest
Lingo::JSON 660.18  (± 1.28%) 191.54× slower

Ouch, that's a lot slower.

But, it's on par with Ruby and parslet, the inspiration for this project:

$ ruby parslet_json_benchmark.rb
Calculating -------------------------------------
       Parslet JSON      4.000  i/100ms
       Built-in JSON     3.657k i/100ms
-------------------------------------------------
       Parslet JSON      45.788  (± 4.4%) i/s -    232.000
       Built-in JSON     38.285k (± 5.3%) i/s -    193.821k

Comparison:
       Built-in JSON:    38285.2 i/s
       Parslet JSON :       45.8 i/s - 836.13x slower

Both Parslet and Lingo are slower than handwritten parsers. But, they're easier to write!

Development

  • Run the tests with crystal spec
  • Install Ruby & guard, then start a watcher with guard
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].