All Projects → christophevg → human-parser-generator

christophevg / human-parser-generator

Licence: MIT license
A straightforward recursive descent Parser Generator with a focus on "human" code generation and ease of use.

Programming Languages

C#
18002 projects

Projects that are alternatives of or similar to human-parser-generator

Covfefe
A parser for nondeterministic context free languages
Stars: ✭ 49 (+81.48%)
Mutual labels:  parser-generator, ebnf
YaccConstructor
Platform for parser generators and other grammarware research and development. GLL, RNGLR, graph parsing algorithms, and many others are included.
Stars: ✭ 36 (+33.33%)
Mutual labels:  parser-generator, ebnf
Participle
A parser library for Go
Stars: ✭ 2,302 (+8425.93%)
Mutual labels:  ast, ebnf
RBNF
This project's lifetime has ended. The successor is https://github.com/thautwarm/frontend-for-free which is WIP. You can check lark-parser project which is a good alt.
Stars: ✭ 39 (+44.44%)
Mutual labels:  parser-generator, ebnf
dropincc.java
A small and easy to use parser generator. Specify your grammar in pure java and compile dynamically. Especially suitable for DSL creation in java.
Stars: ✭ 90 (+233.33%)
Mutual labels:  parser-generator, ebnf
inmemantlr
ANTLR as a libray for JVM based languages
Stars: ✭ 87 (+222.22%)
Mutual labels:  parser-generator, ast
ebnf
EBNF parser and generic parser generator for Ruby.
Stars: ✭ 101 (+274.07%)
Mutual labels:  parser-generator, ebnf
Tatsu
竜 TatSu generates Python parsers from grammars in a variation of EBNF
Stars: ✭ 198 (+633.33%)
Mutual labels:  parser-generator, ast
lilt
LILT: noun, A characteristic rising and falling of the voice when speaking; a pleasant gentle accent.
Stars: ✭ 18 (-33.33%)
Mutual labels:  parser-generator, ast
nimly
Lexer Generator and Parser Generator as a Library in Nim.
Stars: ✭ 113 (+318.52%)
Mutual labels:  parser-generator, ebnf
lingo
parser generator
Stars: ✭ 22 (-18.52%)
Mutual labels:  parser-generator
predeclared
Find definitions and declarations in Go source code that shadow predeclared identifiers
Stars: ✭ 26 (-3.7%)
Mutual labels:  ast
py2many
Transpiler of Python to many other languages
Stars: ✭ 420 (+1455.56%)
Mutual labels:  ast
cppcombinator
parser combinator and AST generator in c++17
Stars: ✭ 20 (-25.93%)
Mutual labels:  ast
flutter ast
Flutter and Dart AST Analyzer/Parser
Stars: ✭ 87 (+222.22%)
Mutual labels:  ast
abstract-syntax-tree
A library for working with abstract syntax trees.
Stars: ✭ 77 (+185.19%)
Mutual labels:  ast
code summarization public
source code for 'Improving automatic source code summarization via deep reinforcement learning'
Stars: ✭ 71 (+162.96%)
Mutual labels:  ast
lemon-grove
The Lemon parser generator and sibling projects.
Stars: ✭ 27 (+0%)
Mutual labels:  parser-generator
pe
Fastest general-purpose parsing library for Python with a familiar API
Stars: ✭ 21 (-22.22%)
Mutual labels:  parser-generator
CastXMLSuperbuild
Build CastXML and its dependencies (LLVM/Clang)
Stars: ✭ 32 (+18.52%)
Mutual labels:  ast

Human Parser Generator Build Status

A straightforward recursive descent Parser Generator with a focus on "human" code generation and ease of use.
Christophe VG ([email protected])
https://github.com/christophevg/human-parser-generator

Rationale

Although many parser generators exist, I feel like there is room for one more, which generates a parser in a more "human" way.

The objectives are:

  • start from a standard EBNF grammar, e.g. allow copy pasting existing grammars and (maybe almost) be done with it.
  • generate code, as if it were written by a human developer:
    • generate functional classes to construct the AST
    • generate parser logic that is readable and understandable
  • be self hosting: the generator should be able to generate a parser for itself.

EBNF is a (meta-)syntax that can be used to express (context-free) grammars. EBNF is an "extension" to BNF. The Human Parser Generator takes EBNF grammars as input to generate parsers for the language expressed by the grammar.

The project initially targets C#, which is the language of the generator itself. Once the generator is stable, support for generating other languages can be added.

Current Status - Version 1.1

Get the Human Parser Generator

We provide downloads for the repository and a binary build of hpg.exe from our releases GitHub page.

Minimal Survival Commands:

$ git clone https://github.com/christophevg/human-parser-generator
$ cd human-parser-generator
$ msbuild
Microsoft (R) Build Engine version 14.1.0.0
Copyright (C) Microsoft Corporation. All rights reserved.

Build started 3/6/2017 1:46:48 PM.
Project "/Users/xtof/Workspace/human-parser-generator/hpg.csproj" on node 1 (default targets).
MakeBuildDirectory:
  Creating directory "bin/Debug/".
Gen0Parser:
  /Library/Frameworks/Mono.framework/Versions/4.6.2/lib/mono/4.5/csc.exe /debug+ /out:bin/Debug/hpg.gen0.exe /target:exe generator/parsable.cs generator/generator.cs generator/factory.cs generator/emitter.csharp.cs generator/emitter.bnf.cs generator/format.csharp.cs generator/AssemblyInfo.cs generator/grammar.cs generator/bootstrap.cs
Gen1Source:
  mono bin/Debug/hpg.gen0.exe generator/hpg.bnf | LC_ALL="C" astyle -s2 -xt0 -xe -Y -xC80 > generator/parser.gen1.cs
Gen1Parser:
  /Library/Frameworks/Mono.framework/Versions/4.6.2/lib/mono/4.5/csc.exe /debug+ /out:bin/Debug/hpg.gen1.exe /target:exe generator/parsable.cs generator/generator.cs generator/factory.cs generator/emitter.csharp.cs generator/emitter.bnf.cs generator/format.csharp.cs generator/AssemblyInfo.cs generator/parser.gen1.cs generator/hpg.cs
HPGSource:
  mono bin/Debug/hpg.gen1.exe generator/hpg.bnf | LC_ALL="C" astyle -s2 -xt0 -xe -Y -xC80 > generator/parser.cs
Build:
  /Library/Frameworks/Mono.framework/Versions/4.6.2/lib/mono/4.5/csc.exe /debug+ /out:bin/Debug/hpg.exe /target:exe generator/parsable.cs generator/generator.cs generator/factory.cs generator/emitter.csharp.cs generator/emitter.bnf.cs generator/format.csharp.cs generator/AssemblyInfo.cs generator/parser.cs generator/hpg.cs
Done Building Project "/Users/xtof/Workspace/human-parser-generator/hpg.csproj" (default targets).

Build succeeded.
    0 Warning(s)
    0 Error(s)

Time Elapsed 00:00:02.38
$ mono bin/Debug/hpg.exe --help
Human Parser Generator version 1.1.6274.24805
Usage: hpg.exe [options] [file ...]

    --help, -h              Show usage information
    --version, -v           Show version information

    --output, -o FILENAME   Output to file, not stdout

Output options.
Select one of the following:
    --parser, -p            Generate parser (DEFAULT)
    --ast, -a               Show AST
    --model, -m             Show parser model
    --grammar, -g           Show grammar
Formatting options.
    --text, -t              Generate textual output (DEFAULT).
    --dot, -d               Generate Graphviz/Dot format output. (model)
Emission options.
    --info, -i              Suppress generation of info header
    --rule, -r              Suppress generation of rule comment
    --namespace, -n NAME    Embed parser in namespace

When running on a unix-like environment (e.g. macOS, Linux, ...) the generated parsers are styled using AStyle. On Windows this dependency is suppressed by default. To avoid using AStyle, set the AStyle build property to an empty string: msbuild /Property:AStyle=.

A Complete Example

The following example is taken from the Wikipedia page on EBNF:

(* a simple program syntax in EBNF − Wikipedia *)
 program = 'PROGRAM', white space, identifier, white space, 
            'BEGIN', white space, 
            { assignment, ";", white space }, 
            'END.' ;
 identifier = alphabetic character, { alphabetic character | digit } ;
 number = [ "-" ], digit, { digit } ;
 string = '"' , { all characters - '"' }, '"' ;
 assignment = identifier , ":=" , ( number | identifier | string ) ;
 alphabetic character = "A" | "B" | "C" | "D" | "E" | "F" | "G"
                      | "H" | "I" | "J" | "K" | "L" | "M" | "N"
                      | "O" | "P" | "Q" | "R" | "S" | "T" | "U"
                      | "V" | "W" | "X" | "Y" | "Z" ;
 digit = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9" ;
 white space = ? white space characters ? ;
 all characters = ? all visible characters ? ;

This grammar allows to parse a Pascal program with assignments:

 PROGRAM DEMO1
 BEGIN
   A:=3;
   B:=45;
   H:=-100023;
   C:=A;
   D123:=B34A;
   BABOON:=GIRAFFE;
   TEXT:="Hello world!";
 END.

To take advantage of the [extended grammar features of the Human Parser Generator](https://github.com/christophevg/human-parser-generator/wiki/HPG Grammar), the grammar above can be rewritten to:

(* a simple program syntax in HPG-flavoured EBNF - based on example from Wikipedia *)

program      = "PROGRAM" identifier
               "BEGIN"
               { assignment ";" }
               "END."
             ;

assignment   = identifier ":=" expression ;

expression   = identifier
             | string
             | number
             ;

identifier   = name  @ ? /([A-Z][A-Z0-9]*)/ ? ;
string       = text  @ ? /"([^"]*)"|'([^']*)'/ ? ;
number       = value @ ? /(-?[1-9][0-9]*)/ ? ;

We can now feed this grammar to the Human Parser Generator

$ mono hpg.exe example/pascal/pascal.bnf

The generated parser is returned on standard output:

// DO NOT EDIT THIS FILE
// This file was generated using the Human Parser Generator
// (https://github.com/christophevg/human-parser-generator)
// on Monday, March 6, 2017 at 1:10:56 PM
// Source : example/pascal/pascal.bnf

using System;
using System.IO;
using System.Collections.Generic;
using System.Text.RegularExpressions;
using System.Linq;

// program ::= "PROGRAM" identifier "BEGIN" { assignment ";" } "END." ;
public class Program {
  public Identifier Identifier { get; set; }
  public List<Assignment> Assignments { get; set; }
  public Program() {
    this.Assignments = new List<Assignment>();
  }
  // ...
}
// ...
public class Parser : ParserBase<Program> {

  // program ::= "PROGRAM" identifier "BEGIN" { assignment ";" } "END." ;
  public override Program Parse() {
    Program program = new Program();
    Log( "ParseProgram" );
    Parse( () => {
      Consume("PROGRAM");
      program.Identifier = ParseIdentifier();
      Consume("BEGIN");
      Repeat( () => {
        program.Assignments.Add(ParseAssignment());
        Consume(";");
      });
      Consume("END.");
    }).OrThrow("Failed to parse Program");
    return program;
  }
// ...
}

If no file is provided, input is read from standard input.

Combine this generated parser with parsable.cs and add a minimal driver application:

// run.cs - a minimal driver application of HPG generated parsers
using System;
using System.IO;

public class Runner {
  public static void Main(string[] args) {
    string source = File.ReadAllText(args[0]);

    Parser parser = new Parser();
    parser.Parse(source);

    Console.WriteLine(parser.AST);
  }
}

Compile and run ...

$ mcs run.cs pascal.cs generator/parsable.cs 
$ mono run.exe example/pascal/example.pascal

The output is a string representation of the resulting AST:

new Program() {
  Identifier = new Identifier() { Name = "DEMO1"},
  Assignments = new List<Assignment>() {
    new Assignment() {
      Identifier = new Identifier() { Name = "A"},
      Expression = new Number() { Value = "3" }
    },
    new Assignment() {
      Identifier = new Identifier() { Name = "B"},
      Expression = new Number() { Value = "45" }
    },
    new Assignment() {
      Identifier = new Identifier() { Name = "H"},
      Expression = new Number() { Value = "-100023" }
    },
    new Assignment() {
      Identifier = new Identifier() { Name = "C"},
      Expression = new Identifier() { Name = "A" }
    },
    new Assignment() {
      Identifier = new Identifier() { Name = "D123"},
      Expression = new Identifier() { Name = "B34A" }
    },
    new Assignment() {
      Identifier = new Identifier() { Name = "BABOON"},
      Expression = new Identifier() { Name = "GIRAFFE" }
    },
    new Assignment() {
      Identifier = new Identifier() { Name = "TEXT"},
        Expression = new String() { Text = "Hello world!" }
    }
  }
}

Documentation

Consult the repository's wiki for more background, tutorials and annotated examples.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].