All Projects → pfmiles → dropincc.java

pfmiles / dropincc.java

Licence: other
A small and easy to use parser generator. Specify your grammar in pure java and compile dynamically. Especially suitable for DSL creation in java.

Programming Languages

java
68154 projects - #9 most used programming language
HTML
75241 projects

Projects that are alternatives of or similar to dropincc.java

RBNF
This project's lifetime has ended. The successor is https://github.com/thautwarm/frontend-for-free which is WIP. You can check lark-parser project which is a good alt.
Stars: ✭ 39 (-56.67%)
Mutual labels:  parser-generator, ebnf
ebnf
EBNF parser and generic parser generator for Ruby.
Stars: ✭ 101 (+12.22%)
Mutual labels:  parser-generator, ebnf
nimly
Lexer Generator and Parser Generator as a Library in Nim.
Stars: ✭ 113 (+25.56%)
Mutual labels:  parser-generator, ebnf
YaccConstructor
Platform for parser generators and other grammarware research and development. GLL, RNGLR, graph parsing algorithms, and many others are included.
Stars: ✭ 36 (-60%)
Mutual labels:  parser-generator, ebnf
human-parser-generator
A straightforward recursive descent Parser Generator with a focus on "human" code generation and ease of use.
Stars: ✭ 27 (-70%)
Mutual labels:  parser-generator, ebnf
Covfefe
A parser for nondeterministic context free languages
Stars: ✭ 49 (-45.56%)
Mutual labels:  parser-generator, ebnf
Rust Peg
Parsing Expression Grammar (PEG) parser generator for Rust
Stars: ✭ 836 (+828.89%)
Mutual labels:  parser-generator
Irony
A modified version of the Irony project (https://irony.codeplex.com) with .NET Core support
Stars: ✭ 127 (+41.11%)
Mutual labels:  parser-generator
Pigeon
Command pigeon generates parsers in Go from a PEG grammar.
Stars: ✭ 603 (+570%)
Mutual labels:  parser-generator
Pegjs
PEG.js: Parser generator for JavaScript
Stars: ✭ 4,176 (+4540%)
Mutual labels:  parser-generator
Reduce.jl
Symbolic parser generator for Julia language expressions using REDUCE algebra term rewriter
Stars: ✭ 172 (+91.11%)
Mutual labels:  parser-generator
Ecsharp
Home of LoycCore, the LES language of Loyc trees, the Enhanced C# parser, the LeMP macro preprocessor, and the LLLPG parser generator.
Stars: ✭ 141 (+56.67%)
Mutual labels:  parser-generator
Antlr4
ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.
Stars: ✭ 11,227 (+12374.44%)
Mutual labels:  parser-generator
Lug
Parsing expression grammar (PEG) embedded domain specific language and parsing machine for C++17
Stars: ✭ 44 (-51.11%)
Mutual labels:  parser-generator
Php Zephir Parser
The Zephir Parser delivered as a C extension for the PHP language.
Stars: ✭ 129 (+43.33%)
Mutual labels:  parser-generator
Owl
A parser generator for visibly pushdown languages.
Stars: ✭ 645 (+616.67%)
Mutual labels:  parser-generator
Npeg
PEGs for Nim, another take
Stars: ✭ 163 (+81.11%)
Mutual labels:  parser-generator
Cpp Peglib
A single file C++ header-only PEG (Parsing Expression Grammars) library
Stars: ✭ 435 (+383.33%)
Mutual labels:  parser-generator
Pasukon
JavaScript practical parser generator library using combinators
Stars: ✭ 107 (+18.89%)
Mutual labels:  parser-generator
Lalrpop
LR(1) parser generator for Rust
Stars: ✭ 1,929 (+2043.33%)
Mutual labels:  parser-generator

dropincc.java Quick Start.

What is dropincc.java?

  • A small, and easy to use parser generator.
  • It is designed to ease your DSL building process in java.
  • No new grammar need to learn, no additional command line tool need to use. All you need to do is define your DSL grammar in your favorite editor using pure java.
  • It has very nice BNF-alike fluent interface API to define grammar rules.
  • Recognizes LL(*) grammar, and supports powerful semantic predicates that could even handle context-sensitive parsing.
  • No other dependencies except for the built-in java library, requires JDK 1.6 or above.

Hello World.

As to parser generators, a full-featured calculator is more or less like a Hello World example.
Here is this resulting Hello World example in dropincc.java:

/**
 * EBNF of Calculator:
 * <pre>
 * calc ::= expr $
 * expr ::= addend (('+'|'-') addend)*
 * addend ::= factor (('*'|'/') factor)*
 * factor ::= '(' expr ')'
 *          | '\\d+(\\.\\d+)?'
 * </pre>
 */
public static void main(String... args) throws Throwable {
    Lang c = new Lang("Calculator");
    Grule expr = c.newGrule();
    c.defineGrule(expr, CC.EOF).action(new Action() {
        public Double act(Object matched) {
            return (Double) ((Object[]) matched)[0];
        }
    });
    TokenDef a = c.newToken("\\+");
    Grule addend = c.newGrule();
    expr.define(addend, CC.ks(a.or("\\-"), addend)).action(new Action() {
        public Double act(Object matched) {
            Object[] ms = (Object[]) matched;
            Double a0 = (Double) ms[0];
            Object[] aPairs = (Object[]) ms[1];
            for (Object p : aPairs) {
                String op = (String) ((Object[]) p)[0];
                Double a = (Double) ((Object[]) p)[1];
                if ("+".equals(op)) {
                    a0 += a;
                } else {
                    a0 -= a;
                }
            }
            return a0;
        }
    });
    TokenDef m = c.newToken("\\*");
    Grule factor = c.newGrule();
    addend.define(factor, CC.ks(m.or("/"), factor)).action(new Action() {
        public Double act(Object matched) {
            Object[] ms = (Object[]) matched;
            Double f0 = (Double) ms[0];
            Object[] fPairs = (Object[]) ms[1];
            for (Object p : fPairs) {
                String op = (String) ((Object[]) p)[0];
                Double f = (Double) ((Object[]) p)[1];
                if ("*".equals(op)) {
                    f0 *= f;
                } else {
                    f0 /= f;
                }
            }
            return f0;
        }
    });
    factor.define("\\(", expr, "\\)").action(new Action() {
        public Double act(Object matched) {
            return (Double) ((Object[]) matched)[1];
        }
    }).alt("\\d+(\\.\\d+)?").action(new Action() {
        public Double act(Object matched) {
            return Double.parseDouble((String) matched);
        }
    });
    Exe exe = c.compile();
    System.out.println(exe.eval("1 +2+3+(4 +5*6*7*(64/8/2/(2/1 )/1)*8  +9  )+   10"));
}

As you can see, these dozens of lines of pure java code defined a non-trivial calculator which supports parentheses operation.
You can run the code above to get the output: 3389.0. You can check this answer in any calculator app shipped along with your operating system.

How to build such a Calculator? Step by step:

First, write down the grammar rules as EBNF:

calc ::= expr $
expr ::= addend (('+'|'-') addend)*
addend ::= factor (('*'|'/') factor)*
factor ::= '(' expr ')'
         | '\\d+(\\.\\d+)?'

Terminals are in single quotes and non-terminals are those meaningful words. '$' as the EOF terminal.
Then, translate the EBNF line-by-line using dropincc.java API:

Lang c = new Lang("Calculator");
Grule expr = c.newGrule();
// calc ::= expr $
c.defineGrule(expr, CC.EOF);

TokenDef a = c.newToken("\\+");
Grule addend = c.newGrule();
// expr ::= addend (('+'|'-') addend)*
expr.define(addend, CC.ks(a.or("\\-"), addend));

TokenDef m = c.newToken("\\*");
Grule factor = c.newGrule();
// addend ::= factor (('*'|'/') factor)*
addend.define(factor, CC.ks(m.or("/"), factor));

/*
 * factor ::= '(' expr ')'
 *          | '\\d+(\\.\\d+)?'
 */
factor.define("\\(", expr, "\\)")
.alt("\\d+(\\.\\d+)?");

It's a very straightforward step. Just need a little explaination:

  • Basic grammar elements are 'TokenDef'(terminal) and 'Grule'(non-terminal), and you are not restricted to define TokenDefs in prior to all other elements, you can just define TokenDefs instantly, as regular expressions, while you defining the structure of your grammar. Like expr.define(addend, CC.ks(a.or("\\-"), addend));, the "\\-" is a regex, it defines a TokenDef on-the-fly.
  • Currently use built-in java regular expression to define token schemes, beware of java regExp special characters: <([{\^-=$!|]})?*+.>, they should be escaped when you need the character itself.
  • Elements concatenations are expressed in comma separated sequences, expr, CC.EOF means expr $ for example.
  • Alternative productions are expressed as a alt method call, for example, the factor rule has two alternative productions, the second one is defined as .alt("\\d+(\\.\\d+)?");.
  • Inline alternatives are expressed as a or method call on elements. See a.or("\\-") or m.or("/") in above definitions.
  • Kleene star/cross notations is expressed as CC.ks and CC.kc function call, and CC.op for optional grammar element.

Ok, as you have successfully translated EBNF to the form of dropincc.java required, it's time to add actions to your grammar rules.

Well, those actions are in fact 'closures', it can catch-up any variables in your current context. In java, this is done by defining an anonymous inner class.
For example, I first added two actions for the two alternative productions of factor rule:

factor.define("\\(", expr, "\\)").action(new Action() {
    public Double act(Object matched) {
        return (Double) ((Object[]) matched)[1];
    }
}).alt("\\d+(\\.\\d+)?").action(new Action() {
    public Double act(Object matched) {
        return Double.parseDouble((String) matched);
    }
});

Action is the interface to define such a closure. The matched param of the only method act in Action are an array of matched tokens or a single matched token.
Whether it is array or single token is up to the number of elements in the corresponding alternative production:

  • The matched parameter in the action of the first alternative production is a three tokens array: ["(", returnedValueOfEnclosingExprRule, ")"]
  • The second alt's matched parameter is a single string object represents the matched digit.
  • The return value of actions are treated as the returned value of the whole rule when parsing. It could be used for further processing by the invoking rule, for example, the second element of the first matched array is a returned value from the enclosing expr rule.

When you have done adding all proper actions to all grammar rule's productions, you are almost winning. The last step is to compile your new created language:

Exe exe = c.compile();

Then, enjoy it:

System.out.println(exe.eval("1 +2+3+(4 +5*6*7*(64/8/2/(2/1 )/1)*8  +9  )+   10"));

The resulting exe object is thread-safe and is suggested to be cached for future use.

There lies many stuff of complexity in the implementation of dropincc.java, but to the end user, I believe it could turns out to be a form of simplicity.

That's it, dropincc.java, a new parser generator which you have never seen. It is not just 'yet another compiler compiler', because there is already a whole bunch of tools to do such kind of general purposed parser generation. It is aimed to help you create DSLs in java. Which is a neglected topic in java community.

In order to make the example code above run, make sure that you are using jdk 1.6 or above and all you need to do is put the dropincc.java's compiled jar file in your classpath, no other dependencies.

More examples and documentation coming soon... You could explore the wiki page or my blog to find more information.

NOTES

  • The current stable release is v0.2.3.
  • The developing v0.3.0 and master branch may not be compatible with v0.2.x, but v0.2.x would be continuously maintained at branch 0.2.x.
  • v0.3.0 will be a BIG release, and it surely will provide you a new powerful DSL tool. But... currently, we could just hold on v0.2.x because it's released and runs dozens of millions of times every day in more than one important production systems. The correctness is proven.
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].