All Projects → weltling → parle

weltling / parle

Licence: other
Parser and lexer for PHP

Programming Languages

C++
36643 projects - #6 most used programming language
PHP
23972 projects - #3 most used programming language

Projects that are alternatives of or similar to parle

types-and-programming-languages
C++ Implementations of programming languages and type systems studied in "Types and Programming Languages" by Benjamin C. Pierce..
Stars: ✭ 32 (-52.94%)
Mutual labels:  lexer
lexer
Hackable Lexer with UTF-8 support
Stars: ✭ 19 (-72.06%)
Mutual labels:  lexer
FLexer
Simple Lexer and Parser in F#
Stars: ✭ 22 (-67.65%)
Mutual labels:  lexer
core.horse64.org
THIS IS A MIRROR, CHECK https://codeberg.org/Horse64/core.horse64.org
Stars: ✭ 3 (-95.59%)
Mutual labels:  lexer
SwiLex
A universal lexer library in Swift.
Stars: ✭ 29 (-57.35%)
Mutual labels:  lexer
Own-Programming-Language-Tutorial
Репозиторий курса "Как создать свой язык программирования"
Stars: ✭ 95 (+39.71%)
Mutual labels:  lexer
ugo
µGo编程语言(从头开发一个迷你Go语言编译器)
Stars: ✭ 38 (-44.12%)
Mutual labels:  lexer
fayrant-lang
Simple, interpreted, dynamically-typed programming language
Stars: ✭ 30 (-55.88%)
Mutual labels:  lexer
compiler lab
Some toy labs for compiler course
Stars: ✭ 49 (-27.94%)
Mutual labels:  lexer
ta-rust
A mirror for the textadept module ta-rust hosted in bitbucket
Stars: ✭ 21 (-69.12%)
Mutual labels:  lexer
expreso
☕ A boolean expression parser and evaluator in Elixir.
Stars: ✭ 54 (-20.59%)
Mutual labels:  lexer
sb-dynlex
Configurable lexer for PHP featuring a fluid API.
Stars: ✭ 27 (-60.29%)
Mutual labels:  lexer
lex
Lex is an implementation of lex tool in Ruby.
Stars: ✭ 49 (-27.94%)
Mutual labels:  lexer
monkey
The Monkey Programming Language & Interpreter written in PHP.
Stars: ✭ 21 (-69.12%)
Mutual labels:  lexer
compiler
Implementing a complete Compiler for a simple C-like language using the C-tools Flex and Bison
Stars: ✭ 106 (+55.88%)
Mutual labels:  lexer
re-typescript
An opinionated attempt at finally solving typescript interop for ReasonML / OCaml.
Stars: ✭ 68 (+0%)
Mutual labels:  lexer
yara-parser
Tools for parsing rulesets using the exact grammar as YARA. Written in Go.
Stars: ✭ 69 (+1.47%)
Mutual labels:  lexer
JavaScript-compiler
编程语言的本质:语言只是一串字符,我们认为它是什么,它就可以是什么
Stars: ✭ 51 (-25%)
Mutual labels:  lexer
bshift
Compiler for a language called bshift
Stars: ✭ 15 (-77.94%)
Mutual labels:  lexer
intellij-cue
IntelliJ support for the CUE language.
Stars: ✭ 23 (-66.18%)
Mutual labels:  lexer

Parle provides lexing and parsing facilities for PHP

Lexing and parsing is used widely in the PHP core and extensions. Usually such a functionality is packed into a piece of C/C++ and depends on tools like flex, re2c, Bison, LEMON or similar. With Parle, it is possible to implement lexing and parsing in PHP while relying on features and principles of the parser/lexer generator tools for C/C++. The Lexer and Parser classes are there in the Parle namespace. The implementation bases on the work of Ben Hanson

The lexer is based on the pattern matching similar to flex. The parser is LALR(1).

Supported is PHP 7.4 and above. A C++14 capable compiler is required. As of version 0.7.3 parle can optionally be compiled with internal UTF-32 support, making it possible to use Unicode character classes in patterns.

The full extension documentation is available in the PHP Manual.

Installation

Read the INSTALL.md documentation.

Example tokenizing comma separated integer list

use Parle\Token;
use Parle\Lexer;
use Parle\LexerException;

/* name => id */
$token = array(
        "COMMA" => 1,
        "CRLF" => 2,
        "DECIMAL" => 3,
);
/* id => name */
$tokenIdToName = array_flip($token);

$lex = new Lexer;
$lex->push("[\x2c]", $token["COMMA"]);
$lex->push("[\r][\n]", $token["CRLF"]);
$lex->push("[\d]+", $token["DECIMAL"]);
$lex->build();

$in = "0,1,2\r\n3,42,5\r\n6,77,8\r\n";

$lex->consume($in);

do {
        $lex->advance();
        $tok = $lex->getToken();

        if (Token::UNKNOWN == $tok->id) {
                throw new LexerException("Unknown token '{$tok->value}' at offset {$lex->marker}.");
        }

        echo "TOKEN: ", $tokenIdToName[$tok->id], PHP_EOL;
} while (Token::EOI != $tok->id);

Example parsing comma separated number list

use Parle\Lexer;
use Parle\Parser;
use Parle\ParserException;

$p = new Parser;
$p->token("CRLF");
$p->token("COMMA");
$p->token("INTEGER");
$p->token("'\"'");
$p->push("START", "RECORDS");
$prod_record_0 = $p->push("RECORDS", "RECORD CRLF");
$prod_record_1 = $p->push("RECORDS", "RECORDS RECORD CRLF");
$prod_int_0 = $p->push("RECORD", "INTEGER");
$prod_int_1 = $p->push("RECORD", "RECORD COMMA INTEGER");
$p->push("DECIMAL", "INTEGER COMMA INTEGER"); /* Production index unused. */
$prod_dec_0 = $p->push("RECORD", "'\"' DECIMAL '\"'");
$prod_dec_1 = $p->push("RECORD", "RECORD COMMA '\"' DECIMAL '\"'");
$p->build();

$lex = new Lexer;
$lex->push("[\x2c]", $p->tokenId("COMMA"));
$lex->push("[\r][\n]", $p->tokenId("CRLF"));
$lex->push("[\d]+", $p->tokenId("INTEGER"));
$lex->push("[\x22]", $p->tokenId("'\"'"));
$lex->build();

/* Specifically using comma as both list separator and as a decimal mark. */
$in = "000,111,222\r\n\"333,3\",444,555\r\n666,777,\"888,8\"\r\n";

$p->consume($in, $lex);

do {
	switch ($p->action) {
		case Parser::ACTION_ERROR:
			$err = $p->errorInfo();
			if (Parser::ERROR_UNKNOWN_TOKEN == $err->id) {
				$tok = $err->token;
				$msg = "Unknown token '{$tok->value}' at offset {$err->position}";
			} else if (Parser::ERROR_NON_ASSOCIATIVE == $err->id) {
				$tok = $err->token;
				$msg = "Token '{$tok->id}' at offset {$lex->marker} is not associative";
			} else if (Parser::ERROR_SYNTAX == $err->id) {
				$tok = $err->token;
				$msg = "Syntax error at offset {$lex->marker}";
			} else {
				$msg = "Parse error";
			}
			throw new ParserException($msg);
			break;
		case Parser::ACTION_SHIFT:
		case Parser::ACTION_GOTO:
		case Parser::ACTION_ACCEPT:
			continue;
			break;
		case Parser::ACTION_REDUCE:
			switch ($p->reduceId) {
				case $prod_int_0:
					/* INTEGER */
					echo $p->sigil(), PHP_EOL;
					break;
				case $prod_int_1:
					/* RECORD COMMA INTEGER */
					echo $p->sigil(2), PHP_EOL;
					break;
				case $prod_dec_0:
					/* '\"' DECIMAL '\"' */
					echo $p->sigil(1), PHP_EOL;
					break;
				case $prod_dec_1:
					/* RECORD COMMA '\"' DECIMAL '\"' */
					echo $p->sigil(3), PHP_EOL;
					break;
				case $prod_record_0:
				case $prod_record_1:
					echo "=====", PHP_EOL;
					break;
			}
			break;
	}
	$p->advance();
} while (Parser::ACTION_ACCEPT != $p->action);
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].