All Projects → SilentByte → sb-dynlex

SilentByte / sb-dynlex

Licence: MIT license
Configurable lexer for PHP featuring a fluid API.

Programming Languages

PHP
23972 projects - #3 most used programming language
Makefile
30231 projects

Projects that are alternatives of or similar to sb-dynlex

Tox
misc parsers in rust
Stars: ✭ 40 (+48.15%)
Mutual labels:  parsing, lexer
Flex
The Fast Lexical Analyzer - scanner generator for lexing in C and C++
Stars: ✭ 2,338 (+8559.26%)
Mutual labels:  lexer, lexer-generator
Jflex
The fast scanner generator for Java™ with full Unicode support
Stars: ✭ 380 (+1307.41%)
Mutual labels:  parsing, lexer
Logos
Create ridiculously fast Lexers
Stars: ✭ 1,001 (+3607.41%)
Mutual labels:  parsing, lexer
Graphql Go Tools
Tools to write high performance GraphQL applications using Go/Golang.
Stars: ✭ 96 (+255.56%)
Mutual labels:  parsing, lexer
Clangkit
ClangKit provides an Objective-C frontend to LibClang. Source tokenization, diagnostics and fix-its are actually implemented.
Stars: ✭ 330 (+1122.22%)
Mutual labels:  syntax-highlighting, parsing
Chevrotain
Parser Building Toolkit for JavaScript
Stars: ✭ 1,795 (+6548.15%)
Mutual labels:  parsing, lexer
re-typescript
An opinionated attempt at finally solving typescript interop for ReasonML / OCaml.
Stars: ✭ 68 (+151.85%)
Mutual labels:  parsing, lexer
Ramble
A R parser based on combinatory parsers.
Stars: ✭ 19 (-29.63%)
Mutual labels:  parsing
arborist
Arborist is a PEG parser that supports left-associative left recursion
Stars: ✭ 17 (-37.04%)
Mutual labels:  parsing
twitter-to-rss
Simple python script to parse twitter feed to generate a rss feed.
Stars: ✭ 15 (-44.44%)
Mutual labels:  parsing
sublime-coconut
Coconut syntax highlighting for Sublime Text and VSCode.
Stars: ✭ 18 (-33.33%)
Mutual labels:  syntax-highlighting
robotframework-vim
Some vim scripts for use with the Robot framework.
Stars: ✭ 89 (+229.63%)
Mutual labels:  syntax-highlighting
vim-ember-hbs
Ember Handlebars/HTMLBars plugin for Vim with indentation support
Stars: ✭ 45 (+66.67%)
Mutual labels:  syntax-highlighting
biaffine-ner
Named Entity Recognition as Dependency Parsing
Stars: ✭ 293 (+985.19%)
Mutual labels:  parsing
pysub-parser
Library for extracting text and timestamps from multiple subtitle files (.ass, .ssa, .srt, .sub, .txt).
Stars: ✭ 40 (+48.15%)
Mutual labels:  parsing
snapdragon-lexer
Converts a string into an array of tokens, with useful methods for looking ahead and behind, capturing, matching, et cetera.
Stars: ✭ 19 (-29.63%)
Mutual labels:  lexer
Whatsapp-Chat-Exporter
A customizable Android and iPhone WhatsApp database parser that will give you the history of your WhatsApp conversations in HTML and JSON. Android Backup Crypt12, Crypt14 and Crypt15 supported.
Stars: ✭ 150 (+455.56%)
Mutual labels:  parsing
FullFIX
A library for parsing FIX (Financial Information eXchange) protocol messages.
Stars: ✭ 60 (+122.22%)
Mutual labels:  parsing
rehype-highlight
plugin to highlight code blocks
Stars: ✭ 127 (+370.37%)
Mutual labels:  syntax-highlighting

DynLex Dynamically Configurable Lexer Library

Build Status Latest Stable Version MIT License

This is the main repository of the SilentByte DynLex Lexer Library.

DynLex is an easy-to-use library for PHP that provides the functionality to create and use dynamically configurable lexers accessed via a fluid interface.

Official documentations can be found here: http://docs.silentbyte.com/dynlex

Installation

To install the latest version, either checkout and include the source directly or use:

$ composer require silentbyte/sb-dynlex

General Usage

DynLex allows the definition of a set of lexer rules that determine how the input is scanned and what tokens can be created. The following code is a simple example that tokenizes words and numbers:

<?php

use SilentByte\DynLex\DynLexUtils;
use SilentByte\DynLex\DynLexBuilder;

$input = "Hello world 8273 this 919 28 is a 12 39 44 string"
    . "consisting of 328 words 003 and numbers 283";

$lexer = (new DynLexBuilder())
    ->rule('[a-zA-z]+', 'word')
    ->rule('[0-9]+',    'number')
    ->skip('.')
    ->build();

$tokens = $lexer->collect($input);
DynLexUtils::dumpTokens($tokens);

// [Output]
// -------------------------------------
// tag            off    ln   col  value
// -------------------------------------
// word             0     1     1  Hello
// word             6     1     7  world
// number          12     1    13  8273
// word            17     1    18  this
// number          22     1    23  919
// ...

?>

DynLex also allows the specification of lexer actions that will be executed each time the associated token is matched in the input stream. Extending the previous example, we can implement a program that counts the number of words and numbers within the input stream:

<?php

use SilentByte\DynLex\DynLexUtils;
use SilentByte\DynLex\DynLexBuilder;

$words = 0;
$numbers = 0;

$input = "hello world 8273 this 919 28 is a 12 39 44 string"
    . "consisting of 328 words 003 and numbers 283";

$lexer = (new DynLexBuilder())
    ->rule('[a-z]+', 'word',   function() use (&$words)   { $words++; })
    ->rule('[0-9]+', 'number', function() use (&$numbers) { $numbers++; })
    ->skip('.')
    ->build();

$tokens = $lexer->collect($input);
DynLexUtils::dumpTokens($tokens);

echo "$words words found.\n";
echo "$numbers numbers found.\n";

// [Output]
// -------------------------------------
// tag            off    ln   col  value
// -------------------------------------
// word             0     1     1  hello
// word             6     1     7  world
// number          12     1    13  8273
// word            17     1    18  this
// number          22     1    23  919
// ...
// -------------------------------------
// 11 words found.
// 9 numbers found.

?>

Using this concept, it is possible to easily create lexers for different kinds of applications. A more elaborate example that demonstrates how to use DynLex to create HTML syntax highlighters for programming languages can be found under examples/04-syntax-highlighting.php.

It is generally advised to check out the examples folder for further information and examples on how to use DynLex. Also have a look into the source code for more detailed documentation.

Contributing

See CONTRIBUTING.md.

FAQ

Under what license is DynLex released?

MIT license. Check out license.txt for details. More information regarding the MIT license can be found here: https://opensource.org/licenses/MIT

Why do rules sometimes not get matched correctly?

You have to ensure that rules that may conflict with each other are listed in the correct order from most specific to most general. For example, if you want to tokenize integers ([0-9]+) and floats ([0-9]+\.[0-9]+), the rule for floats must be listed before the rule for integers because the integer rule matches the first part of the float rule.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].