All Projects → lucasb-eyer → flex-bison-indentation

lucasb-eyer / flex-bison-indentation

Licence: MIT license
An example of how to correctly parse python-like indentation-scoped files using flex (and bison).

Programming Languages

Lex
420 projects
Yacc
648 projects
CMake
9771 projects

Projects that are alternatives of or similar to flex-bison-indentation

pascal-interpreter
A simple interpreter for a large subset of Pascal language written for educational purposes
Stars: ✭ 21 (-34.37%)
Mutual labels:  parse, scanner
Jflex
The fast scanner generator for Java™ with full Unicode support
Stars: ✭ 380 (+1087.5%)
Mutual labels:  flex, scanner
Simple wc example
simple word count example using flex/bison parser
Stars: ✭ 102 (+218.75%)
Mutual labels:  flex, parse
mpq
Decoder/parser of Blizzard's MPQ archive file format
Stars: ✭ 28 (-12.5%)
Mutual labels:  parse
N-WEB
WEB PENETRATION TESTING TOOL 💥
Stars: ✭ 56 (+75%)
Mutual labels:  scanner
magicRecon
MagicRecon is a powerful shell script to maximize the recon and data collection process of an objective and finding common vulnerabilities, all this saving the results obtained in an organized way in directories and with various formats.
Stars: ✭ 478 (+1393.75%)
Mutual labels:  scanner
snapdragon-lexer
Converts a string into an array of tokens, with useful methods for looking ahead and behind, capturing, matching, et cetera.
Stars: ✭ 19 (-40.62%)
Mutual labels:  parse
PyScholar
A 'supervised' parser for Google Scholar
Stars: ✭ 74 (+131.25%)
Mutual labels:  parse
exjson
JSON parser and genarator in Elixir.
Stars: ✭ 71 (+121.88%)
Mutual labels:  parse
parse-server-test-runner
A tool for programmatically starting Parse Server
Stars: ✭ 18 (-43.75%)
Mutual labels:  parse
Jira-Lens
Fast and customizable vulnerability scanner For JIRA written in Python
Stars: ✭ 185 (+478.13%)
Mutual labels:  scanner
porteye
Detect alive host and open port .
Stars: ✭ 17 (-46.87%)
Mutual labels:  scanner
pfp-vim
A vim hex-editor plugin that uses 010 templates to parse binary data using pfp
Stars: ✭ 57 (+78.13%)
Mutual labels:  parse
moneta
Moneta is a live usermode memory analysis tool for Windows with the capability to detect malware IOCs
Stars: ✭ 384 (+1100%)
Mutual labels:  scanner
sgCheckup
sgCheckup generates nmap output based on scanning your AWS Security Groups for unexpected open ports.
Stars: ✭ 77 (+140.63%)
Mutual labels:  scanner
language-grammars
Syntax highlighting for ABNF/BNF/EBNF, Yacc, and other language-related languages.
Stars: ✭ 14 (-56.25%)
Mutual labels:  flex
memory signature
A small wrapper class providing an unified interface to search for various memory signatures
Stars: ✭ 69 (+115.63%)
Mutual labels:  scanner
vulnscan
A static binary vulnerability scanner
Stars: ✭ 47 (+46.88%)
Mutual labels:  scanner
pixl-xml
A simple module for parsing and composing XML.
Stars: ✭ 78 (+143.75%)
Mutual labels:  parse
nmap-formatter
A tool that allows you to convert NMAP results to html, csv, json, markdown, graphviz (dot). Simply put it's nmap converter.
Stars: ✭ 129 (+303.13%)
Mutual labels:  scanner

flex-bison-indentation

An example of how to correctly parse python-like indentation-scoped files using flex (and bison).

Besides that, this project also serves as a template CMake-based project for a flex&bison parser and includes rules to track the current line and column of the scanner.

Quick overview

All the magic happens in the scanner, which emits TOK_INDENT and TOK_OUTDENT tokens whenever the level of indentation increases or decreases. The parser in this project just echoes the tokens.

The scanner includes the <normal> mode which it starts in. That's where you put your regular rules. Whenever a newline is encountered in that mode, the parser enters the <indent> mode, in which it keeps counting the spaces and tabs (and ignoring blank lines) until it sees anything else, in which case it outputs either a TOK_INDENT, one or more TOK_OUTDENT as necessary or none of these tokens and goes back to <normal> mode.

The scanner also does its best to keep track of the column where the current match starts, which can be accessed (and changed) through yycolumn. The line number is kept track of by flex internally.

All of this means that you can write the parser as usual, make use of the TOK_INDENT and TOK_OUTDENT tokens in order to handle indentation and access the current line of tokens through @1.first_line (and @1.last_line if the token spans multiple lines, which I don't recommend.) and the column range of it through @1.first_column and @1.last_column.

One caveat is that if one of your rules includes a newline character and is matches text longer than one symbol, you will need to reset yycolumn by hand.

Another one is that, for technical reasons, the column-range of the TOK_INDENT and TOK_OUTDENT tokens is the first character of the line or, for outdents happening through reaching the end of the file, 0-0.

Until I write a full tutorial, I recommend you look at the code, it is short and fully commented.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].