All Projects → BioJulia → Automa.jl

BioJulia / Automa.jl

Licence: other
A julia code generator for regular expressions

Programming Languages

julia
2034 projects

Projects that are alternatives of or similar to Automa.jl

Shallow Clone
Make a shallow clone of an object, array or primitive.
Stars: ✭ 23 (-79.28%)
Mutual labels:  regular-expression
Regexr
For composing regular expressions without the need for double-escaping inside strings.
Stars: ✭ 53 (-52.25%)
Mutual labels:  regular-expression
Eval Sql.net
SQL Eval Function | Dynamically Evaluate Expression in SQL Server using C# Syntax
Stars: ✭ 84 (-24.32%)
Mutual labels:  regular-expression
Regex
A sane interface for php's built in preg_* functions
Stars: ✭ 909 (+718.92%)
Mutual labels:  regular-expression
Tempreites
One-file semantic DSL-free templates direto da roça for the browser and server.
Stars: ✭ 31 (-72.07%)
Mutual labels:  regular-expression
Hyperscan Java
Match tens of thousands of regular expressions within milliseconds - Java bindings for Intel's hyperscan 5
Stars: ✭ 66 (-40.54%)
Mutual labels:  regular-expression
Libphorward
C/C++ library for dynamic data structures, regular expressions, lexical analysis & more...
Stars: ✭ 18 (-83.78%)
Mutual labels:  regular-expression
To Regex Range
Pass two numbers, get a regex-compatible source string for matching ranges. Fast compiler, optimized regex, and validated against more than 2.78 million test assertions. Useful for creating regular expressions to validate numbers, ranges, years, etc.
Stars: ✭ 97 (-12.61%)
Mutual labels:  regular-expression
Rexrex
🦖 Composable JavaScript regular expressions
Stars: ✭ 34 (-69.37%)
Mutual labels:  regular-expression
Regex
A Regular Expression game for Android
Stars: ✭ 80 (-27.93%)
Mutual labels:  regular-expression
Regexanalyzer
Regular Expression Analyzer and Composer for Node.js / XPCOM / Browser Javascript, PHP, Python
Stars: ✭ 29 (-73.87%)
Mutual labels:  regular-expression
Place2live
Analysis of the characteristics of different countries
Stars: ✭ 30 (-72.97%)
Mutual labels:  regular-expression
Nanomatch
Fast, minimal glob matcher for node.js. Similar to micromatch, minimatch and multimatch, but without support for extended globs (extglobs), posix brackets or braces, and with complete Bash 4.3 wildcard support: ("*", "**", and "?").
Stars: ✭ 79 (-28.83%)
Mutual labels:  regular-expression
Whitespace Regex
Regular expression for matching the whitespace in a string.
Stars: ✭ 9 (-91.89%)
Mutual labels:  regular-expression
Globbing
Introduction to "globbing" or glob matching, a programming concept that allows "filepath expansion" and matching using wildcards.
Stars: ✭ 86 (-22.52%)
Mutual labels:  regular-expression
R4ds
📖 R for data import/export , clean, wrangling, exploration, visualization, & analysis with R https://xiangyunhuang.github.io/r4ds/
Stars: ✭ 19 (-82.88%)
Mutual labels:  regular-expression
Emoji Regex
A regular expression to match all Emoji-only symbols as per the Unicode Standard.
Stars: ✭ 1,134 (+921.62%)
Mutual labels:  regular-expression
Orchestra
One language to be RegExp's Successor. Visually readable and rich, technically safe and extended, naturally scalable, advanced, and optimized
Stars: ✭ 103 (-7.21%)
Mutual labels:  regular-expression
Computer Science Resources
A list of resources in different fields of Computer Science (multiple languages)
Stars: ✭ 1,316 (+1085.59%)
Mutual labels:  regular-expression
Regen
Tool to generate random strings from Go/RE2 regular expressions (Migrated to https://git.sr.ht/~nilium/regen)
Stars: ✭ 79 (-28.83%)
Mutual labels:  regular-expression

Automa.jl

Docs Latest Build Status codecov.io

A Julia package for text validation, parsing, and tokenizing based on state machine compiler.

Schema of Automa.jl

Automa.jl compiles regular expressions into Julia code, which is then compiled into low-level machine code by the Julia compiler. Automa.jl is designed to generate very efficient code to scan large text data, which is often much faster than handcrafted code. Automa.jl can insert arbitrary Julia code that will be executed in state transitions. This makes it possible, for example, to extract substrings that match a part of a regular expression.

This is a number literal tokenizer using Automa.jl (numbers.jl):

# A tokenizer of octal, decimal, hexadecimal and floating point numbers
# =====================================================================

import Automa
import Automa.RegExp: @re_str
const re = Automa.RegExp

# Describe patterns in regular expression.
oct      = re"0o[0-7]+"
dec      = re"[-+]?[0-9]+"
hex      = re"0x[0-9A-Fa-f]+"
prefloat = re"[-+]?([0-9]+\.[0-9]*|[0-9]*\.[0-9]+)"
float    = prefloat | re.cat(prefloat | re"[-+]?[0-9]+", re"[eE][-+]?[0-9]+")
number   = oct | dec | hex | float
numbers  = re.cat(re.opt(number), re.rep(re" +" * number), re" *")

# Register action names to regular expressions.
number.actions[:enter] = [:mark]
oct.actions[:exit]     = [:oct]
dec.actions[:exit]     = [:dec]
hex.actions[:exit]     = [:hex]
float.actions[:exit]   = [:float]

# Compile a finite-state machine.
machine = Automa.compile(numbers)

# This generates a SVG file to visualize the state machine.
# write("numbers.dot", Automa.machine2dot(machine))
# run(`dot -Tpng -o numbers.png numbers.dot`)

# Bind an action code for each action name.
actions = Dict(
    :mark  => :(mark = p),
    :oct   => :(emit(:oct)),
    :dec   => :(emit(:dec)),
    :hex   => :(emit(:hex)),
    :float => :(emit(:float)),
)

# Generate a tokenizing function from the machine.
context = Automa.CodeGenContext()
@eval function tokenize(data::String)
    tokens = Tuple{Symbol,String}[]
    mark = 0
    $(Automa.generate_init_code(context, machine))
    p_end = p_eof = lastindex(data)
    emit(kind) = push!(tokens, (kind, data[mark:p-1]))
    $(Automa.generate_exec_code(context, machine, actions))
    return tokens, cs == 0 ? :ok : cs < 0 ? :error : :incomplete
end

tokens, status = tokenize("1 0x0123BEEF 0o754 3.14 -1e4 +6.022045e23")

This emits tokens and the final status:

~/.j/v/Automa (master) $ julia -qL example/numbers.jl
julia> tokens
6-element Array{Tuple{Symbol,String},1}:
 (:dec,"1")
 (:hex,"0x0123BEEF")
 (:oct,"0o754")
 (:float,"3.14")
 (:float,"-1e4")
 (:float,"+6.022045e23")

julia> status
:ok

The compiled deterministic finite automaton (DFA) looks like this: DFA

For more details, see fasta.jl and read the docs page.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].