All Projects → gnidan → solregex

gnidan / solregex

Licence: MIT license
Regex compilation to Solidity

Programming Languages

javascript
184084 projects - #8 most used programming language
Handlebars
879 projects

Projects that are alternatives of or similar to solregex

Regexpu
A source code transpiler that enables the use of ES2015 Unicode regular expressions in ES5.
Stars: ✭ 201 (+443.24%)
Mutual labels:  regex, code-generation
IronRure
.NET Bindings to the Rust Regex Crate
Stars: ✭ 16 (-56.76%)
Mutual labels:  regex
wb-toolbox
Simulink toolbox to rapidly prototype robot controllers
Stars: ✭ 20 (-45.95%)
Mutual labels:  code-generation
simplematch
Minimal, super readable string pattern matching for python.
Stars: ✭ 147 (+297.3%)
Mutual labels:  regex
oag
Idiomatic Go (Golang) client package generation from OpenAPI documents
Stars: ✭ 51 (+37.84%)
Mutual labels:  code-generation
acgt
Auto Code Generation Tools (DSL)
Stars: ✭ 18 (-51.35%)
Mutual labels:  code-generation
pyodesys
∫ Straightforward numerical integration of systems of ordinary differential equations
Stars: ✭ 85 (+129.73%)
Mutual labels:  code-generation
es6-template-regex
Regular expression for matching es6 template delimiters in a string.
Stars: ✭ 15 (-59.46%)
Mutual labels:  regex
gdbus-codegen-glibmm
Code generator for C++ D-Bus stubs and proxies using Giomm/Glibmm
Stars: ✭ 21 (-43.24%)
Mutual labels:  code-generation
AutomateWithPython
If you've ever spent hours renaming files or updating hundreds of spreadsheet cells, you know how tedious tasks like these can be. But what if you could have your computer do them for you? In Automate the Boring Stuff with Python, you'll learn how to use Python to write programs that do in minutes what would take you hours to do by hand-no prior…
Stars: ✭ 22 (-40.54%)
Mutual labels:  regex
Regex
🔤 Swifty regular expressions
Stars: ✭ 311 (+740.54%)
Mutual labels:  regex
UI2CODE
A tidied repo for UI2CODE, a reverse engineering system convert UI design to code automatically and precisely.
Stars: ✭ 24 (-35.14%)
Mutual labels:  code-generation
regexp-expand
Show the ELisp regular expression at point in rx form.
Stars: ✭ 18 (-51.35%)
Mutual labels:  regex
cregex
A small implementation of regular expression matching engine in C
Stars: ✭ 72 (+94.59%)
Mutual labels:  regex
regXwild
⏱ Superfast ^Advanced wildcards++? | Unique algorithms that was implemented on native unmanaged C++ but easily accessible in .NET via Conari (with caching of 0x29 opcodes +optimizations) etc.
Stars: ✭ 20 (-45.95%)
Mutual labels:  regex
url-regex-safe
Regular expression matching for URL's. Maintained, safe, and browser-friendly version of url-regex. Resolves CVE-2020-7661 for Node.js servers.
Stars: ✭ 59 (+59.46%)
Mutual labels:  regex
parse-author
Parse a person, author, contributor or maintainer string into an object with name, email and url properties following NPM conventions. Useful for the `authors` property in package.json or for parsing an AUTHORS file into an array of person objects.
Stars: ✭ 23 (-37.84%)
Mutual labels:  regex
Friendly Code Editor
Try this Friendly Code Editor. You'll love it. I made it with a lot of effort. It has some great features. I will update it adequately later. Very helpful for developers. Enjoy and share.
Stars: ✭ 20 (-45.95%)
Mutual labels:  code-generation
bhedak
A replacement of "qsreplace", accepts URLs as standard input, replaces all query string values with user-supplied values and stdout.
Stars: ✭ 77 (+108.11%)
Mutual labels:  regex
termco
Regular Expression Counts of Terms and Substrings
Stars: ✭ 24 (-35.14%)
Mutual labels:  regex

solregex

Travis npm Gitter

Tool to generate a Solidity smart contract for a given regular expression.

Installing

npm install -g solregex

Usage

Provide optional --name parameter and regex as argument.

$ solregex --name EmailRegex '[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-_]+\.[a-zA-Z]{2,}'

solregex prints the contents of a standalone Solidity source file (.sol) to your terminal's standard out.

You may want to:

  • Save to disk (e.g. solregex 'ab*c?' > Regex.sol)
  • Copy/paste it into an IDE (e.g. solregex '.+abc.*' | pbcopy on macOS, or select with mouse)

Workflow Usage

There are many different workflows for developing / deploying applications with Ethereum.

Generally the process follows these steps:

  1. Run solregex with a regular expression (generates Solidity smart contract)
  2. Compile smart contract
  3. Deploy compiled contract
  4. Use in another contract

Graphviz Output

solregex supports generating Graphviz DOT output for a given regular expression's DFA (deterministic finite automaton).

To generate DOT output instead of Solidity, pass the --dot parameter.

Sample Regex DFA

Sample DOT Output: solregex --dot '[a-f]x|[d-i]y|[g-l]z' | dot -Tsvg > sample-regex.svg

Examples

A contract to match email addresses is deployed at 0x537837D00047C874D19B68E94ADbA107674C21b8 (Etherscan)

A contract to match Ethereum addresses is deployed at 0x62C8b4aC2aEF3Ed13B929cA9FB20caCB222E3fA6 (Etherscan)

Approach

Compiling a regular expression to Solidity is done via several steps:

  1. Parse regex using regjsparser

  2. Build NFA (non-deterministic finite automaton) from parse result. Use graph.js for underlying state machine data.

  3. Split overlapping character class ranges into non-overlapping subset ranges (e.g. [a-f], [d-i] become [a-c], [d-f], [g-i]) using interval trees Ref. Graphviz output above for example to highlight this behavior.

  4. Use powerset construction to convert NFA to DFA (deterministic finite automaton)

  5. Convert DFA into Solidity source using a handlebars template.

Status

What's Supported

Supports disjunctions |, alternations (e.g. ab, concatenation), quantifiers (+, *, ?, {n}, {n,m}, {n,}), wildcard matching (.), quantified groups ((...)*, etc.), character classes (positive, negative, ranges)

Supports true/false result for string matching against a regex.

What's Missing

  • Assertions (^, $ for start/end). Currently "enabled" by default.
  • Capturing groups (e.g. (a*)(b*) indicating a/b groups in input string)
  • Backreferences (e.g. (a*)\1)
  • Escape sequences for things like tabs, newlines, word characters, etc.
  • Any kind of multi-line smartness
  • Unicode support
  • Probably more, let me know!

Known Inefficiencies

  • Quantifiers using numeric literals (e.g. a{40}) generate numerous resulting DFA states. This makes the output code very large very fast.

    It may be possible to add support for compressing mostly-identical states into a single state with parameters, to avoid so much output code.

Contributing

Feel free to contact me in the Gitter channel for this repository with any comments, concerns, questions. Let me know if anything is unclear about usage or if you encounter any problems!

If you are interested in helping improve the state of efficient string pattern matching on the EVM, get in touch or open a pull request! Feedback, fixes, and improvements of all kinds are most appreciated :). Thank you!

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].