All Projects → xysun → Regex

xysun / Regex

Licence: mit
Regular expression engine in Python using Thompson's algorithm.

Programming Languages

python
139335 projects - #7 most used programming language

Labels

Projects that are alternatives of or similar to Regex

Regex In Python
A comprehensive guide for learning regular expressions using Python
Stars: ✭ 58 (-36.26%)
Mutual labels:  regex
Certstreamcatcher
This tool is based on regex with effective standards for detecting phishing sites in real time using certstream and can also detect punycode (IDNA) attacks.
Stars: ✭ 68 (-25.27%)
Mutual labels:  regex
Rust Onig
Rust bindings for the Oniguruma regex library
Stars: ✭ 81 (-10.99%)
Mutual labels:  regex
Fsq
A tool for querying the file system with a SQL-like language.
Stars: ✭ 60 (-34.07%)
Mutual labels:  regex
Emoji Regex
A regular expression to match all Emoji-only symbols as per the Unicode Standard.
Stars: ✭ 1,134 (+1146.15%)
Mutual labels:  regex
Regex101 Osx
Regex101 packaged as an offline Mac OSX application
Stars: ✭ 72 (-20.88%)
Mutual labels:  regex
Ctregex.zig
Compile time regular expressions in zig
Stars: ✭ 55 (-39.56%)
Mutual labels:  regex
Globbing
Introduction to "globbing" or glob matching, a programming concept that allows "filepath expansion" and matching using wildcards.
Stars: ✭ 86 (-5.49%)
Mutual labels:  regex
Hyperscan Java
Match tens of thousands of regular expressions within milliseconds - Java bindings for Intel's hyperscan 5
Stars: ✭ 66 (-27.47%)
Mutual labels:  regex
Gitmad
Monitor, Alert, and Discover sensitive info and data leakage on Github.
Stars: ✭ 81 (-10.99%)
Mutual labels:  regex
Zile
Extract API keys from file or url using by magic of python and regex.
Stars: ✭ 61 (-32.97%)
Mutual labels:  regex
Is Glob
If you use globs, this will make your code faster. Returns `true` if the given string looks like a glob pattern or an extglob pattern. This makes it easy to create code that only uses external modules like node-glob when necessary, resulting in much faster code execution and initialization time, and a better user experience. 55+ million downloads.
Stars: ✭ 63 (-30.77%)
Mutual labels:  regex
Rare
Fast, realtime regex-extraction, and aggregation into common formats such as histograms, numerical summaries, tables, and more!
Stars: ✭ 76 (-16.48%)
Mutual labels:  regex
Applied Text Mining In Python
Repo for Applied Text Mining in Python (coursera) by University of Michigan
Stars: ✭ 59 (-35.16%)
Mutual labels:  regex
Lit Element Router
A LitElement Router (1278 bytes gzip)
Stars: ✭ 85 (-6.59%)
Mutual labels:  regex
Ore
An R interface to the Onigmo regular expression library
Stars: ✭ 54 (-40.66%)
Mutual labels:  regex
Entityframework Reverse Poco Generator Ui
A simple UI to allow you to easily select which tables you want the EntityFramework Reverse POCO Code First Generator to use.
Stars: ✭ 69 (-24.18%)
Mutual labels:  regex
Youtube Regex
Best YouTube Video ID regex. Online: https://regex101.com/r/rN1qR5/2 and http://regexr.com/3anm9
Stars: ✭ 87 (-4.4%)
Mutual labels:  regex
Djurl
Simple yet helpful library for writing Django urls by an easy, short and intuitive way.
Stars: ✭ 85 (-6.59%)
Mutual labels:  regex
Machine Learning
My Attempt(s) In The World Of ML/DL....
Stars: ✭ 78 (-14.29%)
Mutual labels:  regex

A regex engine in Python following Thompson's Algorithm. This will perform significantly better than the backtracking approach implemented in Python's re module on some pathological patterns.

I wrote a blog post about this project here

It has the same interface as Python's re module:

import regex
n = 20
p = 'a?' * n + 'a' * n
nfa = regex.compile(p)
input_string = 'a' * n
matched = nfa.match(input_string)
print(matched) # True

Currently it supports the following:

  • Repetition operators: * + ?
  • Parenthesis
  • Characters (no character sets)

regex.py is the interface, parse.py holds main implementation logic, sample.py shows some samples.

You can run python3 testing.py -v to ensure it passes all test cases in test_suite.dat

Performance

This regex engine underperforms Python's re module on normal inputs (using Glenn Fowler's test suite -- see below), however it outperforms significantly on pathological inputs.

normal pathological

Credits

  • Test suite is based on Glenn Fowler's regex test suites.

  • Russ Cox has an excellent collection of articles on regex.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].