All Projects → google → Re2j

google / Re2j

Licence: other
linear time regular expression matching in Java

Programming Languages

java
68154 projects - #9 most used programming language

Projects that are alternatives of or similar to Re2j

python-hyperscan
A CPython extension for the Hyperscan regular expression matching library.
Stars: ✭ 112 (-87.94%)
Mutual labels:  regular-expressions
crystular
Crystal regex tester http://www.crystular.org/
Stars: ✭ 31 (-96.66%)
Mutual labels:  regular-expressions
Regenerate
Generate JavaScript-compatible regular expressions based on a given set of Unicode symbols or code points.
Stars: ✭ 306 (-67.06%)
Mutual labels:  regular-expressions
sylbreak
Syllable segmentation tool for Myanmar language (Burmese) by Ye.
Stars: ✭ 44 (-95.26%)
Mutual labels:  regular-expressions
Regular Expressions
🔍 Swirl course on regular expressions and the regex family of functions in R
Stars: ✭ 21 (-97.74%)
Mutual labels:  regular-expressions
Sistem-programlama
Sistem Programlama Türkçe Kaynak (KTÜ)
Stars: ✭ 30 (-96.77%)
Mutual labels:  regular-expressions
tokenquery
TokenQuery (regular expressions over tokens)
Stars: ✭ 28 (-96.99%)
Mutual labels:  regular-expressions
Py regular expressions
Learn Python Regular Expressions step by step from beginner to advanced levels
Stars: ✭ 770 (-17.12%)
Mutual labels:  regular-expressions
RgxGen
Regex: generate matching and non matching strings based on regex pattern.
Stars: ✭ 45 (-95.16%)
Mutual labels:  regular-expressions
Node Re2
node.js bindings for RE2: fast, safe alternative to backtracking regular expression engines.
Stars: ✭ 284 (-69.43%)
Mutual labels:  regular-expressions
unmatcher
Regular expressions reverser for Python
Stars: ✭ 26 (-97.2%)
Mutual labels:  regular-expressions
lc-data-intro
Library Carpentry: Introduction to Working with Data (Regular Expressions)
Stars: ✭ 16 (-98.28%)
Mutual labels:  regular-expressions
T Regx
Library for PHP regular expression, providing high-level API.
Stars: ✭ 263 (-71.69%)
Mutual labels:  regular-expressions
convey
CSV processing and web related data types mutual conversion
Stars: ✭ 16 (-98.28%)
Mutual labels:  regular-expressions
Regex
The Hoa\Regex library.
Stars: ✭ 308 (-66.85%)
Mutual labels:  regular-expressions
ostrich
An SMT Solver for string constraints
Stars: ✭ 18 (-98.06%)
Mutual labels:  regular-expressions
Pawn.Regex
🔎 Plugin that adds support for regular expressions in Pawn
Stars: ✭ 34 (-96.34%)
Mutual labels:  regular-expressions
Libphorward
C/C++ library for dynamic data structures, regular expressions, lexical analysis & more...
Stars: ✭ 18 (-98.06%)
Mutual labels:  regular-expressions
Social Media Profiles Regexs
📇 Extract social media profiles and more with regular expressions
Stars: ✭ 324 (-65.12%)
Mutual labels:  regular-expressions
Re Flex
The regex-centric, fast lexical analyzer generator for C++ with full Unicode support. Faster than Flex. Accepts Flex specifications. Generates reusable source code that is easy to understand. Introduces indent/dedent anchors, lazy quantifiers, functions for lex/syntax error reporting, and more. Seamlessly integrates with Bison and other parsers.
Stars: ✭ 274 (-70.51%)
Mutual labels:  regular-expressions

RE2/J: linear time regular expression matching in Java

Build Status Coverage Status

RE2 is a regular expression engine that runs in time linear in the size of the input. RE2/J is a port of RE2 to pure Java.

Java's standard regular expression package, java.util.regex, and many other widely used regular expression packages such as PCRE, Perl and Python use a backtracking implementation strategy: when a pattern presents two alternatives such as a|b, the engine will try to match subpattern a first, and if that yields no match, it will reset the input stream and try to match b instead.

If such choices are deeply nested, this strategy requires an exponential number of passes over the input data before it can detect whether the input matches. If the input is large, it is easy to construct a pattern whose running time would exceed the lifetime of the universe. This creates a security risk when accepting regular expression patterns from untrusted sources, such as users of a web application.

In contrast, the RE2 algorithm explores all matches simultaneously in a single pass over the input data by using a nondeterministic finite automaton.

There are certain features of PCRE or Perl regular expressions that cannot be implemented in linear time, for example, backreferences, but the vast majority of regular expressions patterns in practice avoid such features.

Why should I switch?

If you use regular expression patterns with a high degree of alternation, your code may run faster with RE2/J. In the worst case, the java.util.regex matcher may run forever, or exceed the available stack space and fail; this will never happen with RE2/J.

Caveats

This is not an official Google product (experimental or otherwise), it is just code that happens to be owned by Google.

RE2/J is not a drop-in replacement for java.util.regex. Aside from the different package name, it doesn't support the following parts of the interface:

  • the MatchResult class
  • Matcher.hasAnchoringBounds()
  • Matcher.hasTransparentBounds()
  • Matcher.hitEnd()
  • Matcher.region(int, int)
  • Matcher.regionEnd()
  • Matcher.regionStart()
  • Matcher.requireEnd()
  • Matcher.toMatchResult()
  • Matcher.useAnchoringBounds(boolean)
  • Matcher.usePattern(Pattern)
  • Matcher.useTransparentBounds(boolean)
  • CANON_EQ
  • COMMENTS
  • LITERAL
  • UNICODE_CASE
  • UNICODE_CHARACTER_CLASS
  • UNIX_LINES
  • PatternSyntaxException.getMessage()

It also doesn't have parity with the full set of Java's character classes and special regular expression constructs.

Getting RE2/J

If you're using Maven, you can use the following snippet in your pom.xml to get RE2/J:

<dependency>
  <groupId>com.google.re2j</groupId>
  <artifactId>re2j</artifactId>
  <version>1.6</version>
</dependency>

You can use the same artifact details in any build system compatible with the Maven Central repositories (e.g. Gradle, Ivy).

You can also download RE2/J the old-fashioned way: go to the RE2/J release tag, download the RE2/J JAR and add it to your CLASSPATH.

Discussion and contribution

We have set up a Google Group for discussion, please join the RE2/J discussion list if you'd like to get in touch.

If you would like to contribute patches, please see the instructions for contributors.

Who wrote this?

RE2 was designed and implemented in C++ by Russ Cox. The C++ implementation includes both NFA and DFA engines and numerous optimisations. Russ also ported a simplified version of the NFA to Go. Alan Donovan ported the NFA-based Go implementation to Java. Afroz Mohiuddin wrapped the engine in a familiar Java Matcher / Pattern API. James Ring prepared the open-source release and has been its primary maintainer since then.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].