All Projects → ltrzesniewski → pcre-net

ltrzesniewski / pcre-net

Licence: other
PCRE.NET - Perl Compatible Regular Expressions for .NET

Programming Languages

C#
18002 projects
c
50402 projects - #5 most used programming language
powershell
5483 projects

Projects that are alternatives of or similar to pcre-net

CVparser
CVparser is software for parsing or extracting data out of CV/resumes.
Stars: ✭ 28 (-75.44%)
Mutual labels:  regex, extract, regular-expression
pcre-heavy
A Haskell regular expressions library that doesn't suck | now on https://codeberg.org/valpackett/pcre-heavy
Stars: ✭ 52 (-54.39%)
Mutual labels:  regex, regular-expression, pcre
LLRegex
Regular expression library in Swift, wrapping NSRegularExpression.
Stars: ✭ 18 (-84.21%)
Mutual labels:  regex, regular-expression
expand-brackets
Expand POSIX bracket expressions (character classes) in glob patterns.
Stars: ✭ 26 (-77.19%)
Mutual labels:  regex, regular-expression
globrex
Glob to regular expression with support for extended globs.
Stars: ✭ 52 (-54.39%)
Mutual labels:  regex, regular-expression
regexp-expand
Show the ELisp regular expression at point in rx form.
Stars: ✭ 18 (-84.21%)
Mutual labels:  regex, regular-expression
termco
Regular Expression Counts of Terms and Substrings
Stars: ✭ 24 (-78.95%)
Mutual labels:  regex, regular-expression
dregex
Dregex is a JVM library that implements a regular expression engine using deterministic finite automata (DFA). It supports some Perl-style features and yet retains linear matching time, and also offers set operations.
Stars: ✭ 37 (-67.54%)
Mutual labels:  regex, regular-expression
RgxGen
Regex: generate matching and non matching strings based on regex pattern.
Stars: ✭ 45 (-60.53%)
Mutual labels:  regex, regular-expression
pamatcher
A pattern matching library for JavaScript iterators
Stars: ✭ 23 (-79.82%)
Mutual labels:  regex, regular-expression
regex-cache
Memoize the results of a call to the RegExp constructor, avoiding repetitious runtime compilation of the same string and options, resulting in dramatic speed improvements.
Stars: ✭ 39 (-65.79%)
Mutual labels:  regex, regular-expression
extglob
Extended globs. Add (almost) the expressive power of regular expressions to glob patterns.
Stars: ✭ 25 (-78.07%)
Mutual labels:  regex, regular-expression
Regex
🔤 Swifty regular expressions
Stars: ✭ 311 (+172.81%)
Mutual labels:  regex, regular-expression
es6-template-regex
Regular expression for matching es6 template delimiters in a string.
Stars: ✭ 15 (-86.84%)
Mutual labels:  regex, regular-expression
cregex
A small implementation of regular expression matching engine in C
Stars: ✭ 72 (-36.84%)
Mutual labels:  regex, regular-expression
regex
Regular expressions for Prolog
Stars: ✭ 16 (-85.96%)
Mutual labels:  regex, regular-expression
genex
Genex package for Go
Stars: ✭ 64 (-43.86%)
Mutual labels:  regex, regular-expression
doi-regex
Regular expression for matching DOIs
Stars: ✭ 28 (-75.44%)
Mutual labels:  regex, regular-expression
RegexReplacer
A flexible tool to make complex replacements with regular expression
Stars: ✭ 38 (-66.67%)
Mutual labels:  regex, regular-expression
cheat-sheet-pdf
📜 A Cheat-Sheet Collection from the WWW
Stars: ✭ 728 (+538.6%)
Mutual labels:  regex, regular-expression

PCRE.NET

Perl Compatible Regular Expressions for .NET

Build NuGet Package GitHub release PCRE License

PCRE.NET is a .NET wrapper for the PCRE library. The goal of this project is to bring most of PCRE's features for use from .NET applications with as little overhead as possible.

The following systems are supported:

  • Windows x64
  • Windows x86
  • Linux x64
  • macOS x64

API Types

The classic API

This is a friendly API that is very similar to .NET's System.Text.RegularExpressions. It works on string objects, and supports the following operations:

  • NFA matching and substring extraction:
    • Matches
    • Match
    • IsMatch
  • Matched string replacement: Replace
    • Callbacks: Func<PcreMatch, string>
    • Replacement strings with placeholders: $n ${name} $& $_ $` $' $+
  • String splitting on matches: Split

The Span API

PcreRegex objects provide overloads which take a ReadOnlySpan<char> parameter for the following methods:

  • Matches
  • Match
  • IsMatch

These methods return a ref struct type, but are otherwise similar to the classic API.

The zero-allocation API

This is the fastest matching API the library provides.

Call the CreateMatchBuffer method on a PcreRegex instance to create the necessary data structures up-front, then use the returned match buffer for subsequent match operations. Performing a match through this buffer will not allocate further memory, reducing GC pressure and optimizing the process.

The downside of this approach is that the returned match buffer is not thread-safe and not reentrant: you cannot perform a match operation with a buffer which is already being used - match operations need to be sequential.

It is also counter-productive to allocate a match buffer to perform a single match operation. Use this API if you need to match a pattern against many subject strings.

PcreMatchBuffer objects are disposable (and finalizable in case they're not disposed). They provide an API for matching against ReadOnlySpan<char> subjects.

If you're looking for maximum speed, consider using the following options:

  • PcreOptions.Compiled at compile time to enable the JIT compiler, which will improve matching speed.
  • PcreMatchOptions.NoUtfCheck at match time to skip the Unicode validity check: by default PCRE scans the entire input string to make sure it's valid Unicode.
  • PcreOptions.MatchInvalidUtf at compile time if you plan to use PcreMatchOptions.NoUtfCheck and your subject strings may contain invalid Unicode sequences.

The DFA matching API

This API provides regex matching in O(subject length) time. It is accessible through the Dfa property on a PcreRegex instance:

  • Dfa.Matches
  • Dfa.Match

You can read more about its features in the PCRE documentation, where it's described as the alternative matching algorithm.

Library highlights

  • Support for compiled patterns (x86/x64 JIT)
  • Support for partial matching (when the subject is too short to match the pattern)
  • Callout support (numbered and string-based)
  • Mark retrieval support
  • Conversion from POSIX BRE, POSIX ERE and glob patterns (PcreConvert class)

Example usage

  • Extract all words except those within parentheses:
var matches = PcreRegex.Matches("(foo) bar (baz) 42", @"\(\w+\)(*SKIP)(*FAIL)|\w+")
                       .Select(m => m.Value)
                       .ToList();
// result: "bar", "42"
  • Enclose a series of punctuation characters within angle brackets:
var result = PcreRegex.Replace("hello, world!!!", @"\p{P}+", "<$&>");
// result: "hello<,> world<!!!>"
  • Partial matching:
var regex = new PcreRegex(@"(?<=abc)123");
var match = regex.Match("xyzabc12", PcreMatchOptions.PartialSoft);
// result: match.IsPartialMatch == true
  • Validate a JSON string:
const string jsonPattern = @"
    (?(DEFINE)
        # An object is an unordered set of name/value pairs.
        (?<object> \{
            (?: (?&keyvalue) (?: , (?&keyvalue) )* )?
        (?&ws) \} )
        (?<keyvalue>
            (?&ws) (?&string) (?&ws) : (?&value)
        )

        # An array is an ordered collection of values.
        (?<array> \[
            (?: (?&value) (?: , (?&value) )* )?
        (?&ws) \] )

        # A value can be a string in double quotes, or a number,
        # or true or false or null, or an object or an array.
        (?<value> (?&ws)
            (?: (?&string) | (?&number) | (?&object) | (?&array) | true | false | null )
        )

        # A string is a sequence of zero or more Unicode characters,
        # wrapped in double quotes, using backslash escapes.
        (?<string>
            "" (?: [^""\\\p{Cc}]++ | \\u[0-9A-Fa-f]{4} | \\ [""\\/bfnrt] )* ""
            # \p{Cc} matches control characters
        )

        # A number is very much like a C or Java number, except that the octal
        # and hexadecimal formats are not used.
        (?<number>
            -? (?: 0 | [1-9][0-9]* ) (?: \. [0-9]+ )? (?: [Ee] [-+]? [0-9]+ )?
        )

        # Whitespace
        (?<ws> \s*+ )
    )

    \A (?&ws) (?&object) (?&ws) \z
";

var regex = new PcreRegex(jsonPattern, PcreOptions.IgnorePatternWhitespace);

const string subject = @"{
    ""hello"": ""world"",
    ""numbers"": [4, 8, 15, 16, 23, 42],
    ""foo"": null,
    ""bar"": -2.42e+17,
    ""baz"": true
}";

var isValidJson = regex.IsMatch(subject);
// result: true
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].