All Projects → mmottl → Pcre Ocaml

mmottl / Pcre Ocaml

Licence: other
OCaml bindings to PCRE (Perl Compatibility Regular Expressions)

Programming Languages

ocaml
1615 projects

Projects that are alternatives of or similar to Pcre Ocaml

mux-stream
(De)multiplex asynchronous streams
Stars: ✭ 34 (+47.83%)
Mutual labels:  pattern-matching
matchete
Simple pattern-matching library for Clojure(Script)
Stars: ✭ 65 (+182.61%)
Mutual labels:  pattern-matching
Patterns
This is an experimental library that has evolved to P1371, proposed for C++23.
Stars: ✭ 479 (+1982.61%)
Mutual labels:  pattern-matching
strings
String helper methods and an inflector
Stars: ✭ 31 (+34.78%)
Mutual labels:  pattern-matching
gomatch
Library created for testing JSON against patterns.
Stars: ✭ 41 (+78.26%)
Mutual labels:  pattern-matching
Pampy
Pampy: The Pattern Matching for Python you always dreamed of.
Stars: ✭ 3,419 (+14765.22%)
Mutual labels:  pattern-matching
typy
A fragmentary bidirectional type system as a Python library
Stars: ✭ 51 (+121.74%)
Mutual labels:  pattern-matching
Tiny Glob
Super tiny and ~350% faster alternative to node-glob
Stars: ✭ 710 (+2986.96%)
Mutual labels:  pattern-matching
pattern-case
Simple pattern matching in Typescript
Stars: ✭ 40 (+73.91%)
Mutual labels:  pattern-matching
Defun
A macro to define clojure functions with parameter pattern matching just like erlang or elixir.
Stars: ✭ 432 (+1778.26%)
Mutual labels:  pattern-matching
flowpython
tasty feature extensions for python3(NO MAINTENANCE!).
Stars: ✭ 66 (+186.96%)
Mutual labels:  pattern-matching
cats.match
Pattern matching for the monads in the cats Clojure library
Stars: ✭ 49 (+113.04%)
Mutual labels:  pattern-matching
Qo
Qo - Query Object - Pattern matching and fluent querying in Ruby
Stars: ✭ 351 (+1426.09%)
Mutual labels:  pattern-matching
siringa
Minimalist dependency injection library for Python that embraces type annotations syntax
Stars: ✭ 51 (+121.74%)
Mutual labels:  pattern-matching
Pampy.js
Pampy.js: Pattern Matching for JavaScript
Stars: ✭ 544 (+2265.22%)
Mutual labels:  pattern-matching
conditional-expression
JavaScript functional conditional expression
Stars: ✭ 63 (+173.91%)
Mutual labels:  pattern-matching
Rascal
The implementation of the Rascal meta-programming language (including interpreter, type checker, parser generator, compiler and JVM based run-time system)
Stars: ✭ 284 (+1134.78%)
Mutual labels:  pattern-matching
Egison
The Egison Programming Language
Stars: ✭ 800 (+3378.26%)
Mutual labels:  pattern-matching
Meander
Tools for transparent data transformation
Stars: ✭ 617 (+2582.61%)
Mutual labels:  pattern-matching
Whyhaskellmatters
In this article I try to explain why Haskell keeps being such an important language by presenting some of its most important and distinguishing features and detailing them with working code examples. The presentation aims to be self-contained and does not require any previous knowledge of the language.
Stars: ✭ 418 (+1717.39%)
Mutual labels:  pattern-matching

PCRE-OCaml - Perl Compatibility Regular Expressions for OCaml

This OCaml-library interfaces the C-library PCRE (Perl-compatibility Regular Expressions). It can be used for string matching with "PERL"-style regular expressions.

Features

PCRE-OCaml offers the following functionality for operating on strings:

  • Searching for patterns
  • Extracting subpatterns
  • Splitting strings according to patterns
  • Pattern substitution

Other reasons to use PCRE-OCaml:

  • The PCRE-library by Philip Hazel has been under development for many years and is fairly advanced and stable. It implements just about all of the functionality that can be found in PERL regular expressions. The higher-level functions written in OCaml (split, replace, etc.), too, are compatible with the corresponding PERL-functions to the extent that OCaml allows. Most people find the syntax of PERL-style regular expressions more straightforward and powerful than the Emacs-style regular expressions used in the Str-module in the standard OCaml distribution.

  • PCRE-OCaml is reentrant and thus thread-safe, which is not the case for the Str-module in the OCaml standard library. Using reentrant libraries also means more convenience for programmers. They do not have to reason about states in which the library might be in.

  • The high-level functions for replacement and substitution, which are all implemented in OCaml, are much faster than the ones in the Str-module. In fact, when compiled to native code, they even seem to be significantly faster than those found in PERL (PERL is written in C).

  • You can rely on the data returned being unique. In other terms: if the result of a function is a string, you can safely use destructive updates on it without having to fear side effects.

  • The interface to the library makes use of labels and default arguments to give you a high degree of programming comfort.

Usage

Please consult the API for details.

A general concept the user may need to understand is that most functions allow for two different kinds of flags:

  1. "Convenience"-flags that make for readable and concise code, but which need to be translated to an internal representation on each call. Example:

    let rex = Pcre.regexp ~flags:[`ANCHORED; `CASELESS] "some pattern" in
    (* ... *)
    

    This makes it easy to pass flags on the fly. They will be translated to the internal format automatically. However, if this happens to be in a loop, this translation will occur on each iteration. If you really need to save as much performance as possible, you should use the next approach.

  2. "Internal" flags that need to be defined and translated from "convenience"-flags before function calls, but which allow for optimum performance in loops. Example:

    let iflags = Pcre.cflags [`ANCHORED; `CASELESS] in
    for i = 1 to 1000 do
      let rex = Pcre.regexp ~iflags "some pattern constructed at runtime" in
      (* ... *)
    done
    

    Factoring out the translation of flags for regular expressions may save some cycles, but don't expect too much. You can save more CPU time when lifting the creation of regular expressions out of loops. Example for what not to do:

    for i = 1 to 1000 do
      let chunks = Pcre.split ~pat:"[ \t]+" "foo bar" in
      (* ... *)
    done
    

    Better:

    let rex = Pcre.regexp "[ \t]+" in
    for i = 1 to 1000 do
      let chunks = Pcre.split ~rex "foo bar" in
      (* ... *)
    done
    

The provided functions use optional arguments with intuitive defaults. For example, the Pcre.split-function will assume whitespace as pattern. The examples-directory contains a few example applications demonstrating the functionality of PCRE-OCaml.

Restartable (partial) pattern matching

PCRE includes an "alternative" DFA match function that allows one to restart a partial match with additional input. This is exposed by pcre-ocaml via the pcre_dfa_exec function. While this cannot be used for "higher-level" operations like extracting submatches or splitting subject strings, it can be very useful in certain streaming and search use cases.

This utop interaction demonstrates the basic workflow of a partial match that is then restarted multiple times before completing successfully:

utop # open Pcre;;
utop # let rex = regexp "12+3";;
val rex : regexp = <abstr>
utop # let workspace = Array.make 40 0;;
val workspace : int array =
  [|0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0;
    0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0|]
utop # pcre_dfa_exec ~rex ~flags:[`PARTIAL] ~workspace "12222";;
Exception: Pcre.Error Partial.
utop # pcre_dfa_exec ~rex ~flags:[`PARTIAL; `DFA_RESTART] ~workspace "2222222";;
Exception: Pcre.Error Partial.
utop # pcre_dfa_exec ~rex ~flags:[`PARTIAL; `DFA_RESTART] ~workspace "2222222";;
Exception: Pcre.Error Partial.
utop # pcre_dfa_exec ~rex ~flags:[`PARTIAL; `DFA_RESTART] ~workspace "223xxxx";;
- : int array = [|0; 3; 0|]

Please refer to the documentation of pcre_dfa_exec and check out the dfa_restart example for more info.

Contact Information and Contributing

Please submit bugs reports, feature requests, contributions and similar to the GitHub issue tracker.

Up-to-date information is available at: https://mmottl.github.io/pcre-ocaml

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].