All Projects → ChrisPenner → Lens Regex Pcre

ChrisPenner / Lens Regex Pcre

Licence: bsd-3-clause
Text lenses using PCRE regexes

Programming Languages

haskell
3896 projects

Projects that are alternatives of or similar to Lens Regex Pcre

Netflix To Srt
Rip, extract and convert subtitles to .srt closed captions from .xml/dfxp/ttml and .vtt/WebVTT (e.g. Netflix, YouTube)
Stars: ✭ 387 (+233.62%)
Mutual labels:  hacktoberfest, regex
Prolens
👓 Profunctor based lightweight implementation of Lenses
Stars: ✭ 63 (-45.69%)
Mutual labels:  hacktoberfest, lenses
Rust Onig
Rust bindings for the Oniguruma regex library
Stars: ✭ 81 (-30.17%)
Mutual labels:  hacktoberfest, regex
React Most Wanted
React starter kit with "Most Wanted" application features
Stars: ✭ 1,867 (+1509.48%)
Mutual labels:  hacktoberfest
Goodwork
Self hosted project management and collaboration tool powered by TALL stack
Stars: ✭ 1,730 (+1391.38%)
Mutual labels:  hacktoberfest
Tinysearch
🔍 Tiny, full-text search engine for static websites built with Rust and Wasm
Stars: ✭ 1,705 (+1369.83%)
Mutual labels:  hacktoberfest
Maria Quiteria
Backend para coleta e disponibilização dos dados 📜
Stars: ✭ 115 (-0.86%)
Mutual labels:  hacktoberfest
Graphql Schema
GitHub’s GraphQL Schema with validation. Automatically updated.
Stars: ✭ 113 (-2.59%)
Mutual labels:  hacktoberfest
Gong Wpf Dragdrop
The GongSolutions.WPF.DragDrop library is a drag'n'drop framework for WPF
Stars: ✭ 1,669 (+1338.79%)
Mutual labels:  hacktoberfest
Influxer
InfluxDB ActiveRecord-style
Stars: ✭ 115 (-0.86%)
Mutual labels:  hacktoberfest
Cchoco
Community resource to manage Chocolatey
Stars: ✭ 115 (-0.86%)
Mutual labels:  hacktoberfest
Developer Community Stats
🚀 A repository to encourage beginners to contribute to open source and for all contributors to view their Github stats
Stars: ✭ 116 (+0%)
Mutual labels:  hacktoberfest
Spec
The AsyncAPI specification allows you to create machine-readable definitions of your asynchronous APIs.
Stars: ✭ 1,860 (+1503.45%)
Mutual labels:  hacktoberfest
Learn Regex Zh
🇨🇳 翻译: 学习正则表达式的简单方法
Stars: ✭ 1,772 (+1427.59%)
Mutual labels:  regex
Challenges Front End
Repositório referente à desafios de front-end da womakerscode
Stars: ✭ 116 (+0%)
Mutual labels:  hacktoberfest
Vscode Matlab
MATLAB support for Visual Studio Code
Stars: ✭ 114 (-1.72%)
Mutual labels:  hacktoberfest
Dynaconf
Configuration Management for Python ⚙
Stars: ✭ 2,082 (+1694.83%)
Mutual labels:  hacktoberfest
Pssharedgoods
PSSharedGoods is little PowerShell Module that primary purpose is to be useful for multiple tasks, unrelated to each other. I've created this module as “a glue” between my other modules.
Stars: ✭ 115 (-0.86%)
Mutual labels:  hacktoberfest
Gramjs
NodeJS MTProto API Telegram client library,
Stars: ✭ 113 (-2.59%)
Mutual labels:  hacktoberfest
Meteor Timesync
NTP-style time synchronization between server and client, and facilities to use server time reactively in Meteor applications.
Stars: ✭ 115 (-0.86%)
Mutual labels:  hacktoberfest

lens-regex-pcre

Hackage and Docs

Based on pcre-heavy; so it should support any regexes or options which it supports.

Performance is equal, sometimes better than that of pcre-heavy alone.

Which module should you use?

If you need unicode support, use Control.Lens.Regex.Text, if not then Control.Lens.Regex.ByteString is faster.

Working with Regexes in Haskell kinda sucks; it's tough to figure out which libs to use, and even after you pick one it's tough to figure out how to use it; lens-regex-pcre hopes to replace most other solutions by being fast, easy to set up, more adaptable with a more consistent interface.

It helps that there are already HUNDREDS of combinators which interop with lenses 😄.

As it turns out; regexes are a very lens-like tool; Traversals allow you to select and alter zero or more matches; traversals can even carry indexes so you know which match or group you're working on.

Examples

import Control.Lens.Regex.Text

txt :: Text
txt = "raindrops on roses and whiskers on kittens"

-- Search
>>> has [regex|whisk|] txt
True

-- Get matches
>>> txt ^.. [regex|\br\w+|] . match
["raindrops","roses"]

-- Edit matches
>>> txt & [regex|\br\w+|] . match %~ T.intersperse '-' . T.toUpper
"R-A-I-N-D-R-O-P-S on R-O-S-E-S and whiskers on kittens"

-- Get Groups
>>> txt ^.. [regex|(\w+) on (\w+)|] . groups
[["raindrops","roses"],["whiskers","kittens"]]

-- Edit Groups
>>> txt & [regex|(\w+) on (\w+)|] . groups %~ reverse
"roses on raindrops and kittens on whiskers"

-- Get the third match
>>> txt ^? [regex|\w+|] . index 2 . match
Just "roses"

-- Match integers, 'Read' them into ints, then sort them in-place
-- dumping them back into the source text afterwards.
>>> "Monday: 29, Tuesday: 99, Wednesday: 3" 
   & partsOf ([regex|\d+|] . match . unpacked . _Show @Int) %~ sort
"Monday: 3, Tuesday: 29, Wednesday: 99"

Basically anything you want to do is possible somehow.

Performance

See the benchmarks.

Summary

Caveat: I'm by no means a benchmarking expert; if you have tips on how to do this better I'm all ears!

  • Search lens-regex-pcre is marginally slower than pcre-heavy, but well within acceptable margins (within 0.6%)
  • Replace lens-regex-pcre beats pcre-heavy by ~10%
  • Modify pcre-heavy doesn't support this operation at all, so I guess lens-regex-pcre wins here :)

How can it possibly be faster if it's based on pcre-heavy? lens-regex-pcre only uses pcre-heavy for finding the matches, not substitution/replacement. After that it splits the text into chunks and traverses over them with whichever operation you've chosen. The nature of this implementation makes it a lot easier to understand than imperative implementations of the same thing. This means it's pretty easy to make edits, and is also the reason we can support arbitrary traversals/actions. It was easy enough, so I went ahead and made the whole thing use ByteString Builders, which sped it up a lot. I suspect that pcre-heavy can benefit from the same optimization if anyone feels like back-porting it; it could be (almost) as nicely using simple traverse without any lenses. The whole thing is only about 25 LOC.

I'm neither a benchmarks nor stats person, so please open an issue if anything here seems fishy.

Without pcre-light and pcre-heavy this library wouldn't be possible, so huge thanks to all contributors!

Here are the benchmarks on my 2013 Macbook (2.6 Ghz i5)

benchmarking static pattern search/pcre-heavy ... took 20.78 s, total 56 iterations
benchmarked static pattern search/pcre-heavy
time                 375.3 ms   (372.0 ms .. 378.5 ms)
                     1.000 R²   (0.999 R² .. 1.000 R²)
mean                 378.1 ms   (376.4 ms .. 380.8 ms)
std dev              3.747 ms   (922.3 μs .. 5.609 ms)

benchmarking static pattern search/lens-regex-pcre ... took 20.79 s, total 56 iterations
benchmarked static pattern search/lens-regex-pcre
time                 379.5 ms   (376.2 ms .. 382.4 ms)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 377.3 ms   (376.5 ms .. 378.4 ms)
std dev              1.667 ms   (1.075 ms .. 2.461 ms)

benchmarking complex pattern search/pcre-heavy ... took 95.95 s, total 56 iterations
benchmarked complex pattern search/pcre-heavy
time                 1.741 s    (1.737 s .. 1.746 s)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 1.746 s    (1.744 s .. 1.749 s)
std dev              4.499 ms   (3.186 ms .. 6.080 ms)

benchmarking complex pattern search/lens-regex-pcre ... took 97.26 s, total 56 iterations
benchmarked complex pattern search/lens-regex-pcre
time                 1.809 s    (1.736 s .. 1.908 s)
                     0.996 R²   (0.991 R² .. 1.000 R²)
mean                 1.757 s    (1.742 s .. 1.810 s)
std dev              42.83 ms   (11.51 ms .. 70.69 ms)

benchmarking simple replacement/pcre-heavy ... took 23.32 s, total 56 iterations
benchmarked simple replacement/pcre-heavy
time                 423.8 ms   (422.4 ms .. 425.3 ms)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 424.0 ms   (422.9 ms .. 426.2 ms)
std dev              2.684 ms   (1.239 ms .. 4.270 ms)

benchmarking simple replacement/lens-regex-pcre ... took 20.84 s, total 56 iterations
benchmarked simple replacement/lens-regex-pcre
time                 382.8 ms   (374.3 ms .. 391.5 ms)
                     0.999 R²   (0.999 R² .. 1.000 R²)
mean                 378.2 ms   (376.3 ms .. 381.0 ms)
std dev              3.794 ms   (2.577 ms .. 5.418 ms)

benchmarking complex replacement/pcre-heavy ... took 24.77 s, total 56 iterations
benchmarked complex replacement/pcre-heavy
time                 448.1 ms   (444.7 ms .. 450.0 ms)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 450.8 ms   (449.5 ms .. 453.9 ms)
std dev              3.129 ms   (947.0 μs .. 4.841 ms)

benchmarking complex replacement/lens-regex-pcre ... took 21.99 s, total 56 iterations
benchmarked complex replacement/lens-regex-pcre
time                 399.9 ms   (398.4 ms .. 402.2 ms)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 399.6 ms   (399.0 ms .. 400.4 ms)
std dev              1.135 ms   (826.2 μs .. 1.604 ms)

Benchmark lens-regex-pcre-bench: FINISH

Behaviour

Precise Expected behaviour (and examples) can be found in the test suites:

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].