All Projects → Flight-School → Regularexpressiondecoder

Flight-School / Regularexpressiondecoder

Licence: mit
A decoder that constructs objects from regular expression matches.

Programming Languages

swift
15916 projects

Projects that are alternatives of or similar to Regularexpressiondecoder

Tokenizer
Source code tokenizer
Stars: ✭ 119 (-29.59%)
Mutual labels:  regular-expression
Wayeb
Wayeb is a Complex Event Processing and Forecasting (CEP/F) engine written in Scala.
Stars: ✭ 138 (-18.34%)
Mutual labels:  regular-expression
Find
A find-in-page extension for Chrome and Firefox that supports regular expressions.
Stars: ✭ 157 (-7.1%)
Mutual labels:  regular-expression
Js Regular Expression Awesome
📄我收藏的正则表达式大全,欢迎补充
Stars: ✭ 120 (-28.99%)
Mutual labels:  regular-expression
Randexp.js
Create random strings that match a given regular expression.
Stars: ✭ 1,682 (+895.27%)
Mutual labels:  regular-expression
Regex Dos
👮 👊 RegEx Denial of Service (ReDos) Scanner
Stars: ✭ 143 (-15.38%)
Mutual labels:  regular-expression
Oniguruma
regular expression library
Stars: ✭ 1,643 (+872.19%)
Mutual labels:  regular-expression
Messagepack
A MessagePack encoder and decoder for Codable types
Stars: ✭ 167 (-1.18%)
Mutual labels:  codable
Braces
Faster brace expansion for node.js. Besides being faster, braces is not subject to DoS attacks like minimatch, is more accurate, and has more complete support for Bash 4.3.
Stars: ✭ 133 (-21.3%)
Mutual labels:  regular-expression
Srl Php
Simple Regex Language
Stars: ✭ 1,808 (+969.82%)
Mutual labels:  regular-expression
Dan Jurafsky Chris Manning Nlp
My solution to the Natural Language Processing course made by Dan Jurafsky, Chris Manning in Winter 2012.
Stars: ✭ 124 (-26.63%)
Mutual labels:  regular-expression
Oldpodcasts
A clone of Apple's Podcasts. UIKit version.
Stars: ✭ 128 (-24.26%)
Mutual labels:  codable
Ladybug
A powerful model framework for Swift 4
Stars: ✭ 147 (-13.02%)
Mutual labels:  codable
Prefsmate
🐣 Elegant UITableView generator for Swift.
Stars: ✭ 120 (-28.99%)
Mutual labels:  codable
Router
⚡️ A lightning fast HTTP router
Stars: ✭ 158 (-6.51%)
Mutual labels:  regular-expression
Regular
🔍The convenient paste of regular expression🔎
Stars: ✭ 118 (-30.18%)
Mutual labels:  regular-expression
Micromatch
Contributing Pull requests and stars are always welcome. For bugs and feature requests, please create an issue. Please read the contributing guide for advice on opening issues, pull requests, and coding standards.
Stars: ✭ 1,979 (+1071.01%)
Mutual labels:  regular-expression
Parseback
A Scala implementation of parsing with derivatives
Stars: ✭ 168 (-0.59%)
Mutual labels:  regular-expression
Grex
A command-line tool and library for generating regular expressions from user-provided test cases
Stars: ✭ 4,847 (+2768.05%)
Mutual labels:  regular-expression
Compile Time Regular Expressions
A Compile time PCRE (almost) compatible regular expression matcher.
Stars: ✭ 2,144 (+1168.64%)
Mutual labels:  regular-expression

Regular Expression Decoder

Build Status License Swift Version

A decoder that constructs objects from regular expression matches.


For more information about creating your own custom decoders, consult Chapter 7 of the Flight School Guide to Swift Codable. For more information about using regular expressions in Swift, check out Chapter 6 of the Flight School Guide to Swift Strings.

Requirements

  • Swift 5+
  • iOS 11+ or macOS 10.13+

Usage

import RegularExpressionDecoder

let ticker = """
AAPL 170.69▲0.51
GOOG 1122.57▲2.41
AMZN 1621.48▼18.52
MSFT 106.57=0.00
SWIFT 5.0.0▲1.0.0
"""

let pattern: RegularExpressionPattern<Stock, Stock.CodingKeys> = #"""
\b
(?<\#(.symbol)>[A-Z]{1,4}) \s+
(?<\#(.price)>\d{1,}\.\d{2}) \s*
(?<\#(.sign)>([▲▼=])
(?<\#(.change)>\d{1,}\.\d{2})
\b
"""#

let decoder = try RegularExpressionDecoder<Stock>(
                    pattern: pattern,
                    options: .allowCommentsAndWhitespace
                  )

try decoder.decode([Stock].self, from: ticker)
// Decodes [AAPL, GOOG, AMZN, MSFT] (but not SWIFT, which is invalid)

Explanation

Let's say that you're building an app that parses stock quotes from a text-based stream of price changes.

let ticker = """
AAPL 170.69▲0.51
GOOG 1122.57▲2.41
AMZN 1621.48▼18.52
MSFT 106.57=0.00
"""

Each stock is represented by the following structure:

  • The symbol, consisting of 1 to 4 uppercase letters, followed by a space
  • The price, formatted as a number with 2 decimal places
  • A sign, indicating a price gain (), loss (), or no change (=)
  • The magnitude of the gain or loss, formatted the same as the price

These format constraints lend themselves naturally to representation by a regular expression, such as:

/\b[A-Z]{1,4} \d{1,}\.\d{2}[▲▼=]\d{1,}\.\d{2}\b/

Note: The \b metacharacter anchors matches to word boundaries.

This regular expression can distinguish between valid and invalid stock quotes.

"AAPL 170.69▲0.51" // valid
"SWIFT 5.0.0▲1.0.0" // invalid

However, to extract individual components from a quote (symbol, price, etc.) the regular expression must contain capture groups, of which there are two varieties: positional capture groups and named capture groups.

Positional capture groups are demarcated in the pattern by enclosing parentheses ((___)). With some slight modifications, we can make original regular expression capture each part of the stock quote:

/\b([A-Z]{1,4}) (\d{1,}\.\d{2})([▲▼=])(\d{1,}\.\d{2})\b/

When matched, the symbol can be accessed by the first capture group, the price by the second, and so on.

For large numbers of capture groups --- especially in patterns with nested groups --- one can easily lose track of which parts correspond to which positions. So another approach is to assign names to capture groups, which are denoted by the syntax (?<NAME>___).

/\b
(?<symbol>[A-Z]{1,4}) \s+
(?<price>\d{1,}\.\d{2}) \s*
(?<sign>([▲▼=])
(?<change>\d{1,}\.\d{2})
\b/

Note: Most regular expression engines --- including the one used by NSRegularExpression --- provide a mode to ignore whitespace; this lets you segment long patterns over multiple lines, making them easier to read and understand.

Theoretically, this approach allows you to access each group by name for each match of the regular expression. In practice, doing this in Swift can be inconvenient, as it requires you to interact with cumbersome NSRegularExpression APIs and somehow incorporate it into your model layer.

RegularExpressionDecoder provides a convenient solution to constructing Decodable objects from regular expression matches by automatically matching coding keys to capture group names. And it can do so safely, thanks to the new ExpressibleByStringInterpolation protocol in Swift 5.

To understand how, let's start by considering the following Stock model, which adopts the Decodable protocol:

struct Stock: Decodable {
    let symbol: String
    var price: Double

    enum Sign: String, Decodable {
        case gain = "▲"
        case unchanged = "="
        case loss = "▼"
    }

    private var sign: Sign
    private var change: Double = 0.0
    var movement: Double {
        switch sign {
        case .gain: return +change
        case .unchanged: return 0.0
        case .loss: return -change
        }
    }
}

So far, so good.

Now, normally, the Swift compiler automatically synthesizes conformance to Decodable, including a nested CodingKeys type. But in order to make this next part work correctly, we'll have to do this ourselves:

extension Stock {
    enum CodingKeys: String, CodingKey {
        case symbol
        case price
        case sign
        case change
    }
}

Here's where things get really interesting: remember our regular expression with named capture patterns from before? We can replace the hard-coded names with interpolations of the Stock type's coding keys.

import RegularExpressionDecoder

let pattern: RegularExpressionPattern<Stock, Stock.CodingKeys> = #"""
\b
(?<\#(.symbol)>[A-Z]{1,4}) \s+
(?<\#(.price)>\d{1,}\.\d{2}) \s*
(?<\#(.sign)>[▲▼=])
(?<\#(.change)>\d{1,}\.\d{2})
\b
"""#

Note: This example benefits greatly from another new feature in Swift 5: raw string literals. Those octothorps (#) at the start and end tell the compiler to ignore escape characters (\) unless they also include an octothorp (\#( )). Using raw string literals, we can write regular expression metacharacters like \b, \d, and \s without double escaping them (i.e. \\b).

Thanks to ExpressibleByStringInterpolation, we can restrict interpolation segments to only accept those coding keys, thereby ensuring a direct 1:1 match between capture groups and their decoded properties. And not only that --- this approach lets us to verify that keys have valid regex-friendly names and aren't captured more than once. It's enormously powerful, allowing code to be incredibly expressive without compromising safety or performance.

When all is said and done, RegularExpressionDecoder lets you decode types from a string according to a regular expression pattern much the same as you might from JSON or a property list using a decoder:

let decoder = try RegularExpressionDecoder<Stock>(
                        pattern: pattern,
                        options: .allowCommentsAndWhitespace
                  )

try decoder.decode([Stock].self, from: ticker)
// Decodes [AAPL, GOOG, AMZN, MSFT]

License

MIT

Contact

Mattt (@mattt)

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].