All Projects → begin → Globbing

begin / Globbing

Introduction to "globbing" or glob matching, a programming concept that allows "filepath expansion" and matching using wildcards.

Programming Languages

bash
514 projects

Projects that are alternatives of or similar to Globbing

globrex
Glob to regular expression with support for extended globs.
Stars: ✭ 52 (-39.53%)
Mutual labels:  pattern, regex, glob, regular-expression
extglob
Extended globs. Add (almost) the expressive power of regular expressions to glob patterns.
Stars: ✭ 25 (-70.93%)
Mutual labels:  pattern, regex, glob, regular-expression
expand-brackets
Expand POSIX bracket expressions (character classes) in glob patterns.
Stars: ✭ 26 (-69.77%)
Mutual labels:  regex, glob, regular-expression
Commonregex
🍫 A collection of common regular expressions for Go
Stars: ✭ 733 (+752.33%)
Mutual labels:  regex, regular-expression, pattern
Micromatch
Contributing Pull requests and stars are always welcome. For bugs and feature requests, please create an issue. Please read the contributing guide for advice on opening issues, pull requests, and coding standards.
Stars: ✭ 1,979 (+2201.16%)
Mutual labels:  regex, regular-expression, glob
cheat-sheet-pdf
📜 A Cheat-Sheet Collection from the WWW
Stars: ✭ 728 (+746.51%)
Mutual labels:  regex, regular-expression, cheatsheet
Nanomatch
Fast, minimal glob matcher for node.js. Similar to micromatch, minimatch and multimatch, but without support for extended globs (extglobs), posix brackets or braces, and with complete Bash 4.3 wildcard support: ("*", "**", and "?").
Stars: ✭ 79 (-8.14%)
Mutual labels:  regular-expression, pattern, glob
Anymatch
‼️ Matches strings against configurable strings, globs, regular expressions, and/or functions
Stars: ✭ 289 (+236.05%)
Mutual labels:  regex, regular-expression, glob
Picomatch
Blazing fast and accurate glob matcher written JavaScript, with no dependencies and full support for standard and extended Bash glob features, including braces, extglobs, POSIX brackets, and regular expressions.
Stars: ✭ 393 (+356.98%)
Mutual labels:  regex, regular-expression, glob
Guitar
A Cross-Platform String and Regular Expression Library written in Swift.
Stars: ✭ 641 (+645.35%)
Mutual labels:  regex, regular-expression
Glob
Go glob
Stars: ✭ 670 (+679.07%)
Mutual labels:  pattern, glob
Py2rs
A quick reference guide for the Pythonista in the process of becoming a Rustacean
Stars: ✭ 690 (+702.33%)
Mutual labels:  cheatsheet, guide
Onigmo
Onigmo is a regular expressions library forked from Oniguruma.
Stars: ✭ 536 (+523.26%)
Mutual labels:  regex, regular-expression
Regulex
🚧 Regular Expression Excited!
Stars: ✭ 4,877 (+5570.93%)
Mutual labels:  regex, regular-expression
Chinamobilephonenumberregex
Regular expressions that match the mobile phone number in mainland China. / 一组匹配中国大陆手机号码的正则表达式。
Stars: ✭ 4,440 (+5062.79%)
Mutual labels:  regex, regular-expression
Whitespace Regex
Regular expression for matching the whitespace in a string.
Stars: ✭ 9 (-89.53%)
Mutual labels:  regex, regular-expression
Regex
A sane interface for php's built in preg_* functions
Stars: ✭ 909 (+956.98%)
Mutual labels:  regex, regular-expression
Techinterview
💎 Cheat sheet to prep for technical interviews.
Stars: ✭ 454 (+427.91%)
Mutual labels:  cheatsheet, guide
Shallow Clone
Make a shallow clone of an object, array or primitive.
Stars: ✭ 23 (-73.26%)
Mutual labels:  regex, regular-expression
Rexrex
🦖 Composable JavaScript regular expressions
Stars: ✭ 34 (-60.47%)
Mutual labels:  regex, regular-expression

globbing

Cheatsheet and introduction to "globbing", a programming concept that involves the use of wildcards and special characters for matching and filtering.

Table of contents

(TOC generated by verb using markdown-toc)

Contributing

We love contributors!

Pull requests to add documentation, links to tools, corrections or anything else are warmly accepted and gratefully appreciated!

Too busy to contribute directly, but still want to show your support? Please consider starring this project or tweeting about it!

What is "globbing"?

The term "globbing", also referred to as "glob matching" or "filepath expansion", is a programming concept that describes the process of using wildcards, referred to as "glob patterns" or "globs", for matching file paths or other similar sets of strings.

Similar to regular expressions, but much simpler and limited in scope, glob patterns are defined using a combination of special characters, or wildcards, alongside literal (non-matching) characters. For example, the glob pattern *.txt would match all files in a directory with a .txt file extension.

Globbing syntax

TODO: describe wildcards vs. extended globbing, and segue to following sections

  • wildcards
  • extended globbing

Wildcards

The commonly supported characters supported across globbing implementations for basic wildcard matching are *, ? and a simplified version of regex brackets for matching any of a given set of characters.

Although many different globbing implementations exist across a number of different languages and environments, the following table summarizes the most commonly supported basic globbing features.

Character Description
* Matches any character zero or more times, except for /.
** Matches any character zero or more times, including /.
? Matches any character one time
[abc] Matches any of the specified characters (in this case, a, b or c)

Special exceptions:

  • * typically does not match dotfiles (file names starting with a .) unless explicitly enabled by the user via options
  • ? also typically does not match the leading dot
  • More than two stars in a glob path segment are typically interpreted as a single star (e.g. /***/ is the same as /*/)

Implementations

Globbing is typically enabled using third-party libraries. A notable exception, bash, provides built-in support for basic globbing.

TODO: add list of implementations

Additional reading

Extended globbing

In addition to wildcard matching, extended globbing describes the addition of more advanced matching features, such as:

  • brace expansion
  • extglobs
  • POSIX character classes
  • regular expressions

TBC...

brace expansion

In simple cases, brace expansion appears to work the same way as the logical OR operator. For example, (a|b) will achieve the same result as {a,b}.

Here are some powerful features unique to brace expansion (versus character classes):

  • range expansion: a{1..3}b/*.js expands to: ['a1b/*.js', 'a2b/*.js', 'a3b/*.js']
  • nesting: a{c,{d,e}}b/*.js expands to: ['acb/*.js', 'adb/*.js', 'aeb/*.js']

TBC...

Visit the braces library for more examples and information about brace expansion.

extglobs

TBC...

As described by the bash man page:

pattern regex equivalent matches
?(foo) (foo)? zero or one occurrence of the given patterns
*(foo) (foo)* zero or more occurrences of the given patterns
+(foo) (foo)+ one or more occurrences of the given patterns
@(foo) (foo) * one of the given patterns
!(foo) ^(?:(?!(foo)$).*?)$ anything except one of the given patterns

* Note that @ isn't a RegEx character.

Example implementations

  • extglob - extended glob parser and matcher for node.js

POSIX character classes

POSIX character classes, or "bracket expressions", provide a way of defining regular expressions using something closer to plain english.

For example, the pattern [[:alpha:][:digit:]] would match a1, but not aa.

TBC...

Example implementations

  • expand-brackets, node.js API for parsing and matching POSIX bracket expressions

Regular expressions

This section describes matching using regular expressions as it relates to globbing.

regex character classes

When regex character classes are used in glob patterns, with the exception of brace expansion ({a,b}, {1..5}, etc), most of the special characters convert directly to regex, so you can expect them to follow the same rules and produce the same results as regex.

For example, given the list: ['a.js', 'b.js', 'c.js', 'd.js', 'E.js']:

  • [ac].js: matches both a and c, returning ['a.js', 'c.js']
  • [b-d].js: matches from b to d, returning ['b.js', 'c.js', 'd.js']
  • [b-d].js: matches from b to d, returning ['b.js', 'c.js', 'd.js']
  • a/[A-Z].js: matches and uppercase letter, returning ['a/E.js']

However, there is

Learn about [regex character classes][character-classes].

regex groups

Given ['a.js', 'b.js', 'c.js', 'd.js', 'E.js']:

  • (a|c).js: would match either a or c, returning ['a.js', 'c.js']
  • (b|d).js: would match either b or d, returning ['b.js', 'd.js']
  • (b|[A-Z]).js: would match either b or an uppercase letter, returning ['b.js', 'E.js']

As with regex, parentheses can be nested, so patterns like ((a|b)|c)/b will work. But it might be easier to achieve your goal using brace expansion.

Common options

The following options are commonly available on various globbing implementations.

Option name Description
extglob Enable extended globs. In addition to the traditional globs (using wildcards: *, *, ? and [...]), extended globs add (almost) the expressive power of regular expressions, allowing the use of patterns like `foo/!(a
dotglob Allows files beginning with . to be included in matches. This option is automatically enabled if the glob pattern begins with a dot. Aliases: dot (supported by: minimatch, micromatch)
failglob report an error when no matches are found
globignore allows you to specify patterns a glob should not match Aliases: ignore (supported by: minimatch, micromatch)
globstar recursively match directory paths (enabled by default in minimatch and micromatch, but not in bash)
nocaseglob perform case-insensitive pathname expansion
nocasematch perform case-insensitive matching. Aliases: nocase (supported by: minimatch, micromatch)
nullglob when enabled, the pattern itself will be returned when no matches are found. Aliases: nonull (supported by: minimatch, micromatch)

Caveats

WIP

Like regular expressions, glob patterns are a type of regular language that must first be interpreted by a computer program before any actual matching takes place. This introduces two areas of risk:

  • interpreting patterns:
  • performing matches:

Interpreting patterns

TODO (The process of parsing and compiling a glob pattern into a regular expression...)

Performing matches

TODO (The process of matching the compiled regular expression against a set of strings)

Risks

  • Intentional denial-of-service (DoS) attacks from hackers
  • Unintentional denial-of-service (DoS) attacks from agressively greedy wildcard patterns

Resources

Learn more about globbing.

Guides and documentation

Tools and software

Name Programming language
bash-glob JavaScript/node.js
brace-expansion JavaScript/node.js
braces JavaScript/node.js
expand-brackets JavaScript/node.js
glob JavaScript/node.js
micromatch JavaScript/node.js
minimatch JavaScript/node.js
nanomatch JavaScript/node.js

Related concepts

The following concepts are similar to or include the concept of globbing:

TODO

  • [ ] cheatsheet
  • [ ] what is globbing
  • [ ] wildcards
  • [ ] extglobs
  • [ ] posix brackets
  • [ ] braces
  • [ ] options
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].