All Projects → mathiasbynens → Emoji Regex

mathiasbynens / Emoji Regex

Licence: mit
A regular expression to match all Emoji-only symbols as per the Unicode Standard.

Programming Languages

javascript
184084 projects - #8 most used programming language

Projects that are alternatives of or similar to Emoji Regex

Regexpu
A source code transpiler that enables the use of ES2015 Unicode regular expressions in ES5.
Stars: ✭ 201 (-82.28%)
Mutual labels:  regex, unicode, regular-expression, regexp
cregex
A small implementation of regular expression matching engine in C
Stars: ✭ 72 (-93.65%)
Mutual labels:  regex, regexp, regular-expression
Commonregex
🍫 A collection of common regular expressions for Go
Stars: ✭ 733 (-35.36%)
Mutual labels:  regex, regular-expression, regexp
Regulex
🚧 Regular Expression Excited!
Stars: ✭ 4,877 (+330.07%)
Mutual labels:  regex, regular-expression, regexp
Onigmo
Onigmo is a regular expressions library forked from Oniguruma.
Stars: ✭ 536 (-52.73%)
Mutual labels:  regex, regular-expression, regexp
moar
Deterministic Regular Expressions with Backreferences
Stars: ✭ 19 (-98.32%)
Mutual labels:  regex, regexp, regular-expression
expand-brackets
Expand POSIX bracket expressions (character classes) in glob patterns.
Stars: ✭ 26 (-97.71%)
Mutual labels:  regex, regexp, regular-expression
Grex
A command-line tool and library for generating regular expressions from user-provided test cases
Stars: ✭ 4,847 (+327.43%)
Mutual labels:  regex, regular-expression, regexp
RgxGen
Regex: generate matching and non matching strings based on regex pattern.
Stars: ✭ 45 (-96.03%)
Mutual labels:  regex, regexp, regular-expression
stringx
Drop-in replacements for base R string functions powered by stringi
Stars: ✭ 14 (-98.77%)
Mutual labels:  unicode, regex, regexp
Rex
Your RegEx companion.
Stars: ✭ 283 (-75.04%)
Mutual labels:  regex, regular-expression, regexp
Regex For Regular Folk
🔍💪 Regular Expressions for Regular Folk — A visual, example-based introduction to RegEx [BETA]
Stars: ✭ 242 (-78.66%)
Mutual labels:  regex, regular-expression, regexp
Stringi
THE String Processing Package for R (with ICU)
Stars: ✭ 204 (-82.01%)
Mutual labels:  regex, unicode, regexp
Regaxor
A regular expression fuzzer.
Stars: ✭ 35 (-96.91%)
Mutual labels:  regex, regexp, regular-expression
Regexr
For composing regular expressions without the need for double-escaping inside strings.
Stars: ✭ 53 (-95.33%)
Mutual labels:  regex, regular-expression, regexp
regexp-expand
Show the ELisp regular expression at point in rx form.
Stars: ✭ 18 (-98.41%)
Mutual labels:  regex, regexp, regular-expression
Proposal Regexp Unicode Property Escapes
Proposal to add Unicode property escapes `\p{…}` and `\P{…}` to regular expressions in ECMAScript.
Stars: ✭ 112 (-90.12%)
Mutual labels:  regex, unicode, regexp
Regex Dos
👮 👊 RegEx Denial of Service (ReDos) Scanner
Stars: ✭ 143 (-87.39%)
Mutual labels:  regex, regular-expression, regexp
globrex
Glob to regular expression with support for extended globs.
Stars: ✭ 52 (-95.41%)
Mutual labels:  regex, regexp, regular-expression
Regexp2
A full-featured regex engine in pure Go based on the .NET engine
Stars: ✭ 389 (-65.7%)
Mutual labels:  regex, regular-expression, regexp

emoji-regex Build status

emoji-regex offers a regular expression to match all emoji symbols and sequences (including textual representations of emoji) as per the Unicode Standard.

This repository contains a script that generates this regular expression based on Unicode data. Because of this, the regular expression can easily be updated whenever new emoji are added to the Unicode standard.

Installation

Via npm:

npm install emoji-regex

In Node.js:

const emojiRegex = require('emoji-regex/RGI_Emoji.js');
// Note: because the regular expression has the global flag set, this module
// exports a function that returns the regex rather than exporting the regular
// expression itself, to make it impossible to (accidentally) mutate the
// original regular expression.

const text = `
\u{231A}: ⌚ default emoji presentation character (Emoji_Presentation)
\u{2194}\u{FE0F}: ↔️ default text presentation character rendered as emoji
\u{1F469}: 👩 emoji modifier base (Emoji_Modifier_Base)
\u{1F469}\u{1F3FF}: 👩🏿 emoji modifier base followed by a modifier
`;

const regex = emojiRegex();
let match;
while (match = regex.exec(text)) {
  const emoji = match[0];
  console.log(`Matched sequence ${ emoji } — code points: ${ [...emoji].length }`);
}

Console output:

Matched sequence ⌚ — code points: 1
Matched sequence ⌚ — code points: 1
Matched sequence ↔️ — code points: 2
Matched sequence ↔️ — code points: 2
Matched sequence 👩 — code points: 1
Matched sequence 👩 — code points: 1
Matched sequence 👩🏿 — code points: 2
Matched sequence 👩🏿 — code points: 2

Regular expression flavors

The package comes with three distinct regular expressions:

// This is the recommended regular expression to use. It matches all
// emoji recommended for general interchange, as defined via the
// `RGI_Emoji` property in the Unicode Standard.
// https://unicode.org/reports/tr51/#def_rgi_set
// When in doubt, use this!
const emojiRegexRGI = require('emoji-regex/RGI_Emoji.js');

// This is the old regular expression, prior to `RGI_Emoji` being
// standardized. In addition to all `RGI_Emoji` sequences, it matches
// some emoji you probably don’t want to match (such as emoji component
// symbols that are not meant to be used separately).
const emojiRegex = require('emoji-regex/index.js');

// This regular expression matches even more emoji than the previous
// one, including emoji that render as text instead of icons (i.e.
// emoji that are not `Emoji_Presentation` symbols and that aren’t
// forced to render as emoji by a variation selector).
const emojiRegexText = require('emoji-regex/text.js');

Additionally, in environments which support ES2015 Unicode escapes, you may require ES2015-style versions of the regexes:

const emojiRegexRGI = require('emoji-regex/es2015/RGI_Emoji.js');
const emojiRegex = require('emoji-regex/es2015/index.js');
const emojiRegexText = require('emoji-regex/es2015/text.js');

For maintainers

How to update emoji-regex after new Unicode Standard releases

  1. Update the Unicode data dependency in package.json by running the following commands:

    # Example: updating from Unicode v12 to Unicode v13.
    npm uninstall @unicode/unicode-12.0.0
    npm install @unicode/unicode-13.0.0 --save-dev
    
  2. Generate the new output:

    npm run build
    
  3. Verify that tests still pass:

    npm test
    
  4. Send a pull request with the changes, and get it reviewed & merged.

  5. On the main branch, bump the emoji-regex version number in package.json:

    npm version patch -m 'Release v%s'
    

    Instead of patch, use minor or major as needed.

    Note that this produces a Git commit + tag.

  6. Push the release commit and tag:

    git push
    

    Our CI then automatically publishes the new release to npm.

Author

twitter/mathias
Mathias Bynens

License

emoji-regex is available under the MIT license.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].