Interesting flag emoji replacements
@iwsfutcmd shared an interesting observation:
'๐ง๐ฌ๐ญ๐ท'.replace('๐ฌ๐ญ', '๐ฆ๐ฌ'); // โ '๐ง๐ฆ๐ฌ๐ท'
This repository contains every possible variation of this gotcha according to Unicode 14โs RGI_Emoji_Flag_Sequence
. Thereโs 9,211,570 different variations clocking in at 718.50 MB, so the output is split across multiple files.
Examples:
'๐ป๐ฆ๐ฉ๐ฒ'.replace('๐ฆ๐ฉ', '๐ฌ๐บ'); // โ '๐ป๐ฌ๐บ๐ฒ'
'๐ฐ๐ท๐ช๐น'.replace('๐ท๐ช', '๐ณ๐ฑ'); // โ '๐ฐ๐ณ๐ฑ๐น'
'๐จ๐ต๐ธ๐จ'.replace('๐ต๐ธ', '๐ป๐จ'); // โ '๐จ๐ป๐จ๐จ'
'๐ง๐ง๐ธ๐ฑ'.replace('๐ง๐ธ', '๐ช๐ฌ'); // โ '๐ง๐ช๐ฌ๐ฑ'
'๐ฆ๐น๐จ๐บ'.replace('๐น๐จ', '๐ธ๐ฒ'); // โ '๐ฆ๐ธ๐ฒ๐บ'
'๐ฌ๐ฑ๐ฆ๐ช'.replace('๐ฑ๐ฆ', '๐ฎ๐ช'); // โ '๐ฌ๐ฎ๐ช๐ช'
'๐ฑ๐ธ๐ฒ๐ญ'.replace('๐ธ๐ฒ', '๐ฐ๐ช'); // โ '๐ฑ๐ฐ๐ช๐ญ'
'๐น๐ป๐ฆ๐ซ'.replace('๐ป๐ฆ', '๐จ๐ฌ'); // โ '๐น๐จ๐ฌ๐ซ'
'๐ฌ๐ญ๐ท๐บ'.replace('๐ญ๐ท', '๐ธ๐ญ'); // โ '๐ฌ๐ธ๐ญ๐บ'
'๐ฆ๐ฌ๐ซ๐ฎ'.replace('๐ฌ๐ซ', '๐ฑ๐ธ'); // โ '๐ฆ๐ฑ๐ธ๐ฎ'
'๐ฌ๐ธ๐ฎ๐น'.replace('๐ธ๐ฎ', '๐น๐ฑ'); // โ '๐ฌ๐น๐ฑ๐น'
'๐ฌ๐ฎ๐ธ๐ฌ'.replace('๐ฎ๐ธ', '๐ง๐ง'); // โ '๐ฌ๐ง๐ง๐ฌ'
'๐น๐น๐ฒ๐ฆ'.replace('๐น๐ฒ', '๐ฌ๐น'); // โ '๐น๐ฌ๐น๐ฆ'
'๐ฌ๐ผ๐ซ๐ฒ'.replace('๐ผ๐ซ', '๐ง๐ฏ'); // โ '๐ฌ๐ง๐ฏ๐ฒ'
'๐จ๐ฐ๐ฌ๐ฆ'.replace('๐ฐ๐ฌ', '๐ฒ๐จ'); // โ '๐จ๐ฒ๐จ๐ฆ'
'๐ฒ๐ณ๐ฆ๐ด'.replace('๐ณ๐ฆ', '๐ซ๐ฎ'); // โ '๐ฒ๐ซ๐ฎ๐ด'
'๐ช๐ฆ๐ฎ๐จ'.replace('๐ฆ๐ฎ', '๐ญ๐ฒ'); // โ '๐ช๐ญ๐ฒ๐จ'
'๐ฎ๐น๐น๐ญ'.replace('๐น๐น', '๐ณ๐ช'); // โ '๐ฎ๐ณ๐ช๐ญ'
'๐ต๐ฌ๐บ๐ฌ'.replace('๐ฌ๐บ', '๐ฒ๐ฌ'); // โ '๐ต๐ฒ๐ฌ๐ฌ'
'๐ฑ๐ธ๐ธ๐ฝ'.replace('๐ธ๐ธ', '๐ฐ๐ฒ'); // โ '๐ฑ๐ฐ๐ฒ๐ฝ'
'๐บ๐พ๐น๐ป'.replace('๐พ๐น', '๐ฒ๐ฒ'); // โ '๐บ๐ฒ๐ฒ๐ป'
'๐ธ๐ช๐น๐ฉ'.replace('๐ช๐น', '๐ฑ๐จ'); // โ '๐ธ๐ฑ๐จ๐ฉ'
'๐ฌ๐ช๐ธ๐ญ'.replace('๐ช๐ธ', '๐ธ๐น'); // โ '๐ฌ๐ธ๐น๐ญ'
'๐ธ๐ฉ๐ฒ๐ฌ'.replace('๐ฉ๐ฒ', '๐ฌ๐ช'); // โ '๐ธ๐ฌ๐ช๐ฌ'
'๐ธ๐ฒ๐ฆ๐ฎ'.replace('๐ฒ๐ฆ', '๐ง๐ณ'); // โ '๐ธ๐ง๐ณ๐ฎ'
To get a random example:
shuf -n 1 < output/$(ls output/ | shuf -n 1)
The general pattern is the following, where A
, B
, C
, D
, E
, and F
are all flag emoji, and {A, B}
are distinct from {C, D, E, F}
:
'AB'.replace('C', 'D'); // โ 'EF'
Note that {C, D}
may overlap with {E, F}
, as that leads to even more confusing results, e.g.:
'๐ฒ๐ฆ๐ธ๐ฑ'.replace('๐ฆ๐ธ', '๐ฒ๐ฒ'); // โ '๐ฒ๐ฒ๐ฒ๐ฑ'
Explanation
Unicode defines the following 26 Regional_Indicator
symbols:
๐ฆ ๐ง ๐จ ๐ฉ ๐ช ๐ซ ๐ฌ ๐ญ ๐ฎ ๐ฏ ๐ฐ ๐ฑ ๐ฒ ๐ณ ๐ด ๐ต ๐ถ ๐ท ๐ธ ๐น ๐บ ๐ป ๐ผ ๐ฝ ๐พ ๐ฟ
Combining two of these into a country code results in the flag emoji for that country. So, @iwsfutcmdโs example:
'๐ง๐ฌ๐ญ๐ท'.replace('๐ฌ๐ญ', '๐ฆ๐ฌ'); // โ '๐ง๐ฆ๐ฌ๐ท'
โฆis actually the following (with spaces added for clarity):
'๐ง ๐ฌ ๐ญ ๐ท'.replace('๐ฌ ๐ญ', '๐ฆ ๐ฌ'); // โ '๐ง ๐ฆ ๐ฌ ๐ท'
The code replaces the second half (๐ฌ) of the first flag emoji (