Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

🔍NEW ugrep v3.1: ultra fast grep with interactive query UI and fuzzy search: search file systems, source code, text, binary files, archives (cpio/tar/pax/zip), compressed files (gz/Z/bz2/lzma/xz/lz4), documents and more. A faster, user-friendly and compatible grep replacement.

Stars: ✭ 626 (+3194.74%)

Mutual labels: unicode

Awesome Unicode

😂 👌 A curated list of delightful Unicode tidbits, packages and resources.

Stars: ✭ 693 (+3547.37%)

Mutual labels: unicode

Portable Utf8

🉑 Portable UTF-8 library - performance optimized (unicode) string functions for php.

Stars: ✭ 405 (+2031.58%)

Mutual labels: unicode

Unicopy

Unicode command-line codepoint dumper

Stars: ✭ 16 (-15.79%)

Mutual labels: unicode

Urlify

A fast PHP slug generator and transliteration library that converts non-ascii characters for use in URLs.

Stars: ✭ 633 (+3231.58%)

Mutual labels: unicode

Slug Generator

Slug Generator Library for PHP, based on Unicode’s CLDR data

Stars: ✭ 740 (+3794.74%)

Mutual labels: unicode

Julia Vim

Vim support for Julia.

Stars: ✭ 556 (+2826.32%)

Mutual labels: unicode

Weird Fonts

𝑨 𝑱𝒂𝒗𝒂𝑺𝒄𝒓𝒊𝒑𝒕 𝒑𝒂𝒄𝒌𝒂𝒈𝒆 𝒕𝒉𝒂𝒕 𝒕𝒖𝒓𝒏 𝒂𝒍𝒑𝒉𝒂𝒏𝒖𝒎𝒆𝒓𝒊𝒄 𝒄𝒉𝒂𝒓𝒂𝒄𝒕𝒆𝒓𝒔 𝒊𝒏𝒕𝒐 𝒘𝒆𝒊𝒓𝒅 𝒇𝒐𝒏𝒕 𝒔𝒕𝒚𝒍𝒆.

Stars: ✭ 602 (+3068.42%)

Mutual labels: unicode

String

Provides an object-oriented API to strings and deals with bytes, UTF-8 code points and grapheme clusters in a unified way.

Stars: ✭ 709 (+3631.58%)

Mutual labels: unicode

Transliteration

UTF-8 to ASCII transliteration / slugify module for node.js, browser, Web Worker, React Native, Electron and CLI.

Stars: ✭ 444 (+2236.84%)

Mutual labels: unicode

Unicode Types

Basic Unicode Types of a Ruby String

Stars: ✭ 5 (-73.68%)

Mutual labels: unicode

Wxmedit

wxMEdit, a cross-platform Text/Hex Editor, an improved version of MadEdit

Stars: ✭ 424 (+2131.58%)

Mutual labels: unicode

Ecoji

Encodes (and decodes) data as emojis

Stars: ✭ 671 (+3431.58%)

Mutual labels: unicode

Nepali Romanized Pro

Nepali Romanized Keyboard Layout with installer for macOS

Stars: ✭ 18 (-5.26%)

Mutual labels: unicode

Pragmatapro

PragmataPro font is designed to help pros to work better

Stars: ✭ 887 (+4568.42%)

Mutual labels: unicode

Unicodeplots.jl

Unicode-based scientific plotting for working in the terminal

Stars: ✭ 724 (+3710.53%)

Mutual labels: unicode

View All Similar Projects ➔

UnicodeDB

This library aims to bring the unicode database to Nim. Main goal is having O(1) access for every API and be lightweight in size.

Note: this library doesn't provide Unicode Common Locale Data (UCLD / CLDR data)

Install

nimble install unicodedb

Compatibility

Nim 0.18.0, +0.19.0, +0.20.0

Usage

Properties

import unicode
import unicodedb/properties

assert Rune('A'.ord).unicodeCategory() == ctgLu  # 'L'etter, 'u'ppercase
assert Rune('A'.ord).unicodeCategory() in ctgLm+ctgLo+ctgLu+ctgLl+ctgLt
assert Rune('A'.ord).unicodeCategory() in ctgL

echo Rune(0x0660).bidirectional() # 'A'rabic, 'N'umber
# "AN"

echo Rune(0x860).combining()
# 0

echo nfcQcNo in Rune(0x0374).quickCheck()
# true

docs

Names

import unicode
import unicodedb/names

echo lookupStrict("LEFT CURLY BRACKET")  # '{'
# Rune(0x007B)

echo "/".runeAt(0).name()
# "SOLIDUS"

docs

Compositions

import unicode
import unicodedb/compositions

echo composition(Rune(108), Rune(803))
# Rune(7735)

docs

Decompositions

import unicode
import unicodedb/decompositions

echo Rune(0x0F9D).decomposition()
# @[Rune(0x0F9C), Rune(0x0FB7)]

docs

Types

import unicode
import unicodedb/types

assert utmDecimal in Rune(0x0030).unicodeTypes()
assert utmDigit in Rune(0x00B2).unicodeTypes()
assert utmNumeric in Rune(0x2CFD).unicodeTypes()
assert utmLowercase in Rune(0x1E69).unicodeTypes()
assert utmUppercase in Rune(0x0041).unicodeTypes()
assert utmCased in Rune(0x0041).unicodeTypes()
assert utmWhiteSpace in Rune(0x0009).unicodeTypes()
assert utmWord in Rune(0x1E69).unicodeTypes()

const alphaNumeric = utmLowercase + utmUppercase + utmNumeric
assert alphaNumeric in Rune(0x2CFD).unicodeTypes()
assert alphaNumeric in Rune(0x1E69).unicodeTypes()
assert alphaNumeric in Rune(0x0041).unicodeTypes()

docs

Widths

import unicode
import unicodedb/widths

assert "🕺".runeAt(0).unicodeWidth() == uwdtWide

docs

Scripts

import unicode
import unicodedb/scripts

assert "諸".runeAt(0).unicodeScript() == sptHan

docs

Casing

import sequtils
import unicode
import unicodedb/casing

assert toSeq("Ⓗ".runeAt(0).lowerCase) == @["ⓗ".runeAt(0)]
assert toSeq("İ".runeAt(0).lowerCase) == @[0x0069.Rune, 0x0307.Rune]

assert toSeq("ⓗ".runeAt(0).upperCase) == @["Ⓗ".runeAt(0)]
assert toSeq("ﬃ".runeAt(0).upperCase) == @['F'.ord.Rune, 'F'.ord.Rune, 'I'.ord.Rune]

assert toSeq("ß".runeAt(0).titleCase) == @['S'.ord.Rune, 's'.ord.Rune]

assert toSeq("ᾈ".runeAt(0).caseFold) == @["ἀ".runeAt(0), "ι".runeAt(0)]

docs

Segmentation

import unicode
import unicodedb/segmentation

assert 0x000B.Rune.wordBreakProp == sgwNewline

docs

Related libraries

Storage

Storage is based on multi-stage tables and minimal perfect hashing data-structures.

Sizes

These are the current collections sizes:

properties is 40KB. Used by properties(1), category(1), bidirectional(1), combining(1) and quickCheck(1)
compositions is 12KB. Used by: composition(1)
decompositions is 89KB. Used by decomposition(1) and canonicalDecomposition(1)
names is 578KB. Used by name(1) and lookupStrict(1)
names (lookup) is 241KB. Used by lookupStrict(1)

Missing APIs

New APIs will be added from time to time. If you need something that's missing, please open an issue or PR (please, do mention the use-case).

Upgrading Unicode version

Note: PR's upgrading the unicode version won't get merged, open an issue instead!

Run nimble gen to check there are no changes to ./src/*_data.nim. If there are try an older Nim version and fix the generators accordingly
Run nimble gen_tests to update all test data to current unicode version. The tests for a new unicode version run against the previous unicode version.
Run tests and fix all failing tests. This should require just temporarily commenting out all checks for missing unicode points.
Overwrite ./gen/UCD data with latest unicode UCD.
Run nimble gen to generate the new data.
Run tests. Add checks for missing unicode points back. A handful of unicode points may have change its data, check the unicode changelog page, make sure they are correct and skip them.

Tests

Initial tests were ran against [a dump of] Python's unicodedata module to ensure correctness. Also, the related libraries have their own custom tests (some of the test data is provided by the unicode consortium).

nimble test

Contributing

I plan to work on most missing related libraries (case folding, etc). If you would like to work in one of those, please let me know and I'll add it to the list. If you find the required database data is missing, either open an issue or a PR.

LICENSE

MIT

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 19

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (0) 🔗