Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → blackwinter → Unicode

blackwinter / Unicode

Unicode normalization library. (Mirror of Yoshida-san's code base to maintain the RubyGem.)

Programming Languages

50402 projects - #5 most used programming language

ruby

36898 projects - #4 most used programming language

Labels

unicode

Projects that are alternatives of or similar to Unicode

Phobos

The standard library of the D programming language

Stars: ✭ 1,038 (+1181.48%)

Mutual labels: unicode

Mdetect

Stars: ✭ 54 (-33.33%)

Mutual labels: unicode

Knayi Myscript

Myanmar Language Script Library

Stars: ✭ 63 (-22.22%)

Mutual labels: unicode

Font Awesome Php

A PHP library for Font Awesome 4.7.

Stars: ✭ 47 (-41.98%)

Mutual labels: unicode

Keytokey

Rust keyboard firmware library

Stars: ✭ 54 (-33.33%)

Mutual labels: unicode

Quran Data

Unicode-encoded Quran data

Stars: ✭ 54 (-33.33%)

Mutual labels: unicode

Unicode Tr51

Emoji data extracted from Unicode Technical Report #51.

Stars: ✭ 38 (-53.09%)

Mutual labels: unicode

Ucdn

Unicode Database and Normalization

Stars: ✭ 78 (-3.7%)

Mutual labels: unicode

Awesome Emoji Picker

Add-on/WebExtension that provides a modern emoji picker that you can use to find and copy/insert emoji into the active web page.

Stars: ✭ 54 (-33.33%)

Mutual labels: unicode

Yawysiwygee

Yet another what-you-see-is-what-you-get equation editor

Stars: ✭ 60 (-25.93%)

Mutual labels: unicode

Open Relay

Free and open source fonts from Kreative Software

Stars: ✭ 48 (-40.74%)

Mutual labels: unicode

Weird Json

A collection of strange encoded JSONs. For connoisseurs.

Stars: ✭ 53 (-34.57%)

Mutual labels: unicode

Glyphhanger

Your web font utility belt. It can subset web fonts. It can find unicode-ranges for you automatically. It makes julienne fries.

Stars: ✭ 1,099 (+1256.79%)

Mutual labels: unicode

Unicode Confusable

Unicode::Confusable.confusable? "ℜսᖯʏ", "Ruby"

Stars: ✭ 47 (-41.98%)

Mutual labels: unicode

Emoji Regex

A regular expression to match all Emoji-only symbols as per the Unicode Standard.

Stars: ✭ 1,134 (+1300%)

Mutual labels: unicode

Icu

The new home of the ICU project source code.

Stars: ✭ 1,011 (+1148.15%)

Mutual labels: unicode

Python Myanmar

Python library for Myanmar text processing

Stars: ✭ 53 (-34.57%)

Mutual labels: unicode

Lehar

Visualize data using relative ordering

Stars: ✭ 81 (+0%)

Mutual labels: unicode

Locale2

💪 Try as hard as possible to detect the client's language tag ("locale") in node or the browser. Browserify and Webpack friendly!

Stars: ✭ 65 (-19.75%)

Mutual labels: unicode

Sinais

🔣 Desenvolvimento passo a passo do exemplo `sinais` em Go.

Stars: ✭ 59 (-27.16%)

Mutual labels: unicode

View All Similar Projects ➔

	   Unicode Library for Ruby
		Version 0.4.4

	       Yoshida Masato

Introduction

Unicode string manipulation library for Ruby. This library is based on UAX #15 Unicode Normalization Forms(*1).

*1 URL:http://www.unicode.org/unicode/reports/tr15/
Install

This can work with ruby-1.8.7 or later. I recommend you to use ruby-1.9.3 or later.

Make and install usually. For example, when Ruby supports dynamic linking on your OS,

ruby extconf.rb make make install

To install using gem, for exapmle:

gem build unicdoe.gemspac gem install unicode
Usage

If you do not link this module with Ruby statically,

require "unicode"

before using.
Module Functions

All parameters of functions must be UTF-8 strings.

Unicode::strcmp(str1, str2) Unicode::strcmp_compat(str1, str2) Compare Unicode strings with a normalization. strcmp uses the Normalization Form D, strcmp_compat uses Normalization Form KD.

Unicode::decompose(str) Unicode::decompose_compat(str) Decompose Unicode string. Then the trailing characters are sorted in canonical order. decompose uses the canonical decomposition, decompose_compat uses the compatibility decomposition. The decomposition is based on the character decomposition mapping in UnicodeData.txt and the Hangul decomposition algorithm.

Unicode::decompose_safe(str) Decompose Unicode string with a non-standard mapping. It does not decompose the characters in CompositionExclusions.txt.

Unicode::compose(str) Compose Unicode string. Before composing, the trailing characters are sorted in canonical order. The parameter must be decomposed. The composition is based on the reverse of the character decomposition mapping in UnicodeData.txt, CompositionExclusions.txt and the Hangul composition algorithm.

Unicode::normalize_D(str) (Unicode::nfd(str)) Unicode::normalize_KD(str) (Unicode::nfkd(str)) Normalize Unicode string in form D or form KD. These are aliases of decompose/decompose_compat.

Unicode::normalize_D_safe(str) (Unicode::nfd_safe(str)) This is an alias of decompose_safe.

Unicode::normalize_C(str) (Unicode::nfc(str)) Unicode::normalize_KC(str) (Unicode::nfkc(str)) Normalize Unicode string in form C or form KC. normalize_C = decompose + compose normalize_KC = decompose_compat + compose

Unicode::normalize_C_safe(str) (Unicode::nfc_safe(str)) Normalize Unicode string with decompose_safe. normalize_C_safe = decompose_safe + compose

Unicode::upcase(str) Unicode::downcase(str) Unicode::capitalize(str) Case conversion functions. The mappings that are used by these functions are not normative in UnicodeData.txt.

Unicode::categories(str) Unicode::abbr_categories(str) Get an array of general category names of the string. get_abbr_categories returns abbreviated names. These can be called with a block.
```
Unicode.get_category do |category| p category end
```
Unicode::text_elements(str) Get an array of text elements. A text element is a unit that is displayed as a single character. These can be called with a block.

Unicode::width(str[, cjk]) Estimate the display width on the fixed pitch text terminal. It based on Markus Kuhn's mk_wcwidth. If the optional argument 'cjk' is true, East Asian Ambiguous characters are treated as wide characters.
```
Unicode.width("\u03b1") #=> 1
Unicode.width("\u03b1", true) #=> 2
```
Bugs

UAX #15 suggests that the look up for Normalization Form C should not be implemented with a hash of string for better performance.
Copying

This extension module is copyrighted free software by Yoshida Masato.

You can redistribute it and/or modify it under the same term as Ruby.
Author

Yoshida Masato [email protected]
History

Feb 7, 2013 version 0.4.4 update unidata.map for Unicode 6.2 Aug 8, 2012 version 0.4.3 add categories, text_elements and width Feb 29, 2012 version 0.4.2 add decompose_safe Feb 3, 2012 version 0.4.1 update unidata.map for Unicode 6.1 Oct 14, 2010 version 0.4.0 fix the composition algorithm, and support Unicode 6.0 Feb 26, 2010 version 0.3.0 fix a capitalize bug and support SpecialCasing Dec 29, 2009 version 0.2.0 update for Ruby 1.9.1 and Unicode 5.2 Sep 10, 2005 version 0.1.2 update unidata.map for Unicode 4.1.0 Aug 26, 2004 version 0.1.1 update unidata.map for Unicode 4.0.1 Nov 23, 1999 version 0.1

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 81

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (0) 🔗