All Projects → ausi → Slug Generator

ausi / Slug Generator

Licence: mit
Slug Generator Library for PHP, based on Unicode’s CLDR data

Projects that are alternatives of or similar to Slug Generator

Urlify
A fast PHP slug generator and transliteration library that converts non-ascii characters for use in URLs.
Stars: ✭ 633 (-14.46%)
Mutual labels:  unicode, ascii, transliteration, slug
unidecode
Elixir package to transliterate Unicode to ASCII
Stars: ✭ 18 (-97.57%)
Mutual labels:  unicode, ascii, transliteration
Transliteration
UTF-8 to ASCII transliteration / slugify module for node.js, browser, Web Worker, React Native, Electron and CLI.
Stars: ✭ 444 (-40%)
Mutual labels:  unicode, ascii, transliteration
characteristics
Character info under different encodings
Stars: ✭ 25 (-96.62%)
Mutual labels:  unicode, ascii
table2ascii
Python library for converting lists to fancy ASCII tables for displaying in the terminal and on Discord
Stars: ✭ 31 (-95.81%)
Mutual labels:  unicode, ascii
homoglyphs
Homoglyphs: get similar letters, convert to ASCII, detect possible languages and UTF-8 group.
Stars: ✭ 70 (-90.54%)
Mutual labels:  unicode, ascii
unihandecode
unihandecode is a transliteration library to convert all characters/words in Unicode into ASCII alphabet that aware with Language preference priorities
Stars: ✭ 71 (-90.41%)
Mutual labels:  unicode, transliteration
attic
A collection of personal tiny tools - mirror of https://gitlab.com/hydrargyrum/attic
Stars: ✭ 17 (-97.7%)
Mutual labels:  unicode, ascii
durdraw
Animated Unicode, ANSI and ASCII Art Editor for Linux/Unix/macOS
Stars: ✭ 55 (-92.57%)
Mutual labels:  unicode, ascii
Portable Utf8
🉑 Portable UTF-8 library - performance optimized (unicode) string functions for php.
Stars: ✭ 405 (-45.27%)
Mutual labels:  unicode, ascii
Ustring
The Hoa\Ustring library.
Stars: ✭ 403 (-45.54%)
Mutual labels:  library, unicode
Limax
Node.js module to generate URL slugs. Another one? This one cares about i18n and transliterates non-Latin scripts to conform to the RFC3986 standard. Mostly API-compatible with similar modules.
Stars: ✭ 423 (-42.84%)
Mutual labels:  transliteration, slug
Contour
Modern C++ Terminal Emulator
Stars: ✭ 191 (-74.19%)
Mutual labels:  library, unicode
Harfbuzz
HarfBuzz text shaping engine
Stars: ✭ 2,206 (+198.11%)
Mutual labels:  library, unicode
Terminaltables
Generate simple tables in terminals from a nested list of strings.
Stars: ✭ 685 (-7.43%)
Mutual labels:  library, ascii
Sheenbidi
A sophisticated implementation of Unicode Bidirectional Algorithm
Stars: ✭ 52 (-92.97%)
Mutual labels:  library, unicode
Art
🎨 ASCII art library for Python
Stars: ✭ 1,026 (+38.65%)
Mutual labels:  library, ascii
Php Confusable Homoglyphs
A PHP port of https://github.com/vhf/confusable_homoglyphs
Stars: ✭ 27 (-96.35%)
Mutual labels:  library, unicode
Unicode Bidirectional
A Javascript implementation of the Unicode 9.0.0 Bidirectional Algorithm
Stars: ✭ 35 (-95.27%)
Mutual labels:  library, unicode
Csconsoleformat
.NET C# library for advanced formatting of console output [Apache]
Stars: ✭ 296 (-60%)
Mutual labels:  library, ascii

Slug Generator Library

Build Status Coverage Packagist Version Downloads MIT License

This library provides methods to generate slugs for URLs, filenames or any other target that has a limited character set. It’s based on PHPs Transliterator class which uses the data of the CLDR to transform characters between different scripts (e.g. Cyrillic to Latin) or types (e.g. upper- to lower-case or from special characters to ASCII).

Usage

<?php
use Ausi\SlugGenerator\SlugGenerator;

$generator = new SlugGenerator;

$generator->generate('Hello Wörld!');  // Output: hello-world
$generator->generate('Καλημέρα');      // Output: kalemera
$generator->generate('фильм');         // Output: film
$generator->generate('富士山');         // Output: fu-shi-shan
$generator->generate('國語');           // Output: guo-yu

// Different valid character set, a specified locale and a delimiter
$generator = new SlugGenerator((new SlugOptions)
    ->setValidChars('a-zA-Z0-9')
    ->setLocale('de')
    ->setDelimiter('_')
);
$generator->generate('Äpfel und Bäume');  // Aepfel_und_Baeume

Installation

To install the library use Composer or download the source files from GitHub.

composer require ausi/slug-generator

Why create another slug library, aren’t there enough already?

There are many code snippets and some good libraries out there that create slugs, but I didn’t find anything that met my requirements. Options are often very limited which makes it hard to customize for different use cases. Some libs carry large rulesets with them that try to convert characters to ASCII, no one uses Unicode’s CLDR which is the standard for transliteration rules and many other transforms.

But most importantly no library was able to do the “correct” conversions, like Ö-Äpfel to OE-Aepfel for German or İNATÇI to inatçı for Turkish. Because the CLDR transliteration rules are context sensitive they know how to correctly convert to OE-Aepfel instead of Oe-Aepfel or OE-AEpfel. CLDR also takes the language into account and knows that the turkish uppercase letter I has the lowercase form ı instead of i.

Options

All options can be set for the generator object itself new SlugGenerator($options) or overwritten when calling generate($text, $options). Options can by passed as array or as SlugOptions object.

delimiter, default "-"

The delimiter can be any string, it is used to separate words. It gets stripped from the beginning and the end of the slug.

$generator->generate('Hello World!');                         // Result: hello-world
$generator->generate('Hello World!', ['delimiter' => '_']);   // Result: hello_world
$generator->generate('Hello World!', ['delimiter' => '%20']); // Result: hello%20world

validChars, default "a-z0-9"

Valid characters that are allowed in the slug. The range syntax is the same as in character classes of regular expressions. For example abc, a-z0-9äöüß or \p{Ll}\-_.

$generator->generate('Hello World!');                             // Result: hello-world
$generator->generate('Hello World!', ['validChars' => 'A-Z']);    // Result: HELLO-WORLD
$generator->generate('Hello World!', ['validChars' => 'A-Za-z']); // Result: Hello-World

ignoreChars, default "\p{Mn}\p{Lm}"

Characters that should be completely removed and not replaced with a delimiter. It uses the same syntax as the validChars option.

$generator->generate("don't remove");                         // Result: don-t-remove
$generator->generate("don't remove", ['ignoreChars' => "'"]); // Result: dont-remove

locale, default ""

The locale that should be used for the Unicode transformations.

$generator->generate('Hello Wörld!');                        // Result: hello-world
$generator->generate('Hello Wörld!', ['locale' => 'de']);    // Result: hello-woerld
$generator->generate('Hello Wörld!', ['locale' => 'en_US']); // Result: hello-world

transforms, default Upper, Lower, Latn, ASCII, Upper, Lower

Internally the slug generator uses Transform Rules to convert invalid characters to valid ones. These rules can be customized by setting the transforms, preTransforms or postTransforms options. Usually setting preTransforms is desired as it applies the custom transforms prior to the default ones.

How Transform Rules (like Lower or ASCII) and rule sets (like a > b; c > d;) work is documented on the ICU website: http://userguide.icu-project.org/transforms

$generator->generate('Damn 💩!!');                                           // Result: damn
$generator->generate('Damn 💩!!', ['preTransforms' => ['💩 > Ice-Cream']]);  // Result: damn-ice-cream

$generator->generate('©');                                          // Result: c
$generator->generate('©', ['preTransforms' => ['© > Copyright']]);  // Result: copyright
$generator->generate('©', ['preTransforms' => ['Hex']]);            // Result: u00a9
$generator->generate('©', ['preTransforms' => ['Name']]);           // Result: n-copyright-sign
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].