All Projects → byllyfish → precis_i18n

byllyfish / precis_i18n

Licence: MIT license
Python3 implementation of PRECIS framework (RFC 8264, RFC 8265, RFC 8266)

Programming Languages

python
139335 projects - #7 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to precis i18n

Fuzzdicts
Web Pentesting Fuzz 字典,一个就够了。
Stars: ✭ 4,013 (+24981.25%)
Mutual labels:  password, username
chrome-thief
A small program, lists all the stored user name and passwords with urls in Google Chrome.
Stars: ✭ 14 (-12.5%)
Mutual labels:  password, username
mysql-user-db-creator-bash-script
Script to create a mysql database, user and password with just a command
Stars: ✭ 24 (+50%)
Mutual labels:  password, username
i18n-unused
The static analyze tool for finding, marking and removing unused and missing i18n translations in your JavaScript project
Stars: ✭ 76 (+375%)
Mutual labels:  internationalization
cracken
a fast password wordlist generator, Smartlist creation and password hybrid-mask analysis tool written in pure safe Rust
Stars: ✭ 192 (+1100%)
Mutual labels:  password
mobx-react-intl
A connector between mobx-react and react-intl
Stars: ✭ 32 (+100%)
Mutual labels:  internationalization
django-i18nfield
Store internationalized strings in Django models with full forms support
Stars: ✭ 32 (+100%)
Mutual labels:  internationalization
OutlookPasswordRecovery
This tool usable for recover Outlook passwords and it working with all versions. I tested with 2007, 2010, 2013 and 2016.
Stars: ✭ 14 (-12.5%)
Mutual labels:  password
uPortal-web-components
A collection of uPortal Web Components and JavaScript utilities
Stars: ✭ 24 (+50%)
Mutual labels:  internationalization
vue-translated
Internationalization (i18n) and localization (l10n) library for Vue.js v2.
Stars: ✭ 19 (+18.75%)
Mutual labels:  internationalization
jquery.pwstrength
A jQuery plugin to indicate the strength of passwords
Stars: ✭ 22 (+37.5%)
Mutual labels:  password
stone.js
gettext-like client-side Javascript Internationalization Library
Stars: ✭ 20 (+25%)
Mutual labels:  internationalization
cldr-engine
Internationalization and localization in Typescript with Unicode CLDR, batteries included
Stars: ✭ 34 (+112.5%)
Mutual labels:  internationalization
react-native-passmeter
Simple password strength meter for React Native.
Stars: ✭ 46 (+187.5%)
Mutual labels:  password
frontend-platform
A framework for Open edX micro-frontend applications.
Stars: ✭ 17 (+6.25%)
Mutual labels:  internationalization
jekyll-password-protect
Password protect Jekyll posts (formerly jekyll-firewall)
Stars: ✭ 60 (+275%)
Mutual labels:  password
locl
Internationalization (i18n) tools suite for Angular
Stars: ✭ 95 (+493.75%)
Mutual labels:  internationalization
v-intl
Add i18n to your awesome Vue 3 app 🔉
Stars: ✭ 13 (-18.75%)
Mutual labels:  internationalization
react-router-i18n
Internationalization library built on top of React Router
Stars: ✭ 24 (+50%)
Mutual labels:  internationalization
lisan
🌈i18n, Reimagined! 🚀A blazing fast and super small i18n library for Javascript
Stars: ✭ 85 (+431.25%)
Mutual labels:  internationalization

PRECIS-i18n: Internationalized Usernames and Passwords

MIT licensed Build Status codecov.io

If you want your application to accept Unicode user names and passwords, you must be careful in how you validate and compare them. The PRECIS framework makes internationalized user names and passwords safer for use by applications. PRECIS profiles transform Unicode strings into a canonical form, suitable for comparison.

This module implements the PRECIS Framework as described in:

  • PRECIS Framework: Preparation, Enforcement, and Comparison of Internationalized Strings in Application Protocols (RFC 8264)
  • Preparation, Enforcement, and Comparison of Internationalized Strings Representing Usernames and Passwords (RFC 8265)
  • Preparation, Enforcement, and Comparison of Internationalized Strings Representing Nicknames (RFC 8266)

Requires Python 3.3 or later.

Usage

Use the get_profile function to obtain a profile object, then use its enforce method. The enforce method returns a Unicode string.

>>> from precis_i18n import get_profile
>>> username = get_profile('UsernameCaseMapped')
>>> username.enforce('Kevin')
'kevin'
>>> username.enforce('\u212Aevin')
'kevin'
>>> username.enforce('\uFF2Bevin')
'kevin'
>>> username.enforce('\U0001F17Aevin')
Traceback (most recent call last):
    ...
UnicodeEncodeError: 'UsernameCaseMapped' codec can't encode character '\U0001f17a' in position 0: DISALLOWED/symbols

Alternatively, you can use the Python str.encode API. Import the precis_i18n.codec module to register the PRECIS codec names. Now you can use the str.encode method with any Unicode string. The result will be a UTF-8 encoded byte string or a UnicodeEncodeError if the string is disallowed.

>>> import precis_i18n.codec
>>> 'Kevin'.encode('UsernameCasePreserved')
b'Kevin'
>>> '\u212Aevin'.encode('UsernameCasePreserved')
b'Kevin'
>>> '\uFF2Bevin'.encode('UsernameCasePreserved')
b'Kevin'
>>> '\u212Aevin'.encode('UsernameCaseMapped')
b'kevin'
>>> '\uFF2Bevin'.encode('OpaqueString')
b'\xef\xbc\xabevin'
>>> '\U0001F17Aevin'.encode('UsernameCasePreserved')
Traceback (most recent call last):
    ...
UnicodeEncodeError: 'UsernameCasePreserved' codec can't encode character '\U0001f17a' in position 0: DISALLOWED/symbols

Alternative Unicode Versions

The get_profile function uses whatever version of unicodedata is provided by the Python runtime. The Unicode version is usually tied to the major version of the Python runtime. Python 3.7.x uses Unicode 11.0. Python 3.6.x uses Unicode 10.0.

To use an alternative unicodedata implementation, pass the unicodedata keyword argument to get_profile.

For example, you could separately install version 12.0 of the unicodedata2 module from PyPI. Then, pass it to get_profile to retrieve a profile that uses Unicode 12.0.

>> import unicodedata2
>> from precis_i18n import get_profile
>> username = get_profile('UsernameCaseMapped', unicodedata=unicodedata2)
>> username.enforce('Kevin')
'kevin'

Supported Profiles and Codecs

Each PRECIS profile has a corresponding codec name. The CaseMapped variant converts the string to lower case for implementing case-insensitive comparison.

  • UsernameCasePreserved
  • UsernameCaseMapped
  • OpaqueString
  • NicknameCasePreserved
  • NicknameCaseMapped

The CaseMapped profiles use Unicode ToLower per the latest RFC. Previous versions of this package used Unicode Default Case Folding. There are CaseMapped variants for different case transformations. These profile names are deprecated:

  • UsernameCaseMapped:ToLower
  • UsernameCaseMapped:CaseFold
  • NicknameCaseMapped:ToLower
  • NicknameCaseMapped:CaseFold

The PRECIS base string classes are also available as codecs:

  • IdentifierClass
  • FreeFormClass

Userparts and Space Delimited Usernames

The Username profiles in this implementation do not allow spaces. The Username profiles correspond to the definition of "userparts" in RFC 8265. If you want to allow spaces in your application's user names, you must split the string first.

def enforce_app_username(name):
    profile = precis_i18n.get_profile('UsernameCasePreserved')
    userparts = [profile.enforce(userpart) for userpart in name.split(' ')]
    return ' '.join(userparts)

Be aware that a username constructed this way can contain bidirectional text in the separate userparts.

Error Messages

A PRECIS profile raises a UnicodeEncodeError exception if a string is disallowed. The reason field specifies the kind of error.

Reason Explanation
DISALLOWED/arabic_indic Arabic-Indic digits cannot be mixed with Extended Arabic-Indic Digits. (Context)
DISALLOWED/bidi_rule Right-to-left string cannot contain left-to-right characters due to the "Bidi" rule. (Context)
DISALLOWED/controls Control character is not allowed.
DISALLOWED/empty After applying the profile, the result cannot be empty.
DISALLOWED/exceptions Exception character is not allowed.
DISALLOWED/extended_arabic_indic Extended Arabic-Indic digits cannot be mixed with Arabic-Indic Digits. (Context)
DISALLOWED/greek_keraia Greek keraia must be followed by a Greek character. (Context)
DISALLOWED/has_compat Compatibility characters are not allowed.
DISALLOWED/hebrew_punctuation Hebrew punctuation geresh or gershayim must be preceded by Hebrew character. (Context)
DISALLOWED/katakana_middle_dot Katakana middle dot must be accompanied by a Hiragana, Katakana, or Han character. (Context)
DISALLOWED/middle_dot Middle dot must be surrounded by the letter 'l'. (Context)
DISALLOWED/not_idempotent After reapplying the profile, the result is not stable.
DISALLOWED/old_hangul_jamo Conjoining Hangul Jamo is not allowed.
DISALLOWED/other Other character is not allowed.
DISALLOWED/other_letter_digits Non-traditional letter or digit is not allowed.
DISALLOWED/precis_ignorable_properties Default ignorable or non-character is not allowed.
DISALLOWED/punctuation Non-ASCII punctuation character is not allowed.
DISALLOWED/spaces Space character is not allowed.
DISALLOWED/symbols Non-ASCII symbol character is not allowed.
DISALLOWED/unassigned Unassigned Unicode character is not allowed.
DISALLOWED/zero_width_joiner Zero width joiner must immediately follow a combining virama. (Context)
DISALLOWED/zero_width_nonjoiner Zero width non-joiner must immediately follow a combining virama, or appear where it breaks a cursive connection in a formally cursive script. (Context)

Unicode Version Update Procedure

When Unicode releases a new version, take the following steps to update internal tables and pass unit tests:

  • Under a version of Python that supports the new Unicode version, run the tests using python -m unittest discover and check that the test_derived_props test FAILS due to a missing file.
  • Generate a new derived-props file by running PYTHONPATH=. python test/test_derived_props.py > derived-props-VERSION.txt. Rename the file using the Unicode version, and re-run the tests. The unit tests will further check that no derived properties in the new file contradict the previous values.
  • Check for changes to internal tables used for context rules by running PYTHONPATH=. python tools/check_codepoints.py. Update the corresponding tables in precis_i18n/unicode.py if necessary.
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].