All Projects → kjd → Idna

kjd / Idna

Licence: bsd-3-clause
Internationalized Domain Names for Python (IDNA 2008 and UTS #46)

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Idna

Ldns
LDNS is a DNS library that facilitates DNS tool programming
Stars: ✭ 127 (-7.97%)
Mutual labels:  dns
Rind
DNS server with REST interface for records management built on Golang
Stars: ✭ 132 (-4.35%)
Mutual labels:  dns
Elk Hole
elasticsearch, logstash and kibana configuration for pi-hole visualiziation
Stars: ✭ 136 (-1.45%)
Mutual labels:  dns
Python Whois
Python module/library for retrieving WHOIS information of domains 💻❤
Stars: ✭ 128 (-7.25%)
Mutual labels:  dns
Bedrock Unicode Characters
Minecraft:Bedrock Edition Unicode characters
Stars: ✭ 130 (-5.8%)
Mutual labels:  unicode
Blocklists
Domain-ONLY Filter Lists (for use with DNS / Domain blocking tools)
Stars: ✭ 133 (-3.62%)
Mutual labels:  dns
Php Dns
A DNS abstraction for PHP
Stars: ✭ 126 (-8.7%)
Mutual labels:  dns
Onionmx
Onion delivery, so delicious
Stars: ✭ 138 (+0%)
Mutual labels:  dns
Aliyun Ddns
阿里云动态域名工具,支持docker和ipv6。
Stars: ✭ 131 (-5.07%)
Mutual labels:  dns
Amass
In-depth Attack Surface Mapping and Asset Discovery
Stars: ✭ 1,693 (+1126.81%)
Mutual labels:  dns
Ansiweather
Weather in terminal, with ANSI colors and Unicode symbols
Stars: ✭ 1,663 (+1105.07%)
Mutual labels:  unicode
Confusable homoglyphs
ϲοnfuѕаblе_һοmоɡlyphs
Stars: ✭ 130 (-5.8%)
Mutual labels:  unicode
Tokenizer
Fast and customizable text tokenization library with BPE and SentencePiece support
Stars: ✭ 132 (-4.35%)
Mutual labels:  unicode
Ymhttp
基于 libcurl 的 IO 多路复用 HTTP 框架,适用于 iOS 平台,支持 HTTP/HTTPS/HTTP2/DNS(SNI)
Stars: ✭ 127 (-7.97%)
Mutual labels:  dns
Guide To Swift Strings Sample Code
Xcode Playground Sample Code for the Flight School Guide to Swift Strings
Stars: ✭ 136 (-1.45%)
Mutual labels:  unicode
Prcdns
准确、CDN友好
Stars: ✭ 126 (-8.7%)
Mutual labels:  dns
Spf Tools
Shell scripts for taming the SPF (Sender Policy Framework) records in order to fight 10-maximum-DNS-look-ups limit.
Stars: ✭ 131 (-5.07%)
Mutual labels:  dns
Knot
A mirrored repository
Stars: ✭ 138 (+0%)
Mutual labels:  dns
Dnspython
a powerful DNS toolkit for python
Stars: ✭ 1,838 (+1231.88%)
Mutual labels:  dns
Punic
PHP translation and localization made easy!
Stars: ✭ 133 (-3.62%)
Mutual labels:  unicode

Internationalized Domain Names in Applications (IDNA)

Support for the Internationalised Domain Names in Applications (IDNA) protocol as specified in RFC 5891 <https://tools.ietf.org/html/rfc5891>_. This is the latest version of the protocol and is sometimes referred to as “IDNA 2008”.

This library also provides support for Unicode Technical Standard 46, Unicode IDNA Compatibility Processing <https://unicode.org/reports/tr46/>_.

This acts as a suitable replacement for the “encodings.idna” module that comes with the Python standard library, but which only supports the old, deprecated IDNA specification (RFC 3490 <https://tools.ietf.org/html/rfc3490>_).

Basic functions are simply executed:

.. code-block:: pycon

>>> import idna
>>> idna.encode('ドメイン.テスト')
b'xn--eckwd4c7c.xn--zckzah'
>>> print(idna.decode('xn--eckwd4c7c.xn--zckzah'))
ドメイン.テスト

Packages

The latest tagged release version is published in the PyPI repository:

.. image:: https://badge.fury.io/py/idna.svg :target: https://badge.fury.io/py/idna

Installation

To install this library, you can use pip:

.. code-block:: bash

$ pip install idna

Alternatively, you can install the package using the bundled setup script:

.. code-block:: bash

$ python setup.py install

This library works with Python 3.4 or later. Earlier versions of this library support Python 2 - use "idna<3" in your requirements file if you need this library for a Python 2 application.

Usage

For typical usage, the encode and decode functions will take a domain name argument and perform a conversion to A-labels or U-labels respectively.

.. code-block:: pycon

>>> import idna
>>> idna.encode('ドメイン.テスト')
b'xn--eckwd4c7c.xn--zckzah'
>>> print(idna.decode('xn--eckwd4c7c.xn--zckzah'))
ドメイン.テスト

You may use the codec encoding and decoding methods using the idna.codec module:

.. code-block:: pycon

>>> import idna.codec
>>> print('домена.испытание'.encode('idna'))
b'xn--80ahd1agd.xn--80akhbyknj4f'
>>> print(b'xn--80ahd1agd.xn--80akhbyknj4f'.decode('idna'))
домена.испытание

Conversions can be applied at a per-label basis using the ulabel or alabel functions if necessary:

.. code-block:: pycon

>>> idna.alabel('测试')
b'xn--0zwm56d'

Compatibility Mapping (UTS #46) +++++++++++++++++++++++++++++++

As described in RFC 5895 <https://tools.ietf.org/html/rfc5895>_, the IDNA specification does not normalize input from different potential ways a user may input a domain name. This functionality, known as a “mapping”, is considered by the specification to be a local user-interface issue distinct from IDNA conversion functionality.

This library provides one such mapping, that was developed by the Unicode Consortium. Known as Unicode IDNA Compatibility Processing <https://unicode.org/reports/tr46/>_, it provides for both a regular mapping for typical applications, as well as a transitional mapping to help migrate from older IDNA 2003 applications.

For example, “Königsgäßchen” is not a permissible label as LATIN CAPITAL LETTER K is not allowed (nor are capital letters in general). UTS 46 will convert this into lower case prior to applying the IDNA conversion.

.. code-block:: pycon

>>> import idna
>>> idna.encode('Königsgäßchen')
...
idna.core.InvalidCodepoint: Codepoint U+004B at position 1 of 'Königsgäßchen' not allowed
>>> idna.encode('Königsgäßchen', uts46=True)
b'xn--knigsgchen-b4a3dun'
>>> print(idna.decode('xn--knigsgchen-b4a3dun'))
königsgäßchen

Transitional processing provides conversions to help transition from the older 2003 standard to the current standard. For example, in the original IDNA specification, the LATIN SMALL LETTER SHARP S (ß) was converted into two LATIN SMALL LETTER S (ss), whereas in the current IDNA specification this conversion is not performed.

.. code-block:: pycon

>>> idna.encode('Königsgäßchen', uts46=True, transitional=True)
'xn--knigsgsschen-lcb0w'

Implementors should use transitional processing with caution, only in rare cases where conversion from legacy labels to current labels must be performed (i.e. IDNA implementations that pre-date 2008). For typical applications that just need to convert labels, transitional processing is unlikely to be beneficial and could produce unexpected incompatible results.

encodings.idna Compatibility ++++++++++++++++++++++++++++++++

Function calls from the Python built-in encodings.idna module are mapped to their IDNA 2008 equivalents using the idna.compat module. Simply substitute the import clause in your code to refer to the new module name.

Exceptions

All errors raised during the conversion following the specification should raise an exception derived from the idna.IDNAError base class.

More specific exceptions that may be generated as idna.IDNABidiError when the error reflects an illegal combination of left-to-right and right-to-left characters in a label; idna.InvalidCodepoint when a specific codepoint is an illegal character in an IDN label (i.e. INVALID); and idna.InvalidCodepointContext when the codepoint is illegal based on its positional context (i.e. it is CONTEXTO or CONTEXTJ but the contextual requirements are not satisfied.)

Building and Diagnostics

The IDNA and UTS 46 functionality relies upon pre-calculated lookup tables for performance. These tables are derived from computing against eligibility criteria in the respective standards. These tables are computed using the command-line script tools/idna-data.

This tool will fetch relevant codepoint data from the Unicode repository and perform the required calculations to identify eligibility. There are three main modes:

  • idna-data make-libdata. Generates idnadata.py and uts46data.py, the pre-calculated lookup tables using for IDNA and UTS 46 conversions. Implementors who wish to track this library against a different Unicode version may use this tool to manually generate a different version of the idnadata.py and uts46data.py files.

  • idna-data make-table. Generate a table of the IDNA disposition (e.g. PVALID, CONTEXTJ, CONTEXTO) in the format found in Appendix B.1 of RFC 5892 and the pre-computed tables published by IANA <https://www.iana.org/>_.

  • idna-data U+0061. Prints debugging output on the various properties associated with an individual Unicode codepoint (in this case, U+0061), that are used to assess the IDNA and UTS 46 status of a codepoint. This is helpful in debugging or analysis.

The tool accepts a number of arguments, described using idna-data -h. Most notably, the --version argument allows the specification of the version of Unicode to use in computing the table data. For example, idna-data --version 9.0.0 make-libdata will generate library data against Unicode 9.0.0.

Testing

The library has a test suite based on each rule of the IDNA specification, as well as tests that are provided as part of the Unicode Technical Standard 46, Unicode IDNA Compatibility Processing <https://unicode.org/reports/tr46/>_.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].