All Projects → mpenet → thorn

mpenet / thorn

Licence: other
Homoglyph/IDN homograph detection/handling - Clojure port of python confusable_homoglyphs

Programming Languages

clojure
4091 projects

Τһогɴ

Build Status

This is a direct port of the work of @vhf on https://github.com/vhf/confusable_homoglyphs to clojure .

a homoglyph is one of two or more graphemes, characters, or glyphs with shapes that appear identical or very similar wikipedia:Homoglyph

Unicode homoglyphs can be a nuisance on the web. Your most popular client, AlaskaJazz, might be upset to be impersonated by a trickster who deliberately chose the username ΑlaskaJazz.

  • AlaskaJazz is single script: only Latin characters.
  • ΑlaskaJazz is mixed-script: the first character is a greek letter.

You might also want to avoid people being tricked into entering their password on www.microsоft.com or www.faϲebook.com instead of www.microsoft.com or www.facebook.com. Here is a utility to play with these confusable homoglyphs.

Not all mixed-script strings have to be ruled out though, you could only exclude mixed-script strings containing characters that might be confused with a character from some unicode blocks of your choosing.

  • Allo and ρττ are fine: single script.
  • AlloΓ is fine when our preferred script alias is 'latin': mixed script, but Γ is not confusable.
  • Alloρ is dangerous: mixed script and ρ could be confused with p.

Documentation

codox generated documentation.

The tests might help you getting started.

Installation

thorn is available on Clojars.

Add this to your dependencies:

Clojars Project

License

Distributed under the Eclipse Public License, the same as Clojure.

Port of https://github.com/vhf/confusable_homoglyphs which is MIT-licensed

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].