All Projects → janlelis → Unibits

janlelis / Unibits

Licence: mit
Visualize different Unicode encodings in the terminal

Programming Languages

ruby
36898 projects - #4 most used programming language

Projects that are alternatives of or similar to Unibits

Transliteration
UTF-8 to ASCII transliteration / slugify module for node.js, browser, Web Worker, React Native, Electron and CLI.
Stars: ✭ 444 (+255.2%)
Mutual labels:  unicode, ascii, utf-8
Unicopy
Unicode command-line codepoint dumper
Stars: ✭ 16 (-87.2%)
Mutual labels:  cli-command, unicode, utf-8
Tvision
A modern port of Turbo Vision 2.0, the classical framework for text-based user interfaces. Now cross-platform and with Unicode support.
Stars: ✭ 612 (+389.6%)
Mutual labels:  terminal, ascii, utf-8
homoglyphs
Homoglyphs: get similar letters, convert to ASCII, detect possible languages and UTF-8 group.
Stars: ✭ 70 (-44%)
Mutual labels:  unicode, ascii, utf-8
Uniscribe
Know your Unicode ✀
Stars: ✭ 266 (+112.8%)
Mutual labels:  cli-command, debugging-tool, unicode
characteristics
Character info under different encodings
Stars: ✭ 25 (-80%)
Mutual labels:  unicode, ascii, utf-8
Cowsay Files
A collection of additional/alternative cowsay files.
Stars: ✭ 216 (+72.8%)
Mutual labels:  terminal, unicode, ascii
Portable Utf8
🉑 Portable UTF-8 library - performance optimized (unicode) string functions for php.
Stars: ✭ 405 (+224%)
Mutual labels:  unicode, ascii, utf-8
Lehar
Visualize data using relative ordering
Stars: ✭ 81 (-35.2%)
Mutual labels:  terminal, unicode, ascii
Terminaltables
Generate simple tables in terminals from a nested list of strings.
Stars: ✭ 685 (+448%)
Mutual labels:  terminal, ascii
Awesome Unicode
😂 👌 A curated list of delightful Unicode tidbits, packages and resources.
Stars: ✭ 693 (+454.4%)
Mutual labels:  unicode, utf-8
Slug Generator
Slug Generator Library for PHP, based on Unicode’s CLDR data
Stars: ✭ 740 (+492%)
Mutual labels:  unicode, ascii
Diagram
CLI app to convert ASCII arts into hand drawn diagrams.
Stars: ✭ 642 (+413.6%)
Mutual labels:  terminal, ascii
Urlify
A fast PHP slug generator and transliteration library that converts non-ascii characters for use in URLs.
Stars: ✭ 633 (+406.4%)
Mutual labels:  unicode, ascii
Unicodeplots.jl
Unicode-based scientific plotting for working in the terminal
Stars: ✭ 724 (+479.2%)
Mutual labels:  terminal, unicode
Kibi
A text editor in ≤1024 lines of code, written in Rust
Stars: ✭ 522 (+317.6%)
Mutual labels:  terminal, utf-8
Git Praise
A nicer git blame.
Stars: ✭ 24 (-80.8%)
Mutual labels:  terminal, unicode
Box Cli Maker
Make Highly Customized Boxes for your CLI
Stars: ✭ 115 (-8%)
Mutual labels:  terminal, unicode
Video To Ascii
It is a simple python package to play videos in the terminal using characters as pixels
Stars: ✭ 960 (+668%)
Mutual labels:  terminal, ascii
Turbo
An experimental text editor based on Scintilla and Turbo Vision.
Stars: ✭ 78 (-37.6%)
Mutual labels:  terminal, utf-8

unibits | Reveal the Unicode [version] [ci]

Ruby library and CLI command that visualizes various Unicode and ASCII/single byte encodings in the terminal:

  • Makes analyzing encodings easier
  • Helps you with debugging strings
  • Highlights invalid/special/blank bytes/characters/codepoints
  • Supports UTF-8, UTF-16LE/UTF-16BE, UTF-32LE/UTF-32BE, ISO-8859-X, Windows-125X, IBMX, CP85X, macX, TIS-620/Windows-874, KOI8-R/KOI8-U, 7-Bit ASCII/GB1988, and arbitrary BINARY data

Color Coding

Each byte of the given string is highlighted using the following mechanism (characters -> codepoints):

  • Red for invalid bytes
  • Light blue for blanks
  • Blue for control characters
  • Non-control formatting characters in pink
  • Green for marks (Unicode only)
  • Orange for unassigned codepoints
  • Lighter orange for unassigned codepoints which are also ignorable
  • Random color for all other codepoints

The same colors are used in the higher-level companion tool uniscribe.

Setup

Make sure you have Ruby installed and installing gems works properly. Then do:

$ gem install unibits

Usage

Pass the string to debug to unibits:

From CLI

$ unibits "🌫 Idiosyncrätic ℜսᖯʏ"

From Ruby

require 'unibits/kernel_method'
unibits "🌫 Idiosyncrätic ℜսᖯʏ"

Advanced Options

unibits takes some optional options:

  • encoding (e): The encoding of the given string (uses the string's default encoding if none given)
  • convert (c): An encoding the string should be converted to before visualizing it
  • stats: Whether to show a short stats header (default: true), you can deactivate on the CLI with --no-stats
  • wide-ambiguous: Treat characters of ambiguous width as 2 spaces instead of 1 (more info)
  • width (w): Set a custom column width, if not set, unibits will retrieve it from the terminal or just use 80

Examples of Valid Encodings

UTF-8

CLI: $ unibits -e utf-8 -c utf-8 "🌫 Idiosyncrätic ℜսᖯʏ"

Ruby: unibits "🌫 Idiosyncrätic ℜսᖯʏ", encoding: 'utf-8', convert: 'utf-8'

Screenshot UTF-8

UTF-16LE

CLI: $ unibits -e utf-8 -c utf-16le "🌫 Idiosyncrätic ℜսᖯʏ"

Ruby: unibits "🌫 Idiosyncrätic ℜսᖯʏ", encoding: 'utf-8', convert: 'utf-16le'

Screenshot UTF-16LE

UTF-32BE

CLI: $ unibits -e utf-8 -c utf-32be "🌫 Idiosyncrätic ℜսᖯʏ"

Ruby: unibits "🌫 Idiosyncrätic ℜսᖯʏ", encoding: 'utf-8', convert: 'utf-32be'

Screenshot UTF-32BE

BINARY

CLI: $ unibits -e binary "🌫 Idiosyncrätic ℜսᖯʏ"

Ruby: unibits "🌫 Idiosyncrätic ℜսᖯʏ", encoding: 'binary'

Screenshot BINARY

ASCII

CLI: $ unibits -e utf-8 -c ascii "ascii"

Ruby: unibits "ascii", encoding: 'utf-8', convert: 'ascii'

Screenshot ASCII

Examples of Invalid Encodings

UTF-8

Example in Ruby: unibits "unexpected \x80 | not enough \xF0\x9F\x8C | overlong \xE0\x81\x81 | surrogate \xED\xA0\x80 | too large \xF5\x8F\xBF\xBF"

Screenshot invalid UTF-8

ASCII

Example in Ruby: unibits "🌫 Idiosyncrätic ℜսᖯʏ", encoding: 'ascii'

Screenshot invalid ASCII

Notes

More info

Related gems

Lots of thanks to @damienklinnert for the motivation and inspiration required to build this! 🎆

Copyright (C) 2017-2020 Jan Lelis https://janlelis.com. Released under the MIT license.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].