All Projects → Codepoints → Unidump

Codepoints / Unidump

Licence: other
hexdump(1) for Unicode data

Programming Languages

python
139335 projects - #7 most used programming language
python3
1442 projects

Projects that are alternatives of or similar to Unidump

Laravel Zero
A PHP framework for console artisans
Stars: ✭ 2,821 (+9000%)
Mutual labels:  utility, cli, console
Box Cli Maker
Make Highly Customized Boxes for your CLI
Stars: ✭ 115 (+270.97%)
Mutual labels:  cli, console, unicode
Saws
A supercharged AWS command line interface (CLI).
Stars: ✭ 4,886 (+15661.29%)
Mutual labels:  utility, cli
Ttyplot
a realtime plotting utility for terminal/console with data input from stdin
Stars: ✭ 532 (+1616.13%)
Mutual labels:  cli, console
Python Progressbar
Progressbar 2 - A progress bar for Python 2 and Python 3 - "pip install progressbar2"
Stars: ✭ 682 (+2100%)
Mutual labels:  cli, console
Remarshal
Convert between CBOR, JSON, MessagePack, TOML, and YAML
Stars: ✭ 421 (+1258.06%)
Mutual labels:  utility, cli
Pulsemixer
CLI and curses mixer for PulseAudio
Stars: ✭ 441 (+1322.58%)
Mutual labels:  cli, console
Backslide
💦 CLI tool for making HTML presentations with Remark.js using Markdown
Stars: ✭ 679 (+2090.32%)
Mutual labels:  utility, cli
Collision
💥 Collision is a beautiful error reporting tool for command-line applications
Stars: ✭ 3,993 (+12780.65%)
Mutual labels:  cli, console
Tui Consolelauncher
Linux CLI Launcher for Android
Stars: ✭ 861 (+2677.42%)
Mutual labels:  cli, console
Git Praise
A nicer git blame.
Stars: ✭ 24 (-22.58%)
Mutual labels:  cli, unicode
Executor
Watch for file changes and then execute command. Very nice for test driven development.
Stars: ✭ 14 (-54.84%)
Mutual labels:  utility, cli
Yaspin
A lightweight terminal spinner for Python with safe pipes and redirects 🎁
Stars: ✭ 413 (+1232.26%)
Mutual labels:  cli, console
Phpinsights
🔰 Instant PHP quality checks from your console
Stars: ✭ 4,442 (+14229.03%)
Mutual labels:  cli, console
Dronesploit
Drone pentesting framework console
Stars: ✭ 473 (+1425.81%)
Mutual labels:  cli, console
Cocona
Micro-framework for .NET Core console application. Cocona makes it easy and fast to build console applications on .NET Core.
Stars: ✭ 398 (+1183.87%)
Mutual labels:  cli, console
Progressbar
Terminal-based progress bar for Java / JVM
Stars: ✭ 625 (+1916.13%)
Mutual labels:  cli, console
Radian
A 21 century R console
Stars: ✭ 878 (+2732.26%)
Mutual labels:  cli, console
Stig
TUI and CLI for the BitTorrent client Transmission
Stars: ✭ 360 (+1061.29%)
Mutual labels:  cli, console
Jql
A JSON Query Language CLI tool
Stars: ✭ 368 (+1087.1%)
Mutual labels:  utility, cli

unidump

hexdump for your Unicode data

Installation

Install via pip:

# you need Python 3 for unidump
pip3 install unidump

Usage

Without further ado, here is the usage message of unidump:

$ unidump --help
usage: unidump [-h] [-n LENGTH] [-c ENC] [-e FORMAT] [-v] [FILE [FILE ...]]

  A Unicode code point dump.

  Think of it as  hexdump(1)  for Unicode.  The command analyses  the input and
  then prints three columns: the raw byte index of the first code point in this
  row, code points in their hex notation,  and finally the raw input characters
  with control and whitespace replaced by a dot.

  Invalid byte sequences are represented with an “X” and with the hex value en-
  closed in question marks, e.g., “?F5?”.

  You can pipe in  data from stdin,  select several files at once,  or even mix
  all those input methods together.

positional arguments:
  FILE                  input files. Use `-' or keep empty for stdin.

optional arguments:
  -h, --help            show this help message and exit
  -n LENGTH, --length LENGTH
                        format output using this much input characters.
                        Default is 16 characters.
  -c ENC, --encoding ENC
                        interpret input in this encoding. Default is utf-8.
                        You can choose any encoding that Python supports, e.g.
                        “latin-1”.
  -e FORMAT, --format FORMAT
                        specify a custom format in Python’s {} notation.
                        Default is “{byte:>7} {repr} {data} ”.
  -v, --version         show program's version number and exit

Examples:

* Basic usage with stdin:

      echo -n 'ABCDEFGHIJKLMNOP' | unidump -n 4
            0    0041 0042 0043 0044    ABCD
            4    0045 0046 0047 0048    EFGH
            8    0049 004A 004B 004C    IJKL
           12    004D 004E 004F 0050    MNOP

* Dump the code points translated from another encoding:

      unidump -c latin-1 some-legacy-file

* Dump many files at the same time:

      unidump foo-*.txt

* Control characters and whitespace are safely rendered:

      echo -n -e '\x01' | unidump -n 1
           0    0001    .

* Finally learn what your favorite Emoji is composed of:

      ( echo -n -e '\xf0\x9f\xa7\x9d\xf0\x9f\x8f\xbd\xe2' ; \
        echo -n -e '\x80\x8d\xe2\x99\x82\xef\xb8\x8f' ; ) | \
      unidump -n 5
           0    1F9DD 1F3FD 200D 2642 FE0F    .🏽.♂️

  See  <http://emojipedia.org/man-elf-medium-skin-tone/> for images.  The “elf”
  emoji (the first character) is replaced with a dot here,  because the current
  version of Python’s unicodedata doesn’t know of this character yet.

* Use it like strings(1):

      unidump -e '{data}' some-file.bin

  This will replace  every unknown byte from the input file  with “X” and every
  control and whitespace character with “.”.

* Only print the code points of the input:

      unidump -e '{repr}'$'\n' -n 1 some-file.txt

  This results in a stream of code points in hex notation,  each on a new line,
  without byte counter  or rendering of actual data.  You can use this to count
  the total amount of characters  (as opposed to raw bytes)  in a file,  if you
  pipe it through `wc -l`.

License

MIT-licensed. See license file.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].