All Projects → dimchansky → utfbom

dimchansky / utfbom

Licence: Apache-2.0 license
Detection of the BOM and removing as necessary

Programming Languages

go
31211 projects - #10 most used programming language

Projects that are alternatives of or similar to utfbom

characteristics
Character info under different encodings
Stars: ✭ 25 (-71.26%)
Mutual labels:  unicode, utf
unicode-lookup
The web's best unicode lookup tool!
Stars: ✭ 49 (-43.68%)
Mutual labels:  unicode, utf
UnicodeBOMInputStream
Doing things right, in the name of Sun / Oracle
Stars: ✭ 36 (-58.62%)
Mutual labels:  unicode, bom
emoji-db
A database of Apple-supported emojis in JSON format. Used by my Alfred emoji workflow.
Stars: ✭ 32 (-63.22%)
Mutual labels:  unicode
unigem-objective-c
Unicode Gems, a Mac app, an iOS app, and an iOS keyboard for letter-like unicode.
Stars: ✭ 22 (-74.71%)
Mutual labels:  unicode
android-unicode
Android unicode UTF-7 input apk
Stars: ✭ 23 (-73.56%)
Mutual labels:  unicode
log-utils
Basic logging utils: colors, symbols and timestamp.
Stars: ✭ 24 (-72.41%)
Mutual labels:  unicode
unicode display width
Displayed width of UTF-8 strings in Modern C++
Stars: ✭ 30 (-65.52%)
Mutual labels:  unicode
icu-swift
Swift APIs for ICU
Stars: ✭ 23 (-73.56%)
Mutual labels:  unicode
nepali utils
A pure dart package with collection of Nepali Utilities like Date converter, Date formatter, DateTime, Nepali Numbers, Nepali Unicode, Nepali Moments and many more.
Stars: ✭ 22 (-74.71%)
Mutual labels:  unicode
utf8-validator
UTF-8 Validator
Stars: ✭ 18 (-79.31%)
Mutual labels:  unicode
CJK-character-count
Program that counts the amount of CJK characters based on Unicode ranges and Chinese encoding standards 字体汉字计数软件
Stars: ✭ 195 (+124.14%)
Mutual labels:  unicode
unicode
A Flask-Based Web-App for Exploring Unicode
Stars: ✭ 12 (-86.21%)
Mutual labels:  unicode
table2ascii
Python library for converting lists to fancy ASCII tables for displaying in the terminal and on Discord
Stars: ✭ 31 (-64.37%)
Mutual labels:  unicode
ruby-homograph-detector
🕵️‍♀️🕵️‍♂️ Ruby gem for determining whether a given URL is considered an IDN homograph attack
Stars: ✭ 29 (-66.67%)
Mutual labels:  unicode
KC2PK
KiCad to PartKeepr BOM Tool with Octopart integration
Stars: ✭ 28 (-67.82%)
Mutual labels:  bom
glyphhanger
Your web font utility belt. It can subset web fonts. It can find unicode-ranges for you automatically. It makes julienne fries.
Stars: ✭ 422 (+385.06%)
Mutual labels:  unicode
hyphenation
Text hyphenation for Rust
Stars: ✭ 43 (-50.57%)
Mutual labels:  unicode
unicodia
Encyclopedia of Unicode characters
Stars: ✭ 17 (-80.46%)
Mutual labels:  unicode
front-end-notes
前端课程学习笔记汇总
Stars: ✭ 57 (-34.48%)
Mutual labels:  bom

utfbom Godoc License Build Status Go Report Card Coverage Status

The package utfbom implements the detection of the BOM (Unicode Byte Order Mark) and removing as necessary. It can also return the encoding detected by the BOM.

Installation

go get -u github.com/dimchansky/utfbom

Example

package main

import (
	"bytes"
	"fmt"
	"io/ioutil"

	"github.com/dimchansky/utfbom"
)

func main() {
	trySkip([]byte("\xEF\xBB\xBFhello"))
	trySkip([]byte("hello"))
}

func trySkip(byteData []byte) {
	fmt.Println("Input:", byteData)

	// just skip BOM
	output, err := ioutil.ReadAll(utfbom.SkipOnly(bytes.NewReader(byteData)))
	if err != nil {
		fmt.Println(err)
		return
	}
	fmt.Println("ReadAll with BOM skipping", output)

	// skip BOM and detect encoding
	sr, enc := utfbom.Skip(bytes.NewReader(byteData))
	fmt.Printf("Detected encoding: %s\n", enc)
	output, err = ioutil.ReadAll(sr)
	if err != nil {
		fmt.Println(err)
		return
	}
	fmt.Println("ReadAll with BOM detection and skipping", output)
	fmt.Println()
}

Output:

$ go run main.go
Input: [239 187 191 104 101 108 108 111]
ReadAll with BOM skipping [104 101 108 108 111]
Detected encoding: UTF8
ReadAll with BOM detection and skipping [104 101 108 108 111]

Input: [104 101 108 108 111]
ReadAll with BOM skipping [104 101 108 108 111]
Detected encoding: Unknown
ReadAll with BOM detection and skipping [104 101 108 108 111]
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].