All Projects → zlepper → encoding-html

zlepper / encoding-html

Licence: MIT license
A golang library for decoding html into structs

Programming Languages

go
31211 projects - #10 most used programming language

Projects that are alternatives of or similar to encoding-html

png pong
A pure Rust PNG image decoder and encoder based on lodepng.
Stars: ✭ 21 (-27.59%)
Mutual labels:  decoder
Mappable
flexible JSON to Model converter, specially optimized for immutable properties
Stars: ✭ 27 (-6.9%)
Mutual labels:  decoder
Image deionising auto encoder
Noise removal from images using Convolutional autoencoder
Stars: ✭ 34 (+17.24%)
Mutual labels:  decoder
png
🖼A full-featured PNG decoder and encoder.
Stars: ✭ 64 (+120.69%)
Mutual labels:  decoder
axmldec
Stand-alone binary AndroidManifest.xml decoder
Stars: ✭ 151 (+420.69%)
Mutual labels:  decoder
rasn1
Ruby ASN.1 library
Stars: ✭ 14 (-51.72%)
Mutual labels:  decoder
h264decoder
h264 decoding module for python based on libav
Stars: ✭ 76 (+162.07%)
Mutual labels:  decoder
tinyh264
A tiny WASM h.264 decoder, for node and browser
Stars: ✭ 139 (+379.31%)
Mutual labels:  decoder
keystore-go
A Go (golang) implementation of Java KeyStore encoder/decoder
Stars: ✭ 119 (+310.34%)
Mutual labels:  decoder
otfed
An OpenType font format encoder & decoder written in OCaml
Stars: ✭ 15 (-48.28%)
Mutual labels:  decoder
readsb
ADS-B decoder swiss knife
Stars: ✭ 114 (+293.1%)
Mutual labels:  decoder
RFFHEM
Counterpart of SIGNALDuino, it's the code for FHEM to work with the data received from the uC
Stars: ✭ 44 (+51.72%)
Mutual labels:  decoder
brute-md5
Advanced, Light Weight & Extremely Fast MD5 Cracker/Decoder/Decryptor written in Python 3
Stars: ✭ 16 (-44.83%)
Mutual labels:  decoder
GatedPixelCNNPyTorch
PyTorch implementation of "Conditional Image Generation with PixelCNN Decoders" by van den Oord et al. 2016
Stars: ✭ 68 (+134.48%)
Mutual labels:  decoder
Open-Imaging
Tools and libraries that deal with the creation and processing of images.
Stars: ✭ 100 (+244.83%)
Mutual labels:  decoder
C-plus-plus-ASN.1-2008-coder-decoder
Free C++ ASN.1:2008 coder/decoder
Stars: ✭ 23 (-20.69%)
Mutual labels:  decoder
online-ethereum-abi-encoder-decoder
A quick online tool to abi-encode and abi-decode constructor arguments used in ethereum's solidity. https://adibas03.github.io/online-ethereum-abi-encoder-decoder/
Stars: ✭ 37 (+27.59%)
Mutual labels:  decoder
tg-file-decoder
Decode Telegram bot API file IDs
Stars: ✭ 30 (+3.45%)
Mutual labels:  decoder
acronym-decoder
Acronym Decoder
Stars: ✭ 39 (+34.48%)
Mutual labels:  decoder
rtlsdr-wsprd
WSPR daemon for RTL receivers
Stars: ✭ 93 (+220.69%)
Mutual labels:  decoder

Build Status

Encoding-html

A library for decoding html into golang structs. Useful e.g. for making crawlers to interact with pages that does not have an actual api.

Installation

go get github.com/zlepper/encoding-html

Examples

Getting the front page of hackernews:

package main

import (
	"github.com/zlepper/encoding-html"
	"net/http"
	"log"
)

type Post struct {
	Title string `css:".title a"`
	Link string `css:".title a" extract:"attr" attr:"href"`
}
type HN struct {
	Posts []Post `css:".itemlist .athing"`
}

func main() {
	resp, err := http.Get("https://news.ycombinator.com/")
	if err != nil {
		log.Fatal(err)
	}

	var hn HN
	err = html.NewDecoder(resp.Body).Decode(&hn)
	if err != nil {
		log.Fatal(err)
	}

	log.Printf("%+v", hn)
}

At the time of writing, that printed:

{Posts:[{Title:The NetHack dev team is happy to announce the release of NetHack 3.6.1 Link:https://groups.google.com/forum/m/#!topic/rec.games.roguelike.nethack/XhcIrLlNzpA} {Title:Show HN: A fast, hopefully accurate, fuzzy matching library written in Go Link:https://github.com/sahilm/fuzzy} {Title:Larry Harvey, co-founder of Burning Man, has died Link:https://www.nytimes.com/2018/04/28/obituaries/larry-harvey-burning-man-festival-dead-at-70.html} {Title:Ask HN: My startup has basically failed. What now? Link:item?id=16949209} {Title:Kasparov versus the World Link:https://en.wikipedia.org/wiki/Kasparov_versus_the_World} {Title:Show HN: A proof-of-concept FoundationDB based network block device backend Link:https://github.com/dividuum/fdb-nbd} {Title:OpenEMR v5.0.1 Link:http://www.openhealthnews.com/content/openemr-community-releases-monumental-upgrade-their-open-source-ehr-update-ready-download} {Title:It’s Impossible to Prove Your Laptop Hasn’t Been Hacked Link:https://theintercept.com/2018/04/28/computer-malware-tampering/} {Title:HyperTools: A Python toolbox for gaining insights into high-dimensional data Link:http://hypertools.readthedocs.io/en/latest/#} {Title:Nintendo's secretive creative process Link:https://amp.theguardian.com/games/2018/apr/25/nintendo-interview-secret-innovation-lab-ideas-working} {Title:VoiceOps is hiring in SF to build AI for b2b voice data Link:https://voiceops.com/careers.html} {Title:Show HN: Generating fun Stack Exchange questions using Markov chains Link:https://se-simulator.lw1.at/} {Title:The myopia boom (2015) Link:https://www.nature.com/news/the-myopia-boom-1.17120} {Title:Seattle vacates hundreds of marijuana charges going back 30 years Link:https://www.theroot.com/seattle-vacates-hundreds-of-marijuana-possession-charge-1825622917} {Title:In theory, rocks from Oman could store hundreds of years of human CO2 emissions Link:https://www.nytimes.com/interactive/2018/04/26/climate/oman-rocks.html} {Title:The quadratic formula and low-precision arithmetic Link:https://www.johndcook.com/blog/2018/04/28/quadratic-formula/} {Title:Implementing and Understanding Type Classes (2014) Link:http://okmij.org/ftp/Computation/typeclass.html} {Title:Drawing with boids Link:https://miniatureape.github.io/boiddraw/} {Title:Lessons learned from a failing local mall Link:https://www.strongtowns.org/journal/2018/4/23/bon-ton-gone} {Title:French museum discovers half of its collection are fakes Link:https://www.telegraph.co.uk/news/2018/04/28/french-museum-discovers-half-collection-fakes/} {Title:World's oldest spider discovered in Australian outback Link:https://phys.org/news/2018-04-world-oldest-spider-australian-outback.html} {Title:Statement on Nature Machine Intelligence Link:https://openaccess.engineering.oregonstate.edu/home} {Title:The Wren Programming Language Link:https://github.com/munificent/wren} {Title:Facebook Warns Investors to Expect 'Additional Incidents' of User Data Abuse Link:https://www.siliconvalley.com/2018/04/27/facebook-got-an-earnings-boost-but-heres-the-fine-print/} {Title:Open3D: A Modern Library for 3D Data Processing Home Code Docs C++ API Link:http://www.open-3d.org/} {Title:A Layman’s Intro to Western Classical Music Link:https://quariety.com/2018/04/28/a-laymans-intro-to-western-classical-music/} {Title:EU agrees on total ban of bee-harming pesticides Link:https://www.theguardian.com/environment/2018/apr/27/eu-agrees-total-ban-on-bee-harming-pesticides} {Title:What it means to “disagree and commit” and how I do it (2016) Link:http://www.amazonianblog.com/2016/11/what-it-means-to-disagree-and-commit-and-how-i-do-it.html} {Title:Native Clojure with GraalVM Link:https://www.innoq.com/en/blog/native-clojure-and-graalvm/} {Title:Bulldoze the business school Link:https://www.theguardian.com/news/2018/apr/27/bulldoze-the-business-school}]}

Tag options

Everything in encoding-html is specified using tags, the currently available tags are as follows:

css

Specifies the css selector for finding the element. An element will always be selected from using the parent fields element as root. This allows for selecting in arrays

If the selector is not specified, then the field will be ignored. If a selector matches multiple elements, and the field is not an array, the first element will be used.

extract

Specifies how to get the text to work on. Valid options are text or attr. text will get all the inner text nodes of the html. attr will get the value of an attribute. What attribute to fetch is specified using the attr tag.

If extract is not specified, text will be selected. If an unknown option is specified, an error will be returning from the decode call.

The extracted values will automatically be parsed into the requested type using the strconv.ParseFloat|Int|Bool|UInt() function in the standard library. If the value cannot be parsed, and no default value has been provided, the entire decode will return an error.

attr

Specifies what attribute should be extracted from the matching html element. If extract:"attr" is specified, and this tag is not, an error will be returned. If the attribute does not exist on the element, the empty string "" will be considered the value of the attribute.

default

Specifies a default value that should be set, provided the selected content was a zero value, or that the actual content could no be converted into the specified type.

If the default value cannot be converted, the entire parsing will fail and return an error.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].