All Projects → andrewstuart → Goq

andrewstuart / Goq

Licence: mit
A declarative struct-tag-based HTML unmarshaling or scraping package for Go built on top of the goquery library

Programming Languages

go
31211 projects - #10 most used programming language
golang
3204 projects

Projects that are alternatives of or similar to Goq

binstruct
Golang binary decoder for mapping data into the structure
Stars: ✭ 67 (-65.1%)
Mutual labels:  decoder, struct
Dav1d
A read-only mirror of dav1d source code repository. The origin is at https://code.videolan.org/videolan/dav1d/
Stars: ✭ 168 (-12.5%)
Mutual labels:  decoder
Irremoteesp8266
Infrared remote library for ESP8266/ESP32: send and receive infrared signals with multiple protocols. Based on: https://github.com/shirriff/Arduino-IRremote/
Stars: ✭ 1,964 (+922.92%)
Mutual labels:  decoder
React Native Input Spinner
An extensible input number spinner component for react-native highly customizable. This component enhance a text input for entering numeric values, with increase and decrease buttons.
Stars: ✭ 155 (-19.27%)
Mutual labels:  selector
Cityengine Sdk
CityEngine is a 3D city modeling software for urban design, visual effects, and VR/AR production. With its C++ SDK you can create plugins and standalone apps capable to execute CityEngine CGA procedural modeling rules.
Stars: ✭ 137 (-28.65%)
Mutual labels:  decoder
Ctc pytorch
CTC end -to-end ASR for timit and 863 corpus.
Stars: ✭ 161 (-16.15%)
Mutual labels:  decoder
Toml To Go
Translates TOML into a Go type in your browser instantly
Stars: ✭ 134 (-30.21%)
Mutual labels:  struct
Rebel Framework
Advanced and easy to use penetration testing framework 💣🔎
Stars: ✭ 183 (-4.69%)
Mutual labels:  decoder
Oysterkit
OysterKit is a framework that provides a native Swift scanning, lexical analysis, and parsing capabilities. In addition it provides a language that can be used to rapidly define the rules used by OysterKit called STLR
Stars: ✭ 167 (-13.02%)
Mutual labels:  decoder
Colored Time Selector
A smart colored time selector. Users can select just free time with a handy colorful range selector.
Stars: ✭ 156 (-18.75%)
Mutual labels:  selector
Junion
Delivers struct types for Java programming language.
Stars: ✭ 155 (-19.27%)
Mutual labels:  struct
Base62
Base62 encoder and decoder for arbitrary data
Stars: ✭ 141 (-26.56%)
Mutual labels:  decoder
Gojay
fastest JSON encoder/decoder with powerful stream API for Golang
Stars: ✭ 2,009 (+946.35%)
Mutual labels:  decoder
Wav
Battle tested Wav decoder/encoder
Stars: ✭ 139 (-27.6%)
Mutual labels:  decoder
Webp Hero
browser polyfill for the webp image format
Stars: ✭ 171 (-10.94%)
Mutual labels:  decoder
Css What
a CSS selector parser
Stars: ✭ 134 (-30.21%)
Mutual labels:  selector
Salsanext
Uncertainty-aware Semantic Segmentation of LiDAR Point Clouds for Autonomous Driving
Stars: ✭ 153 (-20.31%)
Mutual labels:  decoder
Ffmediatoolkit
FFMediaToolkit is a cross-platform video decoder/encoder library for .NET that uses FFmpeg native libraries. It supports video frames extraction, reading stream metadata and creating videos from bitmaps in any format supported by FFmpeg.
Stars: ✭ 156 (-18.75%)
Mutual labels:  decoder
Ks265codec
ks cloud hevc(h265) encoder decoder test and description
Stars: ✭ 192 (+0%)
Mutual labels:  decoder
Mapper
A simple and easy go tools for auto mapper map to struct, struct to map, struct to struct, slice to slice, map to slice, map to json.
Stars: ✭ 175 (-8.85%)
Mutual labels:  struct

goq

Build Status GoDoc Coverage Status Go Report Card

Example

import (
	"log"
	"net/http"

	"astuart.co/goq"
)

// Structured representation for github file name table
type example struct {
	Title string `goquery:"h1"`
	Files []string `goquery:"table.files tbody tr.js-navigation-item td.content,text"`
}

func main() {
	res, err := http.Get("https://github.com/andrewstuart/goq")
	if err != nil {
		log.Fatal(err)
	}
	defer res.Body.Close()

	var ex example
	
	err = goq.NewDecoder(res.Body).Decode(&ex)
	if err != nil {
		log.Fatal(err)
	}

	log.Println(ex.Title, ex.Files)
}

Details

goq

-- import "astuart.co/goq"

Package goq was built to allow users to declaratively unmarshal HTML into go structs using struct tags composed of css selectors.

I've made a best effort to behave very similarly to JSON and XML decoding as well as exposing as much information as possible in the event of an error to help you debug your Unmarshaling issues.

When creating struct types to be unmarshaled into, the following general rules apply:

  • Any type that implements the Unmarshaler interface will be passed a slice of *html.Node so that manual unmarshaling may be done. This takes the highest precedence.

  • Any struct fields may be annotated with goquery metadata, which takes the form of an element selector followed by arbitrary comma-separated "value selectors."

  • A value selector may be one of html, text, or [someAttrName]. html and text will result in the methods of the same name being called on the *goquery.Selection to obtain the value. [someAttrName] will result in *goquery.Selection.Attr("someAttrName") being called for the value.

  • A primitive value type will default to the text value of the resulting nodes if no value selector is given.

  • At least one value selector is required for maps, to determine the map key. The key type must follow both the rules applicable to go map indexing, as well as these unmarshaling rules. The value of each key will be unmarshaled in the same way the element value is unmarshaled.

  • For maps, keys will be retreived from the same level of the DOM. The key selector may be arbitrarily nested, though. The first level of children with any number of matching elements will be used, though.

  • For maps, any values must be nested below the level of the key selector. Parents or siblings of the element matched by the key selector will not be considered.

  • Once used, a "value selector" will be shifted off of the comma-separated list. This allows you to nest arbitrary levels of value selectors. For example, the type []map[string][]string would require one selector for the map key, and take an optional second selector for the values of the string slice.

  • Any struct type encountered in nested types (e.g. map[string]SomeStruct) will override any remaining "value selectors" that had not been used. For example, given:

    struct S { F string goquery:",[bang]" }

    struct { T map[string]S goquery:"#someId,[foo],[bar],[baz]" }

[foo] will be used to determine the string map key,but [bar] and [baz] will be ignored, with the [bang] tag present S struct type taking precedence.

Usage

func NodeSelector

func NodeSelector(nodes []*html.Node) *goquery.Selection

NodeSelector is a quick utility function to get a goquery.Selection from a slice of *html.Node. Useful for performing unmarshaling, since the decision was made to use []*html.Node for maximum flexibility.

func Unmarshal

func Unmarshal(bs []byte, v interface{}) error

Unmarshal takes a byte slice and a destination pointer to any interface{}, and unmarshals the document into the destination based on the rules above. Any error returned here will likely be of type CannotUnmarshalError, though an initial goquery error will pass through directly.

func UnmarshalSelection

func UnmarshalSelection(s *goquery.Selection, iface interface{}) error

UnmarshalSelection will unmarshal a goquery.goquery.Selection into an interface appropriately annoated with goquery tags.

type CannotUnmarshalError

type CannotUnmarshalError struct {
	Err      error
	Val      string
	FldOrIdx interface{}
}

CannotUnmarshalError represents an error returned by the goquery Unmarshaler and helps consumers in programmatically diagnosing the cause of their error.

func (*CannotUnmarshalError) Error

func (e *CannotUnmarshalError) Error() string

type Decoder

type Decoder struct {
}

Decoder implements the same API you will see in encoding/xml and encoding/json except that we do not currently support proper streaming decoding as it is not supported by goquery upstream.

func NewDecoder

func NewDecoder(r io.Reader) *Decoder

NewDecoder returns a new decoder given an io.Reader

func (*Decoder) Decode

func (d *Decoder) Decode(dest interface{}) error

Decode will unmarshal the contents of the decoder when given an instance of an annotated type as its argument. It will return any errors encountered during either parsing the document or unmarshaling into the given object.

type Unmarshaler

type Unmarshaler interface {
	UnmarshalHTML([]*html.Node) error
}

Unmarshaler allows for custom implementations of unmarshaling logic

TODO

  • Callable goquery methods with args, via reflection
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].