All Projects → dyatlov → go-htmlinfo

dyatlov / go-htmlinfo

Licence: MIT license
Go HTML Info package for extracting meaningful information from html page

Programming Languages

go
31211 projects - #10 most used programming language

Projects that are alternatives of or similar to go-htmlinfo

Forensic Tools
A collection of tools for forensic analysis
Stars: ✭ 204 (+518.18%)
Mutual labels:  parse
Skrape.it
A Kotlin-based testing/scraping/parsing library providing the ability to analyze and extract data from HTML (server & client-side rendered). It places particular emphasis on ease of use and a high level of readability by providing an intuitive DSL. It aims to be a testing lib, but can also be used to scrape websites in a convenient fashion.
Stars: ✭ 231 (+600%)
Mutual labels:  parse
berkeley-parser-analyser
A tool for classifying mistakes in the output of parsers
Stars: ✭ 34 (+3.03%)
Mutual labels:  parse
Tsql Parser
Library Written in C# For Parsing SQL Server T-SQL Scripts in .Net
Stars: ✭ 203 (+515.15%)
Mutual labels:  parse
Zipson
JSON parse and stringify with compression
Stars: ✭ 229 (+593.94%)
Mutual labels:  parse
Tmxlite
lightweight C++14 parser for Tiled tmx files
Stars: ✭ 248 (+651.52%)
Mutual labels:  parse
Picasso
一款sketch生成代码插件,可将sketch设计稿自动解析成前端代码。
Stars: ✭ 191 (+478.79%)
Mutual labels:  parse
icecast-parser
Node.js module for getting and parsing metadata from SHOUTcast/Icecast radio streams
Stars: ✭ 66 (+100%)
Mutual labels:  parse
Termsql
Convert text from a file or from stdin into SQL table and query it instantly. Uses sqlite as backend. The idea is to make SQL into a tool on the command line or in scripts.
Stars: ✭ 230 (+596.97%)
Mutual labels:  parse
parse-github-url
Parse a Github URL into an object. Supports a wide variety of GitHub URL formats.
Stars: ✭ 114 (+245.45%)
Mutual labels:  parse
Json
A C++11 or library for parsing and serializing JSON to and from a DOM container in memory.
Stars: ✭ 205 (+521.21%)
Mutual labels:  parse
Jquery.terminal
jQuery Terminal Emulator - JavaScript library for creating web-based terminals with custom commands
Stars: ✭ 2,623 (+7848.48%)
Mutual labels:  parse
Swiftsoup
SwiftSoup: Pure Swift HTML Parser, with best of DOM, CSS, and jquery (Supports Linux, iOS, Mac, tvOS, watchOS)
Stars: ✭ 3,079 (+9230.3%)
Mutual labels:  parse
Lark
Lark is a parsing toolkit for Python, built with a focus on ergonomics, performance and modularity.
Stars: ✭ 2,916 (+8736.36%)
Mutual labels:  parse
sqlite-createtable-parser
A parser for sqlite create table sql statements.
Stars: ✭ 67 (+103.03%)
Mutual labels:  parse
Dd Plist
A java library providing support for ASCII, XML and binary property lists.
Stars: ✭ 201 (+509.09%)
Mutual labels:  parse
Subtitle.js
Stream-based library for parsing and manipulating subtitle files
Stars: ✭ 234 (+609.09%)
Mutual labels:  parse
pf-azure-sentinel
Parse pfSense/OPNSense logs using Logstash, GeoIP tag entities, add additional context to logs, then send to Azure Sentinel for analysis.
Stars: ✭ 24 (-27.27%)
Mutual labels:  parse
go-oembed
Golang package for parsing Oembed data from known providers by URL
Stars: ✭ 22 (-33.33%)
Mutual labels:  parse
opensource
Collection of Open Source packages by Otherwise
Stars: ✭ 21 (-36.36%)
Mutual labels:  parse

Go HTML Info

Go HTML Info provides a simple interface to extract meaningful information from an html page.

source docs: http://godoc.org/github.com/dyatlov/go-htmlinfo/htmlinfo

Install: go get github.com/dyatlov/go-htmlinfo/htmlinfo

Use: import "github.com/dyatlov/go-htmlinfo/htmlinfo"

Parse method parses all html content into structurized ata. GenerateOembedFor method generate oembed info from that structurized data. It generates that info based on available data, even if no oembed information found on the page.

Example:

package main

import (
	"fmt"
	"net/http"

	"github.com/dyatlov/go-htmlinfo/htmlinfo"
)

func main() {
	u := "http://techcrunch.com/2015/09/09/ipad-pro-coming-in-november-pricing-starts-at-799/"

	resp, err := http.Get(u)

	if err != nil {
		panic(err)
	}

	defer resp.Body.Close()

	info := htmlinfo.NewHTMLInfo()

	// if url can be nil too, just then we won't be able to fetch (and generate) oembed information
	err = info.Parse(resp.Body, &u, nil)

	if err != nil {
		panic(err)
	}

	fmt.Printf("Info:\n%s\n", info)

	fmt.Printf("Oembed information: %s\n", info.GenerateOembedFor(u))
}

Result would be:

Info:

{
    "title": "iPad Pro Coming In November, Pricing Starts At $799  |  TechCrunch",
    "description": "Apple unveiled its new iPad Pro today. If you're wondering when you can get your hands on it, and how much it will cost, here you go: Apple says the iPad Pro..",
    "author_name": "Anthony Ha",
    "canonical_url": "http://techcrunch.com/2015/09/09/ipad-pro-coming-in-november-pricing-starts-at-799/",
    "oembed_json_url": "https://public-api.wordpress.com/oembed/1.0/?format=json\u0026url=http%3A%2F%2Ftechcrunch.com%2F2015%2F09%2F09%2Fipad-pro-coming-in-november-pricing-starts-at-799%2F\u0026for=wpcom-auto-discovery",
    "oembed_xml_url": "https://public-api.wordpress.com/oembed/1.0/?format=xml\u0026url=http%3A%2F%2Ftechcrunch.com%2F2015%2F09%2F09%2Fipad-pro-coming-in-november-pricing-starts-at-799%2F\u0026for=wpcom-auto-discovery",
    "favicon_url": "https://s0.wp.com/wp-content/themes/vip/techcrunch-2013/assets/images/favicon.ico",
    "touch_icons": [{
      { url: 'https://s0.wp.com/wp-content/themes/vip/techcrunch-2013/assets/images/favicon.ico',
       type: 'icon',
       width: 0,
       height: 0,
       is_scalable: false },
     { url: 'https://s0.wp.com/wp-content/themes/vip/techcrunch-2013/assets/images/homescreen_TCIcon.png',
       type: 'apple-touch-icon-precomposed',
       width: 0,
       height: 0,
       is_scalable: false },
       // ...
    ],
    "image_src_url": "",
    "main_content": "Apple unveiled its new iPad Pro today. If you’re wondering when you can get your hands on it, and how much it will cost, here you go: Apple says the iPad Pro and related accessories will be available in November.\nPricing will start at $799 with 32 gigabytes of memory and WiFi-only connectivity, with a $949 price tag for 128 GB, and $1,079 for 128 GB and a cellular connection. If you want the company’s new stylus, dubbed the Apple Pencil, that’ll cost you $99, and the Smart Keyboard will cost $169.\nThat may seem pretty pricey compared to other iPads — in fact, Apple said today that it’s dropping pricing on its iPad Mini 2, which its considers to be the entry-level iPad, to $269. What you’re paying for (among other things) is a 12.9-inch screen with resolution of 2,732 x 2,048 pixels, Apple’s A9X chip and four speakers.\nAnd, as the name and on-stage demos suggest, Apple seems to be pitching this for enterprise use and productivity, not for casual use.\n\t\t\t\n\t\t\t\t\n\t\t\t\tSay Hello To The Brand-New iPad Pro\n\t\t\t\n\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t",
    "opengraph": {
        "type": "article",
        "url": "http://social.techcrunch.com/2015/09/09/ipad-pro-coming-in-november-pricing-starts-at-799/",
        "title": "iPad Pro Coming In November, Pricing Starts At $799",
        "description": "Apple unveiled its new iPad Pro today. If you're wondering when you can get your hands on it, and how much it will cost, here you go: Apple says the iPad Pro..",
        "determiner": "",
        "site_name": "TechCrunch",
        "locale": "",
        "locales_alternate": null,
        "images": [{
            "url": "https://tctechcrunch2011.files.wordpress.com/2015/09/screen-shot-2015-09-09-at-1-49-10-pm.png?w=560\u0026h=292\u0026crop=1",
            "secure_url": "",
            "type": "",
            "width": 0,
            "height": 0
        }],
        "audios": null,
        "videos": null,
        "article": {
            "published_time": null,
            "modified_time": null,
            "expiration_time": null,
            "section": "",
            "tags": null,
            "authors": null
        }
    },
    "oembed": {
        "type": "link",
        "url": "http://techcrunch.com/2015/09/09/ipad-pro-coming-in-november-pricing-starts-at-799/",
        "provider_url": "http://techcrunch.com",
        "provider_name": "TechCrunch",
        "title": "iPad Pro Coming In November, Pricing Starts At\u0026nbsp;$799",
        "description": "",
        "width": 0,
        "height": 0,
        "thumbnail_url": "https://i1.wp.com/tctechcrunch2011.files.wordpress.com/2015/09/screen-shot-2015-09-09-at-1-49-10-pm.png?fit=440%2C330",
        "thumbnail_width": 440,
        "thumbnail_height": 218,
        "author_name": "\u003ca href=\"/author/anthony-ha/\" title=\"Posts by Anthony Ha\" onclick=\"s_objectID='river_author';\" rel=\"author\"\u003eAnthony Ha\u003c/a\u003e",
        "author_url": "/author/anthony-ha/",
        "html": "Apple \u003ca href=\"http://techcrunch.com/2015/09/09/apple-unveils-the-ipad-pro/\"\u003eunveiled its new iPad Pro today\u003c/a\u003e. If you're wondering when you can get your hands on it, and how much it will cost, here you go: Apple says the iPad Pro and related accessories will be available in November.\r\n\r\nPricing will start at $799 with 32 gigabytes of memory and WiFi-only connectivity, with a $949 price tag for 128 GB, and $1,079 for 128 GB and a cellular connection. If you want the company's new stylus, \u003ca href=\"http://techcrunch.com/2015/09/09/the-apple-pencil-is-the-ipad-pros-secret-weapon/#.91issd:LNXD\"\u003edubbed the Apple Pencil\u003c/a\u003e, that'll cost you $99, and the Smart Keyboard will cost $169.\r\n"
    }
}

Oembed information:

{
    "type": "link",
    "url": "http://techcrunch.com/2015/09/09/ipad-pro-coming-in-november-pricing-starts-at-799/",
    "provider_url": "http://techcrunch.com",
    "provider_name": "TechCrunch",
    "title": "iPad Pro Coming In November, Pricing Starts At\u0026nbsp;$799",
    "description": "Apple unveiled its new iPad Pro today. If you're wondering when you can get your hands on it, and how much it will cost, here you go: Apple says the iPad Pro..",
    "width": 0,
    "height": 0,
    "thumbnail_url": "https://i1.wp.com/tctechcrunch2011.files.wordpress.com/2015/09/screen-shot-2015-09-09-at-1-49-10-pm.png?fit=440%2C330",
    "thumbnail_width": 440,
    "thumbnail_height": 218,
    "author_name": "\u003ca href=\"/author/anthony-ha/\" title=\"Posts by Anthony Ha\" onclick=\"s_objectID='river_author';\" rel=\"author\"\u003eAnthony Ha\u003c/a\u003e",
    "author_url": "/author/anthony-ha/",
    "html": "Apple \u003ca href=\"http://techcrunch.com/2015/09/09/apple-unveils-the-ipad-pro/\"\u003eunveiled its new iPad Pro today\u003c/a\u003e. If you're wondering when you can get your hands on it, and how much it will cost, here you go: Apple says the iPad Pro and related accessories will be available in November.\r\n\r\nPricing will start at $799 with 32 gigabytes of memory and WiFi-only connectivity, with a $949 price tag for 128 GB, and $1,079 for 128 GB and a cellular connection. If you want the company's new stylus, \u003ca href=\"http://techcrunch.com/2015/09/09/the-apple-pencil-is-the-ipad-pros-secret-weapon/#.91issd:LNXD\"\u003edubbed the Apple Pencil\u003c/a\u003e, that'll cost you $99, and the Smart Keyboard will cost $169.\r\n"
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].