All Projects → mathiversen → html-parser

mathiversen / html-parser

Licence: MIT license
A simple and general purpose html/xhtml parser, using Pest.

Programming Languages

HTML
75241 projects
rust
11053 projects

Projects that are alternatives of or similar to html-parser

AdvancedHTMLParser
Fast Indexed python HTML parser which builds a DOM node tree, providing common getElementsBy* functions for scraping, testing, modification, and formatting. Also XPath.
Stars: ✭ 90 (+60.71%)
Mutual labels:  dom, html-parser
Skrape.it
A Kotlin-based testing/scraping/parsing library providing the ability to analyze and extract data from HTML (server & client-side rendered). It places particular emphasis on ease of use and a high level of readability by providing an intuitive DSL. It aims to be a testing lib, but can also be used to scrape websites in a convenient fashion.
Stars: ✭ 231 (+312.5%)
Mutual labels:  dom, html-parser
html5parser
A super tiny and fast html5 AST parser.
Stars: ✭ 153 (+173.21%)
Mutual labels:  dom, html-parser
Didom
Simple and fast HTML and XML parser
Stars: ✭ 1,939 (+3362.5%)
Mutual labels:  dom, html-parser
Htmlparser2
The fast & forgiving HTML and XML parser
Stars: ✭ 3,299 (+5791.07%)
Mutual labels:  dom, html-parser
Lua Gumbo
Moved to https://gitlab.com/craigbarnes/lua-gumbo
Stars: ✭ 116 (+107.14%)
Mutual labels:  dom, html-parser
Hyntax
Straightforward HTML parser for JavaScript
Stars: ✭ 84 (+50%)
Mutual labels:  dom, html-parser
Minimize
Minimize HTML
Stars: ✭ 150 (+167.86%)
Mutual labels:  dom, html-parser
Facon
Tiny utility (272B) to create DOM elements with manner.
Stars: ✭ 212 (+278.57%)
Mutual labels:  dom
Respo
A virtual DOM library built with ClojureScript, inspired by React and Reagent.
Stars: ✭ 230 (+310.71%)
Mutual labels:  dom
Bliss
Blissful JavaScript
Stars: ✭ 2,352 (+4100%)
Mutual labels:  dom
Javascript Interview Questions
500+ JavaScript Interview Questions
Stars: ✭ 208 (+271.43%)
Mutual labels:  dom
Jest Dom
🦉 Custom jest matchers to test the state of the DOM
Stars: ✭ 2,908 (+5092.86%)
Mutual labels:  dom
Server Components
🔧 A simple, lightweight tool for composable HTML rendering in Node.js, based on web components.
Stars: ✭ 212 (+278.57%)
Mutual labels:  dom
Disintegrate
A small JS library to break DOM elements into animated Canvas particles.
Stars: ✭ 251 (+348.21%)
Mutual labels:  dom
Htmlkit
An Objective-C framework for your everyday HTML needs.
Stars: ✭ 206 (+267.86%)
Mutual labels:  dom
React Scroll Sync
Synced scroll position across multiple scrollable elements
Stars: ✭ 252 (+350%)
Mutual labels:  dom
Mogwai
The minimalist, obvious, graphical, web application interface
Stars: ✭ 249 (+344.64%)
Mutual labels:  dom
Lite Virtual List
Virtual list component library supporting waterfall flow based on vue
Stars: ✭ 223 (+298.21%)
Mutual labels:  dom
Angular Ru Interview Questions
Вопросы на собеседовании по Angular
Stars: ✭ 224 (+300%)
Mutual labels:  dom

Html parser

A simple and general purpose html/xhtml parser lib/bin, using Pest.

Features

  • Parse html & xhtml (not xml processing instructions)
  • Parse html-documents
  • Parse html-fragments
  • Parse empty documents
  • Parse with the same api for both documents and fragments
  • Parse custom, non-standard, elements; <cat/>, <Cat/> and <C4-t/>
  • Removes comments
  • Removes dangling elements
  • Iterate over all nodes in the dom three

What is it not

  • It's not a high-performance browser-grade parser
  • It's not suitable for html validation
  • It's not a parser that includes element selection or dom manipulation

If your requirements matches any of the above, then you're most likely looking for one of the crates below:

Examples bin

Parse html file

html_parser index.html

Parse stdin with pretty output

curl <website> | html_parser -p

Examples lib

Parse html document

    use html_parser::Dom;

    fn main() {
        let html = r#"
            <!doctype html>
            <html lang="en">
                <head>
                    <meta charset="utf-8">
                    <title>Html parser</title>
                </head>
                <body>
                    <h1 id="a" class="b c">Hello world</h1>
                    </h1> <!-- comments & dangling elements are ignored -->
                </body>
            </html>"#;

        assert!(Dom::parse(html).is_ok());
    }

Parse html fragment

    use html_parser::Dom;

    fn main() {
        let html = "<div id=cat />";
        assert!(Dom::parse(html).is_ok());
    }

Print to json

    use html_parser::{Dom, Result};

    fn main() -> Result<()> {
        let html = "<div id=cat />";
        let json = Dom::parse(html)?.to_json_pretty()?;
        println!("{}", json);
        Ok(())
    }
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].