All Projects → chwayne → rem

chwayne / rem

Licence: GPL-3.0 license
An HTML parsing library, written in Zig.

Programming Languages

Zig
133 projects

Projects that are alternatives of or similar to rem

bassdrum
reactive, type safe components with preact and rxjs.
Stars: ✭ 44 (-21.43%)
Mutual labels:  dom
phantom
👻 A reactive DOM rendering engine for building UIs.
Stars: ✭ 16 (-71.43%)
Mutual labels:  dom
object-dom
HTML Object Declarative Dom
Stars: ✭ 20 (-64.29%)
Mutual labels:  dom
s2
A function for reactive web UI.
Stars: ✭ 43 (-23.21%)
Mutual labels:  dom
parsed-html-rewriter
A DOM-based implementation of Cloudflare Worker's HTMLRewriter.
Stars: ✭ 34 (-39.29%)
Mutual labels:  dom
recks
🐶 React-like RxJS-based framework
Stars: ✭ 133 (+137.5%)
Mutual labels:  dom
crab
JavaScript library for building user interfaces with Custom Elements, Shadow DOM and React like API
Stars: ✭ 22 (-60.71%)
Mutual labels:  dom
paginathing
a jQuery plugin to paginate your DOM easily.
Stars: ✭ 23 (-58.93%)
Mutual labels:  dom
attoparser
A tiny but fast java event-style markup parser.
Stars: ✭ 46 (-17.86%)
Mutual labels:  dom
anim8js
The ultimate animation library for javascript - animate everything!
Stars: ✭ 33 (-41.07%)
Mutual labels:  dom
Python
covers python basic to advance topics, practice questions, logical problems in python, web development using html, css, bootstrap, jquery, DOM, Django 🚀🚀. 💥 🌈
Stars: ✭ 29 (-48.21%)
Mutual labels:  dom
dom-inspector
Dom inspect like chrome dev tools.
Stars: ✭ 124 (+121.43%)
Mutual labels:  dom
front-end-notes
前端课程学习笔记汇总
Stars: ✭ 57 (+1.79%)
Mutual labels:  dom
modulor-html
Missing template engine for Web Components
Stars: ✭ 36 (-35.71%)
Mutual labels:  dom
hsx
Static HTML sites with JSX and webpack (no React).
Stars: ✭ 15 (-73.21%)
Mutual labels:  dom
emerj
Emerj is a tiny JavaScript library to render live HTML/DOM updates efficiently and non-destructively, by merging an updated DOM with the live DOM, and only changing those elements that differ.
Stars: ✭ 56 (+0%)
Mutual labels:  dom
zig-wasm-dom
Zig + WebAssembly + JS + DOM
Stars: ✭ 81 (+44.64%)
Mutual labels:  dom
sapa
sapa is a library that creates a UI with a simple event system.
Stars: ✭ 65 (+16.07%)
Mutual labels:  dom
CDom
Simple HTML/XML/BBCode DOM component for PHP.
Stars: ✭ 26 (-53.57%)
Mutual labels:  dom
lego
🚀 Web-components made lightweight & Future-Proof.
Stars: ✭ 69 (+23.21%)
Mutual labels:  dom

rem

rem is an HTML5 parser written in Zig.

About

Features

  • An HTML5 parser consisting of a tokenizer (complete) and a tree constructor (works "well enough")
  • A minimal DOM implementation
  • HTML fragment parsing
  • Tested by html5lib-tests

Things to be improved

  • Better DOM functionality
  • Support for more character encodings
  • Support for Javascript

Why create this?

  • To understand what it takes "implement" HTML, even if just a small portion of it. As I discovered, even just trying to parse an HTML file correctly can be quite challenging.
  • To learn more about web standards in general. Reading the HTML spec naturally causes (or rather, forces) one to learn about DOM (especially), SVG, CSS, and many others.
  • For use in other projects, and to be useful to others.

Lastly...

rem is still a work in progress. Not all the features of a fully-capable HTML5 parser are implemented.

Get the code

Clone the repository like this:

git clone --recursive --config core.autocrlf=false https://github.com/chwayne/rem.git

There is also a GitLab mirror.

There are no dependencies other than a Zig compiler. You should use the latest version of Zig that is available.

Use the code

Here's an example of using the parser (you can also see the output of this program by running zig build example).

const std = @import("std");
const rem = @import("rem");
const allocator = std.testing.allocator;

pub fn main() !u8 {
    const string = "<!doctype html><html><body>Click here to download more RAM!";
    // The string must be decoded before it can be passed to the parser.
    const input = &rem.util.utf8DecodeStringComptime(string);

    // Create the DOM in which the parsed Document will be created.
    var dom = rem.dom.Dom{ .allocator = allocator };
    defer dom.deinit();

    var parser = try rem.Parser.init(&dom, input, allocator, .abort, false);
    defer parser.deinit();
    try parser.run();

    const errors = parser.errors();
    if (errors.len > 0) {
        std.log.err("A parsing error occured!\n{s}\n", .{@tagName(errors[0])});
        return 1;
    }

    const writer = std.io.getStdOut().writer();
    const document = parser.getDocument();
    try rem.util.printDocument(writer, document, &dom, allocator);
    return 0;
}

Test the code

rem uses (a fork of) html5lib-tests as a test suite. Specifically, it tests against the 'tokenizer' and 'tree-construction' tests from that suite.

zig build test-tokenizer will run the 'tokenizer' tests. zig build test-tree-construction will run the 'tree-construction' tests in 2 ways: with scripting off, then with scripting on. The expected results are as follows:

  • tokenizer: All tests pass.
  • tree-construction with scripting off: Some tests are skipped because they rely on HTML features that aren't yet implemented in this library (namely templates and namespaced element attributes). All other tests pass.
  • tree-construction with scripting on: Similar to testing with scripting off, but in addition, some entire test files are skipped because they would cause a crash.

License

GPL-3.0-only

Copyright (C) 2021 Chadwain Holness

rem is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 3.

This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this library. If not, see https://www.gnu.org/licenses/.

References

HTML Parsing Specification

DOM Specification

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].