All Projects → rgrove → Parse Xml

rgrove / Parse Xml

Licence: isc
A fast, safe, compliant XML parser for Node.js and browsers.

Programming Languages

javascript
184084 projects - #8 most used programming language
js
455 projects

Projects that are alternatives of or similar to Parse Xml

Fuzi
A fast & lightweight XML & HTML parser in Swift with XPath & CSS support
Stars: ✭ 894 (+385.87%)
Mutual labels:  xml, parser, parsing, xml-parser
Posthtml
PostHTML is a tool to transform HTML/XML with JS plugins
Stars: ✭ 2,737 (+1387.5%)
Mutual labels:  xml, parser, xml-parser
Hquery.php
An extremely fast web scraper that parses megabytes of invalid HTML in a blink of an eye. PHP5.3+, no dependencies.
Stars: ✭ 295 (+60.33%)
Mutual labels:  xml, parser, xml-parser
Sax Wasm
The first streamable, fixed memory XML, HTML, and JSX parser for WebAssembly.
Stars: ✭ 89 (-51.63%)
Mutual labels:  xml, parser, xml-parser
Node Xml2js
XML to JavaScript object converter.
Stars: ✭ 4,402 (+2292.39%)
Mutual labels:  xml, parsing, xml-parser
Dasel
Query, update and convert data structures from the command line. Comparable to jq/yq but supports JSON, TOML, YAML, XML and CSV with zero runtime dependencies.
Stars: ✭ 759 (+312.5%)
Mutual labels:  xml, parser, xml-parser
Xml2lua
XML Parser written entirely in Lua that works for Lua 5.1+. Convert XML to and from Lua Tables 🌖💱
Stars: ✭ 150 (-18.48%)
Mutual labels:  xml, parser, xml-parser
Xml Js
Converter utility between XML text and Javascript object / JSON text.
Stars: ✭ 874 (+375%)
Mutual labels:  xml, parser, xml-parser
Oga
Read-only mirror of https://gitlab.com/yorickpeterse/oga
Stars: ✭ 1,147 (+523.37%)
Mutual labels:  xml, parser, xml-parser
Whois Parser
Go(Golang) module for domain whois information parsing.
Stars: ✭ 123 (-33.15%)
Mutual labels:  parser, parsing
Dan Jurafsky Chris Manning Nlp
My solution to the Natural Language Processing course made by Dan Jurafsky, Chris Manning in Winter 2012.
Stars: ✭ 124 (-32.61%)
Mutual labels:  parser, parsing
Dart Xml
Lightweight library for parsing, traversing, and transforming XML in Dart.
Stars: ✭ 139 (-24.46%)
Mutual labels:  xml, xml-parser
Sywac
🚫 🐭 Asynchronous, single package CLI framework for Node
Stars: ✭ 109 (-40.76%)
Mutual labels:  parser, parsing
Graphql Go Tools
Tools to write high performance GraphQL applications using Go/Golang.
Stars: ✭ 96 (-47.83%)
Mutual labels:  parser, parsing
Coregpx
A library for parsing and creation of GPX location files. Purely Swift.
Stars: ✭ 132 (-28.26%)
Mutual labels:  xml, parsing
Libdparse
Library for lexing and parsing D source code
Stars: ✭ 91 (-50.54%)
Mutual labels:  parser, parsing
Xmlbuilder2
An XML builder for node.js
Stars: ✭ 143 (-22.28%)
Mutual labels:  xml, xml-parser
Parjs
JavaScript parser-combinator library
Stars: ✭ 145 (-21.2%)
Mutual labels:  parser, parsing
Serde Xml Rs
xml-rs based deserializer for Serde (compatible with 1.0+)
Stars: ✭ 141 (-23.37%)
Mutual labels:  xml, parsing
Woodstox
The gold standard Stax XML API implementation. Now at Github.
Stars: ✭ 145 (-21.2%)
Mutual labels:  xml, xml-parser

parse-xml

A fast, safe, compliant XML parser for Node.js and browsers.

npm version Bundle size Test & Lint

Contents

Installation

npm install @rgrove/parse-xml

Or, if you like living dangerously, you can load the minified UMD bundle in a browser via Unpkg and use the parseXml global.

Features

Not Features

This parser currently discards document type declarations (<!DOCTYPE ... >) and all their contents, because they're rarely useful and some of their features aren't safe when the XML being parsed comes from an untrusted source.

In addition, the only supported character encoding is UTF-8 because it's not feasible (or useful) to suppport other character encodings in JavaScript.

API

See API.md for complete API docs.

Examples

Basic Usage

const parseXml = require('@rgrove/parse-xml');
let doc = parseXml('<kittens fuzzy="yes">I like fuzzy kittens.</kittens>');

The result is an XmlDocument instance containing the parsed document, with a structure that looks like this (some properties and methods are excluded for clarity; see the API docs for details):

{
  type: 'document',
  children: [
    {
      type: 'element',
      name: 'kittens',
      attributes: {
        fuzzy: 'yes'
      },
      children: [
        {
          type: 'text',
          text: 'I like fuzzy kittens.'
        }
      ],
      parent: { ... },
      isRootNode: true
    }
  ]
}

Friendly Errors

When something goes wrong, parse-xml throws an error that tells you exactly what happened and shows you where the problem is so you can fix it.

parseXml('<foo><bar>baz</foo>');

Output

Error: Missing end tag for element bar (line 1, column 14)
  <foo><bar>baz</foo>
               ^

In addition to a helpful message, error objects have the following properties:

  • column Number

    Column where the error occurred (1-based).

  • excerpt String

    Excerpt from the input string that contains the problem.

  • line Number

    Line where the error occurred (1-based).

  • pos Number

    Character position where the error occurred relative to the beginning of the input (0-based).

Why another XML parser?

There are many XML parsers for Node, and some of them are good. However, most of them suffer from one or more of the following shortcomings:

  • Native dependencies.

  • Loose, non-standard parsing behavior that can lead to unexpected or even unsafe results when given input the author didn't anticipate.

  • Kitchen sink APIs that tightly couple a parser with DOM manipulation functions, a stringifier, or other tooling that isn't directly related to parsing and consuming XML.

  • Stream-based parsing. This is great in the rare case that you need to parse truly enormous documents, but can be a pain to work with when all you want is a node tree.

  • Poor error handling.

  • Too big or too Node-specific to work well in browsers.

parse-xml's goal is to be a small, fast, safe, compliant, non-streaming, non-validating, browser-friendly parser, because I think this is an under-served niche.

I think parse-xml demonstrates that it's not necessary to jettison the spec entirely or to write complex code in order to implement a small, fast XML parser.

Also, it was fun.

Benchmark

Here's how parse-xml stacks up against two comparable libraries, libxmljs2 (which is based on the native libxml library) and xmldoc (which is based on sax-js).

Node.js v14.15.4 / Darwin x64
Intel(R) Core(TM) i7-6920HQ CPU @ 2.90GHz

Running "Small document (291 bytes)" suite...
Progress: 100%

  @rgrove/parse-xml 3.0.0:
    77 109 ops/s, ±0.46%   | fastest

  libxmljs2 0.26.6 (native):
    29 480 ops/s, ±4.62%   | slowest, 61.77% slower

  xmldoc 1.1.2 (sax-js):
    36 035 ops/s, ±0.62%   | 53.27% slower

Finished 3 cases!
  Fastest: @rgrove/parse-xml 3.0.0
  Slowest: libxmljs2 0.26.6 (native)

Running "Medium document (72081 bytes)" suite...
Progress: 100%

  @rgrove/parse-xml 3.0.0:
    321 ops/s, ±0.99%   | 54.34% slower

  libxmljs2 0.26.6 (native):
    703 ops/s, ±10.64%   | fastest

  xmldoc 1.1.2 (sax-js):
    235 ops/s, ±0.50%   | slowest, 66.57% slower

Finished 3 cases!
  Fastest: libxmljs2 0.26.6 (native)
  Slowest: xmldoc 1.1.2 (sax-js)

Running "Large document (1162464 bytes)" suite...
Progress: 100%

  @rgrove/parse-xml 3.0.0:
    20 ops/s, ±0.48%   | 72.97% slower

  libxmljs2 0.26.6 (native):
    74 ops/s, ±12.02%   | fastest

  xmldoc 1.1.2 (sax-js):
    19 ops/s, ±1.68%   | slowest, 74.32% slower

Finished 3 cases!
  Fastest: libxmljs2 0.26.6 (native)
  Slowest: xmldoc 1.1.2 (sax-js)

See the parse-xml-benchmark repo for instructions on running this benchmark yourself.

License

ISC License

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].