All Projects → Alexhuszagh → Rust Lexical

Alexhuszagh / Rust Lexical

Licence: other
Lexical, to- and from-string conversion routines.

Programming Languages

rust
11053 projects

Projects that are alternatives of or similar to Rust Lexical

Csv
CSV Decoding and Encoding for Elixir
Stars: ✭ 398 (+107.29%)
Mutual labels:  encoding, parsing
BencodeNET
.NET library for encoding/decoding bencode and reading/writing torrent files
Stars: ✭ 133 (-30.73%)
Mutual labels:  encoding, parsing
Bitmatch
A Rust crate that allows you to match, bind, and pack the individual bits of integers.
Stars: ✭ 82 (-57.29%)
Mutual labels:  encoding, no-std
Serpent
A protocol to serialize Swift structs and classes for encoding and decoding.
Stars: ✭ 281 (+46.35%)
Mutual labels:  encoding, parsing
Cyberchef
The Cyber Swiss Army Knife - a web app for encryption, encoding, compression and data analysis
Stars: ✭ 13,674 (+7021.88%)
Mutual labels:  encoding, parsing
Go.geojson
Encoding and decoding GeoJSON <-> Go
Stars: ✭ 172 (-10.42%)
Mutual labels:  encoding
Encoding
Encoding Standard
Stars: ✭ 176 (-8.33%)
Mutual labels:  encoding
Command Line Api
Command line parsing, invocation, and rendering of terminal output.
Stars: ✭ 2,418 (+1159.38%)
Mutual labels:  parsing
Xssor2
XSS'OR - Hack with JavaScript.
Stars: ✭ 1,969 (+925.52%)
Mutual labels:  encoding
Eo Yaml
YAML for Java 8 and above. A user-friendly OOP library. Previously known as "Camel".
Stars: ✭ 189 (-1.56%)
Mutual labels:  parsing
Parse Xml
A fast, safe, compliant XML parser for Node.js and browsers.
Stars: ✭ 184 (-4.17%)
Mutual labels:  parsing
Staticjson
Fast, direct and static typed parsing of JSON with C++
Stars: ✭ 177 (-7.81%)
Mutual labels:  parsing
Wuffs
Wrangling Untrusted File Formats Safely
Stars: ✭ 2,948 (+1435.42%)
Mutual labels:  parsing
Libchef
🍀 c++ standalone header-only basic library. || c++头文件实现无第三方依赖基础库
Stars: ✭ 178 (-7.29%)
Mutual labels:  encoding
Ifme
Powerful x265 GUI Encoder
Stars: ✭ 168 (-12.5%)
Mutual labels:  encoding
Auto enums
A library for to allow multiple return types by automatically generated enum.
Stars: ✭ 188 (-2.08%)
Mutual labels:  no-std
Fecha
Lightweight and simple JS date formatting and parsing
Stars: ✭ 1,955 (+918.23%)
Mutual labels:  parsing
Yamerl
YAML 1.2 and JSON parser in pure Erlang
Stars: ✭ 174 (-9.37%)
Mutual labels:  parsing
Deep Generative Models For Natural Language Processing
DGMs for NLP. A roadmap.
Stars: ✭ 185 (-3.65%)
Mutual labels:  parsing
Combine
A parser combinator library for Elixir projects
Stars: ✭ 174 (-9.37%)
Mutual labels:  parsing

lexical

Build Status Latest Version Rustc Version 1.37+

Fast lexical conversion routines for both std and no_std environments. Lexical provides routines to convert numbers to and from decimal strings. Lexical is simple to use and focuses on performance and correctness. Finally, lexical-core is suitable for environments without a memory allocator, not requiring any internal allocations by default. And, as of version 2.0, lexical uses minimal unsafe features, limiting the chance of memory-unsafe code.

Table of Contents

Getting Started

Add lexical to your Cargo.toml:

[dependencies]
lexical = "^5.1"

And get started using lexical:

extern crate lexical;

// Number to string
lexical::to_string(3.0);            // "3.0", always has a fraction suffix, 
lexical::to_string(3);              // "3"

// String to number.
let i: i32 = lexical::parse("3").unwrap();   // Ok(3), auto-type deduction.
let f: f32 = lexical::parse("3.5").unwrap(); // Ok(3.5)
let d = lexical::parse::<f64, _>("3.5");     // Ok(3.5), error checking parse.
let d = lexical::parse::<f64, _>("3a");      // Err(Error(_)), failed to parse.

Lexical has both partial and complete parsers: the complete parsers ensure the entire buffer is used while parsing, without ignoring trailing characters, while the partial parsers parse as many characters as possible, returning both the parsed value and the number of parsed digits. Upon encountering an error, lexical will return an error indicating both the error type and the index at which the error occurred inside the buffer.

// This will return Err(Error(ErrorKind::InvalidDigit(3))), indicating 
// the first invalid character occurred at the index 3 in the input 
// string (the space character).
let x: i32 = lexical::parse("123 456").unwrap();

For floating-points, Lexical also includes parse_lossy, which may lead to minor rounding error (relative error of ~1e-16) in rare cases (see implementation details for more information), without using slow algorithms that may lead to serious performance degradation.

let x: f32 = lexical::parse_lossy("3.5").unwrap();   // 3.5

In order to use lexical in generics, the type may use the trait bounds FromLexical (for parse``),ToLexical(forto_string), orFromLexicalLossy(forparse_lossy`).

/// Multiply a value in a string by multiplier, and serialize to string.
fn mul_2<T>(value: &str, multiplier: T) 
    -> Result<String, lexical::Error>
    where T: lexical::ToLexical + lexical::FromLexical
{
    let value: T = lexical::parse(value)?;
    Ok(lexical::to_string(value * multiplier))
}

Benchmarks

Most of the following benchmarks measure the time it takes to convert 10,000 random values, for different types. The values were randomly generated using NumPy, and run in both std (rustc 1.29.2) and no_std (rustc 1.31.0) contexts (only std is shown) on an x86-64 Intel processor. More information on these benchmarks can be found in the benches folder and in the source code for the respective algorithms. Adding the flags "target-cpu=native" and "link-args=-s" were also used, however, they minimally affected the relative performance difference between different lexical conversion implementations.

For cross-language benchmarks, they measure the time it takes to convert a digit series of near-halfway decimal floating-point representations. The C++ benchmarks (RapidJSON, strtod, and double-conversion) were done using GCC 8.2.1 with glibc/libstdc++ using Google Benchmark and the -O3 flag. The Python benchmark was done using IPython on Python 3.6.6. The Go benchmark was done using go1.10.4. All benchmarks used the same data. For RapidJSON, the benchmark was done by publicly exposing the ParseNumber method with a custom handler.

For all the following benchmarks, lower is better.

Float to String

ftoa benchmark

Integer To String

itoa benchmark

String to Integer

atoi benchmark

String to f64 Simple, Random Data

atof64 benchmark

String to f64 Complex, Large Data Cross-Language Comparison

atof64 simple language benchmark

String to f64 Complex, Denormal Data Cross-Language Comparison

Note: Rust was unable to parse all but the 10-digit benchmark, producing an error result of ParseFloatError { kind: Invalid }. It performed ~2,000x worse than lexical for that benchmark.

atof64 simple language benchmark

Backends

For Float-To-String conversions, lexical uses one of three backends: an internal, Grisu2 algorithm, an external, Grisu3 algorithm, and an external, Ryu algorithm (~2x as fast).

Documentation

Lexical's documentation can be found on docs.rs. For detailed background on the algorithms and features in lexical, see lexical-core. Finally, for information on how to use lexical from C, C++, or Python, see lexical-capi.

Roadmap

Ideally, Lexical's float-parsing algorithm or approach would be incorporated into libcore. Although Lexical greatly improves on Rust's float-parsing algorithm, in its current state it's insufficient to be included in the standard library, including numerous "anti-features":

  1. It supports non-decimal radices for float parsing, leading to significant binary bloat and increased code branching, for almost non-existent use-cases.
  2. It supports rounding schemes other than round-to-nearest, tie-even.
  3. It inlines aggressively, producing significant binary bloat.
  4. It contains effectively dead code for efficient higher-order arbitrary-precision integer algorithms, for rare use-cases requiring asymptotically faster algorithms.

Versioning and Version Support

Version Support

The currently supported versions are:

  • v5.x
  • v4.x (Maintenace)

Rustc Compatibility

v5.x is tested to work on 1.37+, including stable, beta, and nightly. v4.x is the last version to support Rustc 1.24+, including stable, beta, and nightly.

Please report any errors compiling a supported lexical version on a compatible Rustc version.

Versioning

Lexical uses semantic versioning. Removing support for older Rustc versions is considered an incompatible API change, requiring a major version change.

Changelog

All changes since 2.2.0 are documented in CHANGELOG.

License

Lexical is dual licensed under the Apache 2.0 license as well as the MIT license. See the LICENCE-MIT and the LICENCE-APACHE files for the licenses.

Contributing

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in lexical by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].