All Projects → zrax → string_theory

zrax / string_theory

Licence: MIT license
Flexible modern C++ string library with type-safe formatting

Programming Languages

C++
36643 projects - #6 most used programming language
python
139335 projects - #7 most used programming language
CMake
9771 projects
c
50402 projects - #5 most used programming language
Starlark
911 projects
shell
77523 projects
Makefile
30231 projects

Projects that are alternatives of or similar to string theory

Stringz
💯 Super fast unicode-aware string manipulation Javascript library
Stars: ✭ 181 (+465.63%)
Mutual labels:  strings, utf-8, string-manipulation
Chr
🔤 Lightweight R package for manipulating [string] characters
Stars: ✭ 18 (-43.75%)
Mutual labels:  strings, string-manipulation
Twine
String manipulation, leveled up!
Stars: ✭ 496 (+1450%)
Mutual labels:  strings, string-manipulation
Str
str: yet another string library for C language.
Stars: ✭ 159 (+396.88%)
Mutual labels:  strings, string-manipulation
Str
A fast, solid and strong typed string manipulation library with multibyte support
Stars: ✭ 199 (+521.88%)
Mutual labels:  utf-8, string-manipulation
StringPool
A performant and memory efficient storage for immutable strings with C++17. Supports all standard char types: char, wchar_t, char16_t, char32_t and C++20's char8_t.
Stars: ✭ 19 (-40.62%)
Mutual labels:  strings, utf-8
Cracking The Coding Interview
Solutions for Cracking the Coding Interview - 6th Edition
Stars: ✭ 35 (+9.38%)
Mutual labels:  strings, string-manipulation
Mightystring
Making Ruby Strings Powerful
Stars: ✭ 28 (-12.5%)
Mutual labels:  strings, string-manipulation
Util
A collection of useful utility functions
Stars: ✭ 201 (+528.13%)
Mutual labels:  strings, string-manipulation
Stringy
A PHP string manipulation library with multibyte support
Stars: ✭ 2,461 (+7590.63%)
Mutual labels:  strings, utf-8
the-stringler
An OOP approach to string manipulation.
Stars: ✭ 36 (+12.5%)
Mutual labels:  strings, string-manipulation
Voca rs
Voca_rs is the ultimate Rust string library inspired by Voca.js, string.py and Inflector, implemented as independent functions and on Foreign Types (String and str).
Stars: ✭ 167 (+421.88%)
Mutual labels:  utf-8, string-manipulation
Portable Utf8
🉑 Portable UTF-8 library - performance optimized (unicode) string functions for php.
Stars: ✭ 405 (+1165.63%)
Mutual labels:  utf-8, string-manipulation
Cuerdas
String manipulation library for Clojure(Script)
Stars: ✭ 272 (+750%)
Mutual labels:  strings, string-manipulation
Tiny Utf8
Unicode (UTF-8) capable std::string
Stars: ✭ 322 (+906.25%)
Mutual labels:  utf-8, string-manipulation
bigint
bigint is a C++ library which can handle Very very Big Integers. It can calculate factorial of 1000000... it can go any big. It may be useful in Competitive Coding and Scientific Calculations which deals with very very large Integers. It can also be used in Decryption process. It has many inbuilt functions which can be very useful.
Stars: ✭ 34 (+6.25%)
Mutual labels:  strings, string-manipulation
python-string-utils
A handy Python library to validate, manipulate and generate strings
Stars: ✭ 47 (+46.88%)
Mutual labels:  strings, string-manipulation
Libft
42 library of basic C functions - queues, lists, memory operations and more 😄
Stars: ✭ 21 (-34.37%)
Mutual labels:  strings, string-manipulation
fast-text-encoding
Fast polyfill for TextEncoder and TextDecoder, only supports UTF-8
Stars: ✭ 78 (+143.75%)
Mutual labels:  utf-8
stringsifter
A machine learning tool that ranks strings based on their relevance for malware analysis.
Stars: ✭ 567 (+1671.88%)
Mutual labels:  strings

String Theory

GitHub Build Status Coverity Scan Build Status

Introduction

String Theory is a flexible modern C++ library for string manipulation and storage. It stores data internally as UTF-8, for ease of use with existing C/C++ APIs. It can also handle conversion to and from UTF-16, UTF-32, and Latin-1, and has a variety of methods to simplify text manipulation.

In addition, String Theory includes a powerful and fast type-safe string formatter (ST::format), which can be extended with custom type formatters by end-user code.

You can find the full documentation online at https://github.com/zrax/string_theory/wiki.

Why another string library?

String Theory was originally developed to replace the half-dozen or so string types and string manipulation mini-libraries in the Plasma game engine. Because of the state of the code, it was designed primarily to reduce coding errors, provide an easy to use set of manipulation functionality with minimal surprises, handle Unicode text without a lot of overhead, and have reasonable performance. Many existing string libraries provide some subset of those features, but were hard to integrate well with Plasma, or didn't meet all of our needs. Therefore, plString (and later plFormat) were born. After it had matured a while, it seemed that other projects could benefit from the string library, and so it was ported out into its own stand-alone library, which is String Theory.

String Theory's features

String Theory is designed to provide:

  • Minimal surprises. Strings are immutable objects, so you never have to worry whether your .replace() will create a copy or modify the original -- it will always return a copy even if the new string is identical.
  • UTF-8 by default. You don't have to remember what encoding your string data came in as; by the time ST::string is constructed, its data is assumed to already be in the UTF-8 encoding. This also allows easy re-use by other character-based APIs, since you don't have to first down-convert the string data from UTF-16 or UTF-32 in order to use it.
  • Easy conversion to Unicode formats. String theory provides conversion between UTF-8, UTF-16, UTF-32 and Latin-1. In addition, it can check raw character input for validity with several mechanisms (C++ exceptions, replacement of invalid characters, or just ignore).
  • Type-safe formatting. sprintf and friends are notoriously unsafe, and are one of the most common sources of bugs in string code. ST::format uses C++11's variadic templates to provide a type-safe way to format strings. String Theory also provides a mechanism to create custom formatters for end-user code, in order to extend ST::format's capabilities.
  • Good performance. String theory is optimized to be reasonably fast on a variety of compilers and systems. For ST::string, this ends up being slightly slower than C++'s std::string due to the extra encoding work. However, in my tests ST::string_stream tends to be faster or at least on par with std::stringstream, and ST::format is in the same order of magnitude as an equivalent snprintf.
  • Reentrance. Another side-effect of immutable strings is that ST::string is a fully reentrant string object with no locking necessary.
  • Cross Platform. String Theory is supported on any platform that provides a reasonably modern C++ compiler. Additional features from newer compilers are detected and enabled when supported, but not required.
  • Minimal dependencies. Currently, String Theory has no run-time dependencies aside from the C/C++ standard libraries and runtime. Additional tools may however be necessary for building String Theory or its tests.
  • Well tested. String Theory comes with an extensive suite of unit tests to ensure it works as designed.

What String Theory is NOT

  • A full Unicode library. If you need more Unicode support than just basic UTF data conversion, you probably want to use something like ICU instead.
  • A faster version of std::string. String Theory was never designed to be faster than STL, and because of its design goal to always use UTF-8 data internally, it may be slower for some use cases. However, practical tests have shown that ST::string performs at least on par with STL in many use cases, and ST::format is usually significantly faster than many other type-safe alternatives such as boost::format.
  • A regular expression library. C++11 provides a regex library which should be usable with ST::string, and I don't have a compelling reason at this point to introduce another regular expression library to String Theory.
  • A library for working with theoretical physics. Just in case you got this far and were still uncertain :).

Platform Support

string_theory supports a variety of platforms and compilers. As of July 2020, string_theory is tested and working on:

  • GCC 10 (Arch Linux x86_64 and ARMv7)
  • GCC 9 (Ubuntu 20.04 x86_64)
  • GCC 7 (Ubuntu 18.04 x86_64)
  • GCC 5 (Ubuntu 16.04 x86_64)
  • GCC 4.8 (Ubuntu 14.04 i686)
  • Clang 10 (Arch Linux x86_64 and ARMv7)
  • Clang 10 (Ubuntu 20.04 x86_64)
  • Clang 6 (Ubuntu 18.04 x86_64)
  • Clang 3.8 (Ubuntu 16.04 x86_64)
  • AppleClang 11.0 (macOS Catalina)
  • AppleClang 10.0 (macOS Catalina)
  • MSVC 2019 (x64 and x86)
  • MSVC 2017 (x64 and x86)
  • MinGW-w64 GCC 10 (x86_64 and i686)
  • MinGW-w64 GCC 8 (x86_64)

As of string_theory 3.0, support for some older compilers has been dropped. You'll need a compiler that supports most of C++11.

Contributing to String Theory

String Theory is Open Source software, licensed under the MIT license. Contributions are welcome, and may be submitted as issues and/or pull requests on GitHub: http://github.com/zrax/string_theory.

Some ideas for areas to contribute:

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].