All Projects → Hirrolot → Datatype99

Hirrolot / Datatype99

Licence: mit
C99 with sum types

Programming Languages

c
50402 projects - #5 most used programming language

Datatype99

A header-only library featuring safe, intuitive sum types with exhaustive pattern matching.

Table of contents

Highlights

  • Type-safe. Unlike manually written tagged unions, Datatype99 is type-safe: normally you cannot access invalid data or construct an invalid variant. Pattern matching is exhaustive too.

  • Pure C99/C++11. No external tools are required -- Datatype99 is implemented using only preprocessor macros.

  • Can be used everywhere. Literally everywhere provided that you have a standard-confirming C99/C++11 preprocessor. Even on freestanding environments.

  • Transparent. Datatype99 comes with formal code generation semantics, meaning if you try to look at datatype's output, normally you will not see something unexpected.

  • FFI-tolerant. Because of transparency, writing an FFI is not a challenge.

Installation

  1. Download Datatype99 and Metalang99 (minimum supported version -- 0.4.2).
  2. Add datatype99 and metalang99/include to your include paths.
  3. #include <datatype99.h> beforehand.

PLEASE, use Datatype99 only with -ftrack-macro-expansion=0 (GCC) or something similar, otherwise it will throw your compiler to the moon. Precompiled headers are also very helpful.

If you do not want the shortened versions to appear (e.g., datatype and match instead of datatype99 and match99), define DATATYPE99_NO_ALIASES before #include <datatype99.h>.

Usage

(The full example: examples/binary_tree.c.)

A sum type is created using the datatype macro. I guess you have already caught the syntax but actually there exist one more kind of a variant: an empty variant which is expressed simply as (Foo). It holds no data.

Pattern matching is likewise intuitive. Just three brief notes:

  • To match an empty variant, write of(Foo) { ... }.
  • To match the default case, i.e. when all other cases failed, write otherwise { ... }.
  • To ignore one or more variables inside of, write of(Foo, a, b, _, d).

Happy hacking!

Syntax and semantics

Having a well-defined semantics of the macros, you can write an FFI which is quite common in C.

EBNF syntax

<datatype>      ::= "datatype99(" <datatype-name> { "," <variant> }+ ")" ;
<variant>       ::= "(" <variant-name> [ { "," <type> }+ ] ")" ;
<datatype-name> ::= <ident> ;
<variant-name>  ::= <ident> ;

<match>         ::= "match99(" <lvalue> ")" { <arm> }+ ;
<matches>       ::= "matches99(" <expr> "," <ident> ")" ;
<if-let>        ::= "ifLet99(" <lvalue> "," <variant-name> "," <ident> [ { "," <ident> }+ ] ")" <stmt>;
<of>            ::= "of99(" <variant-name> [ { "," <ident> }+ ] ")" <stmt> ;
<otherwise>     ::= "otherwise99" <stmt> ;

Semantics

(It might be helpful to look at the generated code of examples/binary_tree.c's BinaryTree.)

datatype99

  1. Before everything, the following type definition is generated:
typedef struct <datatype-name> <datatype-name>;
  1. For each non-empty variant, the following type definition is generated (the metavariable <type> ranges over a corresponding variant's types):
typedef struct <datatype-name><variant-name> {
    <type>0 _0;
    ...
    <type>N _N;
} <datatype-name><variant-name>;
  1. For each non-empty variant, the following type definitions to types of each field of <datatype-name><variant-name> are generated:
typedef <type>0 <variant-name>_0;
...
typedef <type>N <variant-name>_N;
  1. For each variant, the following type definition to a corresponding sum type is generated:
typedef struct <datatype-name> <variant-name>SumT;
  1. For each sum type, the following tagged union is generated (inside the union, only fields to structures of non-empty variants are generated):
typedef enum <datatype-name>Tag {
    <variant-name>0Tag, ..., <variant-name>NTag
} <datatype-name>Tag;

typedef union <datatype-name>Data {
    char dummy;

    <datatype-name><variant-name>0 <variant-name>0;
    ...
    <datatype-name><variant-name>N <variant-name>N;
} <datatype-name>Data;

struct <datatype-name> {
    <datatype-name>Tag tag;
    <datatype-name>Data data;
};
  1. For each variant, the following function called a value constructor is generated:
inline static <datatype99-name> <variant-name>(...) { /* ... */ }

match99

match99 has the expected semantics: it sequentially tries to match the given instance of a sum type against the given variants, and, if a match has succeeded, it executes the corresponding statement and moves down to the next instruction (match(val) { ... } next-instruction;). If all the matches have failed, it executes the statement after otherwise99 and moves down to the next instruction.

of99

of99 accepts a matched variant name as a first argument and the rest of arguments comprise a comma-separated list of bindings.

  • A binding equal to _ is ignored.
  • A binding not equal to _ stands for a pointer to a corresponding data of the variant (e.g., let there be (Foo, T1, T2) and of99(Foo, x, y), then x has the type T1 * and y is T2 *).

There can be more than one _ binding, however, non-_ bindings must be distinct.

To match an empty variant, write of99(Bar).

matches99

matches99 just tests an instance of a sum type for a given variant. If the given instance corresponds to the given variant, it expands to truthfulness, otherwise it expands to falsehood.

ifLet99

ifLet99 tests for only one variant. It works conceptually the same as

match99(<expr>) {
    of(<variant-name>, vars...) { /* ... */ }
    otherwise {}
}

, but has a shorter syntax:

ifLet99(<expr>, <variant-name>, vars...) { /* ... */ }

Unit type

The unit type Unit99 represents a type of a single value, unit99 (it should not be assigned to anything else). Unit99 and unit99 are defined as follows:

typedef char Unit99;
static const Unit99 unit99 = '\0';

Credits

Thanks to Rust and ML for their implementations of sum types.

Learning resources

FAQ

Q: Why use C instead of Rust/Zig/whatever else?

A:

  • Datatype99 can be integrated into existing code bases written in pure C.
  • Sometimes C is the only choice.

Q: Why don't you use third-party code generators?

A: See Metalang99's README >>.

Q: How does it work?

A: The datatype99 macro generates a tagged union accompanied with type hints and value constructors. Pattern matching desugars merely to a switch statement. To generate all this stuff, Metalang99 is used, which is a preprocessor metaprogramming library.

Q: What about compile-time errors?

A: With -ftrack-macro-expansion=0 (GCC), there are no chances that compile-time errors will be longer than usual. Some kinds of syntactic errors are detected by the library itself, for example (-E flag):

// !"Metalang99 error" (datatype99): "Bar(int) is unparenthesised"
datatype(A, (Foo, int), Bar(int));

The others are understandable as well:

datatype(Foo, (FooA, NonExistingType));
playground.c:3:1: error: unknown type name ‘NonExistingType’
    3 | datatype(
      | ^~~~~~~~
playground.c:3:1: error: unknown type name ‘NonExistingType’
playground.c:3:1: error: unknown type name ‘NonExistingType’

If an error is not comprehensible at all, try to look at generated code (-E). Hopefully, the code generation semantics is formally defined so normally you will not see something unexpected.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].