CEX.C Language Documentation

Getting started with CEX.C

What is CEX

Cex.C (pronounced as cexy) is Comprehensively EXtended C Language . CEX was born as alternative answer to a plethora of brand new LLVM based languages which strive to replace old C. CEX still remains C language itself, with small but important tweaks that makes CEX a completely different development experience.

I tried to bring best ideas from the modern languages while maintaining smooth developer experience for writing C code. The main goal of CEX is to provide tools for developers and helping them writing high quality C code in general.

Core features

  • Single header, cross-platform, drop-in C language extension
  • No dependencies except C compiler
  • Self contained build system: CMake/Make/Ninja no more
  • Modern memory management model
  • New error handling model
  • New strings
  • Namespaces
  • Code quality oriented tools
  • New dynamic arrays and hashmaps with seamless C compatibility

Solving old C problems

CEX is another attempt to make old C a little bit better. Unlike other new system languages like Rust, Zig, C3 which tend to start from scratch, CEX focuses on evolution process and leverages existing tools provided by modern compilers to make code safer, easy to write and debug.

C Problem CEX Solution
Bug prone memory management CEX provides allocator centric and scoped memory allocation. It uses ArenaAllocators and Temporary allocator in mem$scope() which decrease probability of memory bugs.
Unsafe arrays Address sanitizers are enabled by default, so you’ll get your crashes as in other languages.
3rd party build system Integrated build system, eliminates flame wars about what it better. Now you can use Cex to run your build scripts, like in Zig
Rudimentary error handling CEX introduces Exception type and compiler forces you to check it. New error handling approach make error checking easy and open cool possibilities like stack traces in C.
C is unsafe Yeah, and it’s a cool feature! On other hand, CEX provides unit testing engine and fuzz tester support out of the box.
Bad string support String operations in CEX are safe, NULL and buffer overflow resilient. CEX has dynamic string builder, slices and C compatible strings.
No data structures CEX has type-safe generic dynamic array and hashmap types, they cover 80% of all use cases.
No namespaces It’s more about LSP, developer experience and readability. It much better experience to type and read str.slice.starts_with than str_slice_starts_with.

Making new CEX project

You can initialize a working boiler plate project just using a C compiler and the cex.h file.

Note

Make sure that you have a C compiler installed, we use cc command as a default compiler. You may replace it with gcc or clang.

  1. Make a project directory
mkdir project_dir
cd project_dir
  1. Download cex.h
  2. Make a seed program

At this step we are compiling a special pre-seed program that will create a template project at the first run

cc -D CEX_NEW -x c ./cex.h -o ./cex
  1. Run cex program for project initialization

Cex program automatically creating a project structure with sample app and unit tests. Also it recompiles itself to become universal build system for the project. You may change its logic inside cex.c file, this is your build script now.

./cex
  1. Now your project is ready to go

Now you can launch a sample program or run its unit tests.

./cex test run all
./cex app run myapp
  1. This is how to check your environment and build variables
> ./cex config

cexy$* variables used in build system, see `cex help 'cexy$cc'` for more info
* CEX_LOG_LVL               4
* cexy$build_dir            ./build
* cexy$src_dir              ./examples
* cexy$cc                   cc
* cexy$cc_include           "-I."
* cexy$cc_args_sanitizer    "-fsanitize-address-use-after-scope", "-fsanitize=address", "-fsanitize=undefined", "-fsanitize=leak", "-fstack-protector-strong"
* cexy$cc_args              "-Wall", "-Wextra", "-Werror", "-g3", "-fsanitize-address-use-after-scope", "-fsanitize=address", "-fsanitize=undefined", "-fsanitize=leak", "-fstack-protector-strong"
* cexy$cc_args_test         "-Wall", "-Wextra", "-Werror", "-g3", "-fsanitize-address-use-after-scope", "-fsanitize=address", "-fsanitize=undefined", "-fsanitize=leak", "-fstack-protector-strong", "-Wno-unused-function", "-Itests/"
* cexy$ld_args
* cexy$fuzzer               "clang", "-O0", "-Wall", "-Wextra", "-Werror", "-g", "-Wno-unused-function", "-fsanitize=address,fuzzer,undefined", "-fsanitize-undefined-trap-on-error"
* cexy$debug_cmd            "gdb", "-q", "--args"
* cexy$pkgconf_cmd          "pkgconf"
* cexy$pkgconf_libs
* cexy$process_ignore_kw    ""
* cexy$cex_self_args
* cexy$cex_self_cc          cc

Tools installed (optional):
* git                       OK
* cexy$pkgconf_cmd          OK ("pkgconf")
* cexy$vcpkg_root           Not set
* cexy$vcpkg_triplet        Not set

Global environment:
* Cex Version               0.14.0 (2025-06-05)
* Git Hash                  07aa036d9094bc15eac8637786df0776ca010a33
* os.platform.current()     linux
* ./cex -D<ARGS> config     ""

Meet Cexy build system

cexy$ is a build system integrated with Cex, which helps to manage your project, run tests, find symbols and getting help.

> ./cex --help
Usage:
cex  [-D] [-D<ARG1>] [-D<ARG2>] command [options] [args]

CEX language (cexy$) build and project management system

help                Search cex.h and project symbols and extract help
process             Create CEX namespaces from project source code
new                 Create new CEX project
stats               Calculate project lines of code and quality stats
config              Check project and system environment and config
libfetch            Get 3rd party source code via git or install CEX libs
test                Test running
fuzz                Generic fuzz tester
app                 Generic app build/run/debug

You may try to get help for commands as well, try `cex process --help`
Use `cex -DFOO -DBAR config` to set project config flags
Use `cex -D config` to reset all project config flags to defaults

Code example

Hello world in CEX

#define CEX_IMPLEMENTATION
#include "cex.h"

int
main(int argc, char** argv)
{
    io.printf("MOCCA - Make Old C Cexy Again!\n");
    return 0;
}

Holistic function

// CEX has special exception return type that forces the caller to check return type of calling
//   function, also it provides support of call stack printing on errors in vanilla C
Exception
cmd_custom_test(u32 argc, char** argv, void* user_ctx)
{
    // Let's open temporary memory allocator scope (var name is `_`)
    //  it will free all allocated memory after any exit from scope (including return or goto)
    mem$scope(tmem$, _)
    { 
        e$ret(os.fs.mkpath("tests/build/")); // make directory or return error with traceback
        e$assert(os.path.exists("tests/build/")); // evergreen assertion or error with traceback

        // auto type variables
        auto search_pattern = "tests/os_test/*.c";

        // Trace with file:<line> + formatting
        log$trace("Finding/building simple os apps in %s\n", search_pattern);

        // Search all files in the directory by wildcard pattern
        //   allocate the results (strings) on temp allocator arena `_`
        //   return dynamic array items type of `char*`
        arr$(char*) test_app_src = os.fs.find(search_pattern, false, _);

        // for$each works on dynamic, static arrays, and pointer+length
        for$each(src, test_app_src)
        {
            char* tgt_ext = NULL;
            char* test_launcher[] = { cexy$debug_cmd }; // CEX macros contain $ in their names

            // arr$len() - universal array length getter 
            //  it supports dynamic CEX arrays and static C arrays (i.e. sizeof(arr)/sizeof(arr[0]))
            if (arr$len(test_launcher) > 0 && str.eq(test_launcher[0], "wine")) {
                // str.fmt() - using allocator to sprintf() format and return new char*
                tgt_ext = str.fmt(_, ".%s", "win");
            } else {
                tgt_ext = str.fmt(_, ".%s", os.platform.to_str(os.platform.current()));
            }

            // NOTE: cexy is a build system for CEX, it contains utilities for building code
            // cexy.target_make() - makes target executable name based on source
            char* target = cexy.target_make(src, cexy$build_dir, tgt_ext, _);

            // cexy.src_include_changed - parses `src` .c/.h file, finds #include "some.h",
            //   and checks also if "some.h" is modified
            if (!cexy.src_include_changed(target, src, NULL)) {
                continue; // target is actual, source is not modified
            }

            // Launch OS command and get interactive shell
            // os.cmd. provides more capabilities for launching subprocesses and grabbing stdout
            e$ret(os$cmd(cexy$cc, "-g", "-Wall", "-Wextra", "-o", target, src));
        }
    }

    // CEX provides capabilities for generating namespaces (for user's code too!)
    // For example, cexy namespace contains
    // cexy.src_changed() - 1st level function
    // cexy.app.run() - sub-level function
    // cexy.cmd.help() - sub-level function
    // cexy.test.create() - sub-level function
    return cexy.cmd.simple_test(argc, argv, user_ctx);
}

Supported compilers/platforms

Tested compilers / Libc support

  • GCC - 10, 11, 12, 13, 14, 15
  • Clang - 13, 14, 15, 16, 17, 18, 19, 20
  • MSVC - unsupported, probably never will
  • LibC tested - glibc (linux), musl (linux), ucrt/mingw (windows), macos

Tested platforms / architectures

  • Linux - x32 / x64 (glibc, gcc + clang),
  • Alpine linux - (libc musl, gcc) on architectures x86_64, x86, aarch64, armhf, armv7, loongarch64, ppc64le, riscv64, and s390x (big-endian)
  • Windows (via MSYS2 build) - x64 (mingw64 + clang), libc mscrt/ucrt
  • Macos - x64 / arm64 (clang)

Resources

CEX philosophy

Main purpose of CEX

Cex was designed as a thin base layer above core C language, with the followings goals in mind:

  • Enhancing developer experience. Most common things should be seamless as possible, without changing core language mechanics. Reducing boilerplate code, with improving readability and debugability.
  • Eliminating dependencies. Cex is a single header, all-in-one language, with core tools for building, testing and debugging your project. You need only C compiler (clang or gcc), that’s it. Build system is included, you can write your build logic in C.
  • Cross-platform. All Cex capabilities are cross-platform tested, and you don’t need to figure out nuances of behavior of system API for different platforms.
  • Self-sufficient build system. CMake/Make/ShellScripts are dependencies, they essentially separate programming languages, it’s a burden. Cex itself is a build system, with simple CLI and supports cross-platform builds, persistent configuration, build logic written in C (like in Zig).
  • Scripting-flavor. CEX designed to make common code patterns easy to work with, it combats with extra complexity, reducing mental overhead for writing code. New memory management and error handling make daily life way more easier.
  • Less is more and enough is enough. Cex trying to add just enough new entities (types, namespaces, functions) to original C to make life easier, extend C functionality only when needed. Cex embraces conservatism of C, and it’s goal to be a stable base layer for projects years ahead.
  • Code quality tools. Cex leverages existing compiler capabilities for making C code better. It includes sanitizers, lib fuzzers, and unit tests out of the box, letting your to focus on development.
  • Long-term lifetime. When a project is built with CEX, it carries all what’s needed inside its repo, CEX header itself after 1.0 release will be maintained for ultimate backward compatibility. Ideally, it has to be API stable at SQLite project.

Simplicity as a virtue

C was a least common denominator for legacy and modern system software for decades. It’s a simple language, but very hard to master. Cex tries to add thin layer of things, for making life a little bit easier, but without bringing to much complexity to the code.

It’s challenging to make something simple by adding more stuff, which by definition adding a complexity. However, by adding stuff Cex reduces decision making burden, providing common code patterns, utility functions, and rethinking C experience.

For example:

  • Cex errors make vanilla C error handling obsolete. It’s just one type with two states (error / no error), with unlimited options for errors variants. No more special enums, no more -1 and errno. Cex errors are easy to throw, easy to log, easy to handle.
  • Memory handling via allocators make all allocating function explicit. It’s easier to reason about the code, easier to track lifetimes, easier to cleanup when using arenas and scopes. Also having standard allocators allows to make reusable code or use allocators for memory leaks debugging.
  • Namespaces mitigate remembering burden when we have dozens of functions with the same prefix, it’s easier to type and follow by LSP suggestions. Using sub-namespaces allows to reduce mental overhead of picking right one. For example, str.convert. and str.slice. expand to specific sub-namespaces in LSP suggestions in CEX. Using namespaces feels like a decision tree, you branch step by step from upper namespace to sub-namespace to end function. It’s feels much easier to work with str.slice.remove_prefix typing than remembering full function name str_slice_remove_prefix.
  • Commonality of data collections. Cex has dynamic arrays and hashmaps, which can be handled as any other C array.Inspired by Python approach of applying of len and for to anything iterable, Cex also has arr$len and for$each which can be commonly used for any C/Cex static or dynamic array, hashmap or pointer+length data.
  • Cex built system might look like overkill (just use CMake) however with it you still practicing C/Cex, no extra dependency needed for building your project. Cex on the other hand doing its best to provide utility tools for building code, working with files, strings, and OS.

Making C cexy again

Old kind C99 might look too outdated for the modern times, but we have C23 compilers nowadays with brilliant tooling like sanitizers and fuzzers included. This opens a new era of C, safer C.

C looks like it’s a perfect fit for unsafe low-level applications like OS kernels, drivers, math libraries. Doing higher level stuff was always miserable in C in different reasons. Cex is an attempt of bringing C on a little bit higher level.

Joy of C

In my opinion, the unsafety of C is a really fun to work with, everything is under your control, everything is your responsibility. You can do wild stuff without any complaints from the compiler, ultimate freedom of code with ultimate responsibility.

This freedom of code is not nearly achievable with any modern language, they tend to set unlimited guardrails and protect us from any possible issues. This comes hand in hand with language complexity, unlimited struggle with compiler warnings, and adding new abstraction levels over everything which may hurt you. We end up with a sterile world of safe computer science, without any chance to touch and understand how machine works on low level.

Mission of CEX is to bring joy of the programming in C on the new level, to help in making C code safer and easier to write and understand.

Shooting in the foot

What if self shooting in the foot is not a bad idea? Before you start imagining pistol or shotgun, hold off, let’s start small… What if we could pick a toy gun with plastic bullets? What if we could stress test our code under different conditions and see what’s happened? What happens when we pass NULL to that argument? What about buffer overflow?

Caution

CEX Principle: making by shaking.

For making safe C code we must have tools for that, fortunately modern compilers already have them:

  1. Address sanitizers - for clang/gcc catch variety of bugs (buffer overflows, use after free, memory leaks, etc.). We only need to help them to trigger, by shaking the code via unit tests or fuzzers.
  2. Unit Tests - C is the one of the languages which require 3x more testing efforts than any other programming language.
  3. Fuzzers - for some cases the variety of inputs is too large, so we could not cover all of them via unit testing. Fuzzers come to help with this issue, but also can be used in Deterministic Simulation Testing, or randomized testing. LibFuzzer is included in clang, or you can use AFL++ if you want.
  4. Assertions - placing asserts everywhere in your code is a big deal for a code quality, and long term early warning about possible system inconsistencies. They can be used not only at checking input of a function, but validating results, or even at intermediate stages.

All of the tools above are available with cex.h out of the box, so adding new test via CLI never has been easier:

# New test boilerplate
./cex test create tests/test_my_stuff.c

# Running a test
./cex test run tests/test_my_stuff.c

# Running all tests in a project
./cex test run all

Problems and solutions of C

Every programming language has its own quirks, sharp edges, and workarounds. C is not an exception here. However, in my opinion, sanitizers made a revolution in C development. With modern compilers C is much safer than it used to be even 10 years ago. Fuzzers made another step above, especially if your program works a lot with user input.

Problem Solution
Memory safety We can use sanitizers to catch most of the cases, more unit tests!
Memory leaks Memory scopes in CEX, sanitizers for checking
Manual memory management Temp allocator in CEX, memory scopes, arenas
Name conflicts Hard to solve, CEX mitigates it by introducing namespace generation
Error handling is inconsistent CEX introduces new unified error handling
Type overflows Unit testing, fuzzing
Unsafe type casting It hurts especially at refactoring, but unit testing will catch everything.
Undefined behavior Use UB sanitizer, fix all compiler warnings
Macros People hate macros, I don’t know why, just use in moderation
No generic types CEX dynamic arrays and hashmaps are fully generic, solvable with macros and _Generic
No tracebacks Use sanitizers, also Cex uassert prints tracebacks, Cex errors handling can generate tracebacks
Poor core types (strings, dynamic arrays, hashmaps) CEX introducing general purpose core types

When C shines

What Why
Simple semantics It’s a good thing to have less cryptic combinations of special characters and keywords in the language. C is simple, and it doesn’t mean it’s easy.
Full control We have full control over everything what’s happening in our program: how memory is aligned, how control flow is aligned in assembly, how memory is allocated.
Tooling C has enourmous amount of development tools: testers, fuzzers, debuggers, coverage, performance, etc…
Language stability It’s cool to have a project that compiles and works after 5-10 years, with minimal changes. I would call C is an anti-language to modern NodeJS world.
Knowledge base Probably it’s a most diverse and stable knowledge base of all languages.
Works everywhere Anybody tried to run doom on a toaster?
Performance It feels good when you beat blazingly fast javascript by a factor of 100x with moderate C code
Compatibility C is a lowest common denominator of all languages, anything can wrap and call C code

How to improve C

Cex was initially inspired by my Python experience, especially how very limited set of built-in types (str, list, dict, tuple, set + primitives) and simple semantics were able to produce huge ecosystem of Python nowadays. In my opinion, we don’t need to have every hyped programming paradigm to be added to the C language to make it better.

However, we need some things to be productive in C that CEX tries to implement:

  1. New error handling system - which makes error handling seamless and easy to work with.
  2. New memory management model - for making memory management more transparent, tracking lifetimes of objects more clear.
  3. Better strings - because modern computing became string-centric, strings are everywhere, we need better tool set in C.
  4. Better arrays - there are no built-in dynamic arrays in C, but it’s the most used data structure of all times.
  5. Hashmaps / sets - the second most used data structure, without C coverage.
  6. Build system - current build system situation is endless source of dependency conflicts, cross-platform headaches, and other issues that steals our mental energy.
  7. Code quality tools - we should lower barriers for running unit tests, fuzzers, coverage, benchmarks, etc.

Cex also includes some things for IO, OS/file system operations, JSON lib for fueling cross-platform build system and configuration. However, the goal of CEX core is to remain thin layer above original C, adding just enough.

Why just not use R**t, Z*g or C@$ ?

There is something appealing in C simplicity, it shines when we need full control over the code and assembly. Maybe it’s not for everyone, and maybe it’s a bad idea to use C for web-backends. But modern languages often affected by rush for adding new things, new paradigms, piling a complexity of semantics and dependencies.

C brings stability on the table, if something is written in C there is a chance that this project will be compilable after 5 years from now. Very few modern languages have this paradigm in mind. Most keep rushing to make changes, adding new features.

Basics

Code Style Guidelines

  • dollar$means_macros. CEX style uses $ delimiter as a macro marker, if you see it anywhere in the code this means you are dealing with some sort of macro. first$ part of name usually linked to a namespace of a macro, so you may expect other macros, type names or functions with that prefix.

  • functions_are_snake_case(). Lower case for functions

  • MyStruct_c or my_struct_s. Struct types typically expected to be a PascalCase with suffix, _c suffix indicates there is a code namespace with the same name (i.e. _c hints it’s a container or kind of object), _s suffix means simple data-container without special logic.

  • MyObj.method() or namespace.func(). Namespace names typically lower case, and object specific namespace names reflect type name, e.g. MyObj_c has MyObj.method().

  • Enums__double_underscore. Enum types are defined as MyEnum_e and each element looks like MyEnum__foo, MyEnum__bar.

  • CONSTANTS_ARE_UPPER. Two notations of constants: UPPER_CASE_CONST or namespace$CONST_NAME

Types

CEX provides several short aliases for primitive types and some extra types for covering blank spots in C.

Type Description
auto automatically inferred variable type
bool boolean type
u8/i8 8-bit integer
u16/i16 16-bit integer
u32/i32 32-bit integer
u64/i64 64-bit integer
f32 32-bit floating point number (float)
f64 64-bit floating point number (double)
usize maximum array size (size_t)
isize signed array size (ptrdiff_t)
char* core type for null-term strings
sbuf_c dynamic string builder type
str_s string slice (buf + len)
Exc / Exception error type in CEX
Error.<some> generic error collection
IAllocator memory allocator interface type
arr$(T) generic type dynamic array
hm$(T) generic type hashmap

CEX Core Namespaces

Note

You can get cheat-sheet with ./cex help <namespace>$ command (ending $ is important!). For example, ./cex help io$, ./cex help e$.

Namespace Description
e$ CEX Exception handling
for$ CEX array looping (for$each, etc.)
log$ Logging system
mem$* Memory management and allocators
mem$ Global variable for general purpose heap allocator
tmem$ Global variable for temporary arena allocator
str General purpose string / slice namespace
sbuf String builder class
arr$ Type-safe, generic, dynamic array
hm$ Type-safe, generic hashmap
io Cross-platform IO namespace
test$ Unit-test namespace (see tassert_*)
fuzz$ Fuzz-test interface
argparse Command line argument parsing class
cg$ Code generation namespace
cexy$ CEX Build system config vars and interface

Utility macros

Name Description
uassert() General purpose assert with tracebacks
uassertf() General purpose assert with formatting
unlikely() Branch predictor management for unexpected conditions
likely() Branch predictor management for expected conditions
breakpoint() Cross-platform debugger breakpoint
fallthrough() Explicit fallthrough to the next switch case
unreachable() Panics in debug mode, __builtin_unreachable() #ifdef NDEBUG mode
tassert_* Unit-test assertions see ./cex help tassert_

Error handling

The problem of error handling in C

C errors always were a mess due to historical reasons and because of ABI specifics. The main curse of C error is mixing values with errors, for example system specific calls return -1 and set errno variable. Some return 0 on error, some NULL, sometimes is an enum, or MAP_FAILED (which is (void*)-1).

This convention on errors drains a lot of developer energy making him to keep searching docs and figuring out which return values of a function considered errors.

C error handling makes code cluttered with endless if (ret_code == -1)pattern.

The code below is a typical error handling pattern in C, however it’s illustration for a specific issues:

isize read_file(char* filename, char* buf, usize buf_size) {
    if (buff == NULL || filename == NULL) {
1        errno = EINVAL;
        return -1;
    }

    int fd = open(filename, O_RDONLY);
    if (fd == -1) {
2        fprintf(stderr, "Cannot open '%s': %s\n", filename, strerror(errno));
        return -1;
    }
    isize bytes_read = read(fd, buf, buf_size);
    if (bytes_read == -1) {
3        perror("Error reading");
        return -1;
    }
4    return bytes_read;
}
1
errno is set, but it hard to distinguish by which API call or function argument is failed.
2
Error message line is located not at the same place as it was reported, so the developer must go through code to check.
3
errno is too broad and ambiguous for describing exact reason of failure.
4
foo return value is mixing error -1 and legitimate value of bytes_read. The situation gets worse if we need to use non integer return type of a function.

CEX Error handling goals

CEX made an attempt to re-think general purpose error handling in applications, with the following goals:

  • Errors should be unambiguous - detaching errors from valid result of a function, there are only 2 states: OK or an error.
  • Error handling should be general purpose - providing generic code patterns for error handling
  • Error should be easy to report - avoiding error code to string mapping
  • Error should be bubbling up - code can pass the same error to the upper caller
  • Error should extendable - allowing unique error identification
  • Error should be passed as values - low overhead, error handling
  • Error handling should be natural - no special constructs required to handle error in C code
  • Error should be forced to check - no occasional error check skips

How error handling is implemented

CEX has a special Exception type which is essentially alias for char*, and yes all error handling in CEX is based on char*. Before you start laughing and rolling on the floor, let me explain the most important part of the Exception type, this little * part. Exception in CEX is a pointer (an address, a number) to a some arbitrary char array on memory.

What if the returned pointer could be always some constant area indicating an error? With that rule, we don’t have to match error (string) content, but we can compare only address of the error.

CEX Error in a nutshell

// NOTE: excerpt from cex.h

/// Generic CEX error is a char*, where NULL means success(no error)
typedef char* Exc;

/// Equivalent of Error.ok, execution success
#define EOK (Exc) NULL

/// Use `Exception` in function signatures, to force developer to check return value
/// of the function.
#define Exception Exc __attribute__((warn_unused_result))


/**
 * @brief Generic errors list, used as constant pointers, errors must be checked as
 * pointer comparison, not as strcmp() !!!
 */
extern const struct _CEX_Error_struct
{
    Exc ok; // Success no error
    Exc argument;
    // ... cut ....
    Exc os;
    Exc integrity;
} Error;


// NOTE: user code

Exception
remove_file(char* path)
{
    if (path == NULL || path[0] == '\0') { 
        return Error.argument;  // Empty of null file
    }
    if (!os.path.exists(path)) {
        return "Not exists" // literal error are allowed, but must be handled as strcmp()
    }
    if (str.eq(path, "magic.file")) {
        // Returns an Error.integrity and logs error at current line to stdout
        return e$raise(Error.integrity, "Removing magic file is not allowed!");
    }
    if (remove(path) < 0) { 
        return strerror(errno); // using system error text (arbitrary!)
    }
    return EOK;
}

Exception
main(char* path)
{
    // Method 1: low level handling (no re-throw)
    if (remove_file(path)) { return Error.os; }
    if (remove_file(path) != EOK) { return "bad stuff"; }
    if (remove_file(path) != Error.ok) { return EOK; }

    // Method 2: handling specific errors
    Exc err = remove_file(path);
    if (err == Error.argument) { // <<< NOTE: comparing address not a string contents!
        io.printf("Some weird things happened with path: %s, error: %s\n", path, err);
        return err;
    }
    // Method 3: helper macros + handling with traceback
    e$except(err, remove_file(path)) { // NOTE: this call automatically prints a traceback
        if (err == Error.integrity) { /* TODO: do special case handling */  }
    }

    // Method 4: helper macros + handling unhandled
    e$ret(remove_file(path)); // NOTE: on error, prints traceback and returns error to the caller

    remove_file(path);  // <<< OOPS compiler error, return value of this function unchecked

    return 0;
}

Error tracebacks and custom errors in CEX

CEX error system was designed to help in debugging, this is a simple example of deep call stack printing in CEX.


#define CEX_IMPLEMENTATION
#include "cex.h"

const struct _MyCustomError
{
    Exc why_arg_is_one;
} MyError = { .why_arg_is_one = "WhyArgIsOneError" };

Exception
baz(int argc)
{
    if (argc == 1) { return e$raise(MyError.why_arg_is_one, "Why argc is 1, argc = %d?", argc); }
    return EOK;
}

Exception
bar(int argc)
{
    e$ret(baz(argc));
    return EOK;
}

Exception
foo2(int argc)
{
    io.printf("MOCCA - Make Old C Cexy Again!\n");
    e$ret(bar(argc));
    return EOK;
}

int
main(int argc, char** argv)
{
    (void)argv;
    e$except (err, foo2(argc)) { 
        if (err == MyError.why_arg_is_one) {
            io.printf("We need moar args!\n");
        }
        return 1; 
    }
    return 0;
}
MOCCA - Make Old C Cexy Again!
[ERROR]   ( main.c:12 baz() ) [WhyArgIsOneError] Why argc is 1, argc = 1?
[^STCK]   ( main.c:19 bar() ) ^^^^^ [WhyArgIsOneError] in function call `baz(argc)`
[^STCK]   ( main.c:27 foo2() ) ^^^^^ [WhyArgIsOneError] in function call `bar(argc)`
[^STCK]   ( main.c:35 main() ) ^^^^^ [WhyArgIsOneError] in function call `foo2(argc)`
We need moar args!

Rewriting initial C example to CEX

Main benefits of using CEX error handling system:

  1. Error messages come with source_file.c:line and function() for easier to debugging
  2. Easier to do quick checks with e$assert
  3. Easier to re-throw generic unhandled errors inside code
  4. Unambiguous return values: OK or error.
  5. Unlimited variants of returning different types of errors (Error.argument, "literals", strerror(errno), MyCustom.error)
  6. Easy to log - Exceptions are just char* strings
  7. Traceback support when chained via multiple functions
Exception read_file(char* filename, char* buf, isize* out_buf_size) {
1    e$assert(buff != NULL);
    e$assert(filename != NULL && "invalid filename");

    int fd = 0;
2    e$except_errno(fd = open(filename, O_RDONLY)) { return Error.os; }
3    e$except_errno(*out_buf_size = read(fd, buf, *out_buf_size)) { return Error.io; }
4    return EOK;
}
1
Returns error with printing out internal expression: [ASSERT] ( main.c:26 read_file() ) buff != NULL. e$assert is an Exception returning assert, it doesn’t abort your program, and these asserts are not stripped in release builds.
2
Handles typical -1 + errno check with print: [ERROR] ( main.c:27 read_file() ) fd = open("foo.txt", O_RDONLY) failed errno: 2, msg: No such file or directory
3
Result of a function returned by reference to the out parameter.
4
Unambiguous return code for success.
isize read_file(char* filename, char* buf, usize buf_size) {
    if (buff == NULL || filename == NULL) {
        errno = EINVAL;
        return -1;
    }

    int fd = open(filename, O_RDONLY);
    if (fd == -1) {
        fprintf(stderr, "Cannot open '%s': %s\n", filename, strerror(errno));
        return -1;
    }
    isize bytes_read = read(fd, buf, buf_size);
    if (bytes_read == -1) {
        perror("Error reading");
        return -1;
    }
    return bytes_read;
}

Helper macros e$...

CEX has a toolbox of macros with e$ prefix, which are dedicated to the Exception specific tasks. However, it’s not mandatory to use them, and you can stick to regular control flow constructs from C.

In general, e$ macros provide location logging (source file, line, function), which is a building block for error traceback mechanism in CEX.

e$ macros mostly designed to work with functions that return Exception type.

Returning the Exc[eption]

Errors in CEX are just plain string pointers. If the Exception function returns NULL or EOK or Error.ok this is indication of successful execution, otherwise any other value is an error.

Also you may return with e$raise(error_to_return, format, ...) macro, which prints location of the error in the code with message formatting.

Exception error_sample1(int a) {
    if (a == 0) return Error.argument; // standard set of errors in CEX
    if (a == -1) return "Negative one";   // error literal also works, but harder to handle
    if (a == -2) return UserError.neg_two; // user error
    if (a == 7) return e$raise(Error.argument, "Bad a=%d", a); // error with logging
    
    return EOK; // success
    // return Error.ok; // success
    // return NULL; // success
}

Handling errors

Error handling in CEX supports two ways:

  • Silent handling - which suppresses error location logging, this might be useful for performance critical code, or tight loops. Also, this is a general way of returning errors for CEX standard lib.
  • Loud handling with logging - this way is useful for one shot complex functions which may return multiple types of errors for different reasons. This is the way if you wanted to incorporate tracebacks for your errors.
Silent handling example
Note

Avoid using e$raise() in called functions if you need silent error handling, use plain return Error.*

Exception foo_silent(void) {
    // Method 1: quick and dirty checks
    if (error_sample1(0)) { return "Error"; /* Silent handling without logic */ }
    if (error_sample1(0)) { /* Discarding error of a call */ }

    // Method 2: silent error condition
    Exc err = error_sample1(0);
    if (err) {
        if (err == Error.argument) {
            /* Handling specific error here */
        }
        return err;
    }

    // Method 3: silent macro, with temp error value
    e$except_silent(err, error_sample1(0)) {
        // NOTE: nesting is allowed!
        e$except_silent(err, error_sample1(-2)) {
            return err; // err = UserError.neg_two
        }

        // err = Error.argument now
        if (err == Error.argument) {
            /* Handling specific error here */
        }
        // break; // BAD! See caveats section below
    }
    return EOK;
}
Note

e$except_silent will print error log when code runs under unit test or inside CEX build system, this helps a lot with debugging.

Loud handling with logging

If you write some general purpose code with debugability in mind, the logged error handling can be a breeze. It allows traceback error logging, therefore deep stack errors now easier to track and reason about.

There are special error handling macros for this purpose:

  1. e$except(err, func_call()) { ... } - error handling scope which initialize temporary variable err and logs if there was an error returned by func_call(). func_call() must return Exception type for this macro.
  2. e$except_errno(sys_func()) { ... } - error handling for system functions, returning -1 and setting errno.
  3. e$except_null(ptr_func()) { ... } - error handling for NULL on error functions.
  4. e$except_true(func()) { ... } - error handling for functions returning non-zero code on error.
  5. e$ret(func_call()); - runs the Exception type returning function func_call(), and on error it logs the traceback and re-return the same return value. This is a main code shortcut and driver for all CEX tracebacks. Use it if you don’t care about precise error handling and fine to return immediately on error.
  6. e$goto(func_call(), goto_err_label); - runs the Exception type function, and does goto goto_err_label;. This macro is useful for resource deallocation logic, and intended to use for typical C error handling pattern goto fail.
  7. e$assert(condition) or e$assert(condition && "What's wrong") or e$assertf(condition, format, ...) - quick condition checking inside Exception functions, logs a error location + returns Error.assert. These asserts remain in release builds and do not affected by NDEBUG flag.
Exception foo_loud(int a) {
    e$assert(a != 0);
    e$assert(a != 11 && "a is suspicious");
    e$assertf(a != 22, "a=%d is something bad", a);

    char* m = malloc(20);
    e$assert(m != NULL && "memory error"); // ever green assert

    e$ret(error_sample1(9)); // Re-return on error

    e$goto(error_sample1(0), fail); // goto fail and free the resource

    e$except(err, error_sample1(0)) {
        // NOTE: nesting is allowed!
        e$except(err, error_sample1(-2)) {
            return err; // err = UserError.neg_two
        }
        // err = Error.argument now
        if (err == Error.argument) {
            /* Handling specific error here */
        }

        // continue; // BAD! See caveats section below
    }

    // For these e$except_*() macros you can use assignment expression
    // e$except_errno(fd = open(..))
    // e$except_null(f = malloc(..))
    // e$except_true (sqlite3_open(db_path, &db))
    FILE* f;
    e$except_null(f = fopen("foo.txt", "r")) {
        return Error.io;
    }

    return EOK;
    
fail:
    free(m);
    return Error.runtime;
}

Caveats

Most of e$excep_* macros are backed by for() loop, so you have to be careful when you nest them inside outer loops and try to break/continue outer loop on error.

In my opinion using e$except_ inside loops is generally bad idea, and you should consider:

  1. Factoring error emitting code into a separate function
  2. Using if(error_sample(i)) instead of e$except
Bad example!
Exception foo_err_loop(int a) {
    for (int i = 0; i < 10; i++) {
        e$except(err, error_sample1(i)) {
            break; // OOPS: `break` stops `e$except`, not outer for loop
        }
    }
    return EOK;
}

Standard Error

CEX implements a standard Error namespace, which typical for most common situations if you might need to handle them.

const struct _CEX_Error_struct Error = {
    .ok = EOK,                       // Success
    .memory = "MemoryError",         // memory allocation error
    .io = "IOError",                 // IO error
    .overflow = "OverflowError",     // buffer overflow
    .argument = "ArgumentError",     // function argument error
    .integrity = "IntegrityError",   // data integrity error
    .exists = "ExistsError",         // entity or key already exists
    .not_found = "NotFoundError",    // entity or key already exists
    .skip = "ShouldBeSkipped",       // NOT an error, function result must be skipped
    .empty = "EmptyError",           // resource is empty
    .eof = "EOF",                    // end of file reached
    .argsparse = "ProgramArgsError", // program arguments empty or incorrect
    .runtime = "RuntimeError",       // generic runtime error
    .assert = "AssertError",         // generic runtime check
    .os = "OSError",                 // generic OS check
    .timeout = "TimeoutError",       // await interval timeout
    .permission = "PermissionError", // Permission denied
    .try_again = "TryAgainError",    // EAGAIN / EWOULDBLOCK errno analog for async operations
}

Exception foo(int a) {
    e$except(err, error_sample1(0)) {
        if (err == Error.argument) {
            return Error.runtime; // Return another error
        }
    }
    return Error.ok; // success
}

Making custom user exceptions

Extending with existing functionality

Probably you only need to make custom errors when you need specific needs of handling, which is rare case. In common case you might need to report details of the error and forget about it. Before we dive into customized error structs, let’s consider what simple instruments do we have for error customization without making another entity in the code:

  1. You may try to return string literals as a custom error, these errors are convenient options when you don’t need to handle them (e.g. for rare/weird edge cases)
Exception foo_literal(int a) {
    if (a == 777999) return "a is a duplicate of magic number";
    return EOK;
}
  1. You may try to return standard error + log something with e$raise() which support location logging and custom formatting.
Exception foo_ret(int a) {
    if (a == 777999) return e$raise(Error.argument, "a=%d looks weird", a);
    return EOK;
}

Custom error structs

If you need custom handling, you might need to create a new dedicated structure for errors.

Here are some requirements for a custom error structure:

  1. It has to be a constant global variable
  2. All fields must be initialized, uninitialized fields are NULL therefore they are success code.
// myerr.h
extern const struct _MyError_struct
{
    Exc foo;
    Exc bar;
    Exc baz;
} MyError;

// myerr.c
const struct _MyError_struct MyError = {
    .foo = "FooError",
    .bar = "BarError",
    // WARNING: missing .baz - which will be set to NULL => EOK
}

// other.c
#include "cex.h"
#include "myerr.h"

Exception foo(int a) {
    e$except(err, error_sample1(0)) {
        if (err == Error.argument) {
            return MyError.foo;
        }
    }
    return Error.ok; // success
}

Advanced topics

Performance

Errors are pointers

Using strings as error value carrier may look controversial at the first glance. However let’s remember that strings in C are char*, and essentially * part means that it’s a size_t integer value of a memory address. Therefore CEX approach is to have set of pre-defined and constant memory addresses that hold standard error values (see Standard Error section above).

So for error handling we need to compare return value with EOK|NULL|Error.ok to check if error was returned or not. Then we check address of returned error and compare it with the address of the standard error.

With this being said, performance of typical error handling in CEX is one assembly instruction that compares a register with NULL and one instruction for comparing address of an error with some other constant address when handling returned error type.

Note

CEX uses direct pointer comparison if (err == Error.argument), instead of string content comparison if(strcmp(err, "ArgumentError") == 0) /* << BAD */

Branch predictor control

All CEX e$ macros uses unlikely a.k.a. __builtin_expect to shape assembly code in the way of favoring happy path, for example this is a e$assert source snippet:

#    define e$assert(A)                                                                             \
        ({                                                                                          \
            if (unlikely(!((A)))) {                                                                 \
                __cex__fprintf(stdout, "[ASSERT] ", __FILE_NAME__, __LINE__, __func__, "%s\n", #A); \
                return Error.assert;                                                                \
            }                                                                                       \
        })

The unlikely(!(A)) hints the compiler to place assembly instructions in a way of favoring happy path of the e$assert, which is a performance gain when you have multiple error handling checks and/or big blocks for error handling.

Compatibility

Be careful if you need to expose CEX exception returning functions to an API. Sometimes, if you are working with different shared libraries, the addresses of the same errors might be different. If user code is intended to check and handle API errors, maybe it’s better to stick to C-compatible approach instead of CEX errors.

CEX Exceptions work best when you use them in single address space of an app or a library. If you need to cross this boundary, do your best assessment for pros and cons.

Useful code patterns

Escape main() when possible

CEX approach is to keep main() function separated and as short as possible. This opens capabilities for full code unit testing, unity builds, and tracebacks. This is a typical example app:

// app_main.c file
#include "cex.h"
Exception
app_main(int argc, char** argv)
{
    bool my_flag = false;
    argparse_c args = {
        .description = "New CEX App",
        argparse$opt_list(
            argparse$opt_help(),
            argparse$opt(&my_flag, 'c', "ctf", .help = "Capture the flag"),
        ),
    };
    if (argparse.parse(&args, argc, argv)) { return Error.argsparse; }
    io.printf("MOCCA - Make Old C Cexy Again!\n");
    io.printf("%s\n", (my_flag) ? "Flag is captured" : "Pass --ctf to capture the flag");
    return EOK;
}

// main.c file
#define CEX_IMPLEMENTATION   // this only appears in main file, before #include "cex.h"
#include "cex.h"
#include "app_main.c"  // NOTE: include .c, using unity build approach

int
main(int argc, char** argv)
{
    if(app_main(argc, argv)) { return 1; }
    return 0;
}

Inversion of error checking

Instead of doing if nesting, try an opposite approach, check an error and exit. In CEX you can also use e$assert() for a quick and dirty checking with one line.

Exception
app_main(int argc, char** argv)
{
    e$assert(argc == 2);  // assert shortcut
    if (str.eq(argv[1], "MOCCA")) { return Error.integrity; }

    io.printf("MOCCA - Make Old C Cexy Again!\n");
    return EOK;
}
Exception
app_main(int argc, char** argv)
{
    if (argc > 1) {
        if (str.eq(argv[1], "MOCCA")) {
            io.printf("MOCCA - Make Old C Cexy Again!\n");
        } else {
            return Error.integrity;
        }
    } else {
        return Error.argument;
    }
    return EOK;
}

Resource cleanup

Sometimes you need to open resources, manage your memory, and carry error code. Or maybe we have to use legacy API inside function, with some incompatible error code calls. Here is a CEX flavored implementation of common goto fail C code pattern.


Exception
print_zip(char* zip_path, char* extract_dir)
{
    Exc result = Error.runtime; // NOTE: default error code, setting to error by default

    // Open the ZIP archive
    int err;
    struct zip* archive = NULL;
    e$except_null (archive = zip_open(zip_path, 0, &err)) { goto end; }

    i32 num_files = zip_get_num_entries(archive, 0);

    for (i32 i = 0; i < num_files; i++) {
        struct zip_stat stat;
        if (zip_stat_index(archive, i, 0, &stat) != 0) {
            result = Error.integrity;  // NOTE: we can substitute error code if needed
            goto end;
        }

        // NOTE: next may return error on buffer overflow -> goto end then
        char output_path[64];
        e$goto(str.sprintf(output_path, sizeof(output_path), "%s/%s", extract_dir, stat.name), end);

        io.printf("Element: %s\n", output_path);
    }

    // NOTE: success when no `goto end` happened, only one happy outcome
    result = EOK;

end:
    // Cleanup and result
    zip_close(archive);
    return result;
}
MyObj
MyObj_create(char* path, usize buf_size)
{
    MyObj self = {0};

    e$except_null (self.file = fopen(path, "r")) { goto fail; }

    self.buf = malloc(buf_size);
    if (self.buf == NULL) { goto fail; }

    e$goto(fetch_data(&self.data), fail);
    
    // MyObj was initialized and in consistent state
    return self;

fail:
    // On error - do a cleanup of initialized stuff
    if (self.file) { fclose(self.file); }
    if (self.buf) { free(self.buf); }
    memset(&self, 0, sizeof(MyObj));
    return self;
}

Memory management

The problem of memory management in C

C has a long-lasting history of memory management issues. Many modern languages proposed multiple solutions for these issues: RAII, borrow checkers, garbage collection, allocators, etc. All of them work and solve the memory problem to some extent, but sometimes adding new sets of problems in different places.

From my prospective, the root cause of the C memory problem is hidden memory allocation. When developer works with a function which does memory allocation, it’s hard to remember its behavior without looking into source code or documentation. Absence of explicit indication of memory allocation lead to the flaws with memory handling, for example: memory leaks, use after free, or performance issues.

While C remains system and low-level language, it’s important to have precise control over code behavior and memory allocations. So in my opinion, RAII and garbage collection are alien approaches to C philosophy, but on the other hand modern languages like Zig or C3 have allocator centric approach, which is more explicit and suitable for C.

Modern way of memory management in CEX

Allocator-centric approach

CEX tries to adopt allocator-centric approach to memory management, which help to follow those principles:

  • Explicit memory allocation. Each object (class) or function that may allocate memory has to have an allocator parameter. This requirement, adds explicit API signature hints, and communicates about memory implications of a function without deep dive into documentation or source code.
  • Transparent memory management. All memory operations are provided by IAllocator interface, which can be interchangeable allocator object of different type.
  • Memory scoping. When possible memory usage should be limited by scope, which naturally regulates lifetimes of allocated memory and automatically free it after exiting scope.
  • UnitTest Friendly. Allocators allowing implementation of additional levels of memory safety when run in unit test environment. For example, CEX allocators add special poisoned areas around allocated blocks, which trigger address sanitizer when this region accesses with user code. Allocators open door for a memory leak checks, or extra memory error simulations for better out-of-memory error handling.
  • Standard and Temporary allocators. Sometimes it’s useful to have initialized allocator under your belt for short-lived temporary operations. CEX provides two global allocators by default: mem$ - is a standard heap allocator using malloc/realloc/free, and tmem$ - is dynamic arena allocator of small size (about 256k of per page).

Example

This is a small example of key memory management concepts in CEX:

1mem$scope(tmem$, _)
{
2    arr$(char*) incl_path = arr$new(incl_path, _);
    for$each (p, alt_include_path) {
3        arr$push(incl_path, p);
        if (!os.path.exists(p)) { log$warn("alt_include_path not exists: %s\n", p); }
    }
4}
1
Initializes a temporary allocator (tmem$) scope in mem$scope(tmem$, _) {...} and assigns it as a variable _ (you can use any name).
2
Initializes dynamic array with the scoped allocator variable _, allocates new memory.
3
May allocate memory
4
All memory will be freed at exit from this scope

Lifetimes and scopes

Use of memory scopes naturally regulates lifetime of initialized memory. From the example above you can’t use incl_path variable outside of mem$scope. And more to say, that memory will be automatically freed after exiting scope. This design approach significantly reduces surface for use after free errors in general.

Temporary memory allocator

Dealing with lots of small memory allocations always was a pain in C, because we need to deallocate them at the end, also because of potential overhead each individual memory allocation might have. Temporary allocator in CEX works as a small-page (around 256kb) memory arena, which can be dynamically resized when needed. The most important feature of temporary arena allocator it does the full cleanup at the mem$scope exit automatically.

Temporary allocator is always available via tmem$ global variable and can be used anytime at the program lifetime. It allowed to be used only inside mem$scope, with support of up to 32 levels of mem$scope nesting. At the end of the program, CEX will automatically finalize and free all allocated memory.

You can find more technical details about implementation below in this article.

Memory management in CEX

Allocators

Allocators add many benefits into codebase design and development experience:

  • All memory allocating functions or objects become explicit, because they require IAllocator argument
  • Logic of the code become detached from memory model, the same dynamic array can be backed by heap, arena, or stack based static char buffer with the same allocator interface. The same piece of code may work on Linux OS or embedded device without changes to memory allocation model.
  • Allocators may add testing capabilities, i.e. simulating out-of-mem errors in unit tests, or adding memory checks or extra integrity checks of memory allocations
  • There are multiple memory allocation models (heap, arenas, temp allocation), so you can find the best type of allocator for your needs and use case.
  • It’s easier to trace and doing memory benchmarks with allocators.
  • Automatic garbage collection with mem$scope and arena allocators - you’ll get everything freed on scope exit

Allocator interface

The allocator interface is represented by IAllocator type, which is an interface structure of function pointers for generic operations. Allocators in CEX support malloc/realloc/calloc/free functions similar to their analogs in C, the only optional parameter is alignment for requested memory region.

#define IAllocator const struct Allocator_i* 

typedef struct Allocator_i
{
    // >>> cacheline
    alignas(64) void* (*const malloc)(IAllocator self, usize size, usize alignment);
    void* (*const calloc)(IAllocator self, usize nmemb, usize size, usize alignment);
    void* (*const realloc)(IAllocator self, void* ptr, usize new_size, usize alignment);
    void* (*const free)(IAllocator self, void* ptr);
    const struct Allocator_i* (*const scope_enter)(IAllocator self);   /* Only for arenas/temp alloc! */
    void (*const scope_exit)(IAllocator self);    /* Only for arenas/temp alloc! */
    u32 (*const scope_depth)(IAllocator self);  /* Current mem$scope depth */
    struct {
        u32 magic_id;
        bool is_arena;
        bool is_temp;
    } meta;
    //<<< 64 byte cacheline
} Allocator_i;

mem$ API

You shouldn’t use allocator interface directly (it’s less convenient), so it’s better to use memory specific macros:

  • mem$malloc(allocator, size, [alignment]) - allocates uninitialized memory with allocator, size in bytes, alignment parameter is optional, by default it’s system specific alignment (up to 64 byte alignment is supported)
  • mem$calloc(allocator, nmemb, size, [alignment]) - allocates zero-initialized memory with allocator, nbemb elements of size each, alignment parameter is optional, by default it’s system specific alignment (up to 64 byte alignment is supported)
  • mem$realloc(allocator, old_ptr, size, [alignment]) - reallocates previously initialized old_ptr with allocator, alighment parameter is optional and must match initial alignment of a old_ptr
  • mem$free(allocator, old_ptr) - frees old_prt and implicitly set it to NULL to avoid use-after-free issues.
  • mem$new(allocator, T) - generic allocation of new instance of T (type), with respect of its size and alignment.

Allocator scoping:

  • mem$arena(page_size) { ... } - enters new instance of allocator arena with the page_size.
  • mem$scope(arena_or_tmem$, scope_var) { ... } - opens new memory scope (works only with arena allocators or temp allocator)

Dynamic arenas

Dynamic arenas using an array of dynamically allocated pages, each page has static size and allocated on heap. When you allocate memory on arena and there is enough room on page, the arena allocates this chunk of memory inside page (simply moving a pointer without real allocation). If your memory request is big enough, the arena creates new page while keeping all old pages untouched and manages new allocation on the new page.

Arenas are designed to work with mem$scope(), this allowing you create temporary memory allocation, without worrying about cleanup. Once scope is left, the arena will deallocate all memory and return to the initial state. This approach allowing to use up to 32 levels of mem$scope() nesting. Essentially it is exact mechanism that fuels tmem$ - temporary allocator in CEX.

Working with arenas:

1IAllocator arena = AllocatorArena.create(4096);
2u8* p = mem$malloc(arena, 100);

3mem$scope(arena, tal)
{
4    u8* p2 = mem$malloc(tal, 100000);

    mem$scope(arena, tal)
    {
        u8* p3 = mem$malloc(tal, 100);
5    }
6}

7AllocatorArena.destroy(arena);
1
New arena with 4096 byte page
2
Allocating some memory from arena
3
Entering new memory scope
4
Allocation size exceeds page size, new page will be allocated then. p address remain the same!
5
At scope exit p3 will be freed, p2 and p remain
6
At scope exit p2 will be freed, excess pages will be freed, p remains
7
Arena destruction, all pages are freed, p is invalid now.
mem$arena(4096, arena)
{
    // This needs extra page
    u8* p2 = mem$malloc(arena, 10040);
    mem$scope(arena, tal)
    {
        u8* p3 = mem$malloc(tal, 100);
    }
}
1mem$scope(tmem$, _)
{
2    u8* p2 = mem$malloc(_, 110241024);

3    mem$scope(tmem$, _)
    {
        u8* p3 = mem$malloc(_, 100);
4    }
5}
1
Initializes a temporary allocator (tmem$) scope in mem$scope(tmem$, _) {...} and assigns it as a variable _ (you can use any name). _ is a pattern for temp allocator in CEX.
2
New page for temp allocator created, because size exceeds existing page size
3
Nested scope is allowed
4
Scope exit p3 automatically cleaned up
5
Scope exit p2 cleaned up + extra page freed.

Standard allocators

There are two general purpose allocators globally available out of the box for CEX:

  • mem$ - is a heap allocator, the same old malloc/free type of allocation, with extra alignment support. In unit tests this allocator provides simple memory leak checks even without address sanitizer enabled.
  • tmem$ - dynamic arena, with 256kb page size, used for short lived temporary operations, cleans up pages automatically at program exit. Does page allocation only at the first allocation, otherwise remain global static struct instance (about 128 bytes size). Thread safe, uses thread_local.

Caveats

Do cross-scope memory access carefully

Never reallocate memory from one scope, in the nested scope, which will automatically lead to use-after-free issue. This is a bad example:

// BAD!
mem$scope(tmem$, _)
{
1    u8* p2 = mem$malloc(_, 100);

    mem$scope(tmem$, _)
    {
2        p2 = mem$realloc(_, p2, 110241024);
3    }

4    if(p2[128] == '0') { /* OOPS */}
} 
1
Initially allocation at first scope
2
realloc uses different scope depth, this might lead to assertion in CEX unit test
3
p2 automatically freed, because now it belongs to different scope
4
You’ll face use-after-free, which typically expressed use-after-poison in temp allocator.
Tip

CEX does its best to catch these cases in unit test mode, it will raise an assertion at the mem$realloc line with some meaningful error about this. Standard CEX collections like dynamic arrays arr$ and hashmap hm$ also get triggered when they need to resize in a different level of mem$scope.

Be careful with reallocations on arenas

CEX arenas are designed to be always growing, if your code pattern is based on heavily reallocating memory, the arena growth may lead to performance issues, because each reallocation may trigger memory copy with new page creation. Consider pre-allocate some reasonable capacity for your data when working with arenas (including temp allocator). However, if you’re reallocating the exact last pointer, the arena might do it in place on the same page.

Unit Test specific behavior

When run in test mode (or specifically #ifdef CEX_TEST is true) the memory allocation model in CEX includes some extra safety capabilities:

  1. Heap based allocator (mem$) starts tracking memory leaks, comparing number of allocations and frees.
  2. mem$malloc() - return uninitialized memory with 0xf7 byte pattern
  3. If Address Sanitizer is available all allocations for arenas and heap will be surrounded by poisoned areas. If you see use-after-poison errors, it’s likely a sign of use-after-free or out of bounds access in tmem$. Try to switch your code to the mem$ allocator if possible to triage the exact reason of the error.
  4. Allocators do sanity checks at the end of the each unit test case
Be careful with break/continue

mem$scope/mem$arena are macros backed by for loop, be careful when you use them inside loops and trying to break/continue outer loop.

// BAD!
for(u32 i = 0; i < 10; i++){
{
    mem$scope(tmem$, _)
    {
        u8* p2 = mem$malloc(_, 100);
        if(p2[1] == '0') { 
            break; // OOPS, this will break mem$scope not a outer for loop
        }
    }
} 
Never return pointers from scope

Function exit will lead to memory cleanup after memory scope p2 address now is invalid. You might get use-after-poison or use-after-free ASAN crash. Or 0xf7 pattern of data when running in test environment without ASAN.

// BAD!
mem$scope(tmem$, _)
{
    u8* p2 = mem$malloc(_, 100);
    return p2; // BAD! This address will be freed at scope exit
}

Advanced topics

Performance tips

TempAllocator makes CPU cache hot

If we use mem$scope(tmem$) a lot, the ArenaAllocator re-uses same memory pages, therefore these memory areas will be prefetched by CPU cache, which will be beneficial for performance. The ArenaAllocator in general works like a stack, with automatic memory cleanup at the scope exit.

Arena allocation is cheap

ArenaAllocator implements memory allocation by moving a memory pointer back and forth, it doesn’t take much for allocating small chunks if there is no need for requesting memory from the OS for the new arena page.

Be careful with ArenaAllocator when you need to reallocate a lot

AllocatorArena and temporary allocator do not reuse blank chunks of the freed memory in pages, they simply allocate new memory. This might be a problem when you try to dynamically resize some container (e.g. dynamic array arr$), which could lead to uncontrollable growth of arena pages and therefor performance degradation.

On the other hand, it’s totally fine to pre-allocate some capacity for your needs upfront. Just try to be mindful about you memory allocation and usage patterns.

When to use arena or heap allocator

ArenaAllocator and tmem$ use cases

ArenaAllocator works great when you need disposable memory for a temporary needs or you have limited boundaries in time and space for a program operation. For example, read file, process it, calculate stuff, close it, done.

Another great benefit of arenas is stability of memory pointers, once memory is allocated it sits there at the same place.

Arenas simplifies managing memory for small objects, so you don’t need to write extra memory management logic for each small allocation, everything will be cleared at scope exit.

HeapAllocator (mem$) use cases

HeapAllocator is simply system allocator, backed by malloc/free. You can use it for long living or frequently resizable objects (e.g. dynamic arrays or hashmaps). Works best for bigger allocations with longer lives.

UnitTesting and memory leaks

When you run CEX allocators in unit test code, they apply extra sanity check logic for helping you to debug memory related issues:

  • New mem$malloc allocations for mem$/tmem$ are filled by 0xf7 byte pattern, which indicates uninitialized memory.
  • mem$ allocator tracks number of allocations and deallocations and emits unit test [LEAK] warning (in the case if ASAN is disabled)
  • There are some small poisoned areas around allocations by mem$/tmem$ which trigger ASAN use-after-poison crash (read/write), or check validity of poison pattern inside these areas when ASAN is disabled.
  • After each test CEX automatically performs tmem$ sanity checks in order to find memory corruption
Tip

If you need to debug memory leaks for your code consider to use mem$ (heap based) allocation, which utilizes ASAN memory leak tracking mechanisms.

Out-of-bounds access and poisoning

CEX encourage to use ASAN everywhere for debug needs. ASAN works great for handling out-of-bounds access for heap allocated memory. It’s a little bit difficult for arenas, because they use big pages of memory (we own it), therefore no complaints from the ASAN. In order to fix this tmem$ and AllocatorArena add poison areas around each allocation which triggers use-after-poison crash. If you face it, make sure that your program doesn’t read/write out of out-of-bounds, try to temporarily substitute tmem$ by mem$ to get more precise error information.

Code patterns

Using temporary memory scope

mem$scope(tmem$, _) 
{
    u8* p2 = mem$malloc(_, 100);
} 
Note

CEX convention to use _ variable as temp allocator.

Using heap allocator

u8* p2 = mem$malloc(mem$, 100); // mem$ is a global variable for HeapAllocator
mem$free(mem$, p2); // we must manually free it
uassert(p2 == NULL); // p2 set to NULL by mem$free()

Opening new ArenaAllocator scope

mem$arena(4096, arena)
{
    u8* p2 = mem$malloc(arena, 10040);
}

Mixing ArenaAllocator and temp allocator

mem$arena(4096, arena)
{
    // We will store result in the arena
    u8* result = mem$malloc(arena, 10040);

    mem$scope(tmem$, _) 
    {
        // Do a temporary calculations with tmem$
        u8* p2 = mem$malloc(_, 100);

        // Copy persistent results here
        result[0] = p[0];
    }  // NOTE: p2 and all temp data freed
    
    // result remains
} 
// result freed

Strings

Problems with strings in C

Strings in C are historically endless source of problems, bugs and vulnerabilities. String manipulation in standard lib C is very low level and sometimes confusing. But in my opinion, the most of the problems with string in C is a result of poor code practices, rather than language issues itself.

With modern tooling like Address Sanitizer it’s much easier to catch these bugs, so we are starting to face developer experience issues rather than security complications.

Problems with C char* strings:

  • No length information included, which leads to performance issues with overuse of strlen
  • Null terminator is critical for security, but not all libc functions handle strings securely
  • String slicing is impossible without copy and setting null-terminator at the end of slice
  • libc string functions behavior sometimes is implementation specific and insecure

Strings in CEX

There are 3 key string manipulation routines in general:

  1. General purpose string manipulation - uses vanilla char* type, with null-terminator, with dedicated str namespace. The main purpose is to make strings easy to work with, and keeping them C compatible. str namespace uses allocators for all memory allocating operations, which allows us to use temporary allocations with tmem$.
  2. String slicing - sometimes we need to obtain and work with a part of existing string, so CEX use str_s type for defining slices. There is dedicated sub-namespace str.slice which is specially designed for working with slices. Slices may or may not be null-terminated, they carry pointer and length. Typically is a quick and non-allocating way of working of string view representation.
  3. String builder - in the case if we need to build string dynamically we may use sbuf_c type and sbuf namespace in CEX. This type is dedicated for dynamically growing strings backed by allocator, that are always null-terminated and compatible with char* without casting.

Cex strings follow these principles:

  • Security first - all strings are null-terminated, all buffer related operations always checking bounds.
  • NULL-tolerant - all strings may accept NULL pointers and return NULL result on error. This significantly reduces count of if(s == NULL) error checks after each function, allowing to chain string operations and check NULL at the last step.
  • Memory allocations are explicit - if string function accepts IAllocator this is indication of allocating behavior.
  • Developer convenience - sometimes it’s easier to allocate and make new formatted string on tmem$ for example str.fmt(_, "Hello: %s", "CEX"), or use builtin pattern matching engine str.match(arg[1], "command_*_(insert|delete|update))"), or work with read-only slice representation of constant strings.
Tip

To get brief cheat sheet on functions list via Cex CLI type ./cex help str$ or ./cex help sbuf$

General purpose strings

Use str for general purpose string manipulation, this namespace typically returns char* or NULL on error, all function are tolerant to NULL arguments of char* type and re-return NULL in this case. Each allocating function must have IAllocator argument, also return NULL on memory errors.

    char*           str.clone(char* s, IAllocator allc);
    Exception       str.copy(char* dest, char* src, usize destlen);
    bool            str.ends_with(char* s, char* suffix);
    bool            str.eq(char* a, char* b);
    bool            str.eqi(char* a, char* b);
    char*           str.find(char* haystack, char* needle);
    char*           str.findr(char* haystack, char* needle);
    char*           str.fmt(IAllocator allc, char* format,...);
    char*           str.join(char** str_arr, usize str_arr_len, char* join_by, IAllocator allc);
    usize           str.len(char* s);
    char*           str.lower(char* s, IAllocator allc);
    bool            str.match(char* s, char* pattern);
    int             str.qscmp(const void* a, const void* b);
    int             str.qscmpi(const void* a, const void* b);
    char*           str.replace(char* s, char* old_sub, char* new_sub, IAllocator allc);
    str_s           str.sbuf(char* s, usize length);
    arr$(char*)     str.split(char* s, char* split_by, IAllocator allc);
    arr$(char*)     str.split_lines(char* s, IAllocator allc);
    Exc             str.sprintf(char* dest, usize dest_len, char* format,...);
    str_s           str.sstr(char* ccharptr);
    bool            str.starts_with(char* s, char* prefix);
    str_s           str.sub(char* s, isize start, isize end);
    char*           str.upper(char* s, IAllocator allc);
    Exception       str.vsprintf(char* dest, usize dest_len, char* format, va_list va);

String slices

CEX has a special type and namespace for slices, which are dedicated struct of (len, char*) fields, which intended for working with parts of other strings, or can be a representation of a null-terminated string of full length.

Creating string slices

char* my_cstring = "Hello CEX";

// Getting a sub-string of a C string
str_s my_cstring_sub = str.sub(my_cstring, -3, 0); // Value: CEX, -3 means from end of my_cstring

// Created from any other null-terminated C string
str_s my_slice = str.sstr(my_cstring);

// Statically initialized slice with compile time known length
str_s compile_time_slice = str$s("Length of this slice created compile time"); 

// Making slice from a buffer (may not be null-terminated)
char buf[100] = {"foo bar"}; 
str_s my_slice_buf = str.sbuf(buf, arr$len(buf));
Note

str_s types are always passed by value, it’s a 16-byte struct, which fits 2 CPU registers on x64

Using slices

Once slice is created and you see str_s type, it’s only safe to use special functions which work only with slices, because null-termination is not guaranteed anymore.

There are plenty of operations which can be made only on string view, without touching underlying string data.


char*           str.slice.clone(str_s s, IAllocator allc);
Exception       str.slice.copy(char* dest, str_s src, usize destlen);
bool            str.slice.ends_with(str_s s, str_s suffix);
bool            str.slice.eq(str_s a, str_s b);
bool            str.slice.eqi(str_s a, str_s b);
isize           str.slice.index_of(str_s s, str_s needle);
str_s           str.slice.iter_split(str_s s, char* split_by, cex_iterator_s* iterator);
str_s           str.slice.lstrip(str_s s);
bool            str.slice.match(str_s s, char* pattern);
int             str.slice.qscmp(const void* a, const void* b);
int             str.slice.qscmpi(const void* a, const void* b);
str_s           str.slice.remove_prefix(str_s s, str_s prefix);
str_s           str.slice.remove_suffix(str_s s, str_s suffix);
str_s           str.slice.rstrip(str_s s);
bool            str.slice.starts_with(str_s s, str_s prefix);
str_s           str.slice.strip(str_s s);
str_s           str.slice.sub(str_s s, isize start, isize end);
Note

All Cex formatting functions (e.g. io.printf(), str.fmt()) support special format %S dedicated for string slices, allowing to work with slices naturally.

char* my_cstring = "Hello CEX";
str_s my_slice = str.sstr(my_cstring);
str_s my_sub = str.slice.sub(my_slice, -3, 0);

io.printf("%S - Making Old C Cexy Again\n", my_sub);
io.printf("buf: %c %c %c len: %zu", my_sub.buf[0], my_sub.buf[1], my_sub.buf[2], my_sub.len);

Error handling

On error all slice related routines return empty (str_s){.buf = NULL, .len = 0}, all routines check if .buf == NULL therefore it’s safe to pass empty/error slice multiple times without need for checking errors after each call. This allows operations chaining like this:

str_s my_sub = str.slice.sub(my_slice, -3, 0);
my_sub = str.slice.remove_prefix(my_sub, str$s("pref"));
my_sub = str.slice.strip(my_sub);
if (!my_sub.buf) {/* OOPS error */}

String conversions

When working with strings, conversion from string into numerical types become very useful. Libc conversion functions are messy end error prone, CEX uses own implementation, with support for both char* and slices str_s.

You may use one of the functions above or pick type-safe/generic macro str$convert(str_or_slice, out_var_pointer)

Exception       str.convert.to_f32(char* s, f32* num);
Exception       str.convert.to_f32s(str_s s, f32* num);
Exception       str.convert.to_f64(char* s, f64* num);
Exception       str.convert.to_f64s(str_s s, f64* num);
Exception       str.convert.to_i16(char* s, i16* num);
Exception       str.convert.to_i16s(str_s s, i16* num);
Exception       str.convert.to_i32(char* s, i32* num);
Exception       str.convert.to_i32s(str_s s, i32* num);
Exception       str.convert.to_i64(char* s, i64* num);
Exception       str.convert.to_i64s(str_s s, i64* num);
Exception       str.convert.to_i8(char* s, i8* num);
Exception       str.convert.to_i8s(str_s s, i8* num);
Exception       str.convert.to_u16(char* s, u16* num);
Exception       str.convert.to_u16s(str_s s, u16* num);
Exception       str.convert.to_u32(char* s, u32* num);
Exception       str.convert.to_u32s(str_s s, u32* num);
Exception       str.convert.to_u64(char* s, u64* num);
Exception       str.convert.to_u64s(str_s s, u64* num);
Exception       str.convert.to_u8(char* s, u8* num);
Exception       str.convert.to_u8s(str_s s, u8* num);

For example:

i32 num = 0;
s = "-2147483648";

// Both are equivalent
e$ret(str.convert.to_i32(s, &num));
e$ret(str$convert(s, &num));

Dynamic strings / string builder

If you need to build string dynamically you can use sbuf_c type, which is simple alias for char*, but with special logic attached. This type implements dynamic growing / shrinking, and formatting of strings with null-terminator.

Example

1sbuf_c s = sbuf.create(5, mem$);

char* cex = "CEX";
2e$ret(sbuf.appendf(&s, "Hello %s", cex));
3e$assert(str.ends_with(s, "CEX"));

sbuf.destroy(&s);
1
Creates new dynamic string on heap, with 5 bytes initial capacity
2
Appends text to string with automatic resize (memory reallocation)
3
s variable of type sbuf_c is compatible with any char* routines, because it’s an alias of char*
Tip

If you need one-shot format for string try to use str.fmt(allocator, format, ...) inside temporary allocator mem$scope(tmem$, _)

sbuf namespace

    /// Append string to the builder
    Exc             sbuf.append(sbuf_c* self, char* s);
    /// Append format (using CEX formatting engine)
    Exc             sbuf.appendf(sbuf_c* self, char* format,...);
    /// Append format va (using CEX formatting engine), always null-terminating
    Exc             sbuf.appendfva(sbuf_c* self, char* format, va_list va);
    /// Returns string capacity from its metadata
    u32             sbuf.capacity(sbuf_c* self);
    /// Clears string
    void            sbuf.clear(sbuf_c* self);
    /// Creates new dynamic string builder backed by allocator
    sbuf_c          sbuf.create(usize capacity, IAllocator allocator);
    /// Creates dynamic string backed by static array
    sbuf_c          sbuf.create_static(char* buf, usize buf_size);
    /// Destroys the string, deallocates the memory, or nullify static buffer.
    sbuf_c          sbuf.destroy(sbuf_c* self);
    /// Returns false if string invalid
    bool            sbuf.isvalid(sbuf_c* self);
    /// Returns string length from its metadata
    u32             sbuf.len(sbuf_c* self);
    /// Shrinks string length to new_length
    Exc             sbuf.shrink(sbuf_c* self, usize new_length);
    /// Validate dynamic string state, with detailed Exception
    Exception       sbuf.validate(sbuf_c* self);

String formatting in CEX

All CEX routines with format strings (e.g. io.printf()/log$error()/str.fmt()) use CEX special formatting engine with extended features:

  • %S format specifier is used for printing string slices of str_s type
  • %S format has a sanity checks in the case if simple string is passed to its place, it will print (%S-bad/overflow) in the text. However, it’s not guaranteed behavior, and depends on platform.
  • %lu/%ld - formats are dedicated for printing 64-bit integers, they are not platform specific
  • %u/%d - formats are dedicated for printing 32-bit integers, they are not platform specific
  • Other formats should be compatible with vanilla libC.

Data structures and arrays

Data structures in CEX

There is a lack of support for data structures in C, typically it’s up to developer to decide what to do. However, I noticed that many other C projects tend to reimplement over and over again two core data structures, which are used in 90% of cases: dynamic arrays and hashmaps.

Key requirements of the CEX data structures:

  • Allocator based memory management - allowing you to decide memory model and tweak it anytime.
  • Type safety and LSP support - each DS must have a specific type and support LSP suggestions.
  • Generic types - DS must be generic.
  • Seamless C compatibility - allowing accessing CEX DS as plain C arrays and pass them as pointers.
  • Support any item type including overaligned.

Dynamic arrays

Dynamic arrays (a.k.a vectors or lists) are designed specifically for developer convenience and based on ideas of Sean Barrett’s STB DS.

What is dynamic array in CEX

Technically speaking it’s a simple C pointer T*, where T is any generic type. The memory for that pointer is allocated by allocator, and its length is stored at some byte offset before the address of the dynamic array head.

With this type representation we can get some useful benefits:

  • Array access with simple indexing, i.e. arr[i] instead of dynamic_arr_get_at(arr, i)
  • Passing by pointer into vanilla C code. For example, a function signature my_func(int* arr, usize arr_len) is compatible with arr$(int*), so we can call it as my_func(arr, arr$len(arr))
  • Passing length information integrated into single pointer, arr$len(arr) extracts length from dynamic array pointer
  • Type safety out of the box and full LSP support without dealing with void*

arr$ namespace

arr$ is completely macro-driven namespace, with generic type support and safety checks.

arr$ API:

Macro Description
arr$(T) Macro type definition, just for indication that it’s a dynamic array
arr$new(arr, allocator, kwargs…) Initialization of the new instance of dynamic array
arr$free(arr) Dynamic array cleanup (if HeapAllocator was used)
arr$clear(arr) Clearing dynamic array contents
arr$push(arr, item) Adding new item to the end of array
arr$pushm(arr, item, item1, itemN) Adding many new items to the end of array
arr$pusha(arr, other_arr, [other_arr_len]) Adding many new item to the end of array
arr$pop(arr) Returns last element and removes it
arr$at(arr, i) Returns element at index with boundary checks for i
arr$last(arr) Returns last element
arr$del(arr, i) Removes element at index (following data is moved at the i-th position)
arr$delswap(arr, i) Removes element at index, the removed element is replaced by last one
arr$ins(arr, i, value) Inserts element at index
arr$grow_check(arr, add_len) Grows array by add_len if needed
arr$sort(arr, qsort_cmp) Sorting array with qsort function

Examples

1arr$(int) arr = arr$new(arr, mem$);
2arr$push(arr, 1);
3arr$pushm(arr, 2, 3, 4);
int static_arr[] = { 5, 6 };
4arr$pusha(arr, static_arr /*, array_len (optional) */);

io.printf("arr[0]=%d\n", arr[0]); // prints arr[0]=1

// Iterate over array: prints lines 1 ... 6
5for$each (v, arr) {
    io.printf("%d\n", v); 
}

6arr$free(arr);
1
Initialization and allocator
2
Adding single element
3
Adding multiple elements via vargs.
4
Adding arbitrary array, supports static arrays, dynamic CEX arrays or int*+arr_len
5
Array iteration via for$each is common and compatible with all arrays in Cex (dynamic, static, pointer+len)
6
Deallocating memory (only needed when HeapAllocator is used)

test$case(test_overaligned_struct)
{
    struct test32_s
    {
        alignas(32) usize s;
    };

    arr$(struct test32_s) arr = arr$new(arr, mem$);
    struct test32_s f = { .s = 100 };
    tassert(mem$aligned_pointer(arr, 32) == arr);

    for (u32 i = 0; i < 1000; i++) {
        f.s = i;
        arr$push(arr, f);

        tassert_eq(arr$len(arr), i + 1);
    }
    tassert_eq(arr$len(arr), 1000);

    for (u32 i = 0; i < 1000; i++) {
        tassert_eq(arr[i].s, i);
        tassert(mem$aligned_pointer(&arr[i], 32) == &arr[i]);
    }

    arr$free(arr);
    return EOK;
}
test$case(test_array_char_ptr)
{
    arr$(char*) array = arr$new(array, mem$);
    arr$push(array, "foo");
    arr$push(array, "bar");
    arr$pushm(array, "baz", "CEX", "is", "cool");
    for (usize i = 0; i < arr$len(array); ++i) { io.printf("%s \n", array[i]); }
    arr$free(array);

    return EOK;
}
1mem$scope(tmem$, _)
{
2    arr$(char*) incl_path = arr$new(incl_path, _, .capacity = 128);
    for$each (p, alt_include_path) {
3        arr$push(incl_path, p);
        if (!os.path.exists(p)) { log$warn("alt_include_path not exists: %s\n", p); }
    }
4}
1
Initializes a temporary allocator (tmem$) scope in mem$scope(tmem$, _) {...} and assigns it as a variable _ (you can use any name).
2
Initializes dynamic array with the scoped allocator variable _, allocates with specific capacity argument.
3
May allocate memory
4
All memory will be freed at exit from this scope

Hashmaps

Hashmaps (hm$) in CEX are backed by structs with key and value fields, essentially they are backed by plain dynamic arrays of structs (iterable values) with hash table part for implementing keys hashing.

Hashmaps in CEX are also generic, you may use any type of keys or values. However, there are special handling for string keys (char*, or str_s CEX slices). Typically string keys are not copied by hashmap by default, and stored by reference, so you’ll have to keep their allocation stable.

Hashmap initialization is similar to the dynamic arrays, you should define type and call hm$new.

Array compatibility

Hashmaps in CEX are backed by dynamic arrays, which leads to the following developer experience enhancements:

  • arr$len can be applied to hashmaps for checking number of available elements
  • for$each/for$eachp can be used for iteration over hashmap key/values pairs
  • Hashmap items can be accessed as arrays with index

Initialization

There are several ways for declaring hashmap types:

  1. Local function hashmap variables
    hm$(char*, int) intmap = hm$new(intmap, mem$);
    hm$(const char*, int) ap = hm$new(map, mem$);
    hm$(struct my_struct, int) map = hm$new(map, mem$);
  1. Global hashmaps with special types

// NOTE: struct must have .key and .value fields
typedef struct
{
    int key;
    float my_val;
    char* my_string;
    int value;
} my_hm_struct;

void foo(void) {
    // NOTE: this is equivalent of my_hm_struct* map = ...
    hm$s(my_hm_struct) map = hm$new(map, mem$);
}

void my_func(hm$s(my_hm_struct)* map) {
    // NOTE: passing hashmap type, parameter
    int v = hm$get(*map, 1);

    // NOTE: hm$set() may resize map, because of this we use `* map` argument, for keeping pointer valid!
    hm$set(*map, 3, 4);

    // Setting entire structure
    hm$sets(*map, (my_hm_struct){ .key = 5, .my_val = 3.14, .my_string = "cexy", .value = 98 }));
}
  1. Declaring hashmap as type
typedef hm$(char*, int) MyHashMap;

struct my_hm_struct {
    MyHashmap hm;
};

void foo(void) {
    // Initialing  new variable
    MyHashMap map = hm$new(map, mem$);
    
    // Initialing hashmap as a member of struct
    struct my_hm_struct hs = {0};
    hm$new(hs.hm, mem$);

}

Hashmap API

Macro Description
hm$new(hm, allocator, kwargs…) Initialization of hashmap
hm$set(hm, key, value) Set element
hm$setp(hm, key, value) Set element and return pointed to the newly added item inside hashmap
hm$sets(hm, struct_value…) Set entire element as backing struct
hm$get(hm, key) Get a value by key (as a copy)
hm$getp(hm, key) Get a value by key as a pointer to hashmap value
hm$gets(hm, key) Get a value by key as a pointer to a backing struct
hm$clear(hm) Clears contents of hashmap
hm$del(hm, key) Delete element by key
hm$len(hm) Number of elements in hashmap / arr$len() also works

Initialization params

hm$new accepts optional params which may help you to adjust hashmap key behavior:

  • .capacity=16 - initial capacity of the hashmap, will be rounded to closest power of 2 number
  • .seed= - initial seed for hashing algorithm
  • .copy_keys=false - enabling copy of char* keys and storing them specifically in hashmap
  • .copy_keys_arena_pgsize=0 - enabling using arena for copy_keys mode

Example:

test$case(test_hashmap_string_copy_arena)
{
    hm$(char*, int) smap = hm$new(smap, mem$, .copy_keys = true, .copy_keys_arena_pgsize = 1024);

    char key2[10] = "foo";

    hm$set(smap, key2, 3);
    tassert_eq(hm$len(smap), 1);
    tassert_eq(hm$get(smap, "foo"), 3);
    tassert_eq(hm$get(smap, key2), 3);
    tassert_eq(smap[0].key, "foo");

    // Initial buffer gets destroyed, but hashmap keys remain the same
    memset(key2, 0, sizeof(key2));

    tassert_eq(smap[0].key, "foo");
    tassert_eq(hm$get(smap, "foo"), 3);

    hm$free(smap);
    return EOK;
}

Examples

hm$(char*, int) smap = hm$new(smap, mem$);
hm$set(smap, "foo", 3);
hm$get(smap, "foo");
hm$len(smap);
hm$del(smap, "foo");
hm$free(smap);
test$case(test_hashmap_string)
{
    char key_buf[10] = "foobar";

    hm$(char*, int) smap = hm$new(smap, mem$);

    char* k = "foo";
    char* k2 = "baz";

    char key_buf2[10] = "foo";
    char* k3 = key_buf2;
    hm$set(smap, "foo", 3);

    tassert_eq(hm$len(smap), 1);
    tassert_eq(hm$get(smap, "foo"), 3);
    tassert_eq(hm$get(smap, k), 3);
    tassert_eq(hm$get(smap, key_buf2), 3);
    tassert_eq(hm$get(smap, k3), 3);

    tassert_eq(hm$get(smap, "bar"), 0);
    tassert_eq(hm$get(smap, k2), 0);
    tassert_eq(hm$get(smap, key_buf), 0);

    tassert_eq(hm$del(smap, key_buf2), 1);
    tassert_eq(hm$len(smap), 0);

    hm$free(smap);
    return EOK;
}
test$case(test_hashmap_basic_iteration)
{
    hm$(int, int) intmap = hm$new(intmap, mem$);
    hm$set(intmap, 1, 10);
    hm$set(intmap, 2, 20);
    hm$set(intmap, 3, 30);

    tassert_eq(hm$len(intmap), 3);  // special len
    tassert_eq(arr$len(intmap), 3); // NOTE: arr$len is compatible

    // Iterating by value (data is copied)
    u32 nit = 1;
    for$each (it, intmap) {
        tassert_eq(it.key, nit);
        tassert_eq(it.value, nit * 10);
        nit++;
    }

    // Iterating by pointers (data by reference)
    for$eachp(it, intmap)
    {
        isize _nit = intmap - it; // deriving index from pointers
        tassert_eq(it->key, _nit);
        tassert_eq(it->value, _nit * 10);
    }

    hm$free(intmap);

    return EOK;
}

Working with arrays

Arrays are probably most used concept in any language, with C arrays may have many different forms. Unfortunately, the main problem of working with arrays in C is a specialization of methods and operations, each type of array may require special iteration macro, or function for getting array length or element.

Collection types in C:

  • Static arrays i32 arr[10]
  • Dynamic arrays as pointers (i32* arr, usize arr_len)
  • Custom dynamic arrays dynamic_array_push_back(&int_array, &i);
  • Char buffers char buf[1024]
  • Null-terminated strings and slices
  • Hashmaps

Cex tries to solve this by unification of all arrays operations around standard design principles, without getting too far away from standard C.

arr$len unified length

arr$len(array) macro is a ultimate tool for getting lengths of arrays in CEX. It supports: static arrays, char buffers, string literals, dynamic arrays of CEX arr$ and hashmaps of CEX hm$. Also it’s a NULL resilient macro, which returns 0 if array argument is NULL.

Note

Not all array pointers are supports by arr$len (only dynamic arrays or hashmaps are valid), however in debug mode arr$len will raise an assertion/ASAN crash if you passed wrong pointer type there.

Example:

test$case(test_array_len)
{
    arr$(int) array = arr$new(array, mem$);
    arr$pushm(array, 1, 2, 3);

    // Works with CEX dynamic arrays
    tassert_eq(arr$len(array), 3);

    // NULL is supported, and emits 0 length
    arr$free(array);
    tassert(array == NULL); 
    tassert_eq(arr$len(array), 0); // NOTE: NULL array - len = 0

    // Works with static arrays
    char buf[] = {"hello"}; 
    tassert_eq(arr$len(buf), 6); // NOTE: includes null term

    // Works with arrays of given capacity
    char buf2[10] = {0};
    tassert_eq(arr$len(buf2), 10);

    // Type doesn't matter
    i32 a[7] = {0};
    tassert_eq(arr$len(a), 7);

    // Works with string literals
    tassert_eq(arr$len("CEX"), 4); // NOTE: includes null term

    // Works with CEX hashmap
    hm$(int, int) intmap = hm$new(intmap, mem$);
    hm$set(intmap, 1, 3);
    tassert_eq(arr$len(intmap), 1);

    hm$free(intmap);

    return EOK;
}

Accessing elements of array is unified

test$case(test_array_access)
{
    arr$(int) array = arr$new(array, mem$);
    arr$pushm(array, 1, 2, 3);

    // Dynamic array access is natural C index
    tassert_eq(array[2], 3);
    // tassert_eq(arr$at(array, 3), 3); // NOTE: this is bounds checking access, with assertion 
    arr$free(array);

    // Works with static arrays
    char buf[] = {"hello"}; 
    tassert_eq(buf[1], 'e'); 

    // Works with CEX hashmap
    hm$(int, int) intmap = hm$new(intmap, mem$);
    hm$set(intmap, 1, 3);
    hm$set(intmap, 2, 5);
    tassert_eq(arr$len(intmap), 2);

    // Accessing hashmap as array
    // NOTE: hashmap elements are ordered until first deletion
    tassert_eq(intmap[0].key, 1);
    tassert_eq(intmap[0].value, 3);

    tassert_eq(intmap[1].key, 2);
    tassert_eq(intmap[1].value, 5);

    hm$free(intmap);

    return EOK;
}

CEX way of iteration over arrays

CEX introduces an unified for$* macros which helps with dealing with looping, these are typical patters for iteration:

  • for$each(it, array, [array_len]) - iterates over array, it represents value of array item. array_len is optional and uses arr$len(array) by default, or you might explicitly set it for iterating over arbitrary C pointer+len arrays.
  • for$eachp(it, array, [array_len]) - iterates over array, it represent a pointer to array item. array_len is inferred by default.
  • for$iter(it_val_type, it, iter_funct) - a special iterator for non-indexable collections or function based iteration, tailored for customized iteration of unknown length.
  • for(usize i = 0; i < arr$len(array); i++) - classic also works :)
test$case(test_array_iteration)
{
    arr$(int) array = arr$new(array, mem$);
    arr$pushm(array, 1, 2, 3);

    i32 nit = 0; // it's only for testing
    for$each(it, array) {
        tassert_eq(it, ++nit);
        io.printf("el=%d\n", it);
    }
    // Prints: 
    // el=1
    // el=2
    // el=3

    nit = 0;
    // NOTE: prefer this when you work with bigger structs to avoid extra memory copying
    for$eachp(it, array) {
        // TIP: making array index out of `it`
        usize i = it - array;
        tassert_eq(i, nit);

        // NOTE: it now is a pointer
        tassert_eq(*it, ++nit);
        io.printf("el[%zu]=%d\n", i, *it);
    }
    // Prints: 
    // el[0]=1
    // el[1]=2
    // el[2]=3

    // Static arrays work as well (arr$len inferred)
    i32 arr_int[] = {1, 2, 3, 4, 5};
    for$each(it, arr_int) {
        io.printf("static=%d\n", it);
    }
    // Prints:
    // static=1
    // static=2
    // static=3
    // static=4
    // static=5


    // Simple pointer+length also works (let's do a slice)
    i32* slice = &arr_int[2];
    for$each(it, slice, 2) {
        io.printf("slice=%d\n", it);
    }
    // Prints:
    // slice=3
    // slice=4

    arr$free(array);
    return EOK;
}

Making custom collection iterators

It’s possible to make custom iterator, specifically for unbounded collections or sparse data structures. However, this iteration has higher overhead than simple for$each loop, but sometimes it’s necessary.

Note

Consider using iter_ prefix of the function name, by convention, it’s a good indicator of using for$iter()

Example, of how str.slice.iter_split() was implemented:



typedef struct
{
    struct
    {
        union
        {
            usize i;
            char* skey;
            void* pkey;
        };
    } idx;
    char _ctx[47]; // <<< use this buffer to store iterator state, it's usize aligned
    u8 stopped;
    u8 initialized;
} cex_iterator_s;
static_assert(sizeof(cex_iterator_s) <= 64, "cex size");

static str_s
cex_str__slice__iter_split(str_s s, char* split_by, cex_iterator_s* iterator)
{
    uassert(iterator != NULL && "null iterator");
    uassert(split_by != NULL && "null split_by");

    // temporary struct based on _ctxbuffer
    struct iter_ctx
    {
        usize cursor;
        usize split_by_len;
        usize str_len;
    }* ctx = (struct iter_ctx*)iterator->_ctx;
    static_assert(sizeof(*ctx) <= sizeof(iterator->_ctx), "ctx size overflow");
    static_assert(alignof(struct iter_ctx) <= alignof(usize), "cex_iterator_s _ctx misalign");

    if (unlikely(!iterator->initialized)) {
        // First run handling

        iterator->initialized = 1;
        if (unlikely(!_cex_str__isvalid(&s) || s.len == 0)) {
            iterator->stopped = 1;
            return (str_s){ 0 };
        }
        ctx->split_by_len = strlen(split_by);
        uassert(ctx->split_by_len < UINT8_MAX && "split_by is suspiciously long!");

        if (ctx->split_by_len == 0) {
            iterator->stopped = 1;
            return (str_s){ 0 };
        }

        isize idx = _cex_str__index(&s, split_by, ctx->split_by_len);
        if (idx < 0) { idx = s.len; }
        ctx->cursor = idx;
        ctx->str_len = s.len; // this prevents s being changed in a loop
        iterator->idx.i = 0;
        if (idx == 0) {
            // first line is \n
            return (str_s){ .buf = "", .len = 0 };
        } else {
            return str.slice.sub(s, 0, idx);
        }
    } else {
        if (unlikely(ctx->cursor >= ctx->str_len)) {
            iterator->stopped = 1;
            return (str_s){ 0 };
        }
        ctx->cursor++;
        if (unlikely(ctx->cursor == ctx->str_len)) {
            // edge case, we have separator at last col
            // it's not an error, return empty split token
            iterator->idx.i++;
            return (str_s){ .buf = "", .len = 0 };
        }

        // Get remaining string after prev split_by char
        str_s tok = str.slice.sub(s, ctx->cursor, 0);
        isize idx = _cex_str__index(&tok, split_by, ctx->split_by_len);

        iterator->idx.i++;

        if (idx < 0) {
            // No more splits, return remaining part
            ctx->cursor = s.len;
            // iterator->stopped = 1;
            return tok;
        } else if (idx == 0) {
            return (str_s){ .buf = "", .len = 0 };
        } else {
            // Sub from prev cursor to idx (excluding split char)
            ctx->cursor += idx;
            return str.slice.sub(tok, 0, idx);
        }
    }
}

Namespaces

Naming collisions will always remain a problem of C language. However, we could try our best to reduce surface of conflict, by aggregating functions with prefixes to nice-looking namespace symbols. But the primary role of CEX namespacing approach is to keep project structure organized, easier to navigate and understand. Another beneficial effect of using namespaces is reduction of cognitive work when we try to recall the function name when typing with LSP, we’ll see this effect below. At last, using namespacing we can add OOP-ish flavor to our structures, which could behave as classes.

Key features of namespaces

  • They can be automatically generated from .c file, no need for maintaining changes in .h for every function signature change.
  • Helping to maintain naming conventions name of the .c file must be the same as namespace
  • Reducing surface for name collisions, only global namespace name is exposed.
  • Allowing support of sub-namespaces, easier to remember and type with LSP
  • Less symbols in LSP suggestions
  • Better readability with . separator, and color highlighting of different parts of the function call
  • Namespace structure combines function signatures closely in one place, so it’s easier to figure out what functions are available.

CEX Namespaces in the nutshell

CEX namespace is a global const struct, with function pointers in it.


#define CEX_NAMESPACE __attribute__((visibility("hidden"))) extern const

typedef struct  {...} KeyMap_c;

struct __cex_namespace__KeyMap {
    // Autogenerated by CEX
    // clang-format off

    /// NOTE: Cex may generate brief doc string here, if was added prior function implementation in .c file
    Exception       (*create)(KeyMap_c* self, char* input_dev_or_name);
    // Destroys KeyMap instance
    void            (*destroy)(KeyMap_c* self);
    Exception       (*find_mapped_keyboard)(KeyMap_c* self, char* keyboard_name);
    Exception       (*handle_events)(KeyMap_c* self);
    Exception       (*handle_key)(KeyMap_c* self, struct input_event* ev);
    Exception       (*handle_mouse_move)(KeyMap_c* self);

    // clang-format on
};

CEX_NAMESPACE struct __cex_namespace__KeyMap KeyMap;

So the KeyMap namespace allowing the following usage:


1KeyMap_c keymap = { 0 };
2e$goto(KeyMap.create(&keymap, file), end);
3e$goto(KeyMap.handle_events(&keymap), end);
1
_c suffix of KeyMap_c is an indication of namespace, it can be interpreted as class or has code conceptually.
2
Functions of KeyMap namespace are separated by dots, it’s easier to read, and type with LSP, because it filters only relevant information. See pictures below.
3
Dotted notation may get distinct color highlighting which help to distinguish namespace and its function

LSP Suggestions are much better

  1. If you start type conventional KeyMap_create() function name, the LSP suggestions will get cluttered, fuzz typing may return not what you want
  2. With CEX namespace you get only list of KeyMap functions, and fuzzy typing works way better because you have limited options
(a) Vanilla
(b) KeyMap
Figure 1: LSP Suggestions

Sub-namespaces

Sometimes libraries or namespaces can have dozens of functions, so it’s more convenient to add extra level of namespacing. For example, CEX str namespace have many of functions which are grouped by functionality. str.slice. works with str_s types, str.convert. dealing with conversions, some functions take place in the root namespace, for example str.find().

Sub-namespaces allow to build mental model of code, and helping write function names as a decision tree. For example, if I need str, ., then I need deal with slice slice, ., then I have to find exact thing what I need.

(a) Start
(b) Slice
(c) Function
Figure 2: LSP For sub-namespaces
Note

Check full str namespace options with ./cex help str$

How to make a namespace

Making code

For example: you need to add new foo namespace

  1. Create a pair of files with name prefix src/foo.c and src/foo.h
  2. You can create static functions foo_fun1(), foo_fun2(), foo__bar__fun3(), foo__bar__fun4(). These functions will be processed and wrapped into a foo namespace so you can access them via foo.fun1(), foo.fun2(), foo.bar.fun3(), foo.bar.fun4()
  3. Run ./cex process src/foo.c

Requirements / caveats:

  1. You must have foo.c and foo.h in the same folder
  2. Filename must start with foo - namespace prefix
  3. Each function in foo.c that you’d like to add to namespace must start with foo_
  4. For adding sub-namespace use foo__subname__ prefix
  5. Only one level of sub-namespace is allowed
  6. You may not declare function signature in header, and only use .c static functions
  7. Functions with static inline are not included into namespace
  8. Functions with prefix foo__some are considered internal and not included
  9. New namespace is created when you use exact src/foo.c argument, all just for updates

Style conventions

In my experience, it’s helpful to distinguish between type of code (namespace) we are dealing with, nothing strict, just guidelines:

  1. Sometimes we need OOP-ish / object / class behavior, which wraps a typedef struct MyClass_c, with constructor MyClass.create() and destructor MyClass.destroy(MyClass_c* self). This type of code should be placed into MyClass.c/MyClass.h files.
  2. Sometimes we need just a bunch of functions logically combined together and dealing with different set of types, then we should use lower case name foo, or my_namespace. For example, CEX str namespace. Also you may want to add _c suffix to the typedef struct my_type_c to indicate that it has a namespace code attached my_type.some_func().

CLI Commands

# More help
 ./cex process --help

# Creates new `foo` namespace or update existing one 
 ./cex process src/foo.c 

# Update all existing namespaces in the project
#   use it after you change signatures of your functions
 ./cex process all

Special notes

Performance questions

While namespaces are the static structures with pointers in them, they may be a cause of performance hit for calling functions without compiler optimization enabled (the same as C++ virtual functions hit). However, modern compilers are smart enough to replace function pointer dereferencing call with direct function call when -O1 optimization is enabled.

Getting LSP help / goto definition

CEX Namespaces work with clangd LSP server pretty well. However, LSP help functionality is limited, we can get only list of parameters for completion.

Go to definition works, but with some caveats. If you place cursor (|) at KeyMap.cre|ate and do go to definition in LSP, it will jump onto KeyMap structure type. If you need to goto implementation place cursor like this KeyM|ap.create, and goto definition, it will jump on the KeyMap struct implementation inside KeyMap.c file. Then find KeyMap_create record and jump at it once again.

Sharing namespaces as library

If you are going to make shared library (.so or .dll) probably CEX namespaces are not the best fit for this. They should work, but you probably get performance hit of indirect function calls. In this particular case, it’s better to use vanilla C functions.

Build system

CEX has integrated build system cexy$, inspired by Zig-build and Tsoding’s nob.h. It allows you to build your project without dealing with CMake/Make/Ninja/Meson dependencies. For small projects cexy has simplified mode when build is config-driven. For complex or cross-platform projects cexy enables low-level tools for running the compiler and building specific project assembly logic.

How it works

  1. You need to create cex.c file, which is entry point for all building process and cexy tools. For the newer projects, if cex.c is not there, run the bootstrapping routine:
cc -D CEX_NEW -x c ./cex.h -o ./cex
  1. Then you should compile cex CLI, simply using following command:
cc ./cex.c -o ./cex
  1. Afterwards you should have the ./cex executable in project directory. It’s your main entry point for CEX project management and your project is ready to go.

Now you can launch a sample program or run its unit tests.

./cex test run all
./cex app run myapp

Key-features of cexy$ CLI tool

  • Main project management CLI: building, running unit tests, fuzzer, stats, etc
  • Allows to generate new apps or projects
  • Generates CEX namespaces for user code
  • Fuzzy search for help in user code base
  • Supports custom command runner
  • Supports build-mode configuration
  • Allows OS related operations with files, paths, command launching, etc
  • Adds support for external dependencies via pkg-config and vcpkg
  • UnitTest and Fuzzer runner
  • Fetches 3rd party code, updates cex.h itself or cex lib via git

Simple-mode

cexy$ has built-in build routine for building/running/debugging apps, running unit tests and fuzzers. It can be configured using # define cexy$<config-constant-here> in your cex.c file.

Getting cexy$ help about config variables
# full list of cexy API namespace and cexy$ variables
./cex help cexy$
# list of actual values for cexy$ vars in current project
./cex config

When you run ./cex app run|test|fuzz myapp it uses cexy$ config vars internally, and runs build routine which may cover of 80% generic project needs.

Simple mode add several project structure constraints:

  1. Source code should be in src/ directory
  2. If you have myapp application its main() function should be located at src/myapp.c or src/myapp/main.c
  3. Simple-mode uses unity build approach, so all your sources have to be included as #include "src/foo.c" in src/myapp.c.
  4. Simple-mode does not produce object files and does not do extra linking stage. It’s intentional, and in my opinion is better for smaller/medium (<100k LOC) projects.

Project configuration

cexy$ is configured via setting constants in header files, which can be directly compiled as C code in your project as well. Use ./cex config for checking current project configuration. Configuration can be optionally includes as cex_config.h (or any other name), or directly set in cex.c file.

You can change pre-defined cexy config with ./cex -D<YOUR_VAR> config, it will recompile cex CLI with new settings and all subsequent ./cex call will be using new settings. You may reset to defaults with ./cex -D config.

// file: cex.c

#if __has_include("cex_config.h")
// Custom config file
#    include "cex_config.h"
#else
// Overriding config values
#    if defined(CEX_DEBUG)
#        define CEX_LOG_LVL 4 /* 0 (mute all) - 5 (log$trace) */
#    else
#        define cexy$cc_args "-Wall", "-Wextra", "-Werror", "-g", "-O3", "-fwhole-program"
#    endif
#endif
# Check current config (CEX_DEBUG not set, using -O3 gcc/clang argument)
./cex config
>>>
* cexy$cc_args              "-Wall", "-Wextra", "-Werror", "-g", "-O3", "-fwhole-program"
* ./cex -D<ARGS> config     ""
<<<

# Using CEX_DEBUG from `cex.c` (step 1 tab), you may use any name
./cex -DCEX_DEBUG config

# Check what's changed
./cex config
>>>
* cexy$cc_args              "-Wall", "-Wextra", "-Werror", "-g3", "-fsanitize-address-use-after-scope", "-fsanitize=address", "-fsanitize=undefined", "-fsanitize=leak", "-fstack-protector-strong"
* ./cex -D<ARGS> config     "-DCEX_DEBUG "
<<<

# Revert previous config back
./cex -D config

Minimalist cexy build system

If you wish you could build using your own logic, let’s make a simple custom build command, without utilizing cexy machinery.

// file: cex.c

#define CEX_IMPLEMENTATION
#define CEX_BUILD
#include "cex.h"

Exception cmd_mybuild(int argc, char** argv, void* user_ctx);

int
main(int argc, char** argv)
{

    cexy$initialize(); // cex self rebuild and init
    argparse_c args = {
        .description = cexy$description,
        .epilog = cexy$epilog,
        .usage = cexy$usage,
        argparse$cmd_list(
            cexy$cmd_all,
            // cexy$cmd_fuzz, /* disable built-in commands */
            // cexy$cmd_test, /* disable built-in commands */
            // cexy$cmd_app,  /* disable built-in commands */
            { .name = "my-build", .func = cmd_mybuild, .help = "My Custom build" },
        ),
    };
    if (argparse.parse(&args, argc, argv)) { return 1; }
    void* my_user_ctx = NULL; // passed as `user_ctx` to command
    if (argparse.run_command(&args, my_user_ctx)) { return 1; }
    return 0;
}

Exception cmd_mybuild(int argc, char** argv, void* user_ctx) {
    log$info("Launching my-build command\n");
    e$ret(os$cmd("gcc", "-Wall", "-Wextra", "hello.c", "-o", "hello"));
    return EOK;
}
// file: hello.c

#define CEX_IMPLEMENTATION
#include "cex.h"

int
main(int argc, char** argv)
{
    (void)argc;
    (void)argv;
    io.printf("Hello from CEX\n");
    return 0;
}

~ ➜ ./cex my-build
[INFO]    ( cex.c:50 cmd_mybuild() ) Launching my-build command
[DEBUG]   ( cex.c:51 cmd_mybuild() ) CMD: gcc -Wall -Wextra hello.c -o hello
~ ➜ ./hello
Hello from CEX
Getting cexy logic

You can use cexy build source directly and adjust if needed, just use this command to extract source code from ./cex help --source cexy.cmd.simple_app

Dependency management

Dependencies are always pain-points, it’s against CEX philosophy but sometimes it’s necessary evil. CEX has capabilities for using pkgconf compatible-utilities, and vcpkg framework. You may check examples/ folder in cex GIT repo, it contains couple sample projects with dependencies. Windows OS dependencies is a hell, try to use MSYS2 or vcpkg.

Currently pkgconf/vcpkg dependencies are supported in simple mode, or figure out how to integrate cexy$pkgconf() macro into your custom build yourself.

Here is excerpt of libcurl+libzip build for Linux+MacOS+windows:

// file: cex.c
#    define cexy$pkgconf_libs "libcurl", "libzip"
#    define CEX_LOG_LVL 4 /* 0 (mute all) - 5 (log$trace) */

#    if _WIN32
// using mingw libs .a
#        define cexy$build_ext_lib_stat ".a"
// NOTE: windows is a special case, the best way to manage dependencies to have vcpkg
//       you have to manually install vcpkg and configure paths. Currently it uses static
//       environment and mingw because it was tested under MSYS2
//
//  Also install the following in `classic` mode:
//  > vcpkg install --triplet=x64-mingw-static curl
//  > vcpkg install --triplet=x64-mingw-static libzip

#        define cexy$vcpkg_triplet "x64-mingw-static"
#        define cexy$vcpkg_root "c:/vcpkg/"
#    else
// NOTE: linux / macos will use system wide libs
//       make sure you installed libcurl-dev libzip-dev via package manager
//       names of packages will depend on linux distro and macos home brew.
#    endif
#endif

Cross-platform builds

For compile time you may use platform specific constants, for example #ifdef _WIN32 or you can set arbitrary config define that switching to platform logic (compile time). Also cex has os.platform. sub-namespace for runtime platform checks:

#    if _WIN32
// using mingw libs .a
#        define cexy$build_ext_lib_stat ".a"
#        define cexy$vcpkg_triplet "x64-mingw-static"
#elif defined(__APPLE__) || defined(__MACH__)
#        define cexy$vcpkg_triplet "arm64-osx"
#    else
#        define cexy$vcpkg_triplet "x64-linux"
#    endif
#endif
// NOTE: activate with the following command
// ./cex -DCEX_WIN config

// file: cex.c
#ifdef CEX_WIN
#    define cexy$cc "x86_64-w64-mingw32-gcc"
#    define cexy$cc_args_sanitizer "-g3"
#    define cexy$debug_cmd "wine"
#    define cexy$build_ext_exe ".exe"
#endif
// platform-dependent compilation flags (runtime)
// file: cex.c (as a part of custom build command)

arr$(char*) args = arr$new(args, _);
arr$pushm(args, cexy$cc, "shell.c", "../sqlite3.o", "-o", "../sqlite3");
if (os.platform.current() == OSPlatform__win) {
    arr$pushm(args, "-lpthread", "-lm");
} else {

    arr$pushm(args, "-lpthread", "-ldl", "-lm");
}
arr$push(args, NULL);
e$ret(os$cmda(args));
Getting example for arbitrary function use in CEX

You can get example source code with highlighting if any function is used in the project, use shell command: ./cex help --example os.platform.current

Developer Tools

CEX language is designed for improving developer experience with C, ./cex CLI contains key tools for managing project, running apps, debugging, unit testing and fuzzing.

Sanitizers

CEX enables sanitizers by default if they are supported by your OS and compiler. ASAN/UBSAN are extremely useful for catching bugs. Also CEX uses sanitizers for call stack printouts for uassert(). clang has the best sanitizer support across many platforms, gcc sanitizers are supported on Linux.

Default sanitizer arguments:

// file cex.c
#define cexy$cc_args_sanitizer    "-fsanitize-address-use-after-scope", "-fsanitize=address", "-fsanitize=undefined", "-fsanitize=leak", "-fstack-protector-strong"

Asserts

I’m a big fan of “asserts everywhere” code style, which is also known design by contract, or TigerBeetle style, it has many names. Apparently, C asserts kinda work, but are huge pain for debugging without live debugger session.

So cex.h has 2 types of asserts:

  • uassert*() family work like vanilla assertion and lead to abortion at failure (but they print tracebacks with call stack and line numbers). These asserts are stripped when NDEBUG is defined.
  • e$assert() returns Error.assert and only intended for usage in function with Exception return type. These asserts remain in place even when NDEBUG is defined.
// Raises abort
uassert(a == 4); // vanilla
uassert(b == a && "Oops it's a message"); // with static message
uassertf(b == 2, "b[%d] != 2", b); // with formatting

// Disabling uassert() - only for unit test mode
uassert_disable();
run_bad_stuff(NULL);
uassert_enable();

// Returns Error.assert on failure + prints [ASSERT] file:line in the stdout
Exception read_file(char* filename, char* buf, isize* out_buf_size) {
    e$assert(buff != NULL); // vanilla
    e$assert(filename != NULL && "invalid filename"); // with static message
    e$assertf(filename == NULL, "filename: %s", filename); // with formatting
    return EOK;
}
Note

uassert() tracebacks only available if program was compiled with ASAN flags.

Unit Testing Tool

Each CEX test file is compiled as stand alone executable, this allow making specialized tests with mocks, experiment with parts of bigger project without fixing plethora of compiler errors, and do a test driven development and debugging.

Create new test with: ./cex test create tests/test_file.c, run it ./cex test run tests/test_file.c or ./cex test run all.

# Getting built-in help

 ./cex test

Usage:
cex test [options] {run,build,create,clean,debug} all|tests/test_file.c [--test-options]
CEX built-in simple test runner

Each cexy test is self-sufficient and unity build, which allows you to test
static funcions, apply mocks to some selected modules and functions, have more
control over your code. See `cex config --help` for customization/config info.

CEX test runner keep checking include modified time to track changes in the
source files. It expects that each #include "myfile.c" has "myfile.h" in
the same folder. Test runner uses cexy$cc_include for searching.

CEX is a test-centric language, it enables additional sanity checks then in
test suite, all warnings are enabled -Wall -Wextra. Sanitizers are enabled by
default.

Code requirements:
1. You should include all your source files directly using #include "path/src.c"
2. If needed provide linker options via cexy$ld_libs / cexy$ld_args
3. If needed provide compiler options via cexy$cc_args_test
4. All tests have to be in tests/ folder, and start with `test_` prefix
5. Only #include with "" checked for modification

Test suite setup/teardown:
// setup before every case
test$setup_case() {return EOK;}
// teardown after every case
test$setup_case() {return EOK;}
// setup before suite (only once)
test$setup_suite() {return EOK;}
// teardown after suite (only once)
test$setup_suite() {return EOK;}

Test case:

test$case(my_test_case_name) {
    // run `cex help tassert_` / `cex help tassert_eq` to get more info
    tassert(0 == 1);
    tassertf(0 == 1, "this is a failure msg: %d", 3);
    tassert_eq(buf, "foo");
    tassert_eq(1, true);
    tassert_eq(str.sstr("bar"), str$s("bar"));
    tassert_ne(1, 0);
    tassert_le(0, 1);
    tassert_lt(0, 1);
    return EOK;
}

If you need more control you can build your own test runner. Just use cex help
and get source code `./cex help --source cexy.cmd.simple_test`


    -h, --help    show this help message and exit

Test running examples:
cex test create tests/test_file.c        - creates new test file from template
cex test build all                       - build all tests
cex test run all                         - build and run all tests
cex test run tests/test_file.c           - run test by path
cex test debug tests/test_file.c         - run test via `cexy$debug_cmd` program
cex test clean all                       - delete all test executables in `cexy$build_dir`
cex test clean test/test_file.c          - delete specific test executable
cex test run tests/test_file.c [--help]  - run test with passing arguments to the test runner program

Fuzzers

CEX has a fuzzers back-end, currently libfuzzer - built-in in clang is preferable, but AFL++ also works. CEX fuzzers are designed to hit directly in heart of the code, therefore it’s easier to use clang, however CEX fuzzer API in CEX remain compatible with AFL as well.

Note

Try to split functionality across many small fuzz files for different aspects of your program. This will help to hit specific pain points easier. Look into fuzz examples in CEX GIT repo in fuzz/ folder.

Making new fuzzer test

# Placing into fuzz/ directory is mandatory
./cex fuzz create fuzz/myapp/fuzz_bar.c
./cex fuzz create fuzz/mymodule/fuzz_foo.c

Sample fuzz file

// file: fuzz/myapp/fuzz_bar.c

#define CEX_IMPLEMENTATION
#include "cex.h"

/*
// setup is not mandatory, but useful for establishing corpus
fuzz$setup(void){
    // This function allows programmatically seed new corpus for fuzzer
    io.printf("CORPUS: %s\n", fuzz$corpus_dir);
    mem$scope(tmem$, _){
        char* fn = str.fmt(_, "%s/my_case", fuzz$corpus_dir);
        (void)fn;
        // io.file.save(fn, "my seed data");
    }
}
*/

int
fuzz$case(const u8* data, usize size){
    // TODO: do your stuff based on input data and size
    if (size > 2 && data[0] == 'C' && data[1] == 'E' && data[2] == 'X') {
        __builtin_trap();
    }
    return 0;
}

fuzz$main();

Running fuzzer case

# Run specific test (infinite timeout)
./cex fuzz run fuzz/myapp/fuzz_bar.c

# Run all with time limit per test
./cex fuzz run all

>> (output of fuzzer)

SUMMARY: libFuzzer: deadly signal
MS: 4 PersAutoDict-ChangeBit-ShuffleBytes-CMP- DE: "E\000"-"X\000"-; base unit: a04ab19fbcf9e6dd3b7f1b71cb156335556f3507
0x43,0x45,0x58,0x0,0x3e,
CEX\000>
artifact_prefix='fuzz_file.'; Test unit written to fuzz_file.crash-88777
Base64: Q0VYAD4=

>> cat fuzz_file.crash-88777
CEX>

Running / debugging crash file

# Run single artifact file caused crash
# NOTE: fuzz_file.crash-88777 must be located at fuzz/myapp/ 
./cex fuzz run fuzz/myapp/fuzz_bar.c fuzz_file.crash-88777

# run in gdb (see cexy$debug_cmd )
./cex fuzz debug fuzz/myapp/fuzz_bar.c fuzz_file.crash-88777

Lines Of Code stats

./cex stats calculates .c/.h lines of code and estimates assertion percentage as a code quality metric.


~ ➜ ./cex stats 'src/*.c' 'tests/*.c'
Project stats (parsed in 0.020sec)
--------------------------------------------------------
Metric                   |     Code     |    Tests     |
--------------------------------------------------------
Files                    |          27  |          30  |
Asserts                  |         361  |        4230  |
Lines of code            |       11494  |       12261  |
Lines of comments        |         725  |         606  |
Asserts per LOC          |        3.14% |       34.50% |
Total asserts per LOC    |       39.94% |         <<<  |
--------------------------------------------------------

Fetching libraries and CEX updates

./cex libfetch command is a simple git wrapper for retrieving/updating cex lib/ files or updating cex.h itself. This command can be used with any git repo, for getting single-header files.


~ ➜ ./cex libfetch --help
Usage:
cex libfetch [options]
Fetching 3rd party libraries via git (by default it uses cex git repo as source)

    -h, --help            show this help message and exit
    -u, --git-url         Git URL of the repository (default: 'https://github.com/alexveden/cex.git')
    -l, --git-label       Git label (default: 'HEAD')
    -o, --out-dir         Output directory relative to project root (default: './')
    -U, --update          Force replacing existing code with repository files (default: N)
    -p, --preserve-dirs   Preserve directory structure as in repo (default: Y)

Command examples:
cex libfetch lib/test/fff.h                            - fetch signle header lib from CEX repo
cex libfetch -U cex.h                                  - update cex.h to most recent version
cex libfetch lib/random/                               - fetch whole directory recursively from CEX lib
cex libfetch --git-label=v2.0 file.h                   - fetch using specific label or commit
cex libfetch -u https://github.com/m/lib.git file.h    - fetch from arbitrary repo
cex help --example cexy.utils.git_lib_fetch            - you can call it from your cex.c (see example)

Getting help for project

./cex help is CLI command for getting help for your project, it works for CEX and your project as well. You can use it as symbol search: types, functions, files, examples and source code. Also it supports CEX namespaces as struct interfaces and macro$namespaces as well.

Help command

~ ➜ ./cex help --help
Usage:
cex help [options] [query]
Symbol / documentation search tool for C projects


Options
    -h, --help        show this help message and exit
    -f, --filter      file pattern for searching (default: './*.[hc]')
    -s, --source      show full source on match (default: N)
    -e, --example     finds random example in source base (default: N)
    -o, --out         write output of command to file (default: '')

Query examples:
cex help                     - list all namespaces in project directory
cex help foo                 - find any symbol containing 'foo' (case sensitive)
cex help foo.                - find namespace prefix: foo$, Foo_func(), FOO_CONST, etc
cex help os$                 - find CEX namespace help (docs, macros, functions, types)
cex help 'foo_*_bar'         - find using pattern search for symbols (see 'cex help str.match')
cex help '*_(bar|foo)'       - find any symbol ending with '_bar' or '_foo'
cex help str.find            - display function documentation if exactly matched
cex help 'os$PATH_SEP'       - display macro constant value if exactly matched
cex help str_s               - display type info and documentation if exactly matched
cex help --source str.find   - display function source if exactly matched
cex help --example str.find  - display random function use in codebase if exactly matched

Getting language cheat-sheets

All core namespaces in CEX have cheat-sheets with full members reference and examples.

Just type:

# Cheat-sheet for os namespace
./cex help os$
# Help for CEX errors
./cex help e$
# Help for some parts of the language for$ / arr$ / hm$
./cex help arr$ 
# Make your own cheat-sheet!
./cex help myproj_namespace$
Tip

Core namespace reference was fully generated from CEX build-in cheat-sheets and docstrings!

Note

For making your own cheat-sheet place doxygen comment (/** multiline help */) right before struct definition in the mynamespace.h (assuming mynamespace). Or add one before #define __mynamespace$ if you use macro-only namespace.

Example roulette

If you need some use-case example for some function/symbol in your project you can test your luck and find random use-case of that function with the following:

// NOTE: in example mode you must provide full symbol name without wildcards
~./cex help --example str.find

Found at ./cex.h:13704
Exception cexy__test__create(char* target, bool include_sample)
{
    if (os.path.exists(target)) {
        return e$raise(Error.exists, "Test file already exists: %s", target);
    }
    if (str.eq(target, "all") || str.find(target, "*")) {
        return e$raise(
            Error.argument,
            "You must pass exact file path, not pattern, got: %s",
            target
        );
    }

    ...
}

Project management

Using ./cex CLI you could seed new project or add/run/debug/clean apps/fuzz/tests in existing project.

Note

Try ./cex app --help, ./cex test --help, ./cex fuzz --help for more info. Also, this functionality may not work properly if you use custom build routines.

Creating

# Create new project from scratch + bootstraps ./cex cli + sample hello world project structure
./cex new new_dir_path

# Create new app for existing project as src/myapp/main.c
./cex app create myapp

# Create new unit test for existing project
./cex test create tests/test_file.c  

# Create new fuzz test
./cex fuzz create fuzz/myapp/fuzz_bar.c

Running

./cex app run myapp --app-opt=1 app_arg1 app_arg2

./cex test run tests/test_file.c  

./cex fuzz run fuzz/myapp/fuzz_bar.c

Debugging

./cex app debug myapp

./cex test debug tests/test_file.c  

./cex fuzz debug fuzz/myapp/fuzz_bar.c

Core Namespace Reference

Getting help

All core namespaces in CEX have cheat-sheets with full members reference and examples.

Just type:

# Cheat-sheet for os namespace
./cex help os$
# Help for CEX errors
./cex help e$
# Help for some parts of the language for$ / arr$ / hm$
./cex help arr$ 
# Make your own cheat-sheet!
./cex help myproj_namespace$

argparse

  • Command line args parsing
// NOTE: Command example 

Exception cmd_build_docs(int argc, char** argv, void* user_ctx);

int
main(int argc, char** argv)
{
    // clang-format off
    argparse_c args = {
        .description = "My description",
        .usage = "Usage help",
        .epilog = "Epilog text",
        argparse$cmd_list(
            { .name = "build-docs", .func = cmd_build_docs, .help = "Build CEX documentation" },
        ),
    };
    if (argparse.parse(&args, argc, argv)) { return 1; }
    if (argparse.run_command(&args, NULL)) { return 1; }
    return 0;
}

Exception
cmd_build_docs(int argc, char** argv, void* user_ctx)
{
    // Command handling func
}
  • Parsing custom arguments
// Simple options example

int
main(int argc, char** argv)
{
    bool force = 0;
    bool test = 0;
    int int_num = 0;
    float flt_num = 0.f;
    char* path = NULL;

    char* usage = "basic [options] [[--] args]\n"
                  "basic [options]\n";

    argparse_c argparse = {
        argparse$opt_list(
            argparse$opt_help(),
            argparse$opt_group("Basic options"),
            argparse$opt(&force, 'f', "force", "force to do"),
            argparse$opt(&test, 't', "test", .help = "test only"),
            argparse$opt(&path, 'p', "path", "path to read", .required = true),
            argparse$opt_group("Another group"),
            argparse$opt(&int_num, 'i', "int", "selected integer"),
            argparse$opt(&flt_num, 's', "float", "selected float"),
        ),
        // NOTE: usage/description are optional 

        .usage = usage,
        .description = "\nA brief description of what the program does and how it works.",
        "\nAdditional description of the program after the description of the arguments.",
    };
    if (argparse.parse(&args, argc, argv)) { return 1; }

    // NOTE: all args are filled and parsed after this line

    return 0;
}
/// holder for list of
#define argparse$cmd_list(...)

/// command line option record (generic type of arguments)
#define argparse$opt(value, ...)

/// options group separator
#define argparse$opt_group(h)

/// built-in option for -h,--help
#define argparse$opt_help()

/// holder for list of  argparse$opt()
#define argparse$opt_list(...)

/// main argparse struct (used as options config)
typedef argparse_c

/// command settings type (prefer macros)
typedef argparse_cmd_s

/// command line options type (prefer macros)
typedef argparse_opt_s



argparse {
    // Autogenerated by CEX
    // clang-format off

    char*           (*next)(argparse_c* self);
    Exception       (*parse)(argparse_c* self, int argc, char** argv);
    Exception       (*run_command)(argparse_c* self, void* user_ctx);
    void            (*usage)(argparse_c* self);

    // clang-format on
};

arr$

  • Creating array
    // Using heap allocator (need to free later!)
    arr$(i32) array = arr$new(array, mem$);

    // adding elements
    arr$pushm(array, 1, 2, 3); // multiple at once
    arr$push(array, 4); // single element

    // length of array
    arr$len(array);

    // getting i-th elements
    array[1];

    // iterating array (by value)
    for$each(it, array) {
        io.printf("el=%d\n", it);
    }

    // iterating array (by pointer - prefer for bigger structs to avoid copying)
    for$eachp(it, array) {
        // TIP: making array index out of `it`
        usize i = it - array;

        // NOTE: 'it' now is a pointer
        io.printf("el[%zu]=%d\n", i, *it);
    }

    // free resources
    arr$free(array);
  • Array of structs

typedef struct
{
    int key;
    float my_val;
    char* my_string;
    int value;
} my_struct;

void somefunc(void)
{
    arr$(my_struct) array = arr$new(array, mem$, .capacity = 128);
    uassert(arr$cap(array), 128);

    my_struct s;
    s = (my_struct){ 20, 5.0, "hello ", 0 };
    arr$push(array, s);
    s = (my_struct){ 40, 2.5, "failure", 0 };
    arr$push(array, s);
    s = (my_struct){ 40, 1.1, "world!", 0 };
    arr$push(array, s);

    for (usize i = 0; i < arr$len(array); ++i) {
        io.printf("key: %d str: %s\n", array[i].key, array[i].my_string);
    }
    arr$free(array);

    return EOK;
}
/// Generic array type definition. Use arr$(int) myarr - defines new myarr variable, as int array
#define arr$(T)

/// Get element at index (bounds checking with uassert())
#define arr$at(a, i)

/// Returns current array capacity
#define arr$cap(a)

/// Clear array contents
#define arr$clear(a)

/// Delete array elements by index (memory will be shifted, order preserved)
#define arr$del(a, i)

/// Delete element by swapping with last one (no memory overhear, element order changes)
#define arr$delswap(a, i)

/// Free resources for dynamic array (only needed if mem$ allocator was used)
#define arr$free(a)

/// Grows array capacity
#define arr$grow(a, add_len, min_cap)

/// Check array capacity and return false on memory error
#define arr$grow_check(a, add_extra)

/// Inserts element into array at index `i`
#define arr$ins(a, i, value...)

/// Return last element of array
#define arr$last(a)

/// Versatile array length, works with dynamic (arr$) and static compile time arrays
#define arr$len(arr)

/// Array initialization: use arr$(int) arr = arr$new(arr, mem$, .capacity = , ...)
#define arr$new(a, allocator, kwargs...)

/// Pop element from the end
#define arr$pop(a)

/// Push element to the end
#define arr$push(a, value...)

/// Push another array into a. array can be dynamic or static or pointer+len
#define arr$pusha(a, array, array_len...)

/// Push many elements to the end
#define arr$pushm(a, items...)

/// Set array capacity and resize if needed
#define arr$setcap(a, n)

/// Sort array with qsort() libc function
#define arr$sort(a, qsort_cmp)


cexy

/// Build dir for project executables and tests (may be overridden by user)
#define cexy$build_dir

/// Extension for executables (e.g. '.exe' for win32)
#define cexy$build_ext_exe

/// Extension for dynamic linked libs (".dll" win, ".so" linux)
#define cexy$build_ext_lib_dyn

/// Extension for static libs (".lib" win, ".a" linux)
#define cexy$build_ext_lib_stat

/// Default compiler for building tests/apps (by default inferred from ./cex tool compiler)
#define cexy$cc

/// Common compiler flags (may be overridden by user)
#define cexy$cc_args

/// Debug mode and tests sanitizer flags (may be overridden by user)
#define cexy$cc_args_sanitizer

/// Test runner compiler flags (may be overridden by user)
#define cexy$cc_args_test

/// Include path for the #include "some.h" (may be overridden by user)
#define cexy$cc_include

/// Compiler flags used for building ./cex.c -> ./cex (may be overridden by user)
#define cexy$cex_self_args

/// Macro constant derived from the compiler type used to initially build ./cex app
#define cexy$cex_self_cc

/// All built-in commands for ./cex tool
#define cexy$cmd_all

/// Simple app build command (unity build, simple linking, runner, debugger launch, etc)
#define cexy$cmd_app

/// Simple fuzz tests runner command
#define cexy$cmd_fuzz

/// Simple test runner command (test runner, debugger launch, etc)
#define cexy$cmd_test

/// Command for launching debugger for cex test/app debug (may be overridden)
#define cexy$debug_cmd

/// ./cex --help description
#define cexy$description

/// ./cex --help epilog
#define cexy$epilog

/// Fuzzer compilation command (supports clang libfuzzer and afl++)
#define cexy$fuzzer

/// Initialize CEX build system (build itself)
#define cexy$initialize()

/// Linker flags (e.g. -L./lib/path/ -lmylib -lm) (may be overridden)
#define cexy$ld_args

/// Helper macro for running cexy.utils.pkgconf() a dependency resolver for libs
#define cexy$pkgconf(allocator, out_cc_args, pkgconf_args...)

/// Dependency resolver command: pkg-config, pkgconf, etc. May be used in cross-platform
/// compilation, allowed multiple command arguments here
#define cexy$pkgconf_cmd

/// list of standard system project libs (for example: "lua5.3", "libz")
#define cexy$pkgconf_libs

/// Pattern for ignoring extra macro keywords in function signatures (for cex process).
#define cexy$process_ignore_kw

/// Directory for applications and code (may be overridden by user)
#define cexy$src_dir

/// ./cex --help usage
#define cexy$usage

/// Current vcpkg root path (where ./vcpkg tool is located)
#define cexy$vcpkg_root

/// Current build triplet (empty, NULL, or string like "x64-linux")
///   if you are using  `vcpkg install mydep`, ignored if blank or NULL, 
///   list of all supported triplets is here: `vcpkg help triplet`)
#define cexy$vcpkg_triplet



cexy {
    // Autogenerated by CEX
    // clang-format off

    void            (*build_self)(int argc, char** argv, char* cex_source);
    bool            (*src_changed)(char* target_path, char** src_array, usize src_array_len);
    bool            (*src_include_changed)(char* target_path, char* src_path, arr$(char*) alt_include_path);
    char*           (*target_make)(char* src_path, char* build_dir, char* name_or_extension, IAllocator allocator);

    struct {
        Exception       (*clean)(char* target);
        Exception       (*create)(char* target);
        Exception       (*find_app_target_src)(IAllocator allc, char* target, char** out_result);
        Exception       (*run)(char* target, bool is_debug, int argc, char** argv);
    } app;

    struct {
        Exception       (*config)(int argc, char** argv, void* user_ctx);
        Exception       (*help)(int argc, char** argv, void* user_ctx);
        Exception       (*libfetch)(int argc, char** argv, void* user_ctx);
        Exception       (*new)(int argc, char** argv, void* user_ctx);
        Exception       (*process)(int argc, char** argv, void* user_ctx);
        Exception       (*simple_app)(int argc, char** argv, void* user_ctx);
        Exception       (*simple_fuzz)(int argc, char** argv, void* user_ctx);
        Exception       (*simple_test)(int argc, char** argv, void* user_ctx);
        Exception       (*stats)(int argc, char** argv, void* user_ctx);
    } cmd;

    struct {
        Exception       (*create)(char* target);
    } fuzz;

    struct {
        Exception       (*clean)(char* target);
        Exception       (*create)(char* target, bool include_sample);
        Exception       (*make_target_pattern)(char** target);
        Exception       (*run)(char* target, bool is_debug, int argc, char** argv);
    } test;

    struct {
        char*           (*git_hash)(IAllocator allc);
        Exception       (*git_lib_fetch)(char* git_url, char* git_label, char* out_dir, bool update_existing, bool preserve_dirs, char** repo_paths, usize repo_paths_len);
        Exception       (*make_compile_flags)(char* flags_file, bool include_cexy_flags, arr$(char*) cc_flags_or_null);
        Exception       (*make_new_project)(char* proj_dir);
        Exception       (*pkgconf)(IAllocator allc, arr$(char*)* out_cc_args, char** pkgconf_args, usize pkgconf_args_len);
    } utils;

    // clang-format on
};

cg$

  • Code generation module

test$case(test_codegen_test)
{
    sbuf_c b = sbuf.create(1024, mem$);
    // NOTE: cg$ macros should be working within cg$init() scope or make sure cg$var is available
    cg$init(&b);

    tassert(cg$var->buf == &b);
    tassert(cg$var->indent == 0);

    cg$pn("printf(\"hello world\");");
    cg$pn("#define GOO");
    cg$pn("// this is empty scope");
    cg$scope("", "")
    {
        cg$pf("printf(\"hello world: %d\");", 2);
    }

    cg$func("void my_func(int arg_%d)", 2)
    {
        cg$scope("var my_var = (mytype)", "")
        {
            cg$pf(".arg1 = %d,", 1);
            cg$pf(".arg2 = %d,", 2);
        }
        cg$pa(";\n", "");

        cg$if("foo == %d", 312)
        {
            cg$pn("printf(\"Hello: %d\", foo);");
        }
        cg$elseif("bar == foo + %d", 7)
        {
            cg$pn("// else if scope");
        }
        cg$else()
        {
            cg$pn("// else scope");
        }

        cg$while("foo == %d", 312)
        {
            cg$pn("printf(\"Hello: %d\", foo);");
        }

        cg$for("u32 i = 0; i < %d; i++", 312)
        {
            cg$pn("printf(\"Hello: %d\", foo);");
            cg$foreach("it, my_var", "")
            {
                cg$pn("printf(\"Hello: %d\", foo);");
            }
        }

        cg$scope("do ", "")
        {
            cg$pf("// do while", 1);
        }
        cg$pa(" while(0);\n", "");
    }

    cg$switch("foo", "")
    {
        cg$case("'%c'", 'a')
        {
            cg$pn("// case scope");
        }
        cg$scope("case '%c': ", 'b')
        {
            cg$pn("fallthrough();");
        }
        cg$default()
        {
            cg$pn("// default scope");
        }
    }

    tassert(cg$is_valid());

    printf("result: \n%s\n", b);


    sbuf.destroy(&b);
    return EOK;
}
/// add case in switch() statement
#define cg$case(format, ...)

/// decrease code indent by 4
#define cg$dedent()

/// add default in switch() statement
#define cg$default()

/// add else
#define cg$else()

/// add else if
#define cg$elseif(format, ...)

/// add for loop
#define cg$for(format, ...)

/// add CEX for$each loop
#define cg$foreach(format, ...)

/// add new function     cg$func("void my_func(int arg_%d)", 2)
#define cg$func(format, ...)

/// add if statement
#define cg$if(format, ...)

/// increase code indent by 4
#define cg$indent()

/// Initializes new code generator (uses sbuf instance as backing buffer)
#define cg$init(out_sbuf)

/// false if any cg$ operation failed, use cg$var->error to get Exception type of error
#define cg$is_valid()

/// append code at the current line without "\n"
#define cg$pa(format, ...)

/// add new line of code with formatting
#define cg$pf(format, ...)

/// add new line of code
#define cg$pn(text)

#define cg$printva(cg)

/// add new code scope with indent (use for low-level stuff)
#define cg$scope(format, ...)

/// add switch() statement
#define cg$switch(format, ...)

/// Common code gen buffer variable (all cg$ macros use it under the hood)
#define cg$var

/// add while loop
#define cg$while(format, ...)


e$

CEX Error handling cheat sheet:

  1. Errors can be any char*, or string literals.
  2. EOK / Error.ok - is NULL, means no error
  3. Exception return type forced to be checked by compiler
  4. Error is built-in generic error type
  5. Errors should be checked by pointer comparison, not string contents.
  6. e$ are helper macros for error handling
  7. DO NOT USE break/continue inside e$except/e$except_* scopes (these macros are for loops too)!

Generic errors:

Error.ok = EOK;                       // Success
Error.memory = "MemoryError";         // memory allocation error
Error.io = "IOError";                 // IO error
Error.overflow = "OverflowError";     // buffer overflow
Error.argument = "ArgumentError";     // function argument error
Error.integrity = "IntegrityError";   // data integrity error
Error.exists = "ExistsError";         // entity or key already exists
Error.not_found = "NotFoundError";    // entity or key already exists
Error.skip = "ShouldBeSkipped";       // NOT an error, function result must be skipped
Error.empty = "EmptyError";           // resource is empty
Error.eof = "EOF";                    // end of file reached
Error.argsparse = "ProgramArgsError"; // program arguments empty or incorrect
Error.runtime = "RuntimeError";       // generic runtime error
Error.assert = "AssertError";         // generic runtime check
Error.os = "OSError";                 // generic OS check
Error.timeout = "TimeoutError";       // await interval timeout
Error.permission = "PermissionError"; // Permission denied
Error.try_again = "TryAgainError";    // EAGAIN / EWOULDBLOCK errno analog for async operations

Exception
remove_file(char* path)
{
    if (path == NULL || path[0] == '\0') { 
        return Error.argument;  // Empty of null file
    }
    if (!os.path.exists(path)) {
        return "Not exists" // literal error are allowed, but must be handled as strcmp()
    }
    if (str.eq(path, "magic.file")) {
        // Returns an Error.integrity and logs error at current line to stdout
        return e$raise(Error.integrity, "Removing magic file is not allowed!");
    }
    if (remove(path) < 0) { 
        return strerror(errno); // using system error text (arbitrary!)
    }
    return EOK;
}

Exception read_file(char* filename) {
    e$assert(buff != NULL);

    int fd = 0;
    e$except_errno(fd = open(filename, O_RDONLY)) { return Error.os; }
    return EOK;
}

Exception do_stuff(char* filename) {
    // return immediately with error + prints traceback
    e$ret(read_file("foo.txt"));

    // jumps to label if read_file() fails + prints traceback
    e$goto(read_file(NULL), fail);

    // silent error handing without tracebacks
    e$except_silent (err, foo(0)) {

        // Nesting of error handlers is allowed
        e$except_silent (err, foo(2)) {
            return err;
        }

        // NOTE: `err` is address of char* compared with address Error.os (not by string contents!)
        if (err == Error.os) {
            // Special handing
            io.print("Ooops OS problem\n");
        } else {
            // propagate
            return err;
        }
    }
    return EOK;

fail:
    // TODO: cleanup here
    return Error.io;
}
/// Non disposable assert, returns Error.assert CEX exception when failed
#define e$assert(A)

/// Non disposable assert, returns Error.assert CEX exception when failed (supports formatting)
#define e$assertf(A, format, ...)

/// catches the error of function inside scope + prints traceback
#define e$except(_var_name, _func)

/// catches the error of system function (if negative value + errno), prints errno error
#define e$except_errno(_expression)

/// catches the error is expression returned null
#define e$except_null(_expression)

/// catches the error of function inside scope (without traceback)
#define e$except_silent(_var_name, _func)

/// catches the error is expression returned true
#define e$except_true(_expression)

/// `goto _label` when _func returned error + prints traceback
#define e$goto(_func, _label)

/// raises an error, code: `return e$raise(Error.integrity, "ooops: %d", i);`
#define e$raise(return_uerr, error_msg, ...)

/// immediately returns from function with _func error + prints traceback
#define e$ret(_func)


for$

  • using for$ as unified array iterator

test$case(test_array_iteration)
{
    arr$(int) array = arr$new(array, mem$);
    arr$pushm(array, 1, 2, 3);

    for$each(it, array) {
        io.printf("el=%d\n", it);
    }
    // Prints:
    // el=1
    // el=2
    // el=3

    // NOTE: prefer this when you work with bigger structs to avoid extra memory copying
    for$eachp(it, array) {
        // TIP: making array index out of `it`
        usize i = it - array;

        // NOTE: it now is a pointer
        io.printf("el[%zu]=%d\n", i, *it);
    }
    // Prints:
    // el[0]=1
    // el[1]=2
    // el[2]=3

    // Static arrays work as well (arr$len inferred)
    i32 arr_int[] = {1, 2, 3, 4, 5};
    for$each(it, arr_int) {
        io.printf("static=%d\n", it);
    }
    // Prints:
    // static=1
    // static=2
    // static=3
    // static=4
    // static=5


    // Simple pointer+length also works (let's do a slice)
    i32* slice = &arr_int[2];
    for$each(it, slice, 2) {
        io.printf("slice=%d\n", it);
    }
    // Prints:
    // slice=3
    // slice=4


    // it is type of cex_iterator_s
    // NOTE: run in shell: ➜ ./cex help cex_iterator_s
    s = str.sstr("123,456");
    for$iter (str_s, it, str.slice.iter_split(s, ",", &it.iterator)) {
        io.printf("it.value = %S\n", it.val);
    }
    // Prints:
    // it.value = 123
    // it.value = 456

    arr$free(array);
    return EOK;
}
/// Iterates over arrays `it` is iterated **value**, array may be arr$/or static / or pointer,
/// array_len is only required for pointer+len use case
#define for$each(it, array, array_len...)

/// Iterates over arrays `it` is iterated by **pointer**, array may be arr$/or static / or pointer,
/// array_len is only required for pointer+len use case
#define for$eachp(it, array, array_len...)

/// Iterates via iterator function (see usage below)
#define for$iter(it_val_type, it, iter_func)


fuzz$

  • Fuzz runner commands

./cex fuzz create fuzz/myapp/fuzz_bar.c

./cex fuzz run fuzz/myapp/fuzz_bar.c

  • Fuzz testing tools using fuzz namespace

int
fuzz$case(const u8* data, usize size)
{
    cex_fuzz_s fz = fuzz.create(data, size);
    u16 random_val = 0;
    some_struct random_struct = {0}; 

    while(fuzz.dget(&fz, &random_val, sizeof(random_val))) {
        // testing function with random data
        my_func(random_val);

        // checking probability based on fuzz data
        if (fuzz.dprob(&fz, 0.2)) {
            my_func(random_val * 10);
        }

        if (fuzz.dget(&fz, &random_struct, sizeof(random_struct))){
            my_func_struct(&random_struct);
        }
    }
}
  • Fuzz testing tools using fuzz$ macros (shortcuts)

int
fuzz$case(const u8* data, usize size)
{
    fuzz$dnew(data, size);

    u16 random_val = 0;
    some_struct random_struct = {0}; 

    while(fuzz$dget(&random_val)) {
        // testing function with random data
        my_func(random_val);

        // checking probability based on fuzz data
        if (fuzz$dprob(0.2)) {
            my_func(random_val * 10);
        }

        // it's possible to fill whole structs with data
        if (fuzz$dget(&random_struct)){
            my_func_struct(&random_struct);
        }
    }
}
  • Fuzz corpus priming (it’s optional step, but useful)

typedef struct fuzz_match_s
{
    char pattern[100];
    char null_term;
    char text[300];
    char null_term2;
} fuzz_match_s;

Exception
match_make(char* out_file, char* text, char* pattern)
{
    fuzz_match_s f = { 0 };
    e$ret(str.copy(f.text, text, sizeof(f.text)));
    e$ret(str.copy(f.pattern, pattern, sizeof(f.pattern)));

    FILE* fh;
    e$ret(io.fopen(&fh, out_file, "wb"));
    e$ret(io.fwrite(fh, &f, sizeof(f)));
    io.fclose(&fh);

    return EOK;
}

fuzz$setup()
{
    if (os.fs.mkdir(fuzz$corpus_dir)) {}

    struct
    {
        char* text;
        char* pattern;
    } match_tuple[] = {
        { "test", "*" },
        { "", "*" },
        { ".txt", "*.txt" },
        { "test.txt", "" },
        { "test.txt", "*txt" },
    };
    mem$scope(tmem$, _)
    {
        for (u32 i = 0; i < arr$len(match_tuple); i++) {
            char* fn = str.fmt(_, "%s/%05d", fuzz$corpus_dir, i);
            e$except (err, match_make(fn, match_tuple[i].text, match_tuple[i].pattern)) {
                uassertf(false, "Error writing file: %s", fn);
            }
        }
    }
}
/// Fuzz case: ``int fuzz$case(const u8* data, usize size) { return 0;}
#define fuzz$case

/// Current fuzz_ file corpus directory relative to calling source file
#define fuzz$corpus_dir

/// Load random data into variable by pointer from random fuzz data
#define fuzz$dget(out_result_ptr)

/// Initialize fuzz$ helper macros
#define fuzz$dnew(data, size)

/// Get deterministic probability based on fuzz data
#define fuzz$dprob(prob_threshold)

/// Special fuzz variable used by all fuzz$ macros
#define fuzz$dvar

/// Fuzz main function
#define fuzz$main()

/// Fuzz test constructor (for building corpus seeds programmatically)
#define fuzz$setup



fuzz {
    // Autogenerated by CEX
    // clang-format off

    /// Get current corpus dir relative tho the `this_file_name`
    char*           (*corpus_dir)(char* this_file_name);
    /// Creates new fuzz data generator, for fuzz-driven randomization
    cex_fuzz_s      (*create)(const u8* data, usize size);
    /// Get result from random data into buffer (returns false if not enough data)
    bool            (*dget)(cex_fuzz_s* fz, void* out_result, usize result_size);
    /// Get deterministic probability using fuzz data, based on threshold
    bool            (*dprob)(cex_fuzz_s* fz, double threshold);

    // clang-format on
};

hm$

Generic type-safe hashmap

Principles:

  1. Data is backed by engine similar to arr$
  2. arr$len() works with hashmap too
  3. Array indexing works with hashmap
  4. for$each/for$eachp is applicable
  5. hm$ generic type is essentially a struct with key and value fields
  6. hm$ supports following keys: numeric (by default it’s just binary representation), char*, char[N], str_s (CEX sting slice).
  7. hm$ with string keys are stored without copy, use hm$new(hm, mem$, .copy_keys = true) for copy-mode.
  8. hm$ can store string keys inside an Arena allocator when hm$new(hm, mem$, .copy_keys = true, .copy_keys_arena_pgsize = NNN)

test$case(test_simple_hashmap)
{
    hm$(int, int) intmap = hm$new(intmap, mem$);

    // Setting items
    hm$set(intmap, 15, 7);
    hm$set(intmap, 11, 3);
    hm$set(intmap, 9, 5);

    // Length
    tassert_eq(hm$len(intmap), 3);
    tassert_eq(arr$len(intmap), 3);

    // Getting items **by value**
    tassert(hm$get(intmap, 9) == 5);
    tassert(hm$get(intmap, 11) == 3);
    tassert(hm$get(intmap, 15) == 7);

    // Getting items **pointer** - NULL on missing
    tassert(hm$getp(intmap, 1) == NULL);

    // Getting with default if not found
    tassert_eq(hm$get(intmap64, -1, 999), 999);

    // Accessing hashmap as array by i-th index
    // NOTE: hashmap elements are ordered until first deletion
    tassert_eq(intmap[0].key, 1);
    tassert_eq(intmap[0].value, 3);

    // removing items
    hm$del(intmap, 100);

    // cleanup
    hm$clear(intmap);

    // basic iteration **by value**
    for$each (it, intmap) {
        io.printf("key=%d, value=%d\n", it.key, it.value);
    }

    // basic iteration **by pointer**
    for$each (it, intmap) {
        io.printf("key=%d, value=%d\n", it->key, it->value);
    }

    hm$free(intmap);
}
  • Using hashmap as field of other struct

typedef hm$(char* , int) MyHashmap;

struct my_hm_struct {
    MyHashmap hm;
};


test$case(test_hashmap_string_copy_clear_cleanup)
{
    struct my_hm_struct hs = {0};
    // NOTE: .copy_keys - makes sure that key string was copied
    hm$new(hs.hm, mem$, .copy_keys = true);
    hm$set(hs.hm, "foo", 3);
}
  • Storing string values in the arena

test$case(test_hashmap_string_copy_arena)
{
    hm$(char*, int) smap = hm$new(smap, mem$, .copy_keys = true, .copy_keys_arena_pgsize = 1024);

    char key2[10] = "foo";

    hm$set(smap, key2, 3);
    tassert_eq(hm$len(smap), 1);
    tassert_eq(hm$get(smap, "foo"), 3);
    tassert_eq(hm$get(smap, key2), 3);
    tassert_eq(smap[0].key, "foo");

    memset(key2, 0, sizeof(key2));
    tassert_eq(smap[0].key, "foo");
    tassert_eq(hm$get(smap, "foo"), 3);

    hm$free(smap);
    return EOK;
}
  • Checking errors + custom struct backing

test$case(test_hashmap_basic)
{
    hm$(int, int) intmap;
    if(hm$new(intmap, mem$) == NULL) {
        // initialization error
    }

    // struct as a value
    struct test64_s
    {
        usize foo;
        usize bar;
    };
    hm$(int, struct test64_s) intmap = hm$new(intmap, mem$);

    // custom struct as hashmap backend
    struct test64_s
    {
        usize fooa;
        usize key; // this field `key` is mandatory
    };

    hm$s(struct test64_s) smap = hm$new(smap, mem$);
    tassert(smap != NULL);

    // Setting hashmap as a whole struct key/value record
    tassert(hm$sets(smap, (struct test64_s){ .key = 1, .fooa = 10 }));
    tassert_eq(hm$len(smap), 1);
    tassert_eq(smap[0].key, 1);
    tassert_eq(smap[0].fooa, 10);

    // Getting full struct by .key value
    struct test64_s* r = hm$gets(smap, 1);
    tassert(r != NULL);
    tassert(r == &smap[0]);
    tassert_eq(r->key, 1);
    tassert_eq(r->fooa, 10);

}
/// Defines hashmap generic type
#define hm$(_KeyType, _ValType)

/// Clears hashmap contents
#define hm$clear(t)

/// Deletes items, IMPORTANT hashmap array may be reordered after this call
#define hm$del(t, k)

/// Frees hashmap resources
#define hm$free(t)

/// Get item by value, def - default value (zeroed by default), can be any type
#define hm$get(t, k, def...)

/// Get item by pointer (no copy, direct pointer inside hashmap)
#define hm$getp(t, k)

/// Get a pointer to full hashmap record, NULL if not found
#define hm$gets(t, k)

/// Returns hashmap length, also you can use arr$len()
#define hm$len(t)

/// Creates new hashmap of hm$(KType, VType) using allocator, kwargs: .capacity, .seed,
/// .copy_keys_arena_pgsize, .copy_keys
#define hm$new(t, allocator, kwargs...)

/// Defines hashmap type based on _StructType, must have `key` field
#define hm$s(_StructType)

/// Set hashmap key/value, replaces if exists
#define hm$set(t, k, v...)

/// Add new item and returns pointer of hashmap record for `k`, for further editing
#define hm$setp(t, k)

/// Set full record, must be initialized by user
#define hm$sets(t, v...)


io

Cross-platform IO namespace

  • Read all file content (low level api)

test$case(test_readall)
{
    // Open new file
    FILE* file;
    e$ret(io.fopen(&file, "tests/data/text_file_50b.txt", "r"));


    // get file size 
    tassert_eq(50, io.file.size(file));

    // Read all content
    str_s content;
    e$ret(io.fread_all(file, &content, mem$));
    mem$free(mem$, content.buf); // content.buf is allocated by mem$ !

    // Cleanup
    io.fclose(&file); // file will be set to NULL
    tassert(file == NULL);

    return EOK;
}
  • File load/save (easy api)

test$case(test_fload_save)
{
    tassert_eq(Error.ok, io.file.save("tests/data/text_file_write.txt", "Hello from CEX!\n"));
    char* content = io.file.load("tests/data/text_file_write.txt", mem$);
    tassert(content);
    tassert_eq(content, "Hello from CEX!\n");
    mem$free(mem$, content);
    return EOK;
}
  • File read/write lines
test$case(test_write_line)
{
    FILE* file;
    tassert_eq(Error.ok, io.fopen(&file, "tests/data/text_file_write.txt", "w+"));

    str_s content;
    mem$scope(tmem$, _)
    {
        // Writing line by line
        tassert_eq(EOK, io.file.writeln(file, "hello"));
        tassert_eq(EOK, io.file.writeln(file, "world"));

        // Reading line by line
        io.rewind(file);

        // easy api - backed by temp allocator
        tassert_eq("hello", io.file.readln(file, _));

        // low-level api (using heap allocator, needs free!)
        tassert_er(EOK, io.fread_line(file, &content, mem$));
        tassert(str.slice.eq(content, str$s("world")));
        mem$free(mem$, content.buf);
    }

    io.fclose(&file);
    return EOK;
}
  • File low-level write/read

test$case(test_read_loop)
{
    FILE* file;
    tassert_eq(Error.ok, io.fopen(&file, "tests/data/text_file_50b.txt", "r+"));

    char buf[128] = {0};

    // Read bytes
    isize nread = 0;
    while((nread = io.fread(file, buf, 10))) {
        if (nread < 0) {
            // TODO: io.fread() error occured, you should handle it here
            // NOTE: you can use os.get_last_error() for Exception representation of io.fread() err
            break;
        }
        
        tassert_eq(nread, 10);
        buf[10] = '\0';
        io.printf("%s", buf);
    }

    // Write bytes
    char buf2[] = "foobar";
    tassert_ne(EOK, io.fwrite(file, buf2, arr$len(buf2)));

    io.fclose(&file);
    return EOK;
}
/// Makes string literal with ansi colored test
#define io$ansi(text, ansi_col)



io {
    // Autogenerated by CEX
    // clang-format off

    /// Closes file and set it to NULL.
    void            (*fclose)(FILE** file);
    /// Flush changes to file
    Exception       (*fflush)(FILE* file);
    /// Obtain file descriptor from FILE*
    int             (*fileno)(FILE* file);
    /// Opens new file: io.fopen(&file, "file.txt", "r+")
    Exception       (*fopen)(FILE** file, char* filename, char* mode);
    /// Prints formatted string to the file. Uses CEX printf() engine with special formatting.
    Exc             (*fprintf)(FILE* stream, char* format,...);
    /// Read file contents into the buf, return nbytes read (can be < buff_len), 0 on EOF, negative on
    /// error (you may use os.get_last_error() for getting Exception for error, cross-platform )
    isize           (*fread)(FILE* file, void* buff, usize buff_len);
    /// Read all contents of the file, using allocator. You should free `s.buf` after.
    Exception       (*fread_all)(FILE* file, str_s* s, IAllocator allc);
    /// Reads line from a file into str_s buffer, allocates memory. You should free `s.buf` after.
    Exception       (*fread_line)(FILE* file, str_s* s, IAllocator allc);
    /// Seek file position
    Exception       (*fseek)(FILE* file, long offset, int whence);
    /// Returns current cursor position into `size` pointer
    Exception       (*ftell)(FILE* file, usize* size);
    /// Writes bytes to the file
    Exception       (*fwrite)(FILE* file, void* buff, usize buff_len);
    /// Check if current file supports ANSI colors and in interactive terminal mode
    bool            (*isatty)(FILE* file);
    /// Prints formatted string to stdout. Uses CEX printf() engine with special formatting.
    int             (*printf)(char* format,...);
    /// Rewind file cursor at the beginning
    void            (*rewind)(FILE* file);

    struct {
        /// Load full contents of the file at `path`, using text mode. Returns NULL on error.
        char*           (*load)(char* path, IAllocator allc);
        /// Reads line from file, allocates result. Returns NULL on error.
        char*           (*readln)(FILE* file, IAllocator allc);
        /// Saves full `contents` in the file at `path`, using text mode.
        Exception       (*save)(char* path, char* contents);
        /// Return full file size, always 0 for NULL file or atty
        usize           (*size)(FILE* file);
        /// Writes new line to the file
        Exception       (*writeln)(FILE* file, char* line);
    } file;

    // clang-format on
};

log$

Simple console logging engine:

  • Prints file:line + log type: [INFO] ( file.c:14 cexy_fun() ) Message format: ./cex
  • Supports CEX formatting engine
  • Can be regulated using compile time level, e.g. #define CEX_LOG_LVL 4

Log levels (CEX_LOG_LVL value):

  • 0 - mute all including assert messages, tracebacks, errors
  • 1 - allow log$error + assert messages, tracebacks
  • 2 - allow log$warn
  • 3 - allow log$info
  • 4 - allow log$debug (default level if CEX_LOG_LVL is not set)
  • 5 - allow log$trace
/// Log debug (when CEX_LOG_LVL > 3)
#define log$debug(format, ...)

/// Log error (when CEX_LOG_LVL > 0)
#define log$error(format, ...)

/// Log info  (when CEX_LOG_LVL > 2)
#define log$info(format, ...)

/// Log tace (when CEX_LOG_LVL > 4)
#define log$trace(format, ...)

/// Log warning  (when CEX_LOG_LVL > 1)
#define log$warn(format, ...)


mem$

Mem cheat-sheet

Global allocators:

  • mem$ - heap based allocator, typically used for long-living data, requires explicit mem$free
  • tmem$ - temporary allocator, based by ArenaAllocator, with 256kb page, requires mem$scope

Memory management hints:

  • If function accept IAllocator as argument, it allocates memory
  • If class/object accept IAllocator in constructor it should track allocator’s instance
  • mem$scope() - automatically free memory at scope exit by any reason (return, goto out, break)
  • consider mem$malloc/mem$calloc/mem$realloc/mem$free/mem$new
  • You can init arena scope with mem$arena(page_size, arena_var_name)
  • AllocatorArena grows dynamically if there is no room in existing page, but be careful when you use many realloc(), it can grow arenas unexpectedly large.
  • Use temp allocator as mem$scope(tmem$, _) {} it’s a common CEX pattern, _ is tmem$ short-alias
  • Nested mem$scope are allowed, but memory freed at nested scope exit. NOTE: don’t share pointers across scopes.
  • Use address sanitizers as often as possible

Examples:

  • Vanilla heap allocator
test$case(test_allocator_api)
{
    u8* p = mem$malloc(mem$, 100);
    tassert(p != NULL);

    // mem$free always nullifies pointer
    mem$free(mem$, p);
    tassert(p == NULL);

    p = mem$calloc(mem$, 100, 100, 32); // malloc with 32-byte alignment
    tassert(p != NULL);

    // Allocates new ZII struct based on given type
    auto my_item = mem$new(mem$, struct my_type_s);

    return EOK;
}
  • Temporary memory scope
mem$scope(tmem$, _)
{
    arr$(char*) incl_path = arr$new(incl_path, _);
    for$each (p, alt_include_path) {
        arr$push(incl_path, p);
        if (!os.path.exists(p)) { log$warn("alt_include_path not exists: %s\n", p); }
    }
}
  • Arena Scope
mem$arena(4096, arena)
{
    // This needs extra page
    u8* p2 = mem$malloc(arena, 10040);
    mem$scope(arena, tal)
    {
        u8* p3 = mem$malloc(tal, 100);
    }
}
  • Arena Instance
IAllocator arena = AllocatorArena.create(4096);

u8* p = mem$malloc(arena, 100); // direct use allowed

mem$scope(arena, tal)
{
    // NOTE: this scope will be freed after exit
    u8* p2 = mem$malloc(tal, 100000);

    mem$scope(arena, tal)
    {
        u8* p3 = mem$malloc(tal, 100);
    }
}

AllocatorArena.destroy(arena);
/// General purpose heap allocator
#define mem$

/// Gets address of struct member
#define mem$addressof(typevar, value)

/// Checks if pointer address of `p` is aligned to `alignment`
#define mem$aligned_pointer(p, alignment)

/// Rounds `size` to the closest alignment
#define mem$aligned_round(size, alignment)

/// Creates new ArenaAllocator instance in scope, frees it at scope exit
#define mem$arena(page_size, allc_var)

/// true - if program was compiled with address sanitizer support
#define mem$asan_enabled()

/// Poisons memory region with ASAN, or fill it with 0xf7 byte pattern (no ASAN)
#define mem$asan_poison(addr, size)

/// Check if previously poisoned address is consistent, and 0x7f pattern not overwritten (no ASAN)
#define mem$asan_poison_check(addr, size)

/// Unpoisons memory region with ASAN, or fill it with 0x00 byte pattern (no ASAN)
#define mem$asan_unpoison(addr, size)

/// Allocate zero initialized chunk of memory using `allocator`
#define mem$calloc(allocator, nmemb, size, alignment...)

/// Free previously allocated chunk of memory, `ptr` implicitly set to NULL
#define mem$free(allocator, ptr)

/// Checks if `s` value is power of 2
#define mem$is_power_of2(s)

/// Allocate uninitialized chunk of memory using `allocator`
#define mem$malloc(allocator, size, alignment...)

/// Allocates generic type instance using `allocator`, result is zero filled, size and alignment
/// derived from type T
#define mem$new(allocator, T)

/// Gets offset in bytes of struct member
#define mem$offsetof(var, field)

/// Returns 32 for 32-bit platform, or 64 for 64-bit platform
#define mem$platform()

/// Reallocate chunk of memory using `allocator`
#define mem$realloc(allocator, old_ptr, size, alignment...)

/// Opens new memory scope using Arena-like allocator, frees all memory after scope exit
#define mem$scope(allocator, allc_var)


os

Cross-platform OS related operations:

  • os.cmd. - for running commands and interacting with them
  • os.fs. - file-system related tasks
  • os.env. - getting setting environment variable
  • os.path. - file path operations
  • os.platform. - information about current platform

Examples:

  • Running simple commands
// NOTE: there are many operation with os-related stuff in cexy build system
// try to play in example roulette:  ./cex help --example os.cmd.run

// Easy macro, run fixed number of arguments

e$ret(os$cmd(
    cexy$cc,
    "src/main.c",
    cexy$build_dir "/sqlite3.o",
    cexy$cc_include,
    "-lpthread",
    "-lm",
    "-o",
    cexy$build_dir "/hello_sqlite"
));
  • Running dynamic arguments
mem$scope(tmem$, _)
{
        arr$(char*) args = arr$new(args, _);
        arr$pushm(
            args,
            cexy$cc,
            "-Wall",
            "-Werror",
        );

        if (os.platform.current() == OSPlatform__win) {
            arr$pushm(args, "-lbcrypt");
        }

        arr$pushm(args, NULL); // NOTE: last element must be NULL
        e$ret(os$cmda(args));
    }
}
  • Getting command output (low level api)

test$case(os_cmd_create)
{
    os_cmd_c c = { 0 };
    mem$scope(tmem$, _)
    {
        char* args[] = { "./cex", NULL };
        tassert_er(EOK, os.cmd.create(&c, args, arr$len(args), NULL));

        char* output = os.cmd.read_all(&c, _);
        tassert(output != NULL);
        io.printf("%s\n", output);

        int err_code = 0;
        tassert_er(Error.runtime, os.cmd.join(&c, 0, &err_code));
        tassert_eq(err_code, 1);
    }
    return EOK;
}
  • Working with files

test$case(test_os_find_all_c_files)
{
    mem$scope(tmem$, _)
    {
        // Check if exists and remove
        if (os.path.exists("./cex")) { e$ret(os.fs.remove("./cex")); }

        // illustration of path combining
        char* pattern = os$path_join(_, "./", "*.c");

        // find all matching *.c files
        for$each (it, os.fs.find(pattern, _), false , _)) {
            log$debug("found file: %s\n", it);
        }
    }

    return EOK;
}
/// OS path separator, generally '\' for Windows, '/' otherwise
#define os$PATH_SEP

/// Run command by arbitrary set of arguments (returns Exc, but error check is not mandatory). Pipes
/// all IO to stdout/err/in into current terminal, feels totally interactive. 
/// Example: e$ret(os$cmd("cat", "./cex.c"))
#define os$cmd(args...)

/// Run command by dynamic or static array (returns Exc, but error check is not mandatory). Pipes
/// all IO to stdout/err/in into current terminal, feels totally interactive.
#define os$cmda(args, args_len...)

/// Path parts join by variable set of args: os$path_join(mem$, "foo", "bar", "cex.c")
#define os$path_join(allocator, path_parts...)

/// Command container (current state of subprocess)
typedef os_cmd_c

/// Additional flags for os.cmd.create()
typedef os_cmd_flags_s

/// File stats metadata (cross-platform), returned by os.fs.stats
typedef os_fs_stat_s



os {
    // Autogenerated by CEX
    // clang-format off

    /// Get last system API error as string representation (Exception compatible). Result content may be
    /// affected by OS locale settings.
    Exc             (*get_last_error)(void);
    /// Sleep for `period_millisec` duration
    void            (*sleep)(u32 period_millisec);
    /// Get high performance monotonic timer value in seconds
    f64             (*timer)(void);

    struct {
        /// Creates new os command (use os$cmd() and os$cmd() for easy cases)
        Exception       (*create)(os_cmd_c* self, char** args, usize args_len, os_cmd_flags_s* flags);
        /// Check if `cmd_exe` program name exists in PATH. cmd_exe can be absolute, or simple command name,
        /// e.g. `cat`
        bool            (*exists)(char* cmd_exe);
        /// Get running command stderr stream
        FILE*           (*fstderr)(os_cmd_c* self);
        /// Get running command stdin stream
        FILE*           (*fstdin)(os_cmd_c* self);
        /// Get running command stdout stream
        FILE*           (*fstdout)(os_cmd_c* self);
        /// Checks if process is running
        bool            (*is_alive)(os_cmd_c* self);
        /// Waits process to end, and get `out_ret_code`, if timeout_sec=0 - infinite wait, raises
        /// Error.runtime if out_ret_code != 0
        Exception       (*join)(os_cmd_c* self, u32 timeout_sec, i32* out_ret_code);
        /// Terminates the running process
        Exception       (*kill)(os_cmd_c* self);
        /// Read all output from process stdout, NULL if stdout is not available
        char*           (*read_all)(os_cmd_c* self, IAllocator allc);
        /// Read line from process stdout, NULL if stdout is not available
        char*           (*read_line)(os_cmd_c* self, IAllocator allc);
        /// Run command using arguments array and resulting os_cmd_c
        Exception       (*run)(char** args, usize args_len, os_cmd_c* out_cmd);
        /// Writes line to the process stdin
        Exception       (*write_line)(os_cmd_c* self, char* line);
    } cmd;

    struct {
        /// Get environment variable, with `deflt` if not found
        char*           (*get)(char* name, char* deflt);
        /// Set environment variable
        Exception       (*set)(char* name, char* value);
    } env;

    struct {
        /// Change current working directory
        Exception       (*chdir)(char* path);
        /// Copy file
        Exception       (*copy)(char* src_path, char* dst_path);
        /// Copy directory recursively
        Exception       (*copy_tree)(char* src_dir, char* dst_dir);
        /// Iterates over directory (can be recursive) using callback function
        Exception       (*dir_walk)(char* path, bool is_recursive, os_fs_dir_walk_f callback_fn, void* user_ctx);
        /// Finds files in `dir/pattern`, for example "./mydir/*.c" (all c files), if is_recursive=true, all
        /// *.c files found in sub-directories.
        arr$(char*)     (*find)(char* path_pattern, bool is_recursive, IAllocator allc);
        /// Get current working directory
        char*           (*getcwd)(IAllocator allc);
        /// Makes directory (no error if exists)
        Exception       (*mkdir)(char* path);
        /// Makes all directories in a path
        Exception       (*mkpath)(char* path);
        /// Removes file or empty directory (also see os.fs.remove_tree)
        Exception       (*remove)(char* path);
        /// Removes directory and all its contents recursively
        Exception       (*remove_tree)(char* path);
        /// Renames file or directory
        Exception       (*rename)(char* old_path, char* new_path);
        /// Returns cross-platform path stats information (see os_fs_stat_s)
        os_fs_stat_s    (*stat)(char* path);
    } fs;

    struct {
        /// Returns absolute path from relative
        char*           (*abs)(char* path, IAllocator allc);
        /// Get file name of a path
        char*           (*basename)(char* path, IAllocator allc);
        /// Get directory name of a path
        char*           (*dirname)(char* path, IAllocator allc);
        /// Check if file/directory path exists
        bool            (*exists)(char* file_path);
        /// Join path with OS specific path separator
        char*           (*join)(char** parts, u32 parts_len, IAllocator allc);
        /// Splits path by `dir` and `file` parts, when return_dir=true - returns `dir` part, otherwise
        /// `file` part
        str_s           (*split)(char* path, bool return_dir);
    } path;

    struct {
        /// Returns OSArch from string
        OSArch_e        (*arch_from_str)(char* name);
        /// Converts arch to string
        char*           (*arch_to_str)(OSArch_e platform);
        /// Returns current OS platform, returns enum of OSPlatform__*, e.g. OSPlatform__win,
        /// OSPlatform__linux, OSPlatform__macos, etc..
        OSPlatform_e    (*current)(void);
        /// Returns string name of current platform
        char*           (*current_str)(void);
        /// Converts platform name to enum
        OSPlatform_e    (*from_str)(char* name);
        /// Converts platform enum to name
        char*           (*to_str)(OSPlatform_e platform);
    } platform;

    // clang-format on
};

sbuf

Dynamic string builder class

Key features:

  • Dynamically grown strings

  • Supports CEX specific formats

  • Can be backed by allocator or static buffer

  • Error resilient - allows self as NULL

  • sbuf_c - is an alias of char*, always null terminated, compatible with any C strings

  • Allocator driven dynamic string

sbuf_c s = sbuf.create(20, mem$);

// These may fail (you may use them with e$* checks or add final check)
sbuf.appendf(&s, "%s, CEX slice: %S\n", "456", str$s("slice"));
sbuf.append(&s, "some string");

e$except(err, sbuf.validate(&s)) {
    // Error handling
}

if (!sbuf.isvalid(&s)) {
    // Error, just a boolean flag
}

// Some other stuff
s[i]   // getting i-th character of string
strlen(s); // C strings work, because sbuf_c is vanilla char*
sbuf.len(&s); // faster way of getting length (uses metadata)
sbuf.grow(&s, new_capacity); // increase capacity
sbuf.capacity(&s); // current capacity, 0 if error occurred
sbuf.clear(&s); // reset dynamic string + null term




// Frees the memory and sets s to NULL
sbuf.destroy(&s);
  • Static buffer backed string

// NOTE: `s` address is different, because `buf` will contain header and metadata, use only `s`
char buf[64];
sbuf_c s = sbuf.create_static(buf, arr$len(buf));

// You may check every operation if needed, but this more verbose
e$ret(sbuf.appendf(&s, "%s, CEX slice: %S\n", "456", str$s("slice")));
e$ret(sbuf.append(&s, "some string"));

// It's not mandatory, but will clean up buffer data at the end
sbuf.destroy(&s);
typedef char* sbuf_c

typedef struct sbuf_head_s



sbuf {
    // Autogenerated by CEX
    // clang-format off

    /// Append string to the builder
    Exc             (*append)(sbuf_c* self, char* s);
    /// Append format (using CEX formatting engine)
    Exc             (*appendf)(sbuf_c* self, char* format,...);
    /// Append format va (using CEX formatting engine), always null-terminating
    Exc             (*appendfva)(sbuf_c* self, char* format, va_list va);
    /// Returns string capacity from its metadata
    u32             (*capacity)(sbuf_c* self);
    /// Clears string
    void            (*clear)(sbuf_c* self);
    /// Creates new dynamic string builder backed by allocator
    sbuf_c          (*create)(usize capacity, IAllocator allocator);
    /// Creates dynamic string backed by static array
    sbuf_c          (*create_static)(char* buf, usize buf_size);
    /// Destroys the string, deallocates the memory, or nullify static buffer.
    sbuf_c          (*destroy)(sbuf_c* self);
    /// Returns false if string invalid
    bool            (*isvalid)(sbuf_c* self);
    /// Returns string length from its metadata
    u32             (*len)(sbuf_c* self);
    /// Shrinks string length to new_length
    Exc             (*shrink)(sbuf_c* self, usize new_length);
    /// Validate dynamic string state, with detailed Exception
    Exception       (*validate)(sbuf_c* self);

    // clang-format on
};

str

CEX string principles:

  • str namespace is build for compatibility with C strings
  • all string functions are NULL resilient
  • all string functions can return NULL on error
  • you don’t have to check every operation for NULL every time, just at the end
  • all string format operations support CEX specific specificators (see below)

String slices:

  • Slices are backed by (str_s){.buf = s, .len = NNN} struct

  • Slices are passed by value and allocated on stack

  • Slices can be made from null-terminated strings, or buffers, or literals

  • str$s(“hello”) - use this for compile time defined slices/constants

  • Slices are not guaranteed to be null-terminated

  • Slices support operations which allowed by read-only string view representation

  • CEX formatting uses %S for slices: io.print("Hello %S\n", str$s("world"))

  • Working with slices:


test$case(test_cstr)
{
    char* cstr = "hello";
    str_s s = str.sstr(cstr);
    tassert_eq(s.buf, cstr);
    tassert_eq(s.len, 5);
    tassert(s.buf == cstr);
    tassert_eq(str.len(s.buf), 5);
}
  • Getting substring as slices

str.sub("123456", 0, 0); // slice: 123456
str.sub("123456", 1, 0); // slice: 23456
str.sub("123456", 1, -1); // slice: 2345
str.sub("123456", -3, -1); // slice: 345
str.sub("123456", -30, 2000); // slice: (str_s){.buf = NULL, .len = 0} (error, but no crash)

// works with slices too
str_s s = str.sstr("123456");
str_s sub = str.slice.sub(s, 1, 2);
  • Splitting / iterating via tokens

// Working without mem allocation
s = str.sstr("123,456");
for$iter (str_s, it, str.slice.iter_split(s, ",", &it.iterator)) {
    io.printf("%S\n", it.val); // NOTE: it.val is non null-terminated slice
}


// Mem allocating split
mem$scope(tmem$, _)
{

    // NOTE: each `res` item will be allocated C-string, use tmem$ or deallocate independently
    arr$(char*) res = str.split("123,456,789", ",", _);
    tassert(res != NULL); // NULL on error

    for$each (v, res) {
        io.printf("%s\n", v); // NOTE: strings now cloned and null-terminated
    }
}
  • Chaining string operations

mem$scope(tmem$, _)
{
    char* s = str.fmt(_, "hi there"); // NULL on error
    s = str.replace(s, "hi", "hello", _); // NULL tolerant, NULL on error
    s = str.fmt(_, "result is: %s", s); // NULL tolerant, NULL on error
    if (s == NULL) {
        // TODO: oops error occurred, in one of three operations, but we don't need to check each one
    }

    tassert_eq(s, "result is: hello there");
}
  • Pattern matching
// Pattern matching 101
// * - one or more characters
// ? - one character
// [abc] - one character a or b or c
// [!abc] - one character, but not a or b or c
// [abc+] - one or more characters a or b or c
// [a-cA-C0-9] - one character in a range of characters
// \\* - escaping literal '*'
// (abc|def|xyz) - matching combination of words abc or def or xyz

tassert(str.match("test.txt", "*?txt"));
tassert(str.match("image.png", "image.[jp][pn]g"));
tassert(str.match("backup.txt", "[!a]*.txt"));
tassert(!str.match("D", "[a-cA-C0-9]"));
tassert(str.match("1234567890abcdefABCDEF", "[0-9a-fA-F+]"));
tassert(str.match("create", "(run|build|create|clean)"));


// Works with slices
str_s src = str$s("my_test __String.txt");
tassert(str.slice.match(src, "*"));
tassert(str.slice.match(src, "*.txt*"));
tassert(str.slice.match(src, "my_test*.txt"));
/// Parses string contents as value type based on generic numeric type of out_var_ptr
#define str$convert(str_or_slice, out_var_ptr)

/// Joins parts of strings using a separator str$join(allc, ",", "a", "b", "c") -> "a,b,c"
#define str$join(allocator, str_join_by, str_parts...)

/// creates str_s, instance from string literals/constants: str$s("my string")
#define str$s(string)

/// Represents char* slice (string view) + may not be null-term at len!
typedef struct str_s



str {
    // Autogenerated by CEX
    // clang-format off

    /// Clones string using allocator, null tolerant, returns NULL on error.
    char*           (*clone)(char* s, IAllocator allc);
    /// Makes a copy of initial `src`, into `dest` buffer constrained by `destlen`. NULL tolerant,
    /// always null-terminated, overflow checked.
    Exception       (*copy)(char* dest, char* src, usize destlen);
    /// Checks if string ends with prefix, returns false on error, NULL tolerant
    bool            (*ends_with)(char* s, char* suffix);
    /// Compares two null-terminated strings (null tolerant)
    bool            (*eq)(char* a, char* b);
    /// Compares two strings, case insensitive, null tolerant
    bool            (*eqi)(char* a, char* b);
    /// Find a substring in a string, returns pointer to first element. NULL tolerant, and NULL on err.
    char*           (*find)(char* haystack, char* needle);
    /// Find substring from the end , NULL tolerant, returns NULL on error.
    char*           (*findr)(char* haystack, char* needle);
    /// Formats string and allocates it dynamically using allocator, supports CEX format engine
    char*           (*fmt)(IAllocator allc, char* format,...);
    /// Joins string using a separator (join_by), NULL tolerant, returns NULL on error.
    char*           (*join)(char** str_arr, usize str_arr_len, char* join_by, IAllocator allc);
    /// Calculates string length, NULL tolerant.
    usize           (*len)(char* s);
    /// Returns new lower case string, returns NULL on error, null tolerant
    char*           (*lower)(char* s, IAllocator allc);
    /// String pattern matching check (see ./cex help str$ for examples)
    bool            (*match)(char* s, char* pattern);
    /// libc `qsort()` comparator functions, for arrays of `char*`, sorting alphabetical
    int             (*qscmp)(const void* a, const void* b);
    /// libc `qsort()` comparator functions, for arrays of `char*`, sorting alphabetical case insensitive
    int             (*qscmpi)(const void* a, const void* b);
    /// Replaces substring occurrence in a string
    char*           (*replace)(char* s, char* old_sub, char* new_sub, IAllocator allc);
    /// Creates string slice from a buf+len
    str_s           (*sbuf)(char* s, usize length);
    /// Splits string using split_by (allows many) chars, returns new dynamic array of split char*
    /// tokens, allocates memory with allc, returns NULL on error. NULL tolerant. Items of array are
    /// cloned, so you need free them independently or better use arena or tmem$.
    arr$(char*)     (*split)(char* s, char* split_by, IAllocator allc);
    /// Splits string by lines, result allocated by allc, as dynamic array of cloned lines, Returns NULL
    /// on error, NULL tolerant. Items of array are cloned, so you need free them independently or
    /// better use arena or tmem$. Supports \n or \r\n.
    arr$(char*)     (*split_lines)(char* s, IAllocator allc);
    /// Analog of sprintf() uses CEX sprintf engine. NULL tolerant, overflow safe.
    Exc             (*sprintf)(char* dest, usize dest_len, char* format,...);
    /// Creates string slice of input C string (NULL tolerant, (str_s){0} on error)
    str_s           (*sstr)(char* ccharptr);
    /// Checks if string starts with prefix, returns false on error, NULL tolerant
    bool            (*starts_with)(char* s, char* prefix);
    /// Makes slices of `s` char* string, start/end are indexes, can be negative from the end, if end=0
    /// mean full length of the string. `s` may be not null-terminated. function is NULL tolerant,
    /// return (str_s){0} on error
    str_s           (*sub)(char* s, isize start, isize end);
    /// Returns new upper case string, returns NULL on error, null tolerant
    char*           (*upper)(char* s, IAllocator allc);
    /// Analog of vsprintf() uses CEX sprintf engine. NULL tolerant, overflow safe.
    Exception       (*vsprintf)(char* dest, usize dest_len, char* format, va_list va);

    struct {
        Exception       (*to_f32)(char* s, f32* num);
        Exception       (*to_f32s)(str_s s, f32* num);
        Exception       (*to_f64)(char* s, f64* num);
        Exception       (*to_f64s)(str_s s, f64* num);
        Exception       (*to_i16)(char* s, i16* num);
        Exception       (*to_i16s)(str_s s, i16* num);
        Exception       (*to_i32)(char* s, i32* num);
        Exception       (*to_i32s)(str_s s, i32* num);
        Exception       (*to_i64)(char* s, i64* num);
        Exception       (*to_i64s)(str_s s, i64* num);
        Exception       (*to_i8)(char* s, i8* num);
        Exception       (*to_i8s)(str_s s, i8* num);
        Exception       (*to_u16)(char* s, u16* num);
        Exception       (*to_u16s)(str_s s, u16* num);
        Exception       (*to_u32)(char* s, u32* num);
        Exception       (*to_u32s)(str_s s, u32* num);
        Exception       (*to_u64)(char* s, u64* num);
        Exception       (*to_u64s)(str_s s, u64* num);
        Exception       (*to_u8)(char* s, u8* num);
        Exception       (*to_u8s)(str_s s, u8* num);
    } convert;

    struct {
        /// Clone slice into new char* allocated by `allc`, null tolerant, returns NULL on error.
        char*           (*clone)(str_s s, IAllocator allc);
        /// Makes a copy of initial `src` slice, into `dest` buffer constrained by `destlen`. NULL tolerant,
        /// always null-terminated, overflow checked.
        Exception       (*copy)(char* dest, str_s src, usize destlen);
        /// Checks if slice ends with prefix, returns (str_s){0} on error, NULL tolerant
        bool            (*ends_with)(str_s s, str_s suffix);
        /// Compares two string slices, null tolerant
        bool            (*eq)(str_s a, str_s b);
        /// Compares two string slices, null tolerant, case insensitive
        bool            (*eqi)(str_s a, str_s b);
        /// Get index of first occurrence of `needle`, returns -1 on error.
        isize           (*index_of)(str_s s, str_s needle);
        /// iterator over slice splits:  for$iter (str_s, it, str.slice.iter_split(s, ",", &it.iterator)) {}
        str_s           (*iter_split)(str_s s, char* split_by, cex_iterator_s* iterator);
        /// Removes white spaces from the beginning of slice
        str_s           (*lstrip)(str_s s);
        /// Slice pattern matching check (see ./cex help str$ for examples)
        bool            (*match)(str_s s, char* pattern);
        /// libc `qsort()` comparator function for alphabetical sorting of str_s arrays
        int             (*qscmp)(const void* a, const void* b);
        /// libc `qsort()` comparator function for alphabetical case insensitive sorting of str_s arrays
        int             (*qscmpi)(const void* a, const void* b);
        /// Replaces slice prefix (start part), or returns the same slice if it's not found
        str_s           (*remove_prefix)(str_s s, str_s prefix);
        /// Replaces slice suffix (end part), or returns the same slice if it's not found
        str_s           (*remove_suffix)(str_s s, str_s suffix);
        /// Removes white spaces from the end of slice
        str_s           (*rstrip)(str_s s);
        /// Checks if slice starts with prefix, returns (str_s){0} on error, NULL tolerant
        bool            (*starts_with)(str_s s, str_s prefix);
        /// Removes white spaces from both ends of slice
        str_s           (*strip)(str_s s);
        /// Makes slices of `s` slice, start/end are indexes, can be negative from the end, if end=0 mean
        /// full length of the string. `s` may be not null-terminated. function is NULL tolerant, return
        /// (str_s){0} on error
        str_s           (*sub)(str_s s, isize start, isize end);
    } slice;

    // clang-format on
};

test$

Unit Testing engine:

  • Running/building tests
./cex test create tests/test_mytest.c
./cex test run tests/test_mytest.c
./cex test run all
./cex test debug tests/test_mytest.c
./cex test clean all
./cex test --help
  • Unit Test structure
test$setup_case() {
    // Optional: runs before each test case
    return EOK;
}
test$teardown_case() {
    // Optional: runs after each test case
    return EOK;
}
test$setup_suite() {
    // Optional: runs once before full test suite initialized
    return EOK;
}
test$teardown_suite() {
    // Optional: runs once after full test suite ended
    return EOK;
}

test$case(my_test_case){
    e$ret(foo("raise")); // this test will fail if `foo()` raises Exception 
    return EOK; // Must return EOK for passed
}

test$case(my_test_another_case){
    tassert_eq(1, 0); //  tassert_ fails test, but not abort the program
    return EOK; // Must return EOK for passed
}

test$main(); // mandatory at the end of each test
  • Test checks

test$case(my_test_case){
    // Generic type assertions, fails and print values of both arguments

    tassert_eq(1, 1);
    tassert_eq(str, "foo");
    tassert_eq(num, 3.14);
    tassert_eq(str_slice, str$s("expected") );

    tassert(condition && "oops");
    tassertf(condition, "oops: %s", s);

    tassert_er(EOK, raising_exc_foo(0));
    tassert_er(Error.argument, raising_exc_foo(-1));

    tassert_eq_almost(PI, 3.14, 0.01); // 0.01 is float tolerance
    tassert_eq(3.4 * NAN, NAN); // NAN equality also works

    tassert_eq_ptr(a, b); // raw pointer comparison
    tassert_eq_mem(a, b); // raw buffer content comparison (a and b expected to be same size)

    tassert_eq_arr(a, b); // compare two arrays (static or dynamic)


    tassert_ne(1, 0); // not equal
    tassert_le(a, b); // a <= b
    tassert_lt(a, b); // a < b
    tassert_gt(a, b); // a > b
    tassert_ge(a, b); // a >= b

    return EOK;
}
/// Unit-test test case
#define test$case(NAME)

/// main() function for test suite, you must place it into test file at the end
#define test$main()

/// Attribute for function which disables optimization for test cases or other functions
#define test$noopt

/// Optional: called before each test$case() starts
#define test$setup_case()

/// Optional: initializes at test suite once at start
#define test$setup_suite()

/// Optional: called after each test$case() ends
#define test$teardown_case()

/// Optional: shut down test suite once at the end
#define test$teardown_suite()


CEX lib

Role of CEX lib

CEX lib (see lib/ folder in repo) is designed to be a collection of random tools and libraries that are not so frequently used. Currently it’s in early alpha stage, API stability is not guaranteed, backward compatibility is not guaranteed. Feel free to contribute your ideas, if you think it could be useful.

Installing libraries

Installing and updating libs from the main CEX repo is pretty straightforward, and you can use:

cex libfetch lib/test/fff.h                            - fetch signle header lib from CEX repo
cex libfetch -U cex.h                                  - update cex.h to most recent version
cex libfetch lib/random/                               - fetch whole directory recursively from CEX lib
cex libfetch lib/                                      - fetch everything available in CEX lib
cex libfetch --git-label=v2.0 file.h                   - fetch using specific label or commit
cex libfetch -u https://github.com/m/lib.git file.h    - fetch from arbitrary repo
cex help --example cexy.utils.git_lib_fetch            - you can call it from your cex.c
cex libfetch --help                                    - more help

Credits

CEX contains some code and ideas from the following projects, all of them licensed under MIT license (or Public Domain):

  1. nob.h - by Tsoding / Alexey Kutepov, MIT/Public domain, great idea of making self-contained build system, great youtube channel btw
  2. stb_ds.h - MIT/Public domain, by Sean Barrett, CEX arr/hm are refactored versions of STB data structures, great idea
  3. stb_sprintf.h - MIT/Public domain, by Sean Barrett, I refactored it, fixed all UB warnings from UBSAN, added CEX specific formatting
  4. minirent.h - Alexey Kutepov, MIT license, WIN32 compatibility lib
  5. subprocess.h - by Neil Henning, public domain, used in CEX as a daily driver for os$cmd and process communication
  6. utest.h - by Neil Henning, public domain, CEX test$ runner borrowed some ideas of macro magic for making declarative test cases
  7. c3-lang - I borrowed some ideas about language features from C3, especially mem$scope, mem$/tmem$ global allocators, scoped macros too.

License


MIT License

Copyright (c) 2024-2025 Aleksandr Vedeneev

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.