cwal: Scripting Engine Without A Language

cwal (pronounced "sea wall") is a scripting engine/garbage-collection library implemented in C (which is where the "c" comes from - its original name was "sewal"). cwal itself does not provide a scripting language, only the pieces needed for a custom script parser to manage its memory, scoping, variables, a value type system with which to communicate values between script code and native code, etc. (That said, see s2 and whcl for the scripting languages currently built on top of cwal.) It could also be used as a form of garbage collector for client apps, independent of any scripting-related functionality (there's a big overlap). cwal uses a hybrid of reference-counting with C++-like scoping for deterministic finalization and supports the proper tracking and cleaning up cyclic structures.

cwal was started in late July, 2012 as a fork of cson, a JSON library for C, from which cwal initially derived its data type system and much of the related code. It was largely inspired by the TH1 scripting engine originally found in the Fossil SCM, and one of the goals of cwal is to be used as the basis for such app/library "glue languages," primarily with the goal of scripting unit tests (TH1's original purpose). cwal represents, without a doubt, my most intricate C code to date (and is amongst the cleanest, as well).

License: Dual Public Domain/MIT. The underlying JSON parser code has a BSD-ish "keep this copyright in place" license with a "do no evil" subclause. (Otherwise evildoers must be sure to replace the underlying parser before using this code. Alternately, evildoers may disable those bits at build-time: see cwal_config.h for details.)

Author: Stephan Beal

Code status: cwal has been in heavy use since 2013. Though it makes no particular API stability guarantees, its core APIs are in widespread use in my client-side code so are exceedingly unlikely to be changed in incompatible ways. The core engine works quite well when used as advertized. Work on the now-defunct th1ish scripting engine proved that cwal's general lifetime model works. More recently s2 and whcl supersede th1ish in every way. The verdict is still out as to whether cwal can really scale to be able to be useful in any "really interesting" scripting work, but has been shown to at least be viable for such tasks as creating web sites. e.g., wanderinghorse.net's static HTML pages are created, in large part, with s2, pushing cwal well beyond any initially envisioned uses.

The primary properties of cwal include:

Designed to be embedded in applications or libraries.
Portable ~~C89~~ C99 code (was C89 prior to 2021-07-08, but certain potential additions really called for C99).
Has no (mutable) global state - all memory is managed within the context of an "engine" instance.
Fairly small. The core is currently (March 2019) 19k SLOC (11k LLOC) and compiles in well under 5 seconds with gcc on any reasonably modern machine. s2 is, by comparison, 28k SLOC (13k LLOC). s2's loadable modules weigh in at 18k SLOC (9k LLOC). (SLOC/LLOC are as reported by loccount.)
Your RAM is holy, and cwal treats it like a rare commodity. The engine itself is under 2kb and may be stack allocated. Internally it needs less than 1kb of dynamic memory for the core engine (minus any function-level APIs) with the caveat that (optional) auto-interning of strings can increase this "significantly" (anywhere from +2kb to +20kb in my tests, depending on the number and hash values of unique strings). Similarly, the optional inclusion of high-level class APIs add considerable memory (maybe as much as 30-50kb).
Provides the basis for scripting app behaviours, but not a language. See s2 and whcl for the current scripting languages.
Can also be used as a sort of garbage collector, independent of any scripting-related features. It does not require a scripting language to be useful, though that is the primary target client type.
Allows one to replace the memory allocator (it must be realloc-capable).
Uses C++-like scoping and finalization. Finalizers are always called, provided the library is used properly, but destruction order is unspecified.
Provides several ECMAScript-like types and a few custom types: integers (16-, 32-, or 64-bit, depending on build options), doubles, booleans, null, undefined, arrays, objects, functions, exceptions, memory buffers, hash tables, and "natives" (a generic wrapper for binding client-specific types). Non-container values (e.g. numbers and strings) are always immutable. Functions are first-class object-type values.
All values are passed around "by reference" (well, pointer), using a combination of reference counting and scope-level tracking to manage their lifetimes.
Container values (arrays, objects) can participate in graphs/cycles without introducing leaks. Making use of this requires strict coherence to the API's memory/ownership rules, but is pretty easy to do.
Can optionally automatically internalize string values, such that all strings created with the same bytes are automatically shared and cleaned up using the reference-counting mechanism. This support has a memory cost (a couple kb per page in the interning table, with 1 page per string hash collision depth) but has amortized O(1) speed and can save tons (as in many megabytes) of memory when re-using the same strings many times (e.g. as property keys or identifiers in a loop). Its usefulness and cost varies greatly across scripts, but in all cases it speeds up operations like string-using loops (by roughly 2x in most my tests).
Optionally supports memory capping based on various criteria: max allocation size, total/concurrent allocation count, or total/concurrent bytes allocated. When enabled, all allocations cost sizeof(void*) more bytes but this allows cwal to track all memory with byte precision, which allows one of its recyclers to work better.
The source tree includes s2 bindings to several 3rd-party libraries, so there are plentiful examples of how to use it.

Primary misfeatures:

Each engine instance is strictly single-threaded. Use by multiple threads will corrupt its state.
Similarly, each engine instance is unaware of any other engines. No values created by one engine must ever be made visible to another engine (via passing them to the APIs with a different engine instance) or corruption of each engine's state will ensue via cross-wiring of the memory's ownership. Any communication between two engine instances must be implemented using intermediary memory/channels not managed by either engine, e.g. using JSON data via some sort of message queue or raw memory buffer managed outside of cwal.
Values created via cwal (e.g. numbers and strings) do cost notably more than their raw native equivalents, especially on 64-bit (where pointers cost twice as much (and we use lots of pointers)). Half of that weight is related to ownership/garbage collection, so the cost is shared amongst the framework's various pieces (and would be paid elsewhere if not directly in each value). That said, its recycling mechanism tends to make them exceptionally cheap (in the aggregate) in longer-running scripts (under 1 byte per value instance (aggregated) in many cases for numeric types, and under 4 bytes per string or higher-level construct is not uncommon).
This engine is intended to stay simple and memory-light, and there are no plans at all to develop any advanced features. My rough target goal for memory use is around 2-3kb for 32-bit builds (it's currently far less!), not including purely optional features (e.g. string interning) and memory the client allocates. Currently the only part which uses significant memory out of the box is the string-interning table (currently about 3kb/page, with pages being added as newly-interned strings have hash collisions). That part is off by default and can be enabled/disabled at runtime.
cwal knowingly sacrifices performance in places for deterministic destruction behaviour. Some of its algorithms (in particular container cleanup) will (possibly) slow down when dealing with hundreds- or thousands- of values per scope if those values form complex cycles. (It uses a "slow but sure" method for cleaning up cycles.) Optimizations will be made where feasible as experience reveals what they should be. That said, its core-most algorithms are all O(1) where possible, and some others have a small O(N) or similarly fast computational complexity.
Reference counting in conjunction with cycles is of course historically problematic, and introduces all sorts of wrinkles. The hybrid refcount/scope-tracking ownership mechanism handles cycles quite well. Throughout the development of th1ish and s2, cwal's overall mechanism has been proven to be memory-frugal and reliable, provided it is used properly. Its GC mechanism does have a couple "death trap" cases which clients should avoid to keep memory costs down, but in practice it performs remarkably well. (In practice, via s2, th1ish, and whcl, such cases don't come up "organically," but are easy to willfully construct.)
Deterministic finalization is a key requirement for me, and cwal makes sure that if the client uses it properly, finalization methods will be called. It cannot guaranty a "proper" order of the cleanup, but such cases where the destruction order can make a big difference are limited to client-wrapped Native types which reference each other (and "manual" finalizers can be attached to those). In my experience with other scripting engines, in such cases client code needs to add safeguards against potentially incorrect destruction order. The example which comes to mind is a database driver and prepared statements it creates (which need to be destroyed before the driver is). The weak reference abstraction can be of some assistance here but is not a magic bullet.