cwal

MaybeSomeday
Login

MaybeSomeday

Maybe Someday...

cwal has, as of this writing (2019-10-17) been in development for just over 7 years and has seen considerable use in my own source trees. Given the benefit of hindsight, there are certain aspects i would eventually like to add or modify, but, frankly, am unlikely to because of physical limits imposed by chronic RSI. This page covers those "maybe someday" features...

(Note that this is specifically about the underlying cwal library, not s2 or whcl.)

Scopes vs GC Roots

The cwal_scope class has always performed two closely-related duties:

  1. Acting as a "GC root" - it is used to keep track of which values are "reachable" and which should be reaped.

  2. Acting as a property store for scope-local variables. These properties are "script-visible", in that they can always be reached by client code (so long as they scope lives), whereas the first feature tracks both script-visible and errant/orphaned values owned by this level of the scope stack.

In hindsight, those two concepts should be separated into two classes. The 2nd feature is actually optional, as far as client-side code goes - client-side code is free to track its concept of scope-level variables in their own data structures, provided they keep the lifetimes in sync with cwal's scope stack for lifetime/garbage collection purposes. cwal's core does not use that feature - it was initially provided as a convenience for client-side code, and can be/is still used that way. More advanced clients, namely s2, may reach a point where they want/need to use their own variable-tracking mechanisms, independent of this particular feature (e.g. to implement conventional variable lookup rules, as opposed to cwal's "straight up the scope stack" lookup rules).

Splitting those up in the current source tree would cause Massive Grief in terms of the effort needed to bring s2 up to date, so it's extremely unlikely to happen. If cwal ever gets a "version 2", though, this would be one of the first notable architectural changes. With that change, roots could be added without the requirement of adding additional scopes. Roots would be strictly hierarchical, like cwal_scope is now, but scopes could potentially become first-class Values, which might have interesting uses vis-a-vis closures. Scopes, as currently implemented in the core library, cannot be first-class Values as-is because of their 1-to-1 mapping as GC roots. That said, it would possibly be feasible for higher-level code (i.e., s2) to use a cwal_native to create first-class scope Values. That would be a more heavy-weight solution, though, than splitting cwal_scope into two distinct types. Having scopes as first-class Values would allow multiple disparate scopes to live at the same level of the GC root stack, which is something the API cannot currently do (because only one scope can live at any given level of the stack). Whether or not that would really be a useful feature is not yet clear, but it sounds like it may allow more client-side flexibility without appreciable new memory costs (the main cost would be that "mobile" scopes would need to be heap-allocated).

It might (might) make sense, and be relatively painless, to internally split them up into the current cwal_scope class and a new cwal_gcroot class, and embed a cwal_gcroot instance into each scope. That would allow the separation without outright breaking client-side APIs/code and simplify an eventual separation into two disparate uses. It begs the question, though: once they are split, does cwal_scope even make sense except as a convenience form of scope-local variable storage? Only further consideration and experimentation will tell. (Sidebar: the separation into roots/scopes would probably require additional client-side API hooks to notify the client when GC roots are pushed/popped.)

There are lots of details left to consider here, in particular the numerous side-effects this would have on client-side code (again, s2).

Sidebar: "gc root" might not be the proper term. Maybe "generation", as in "generational", would be a better term. Each generation would correspond to what is currently a scope level.

2022-03-21: some experimentation with this approach has been done in whcl. In short, that branch uses cwal_scope for the GC roots and a whcl-specific scope type to manage script-level visibility and resolution of variables.

See also: https://en.wikipedia.org/wiki/Garbage_collection_(computer_science)

Page-based Allocator for Value Types

The various cwal_value subclasses have fixed memory sizes (not including any "variable" parts like list/hash table memory). Though the current recycling mechanism does a fantastic job of recycling those and keeping allocations to a minimum, we could possibly improve upon that (i.e., reduce total allocation count, though probably not the total memory) by using a page-based allocator for Value types. The possible exception would be strings, as those are currently allocated in a single chunk with their cwal_value part and string bytes, and that approach is not page-friendly. (OTOH, that approach is an optimization tuned for the current allocation technique - it need not be retained for page-base allocation.) Because disparate Value types may have the same sizeof, pagers could be shared for each type with an identical size. (We already perform that calculation for grouping recycling bins of same-sized types.)

Changing/adding this would require fairly notable internal surgery but would not affect client-side APIs. It would almost certainly reduce overall allocation counts, but it may well require more total memory and it might impose a slight performance hit due to the management of the individual cells of the pager.