libfossil
Code Conventions

Project and Code Conventions...

Foreward: all of this more or less evolved organically or was inherited from fossil(1) (where it evolved organically, or was inherited from sqilte (where it evol...)), and is written up here more or less as a formality. Historically i've not been a fan of coding conventions, but as someone else put it to me, "the code should look like it comes from a single source," and the purpose of this section is to help orient those looking to hack in the sources. Note that most of what is said below becomes obvious within a few minutes of looking at the sources - there's nothing earth-shatteringly new nor terribly controversial here.

The Rules/Suggestions/Guidelines/etc. are as follows...

  • C89 whereever possible, with the exception that we optionally use the C99-specified fixed integer types and their standard formatting strings when possible (if the platform has them resp. if the configuration header is configured for them). We also use/tolerate 'long long' (via sqlite3), which is not strictly C89 but is supported on all modern compilers even when compiling in C89 mode. For gcc and workalike-compiler, the -Wno-long-long flag can be used to suppress warnings regarding non-standarization of that type. (Whether or not those warnings appear depends on other warning levels.) Apropos warning levels...
  • The canonical build environment uses the most restrictive set of warning/error levels possible, with the exception of tolerating 'long long', as mentioned above. It is highly recommended that non-canonical build environments do the same. Adding -Wall -Werror -pedantic does _not_ guaranty that all C compliance/portability problems can be caught by the compiler, but it goes a long way in helping us to write clean code. The clang compiler is particularly good at catching minor foo-foo's such as uninitialized variables.
  • API docs (as you have probably already noticed), does not (any longer) follow Fossil's comment style, but instead uses Doxygen-friendly formatting. Each comment block MUST start with two or more asterisks, or '*!', or doxygen apparently doesn't understand it (http://www.stack.nl/~dimitri/doxygen/manual/docblocks.html). When adding code snippets and whatnot to docs, please use doxygen conventions if it is not too much of an inconvenience. All public APIs must be documented with a useful amount of detail. If you hate documenting, let me know and i'll document it (it's what i do for fun).
  • Public API members have a fsl_ or FSL_ prefix (fossil_ seems too long?). For private/static members, anything goes. Optional or "add-on" APIs (e.g. fcli) may use other prefixes, but are encouraged use an "f-word" (as it were), simply out of deference to long-standing software naming conventions.
  • Structs and functions use lower_underscore_style()
  • Overall style, especially scope blocks and indentation, should follow Fossil v1.x. We are not at all picky about whether or not there is a space after/before parens in if( foo ), and similar small details, just the overall code pattern.
  • Structs and enums all get the optional typedef so that they do not need to be qualified with 'struct' resp. 'enum' when used.
  • Function typedefs are named fsl_XXX_f. Implementations of such typedefs/interfaces are typically named fsl_XXX_f_SUFFIX(), where SUFFIX describes the implementation's specialization. e.g. fsl_output_f() is a callback typedef/interface and fsl_output_f_FILE() is a concrete implementation for FILE handles.
  • Typedefs for non-struct types (numerics and enumcs) tend to be named fsl_XXX_t.
  • Functions follow the naming pattern prefix_NOUN_VERB(), rather than the more C-conventional prefix_VERB_NOUN(), e.g. fsl_foo_get() and fsl_foo_set() rather than fsl_get_foo() and fsl_get_foo(). The primary reasons are (A) sortability for document processors and (B) they more naturally match with OO API conventions, e.g. noun.verb(). A few cases knowingly violate this convention for the sake of readability or sorting of several related functions (e.g. fsl_db_get_XXX() instead of fsl_db_XXX_get()).
  • Structs intended to be creatable on the stack are accompanied by a const instance named fsl_STRUCT_NAME_empty, and possibly by a macro named fsl_STRUCT_NAME_empty_m, both of which are "default-initialized" instances of that struct. This is superiour to using memset() for struct initialization because we can define (and document) arbitrary default values and all clients who copy-construct them are unaffected by many types of changes to the struct's signature (though they may need a recompile). The intention of the fsl_STRUCT_NAME_empty_m macro is to provide a struct-embeddable form for use in other structs or copy-initialization of const structs, and the _m macro is always used to initialize its const struct counterpart. e.g. the library guarantees that fsl_cx_empty_m (a macro representing an empty fsl_cx instance) holds the same default values as fsl_cx_empty (a const fsl_cx value).
  • Returning int vs fsl_int_t vs fsl_size_t: int is used as a conventional result code. fsl_int_t is used as a signed length-style result code (e.g. printf() semantics). Unsigned ranges use fsl_size_t. char is used to indicate a boolean. ints are (also) used as a "triplean" (3 potential values, e.g. <0, 0, >0). fsl_int_t also guarantees that it will be 64-bit if available, so can be used for places where large values are needed but a negative value is legal (or handy), e.g. fsl_strndup()'s second argument. The use of the fsl_xxx_f typedefs, rather than (unsigned) int, is primarily for readability/documentation, e.g. so that readers can know immediately that the function does not use integer argument or result-code return semantics. It also allows us to better define platform-portable printf/scanf-style format modifiers for them (analog to C99's PRIi32 and friends), which often come in handy.
  • Signed vs. unsigned types for size/length arguments: use the fsl_int_t (signed) argument type when the client may legally pass in a negative value as a hint that the API should use fsl_strlen() (or similar) to determine a byte array's length. Use fsl_size_t when no automatic length determination is possible (or desired), to "force" the client to pass the proper length. Internally fsl_int_t is used in some places where fsl_size_t "should" be used because some ported-in logic relies on loop control vars being able to go negative. Additionally, fossil internally uses negative blob lengths to mark phantom blobs, and care must be taken when using fsl_size_t with those.
  • Functions taking elipses (...) are accompanied by a va_list counterpart named the same as the (...) form plus a trailing 'v'. e.g. fsl_appendf() and fsl_appendfv(). We do not use the printf()/vprintf() convention because that hoses sorting of the functions in generated/filtered API documentation.
  • Error handling/reporting: please keep in mind that the core code is a library, not an application. The main implication is that all lib-level code needs to check for errors whereever they can happen (e.g. on every single memory allocation, of which there are many) and propagate errors to the caller, to be handled at his discretion. The app-level code (fcli) is not particularly strict in this regard, and installs its own allocator which abort()s on allocation error, which simplifies app-side code somewhat vis-a-vis lib-level code.