libfossil
Fossil Terminology

See also: http://fossil-scm.org/index.html/doc/trunk/www/concepts.wiki

The libfossil API docs normally assume one is familiar with Fossil-internal terminology, which is of course a silly assumption to make. Indeed, one of libfossil's goals is to make Fossil more accessible, partly be demystifying it. To that end, here is a collection of terms one may come across in the API, along with their meanings in the context of Fossil...

  • REPOSITORY (a.k.a. "repo) is an sqlite database file which contains all content for a given "source tree." (We will use the term "source tree" to mean any tree of "source" (documents, whatever) a client has put under Fossil's supervision.)
  • CHECKOUT (a.k.a. "local source tree" or "working copy") refers to (A) the action of pulling a specific version of a repository's state from that repo into the local filesystem, and (B) a local copy "checked out" of a repo. e.g. "he checked out the repo," and "the changes are in his [local] checkout."
  • ARTIFACT is the generic term for anything stored in a repo. More specifically, ARTIFACT refers to "control structures" Fossil uses to internally track changes. These artifacts are stored as blobs in the database, just like any other content. For complete details and examples, see: http://fossil-scm.org/index.html/doc/tip/www/fileformat.wiki
  • A MANIFEST is a specific type of ARTIFACT - the type which records all metadata for a COMMIT operation (which files, which user, the timestamp, checkin comment, lineage, etc.). For historical reasons, MANIFEST is sometimes used as a generic term for ARTIFACT because what the fossil(1)-internal APIs originally called a Manifest eventually grew into other types of artifacts but kept the Manifest naming convention. In Fossil developer discussion, "manifest" most often means what this page calls ARTIFACT (probably because that how the C code is modelled). The libfossil API calls uses the term "deck" instead of "manifest" to avoid ambiguity/confusion (or to move the confusion somewhere else, at least).
  • CHECKIN is the term libfossil prefers to use for COMMIT MANIFESTS. It is also the action of "checking in" (a.k.a. "committing") file changes to a repository. A CHECKIN ARTIFACT can be one of two types: a BASELINE MANIFEST (or BASELINE CHECKIN) contains a list of all files in that version of the repository, including their file permissions and the UUIDs of their content. A DELTA MANFIEST is a checkin record which derives from a BASELINE MANIFEST and it lists only the file-level changes which happened between the baseline and the delta, recording any changes in content, permisions, or name, and recording deletions. Note that this inheritance of deltas from baselines is an internal optimization which has nothing to do with checkin version inheritance - the baseline of any given delta is normally _not_ its direct checkin version parent.
  • BRANCH, FORK, and TAG are all closely related in Fossil and are explained in detail (with pictures!) at: http://fossil-scm.org/index.html/doc/trunk/www/concepts.wiki In short: BRANCHes and FORKs are two names for the same thing, and both are just a special-case usage of TAGs.
  • MERGE or MERGING: the process of integrating one version of source code into another version of that source code, using a common parent version as the basis for comparison. This is normally fully automated, but occasionally human (and sometimes Divine) intervention is required to resolve so-called "merge conflicts," where two versions of a file change the same parts of a common parent version.
  • RID (Record ID) is a reference to the blob.rid field in a repository DB. RIDs are used extensively throughout the API for referencing content records, but they are transient values local to a given copy of a given repository at a given point in time. They _can_ change, even for the same content, (e.g. a rebuild can hypothetically change them, though it might not, and re-cloning a repo may very well change some RIDs). Clients must never rely on them for long-term reference to SCM'd data - always use the full UUID of such data. Even though they normally appear to be static, they are most explicitly NOT guaranteed to be. Nor are their values guaranteed to imply any meaning, e.g. "higher is newer" is not necessarily true because synchronization can import new remote content in an arbitrary order and a rebuild might import it in random order. The API uses RIDs basically as handles to arbitrary blob content and, like most C-side handles, must be considered transient in nature. That said, within the db, records are linked to each other exclusively using RIDs, so they do have some persistence guarantees for a given db instance.

More to come...