libfossil
fossil-pages.h
Go to the documentation of this file.
1 /* -*- Mode: C; tab-width: 4; indent-tabs-mode: nil; c-basic-offset: 2 -*- */
2 /* vim: set ts=2 et sw=2 tw=80: */
3 #if !defined(NET_FOSSIL_SCM_PAGES_H_INCLUDED)
4 #define NET_FOSSIL_SCM_PAGES_H_INCLUDED
5 /*
6  Copyright (c) 2013 D. Richard Hipp
7 
8  This program is free software; you can redistribute it and/or
9  modify it under the terms of the Simplified BSD License (also
10  known as the "2-Clause License" or "FreeBSD License".)
11 
12  This program is distributed in the hope that it will be useful,
13  but without any warranty; without even the implied warranty of
14  merchantability or fitness for a particular purpose.
15 
16  Author contact information:
17  drh@hwaci.com
18  http://www.hwaci.com/drh/
19 
20  *****************************************************************************
21  This file contains only Doxygen-format documentation, split up into
22  Doxygen "pages", each covering some topic at a high level. This is
23  not the place for general code examples - those belong with their
24  APIs.
25 */
26 
27 /** @mainpage libfossil
28 
29  Forewarning: this API assumes one is familiar with the Fossil SCM,
30  ideally in detail. The Fossil SCM can be found at:
31 
32  http://fossil-scm.org
33 
34  libfossil is an experimental/prototype library API for the Fossil
35  SCM. This API concerns itself only with the components of fossil
36  which do not need user interaction or the display of UI components
37  (including HTML and CLI output). It is intended only to model the
38  core internals of fossil, off of which user-level applications
39  could be built.
40 
41  The project's repository and additional information can be found at:
42 
43  http://fossil.wanderinghorse.net/repos/libfossil/
44 
45  This code is 100% hypothetical/potential, and does not represent
46  any Official effort of the Fossil project. It is up for any amount
47  of change at any time and does not yet have a stable API.
48 
49  All Fossil users are encouraged to participate in its development,
50  but if you are reading this then you probably already knew that
51  :).
52 
53  This effort does not represent "Fossil Version 2", but provides an
54  alternate method of accessing and manipulating fossil(1)
55  repositories. Whereas fossil(1) is a monolithic binary, this API
56  provides library-level access to (some level of) the fossil(1)
57  feature set (that level of support grows approximately linearly
58  with each new commit).
59 
60  Current status: alpha. Some bits are basically finished but there
61  is a lot of work left to do. The scope is pretty much all
62  Fossil-related functionality which does not require a user
63  interface or direct user interaction, plus some range of utilities
64  to support those which require a UI/user.
65 */
66 
67 /** @page page_terminology Fossil Terminology
68 
69  See also: http://fossil-scm.org/index.html/doc/trunk/www/concepts.wiki
70 
71  The libfossil API docs normally assume one is familiar with
72  Fossil-internal terminology, which is of course a silly assumption
73  to make. Indeed, one of libfossil's goals is to make Fossil more
74  accessible, partly be demystifying it. To that end, here is a
75  collection of terms one may come across in the API, along with
76  their meanings in the context of Fossil...
77 
78 
79  - REPOSITORY (a.k.a. "repo) is an sqlite database file which
80  contains all content for a given "source tree." (We will use the
81  term "source tree" to mean any tree of "source" (documents,
82  whatever) a client has put under Fossil's supervision.)
83 
84  - CHECKOUT (a.k.a. "local source tree" or "working copy") refers
85  to (A) the action of pulling a specific version of a repository's
86  state from that repo into the local filesystem, and (B) a local
87  copy "checked out" of a repo. e.g. "he checked out the repo," and
88  "the changes are in his [local] checkout."
89 
90  - ARTIFACT is the generic term for anything stored in a repo. More
91  specifically, ARTIFACT refers to "control structures" Fossil uses
92  to internally track changes. These artifacts are stored as blobs
93  in the database, just like any other content. For complete details
94  and examples, see:
95  http://fossil-scm.org/index.html/doc/tip/www/fileformat.wiki
96 
97  - A MANIFEST is a specific type of ARTIFACT - the type which
98  records all metadata for a COMMIT operation (which files, which
99  user, the timestamp, checkin comment, lineage, etc.). For
100  historical reasons, MANIFEST is sometimes used as a generic term
101  for ARTIFACT because what the fossil(1)-internal APIs originally
102  called a Manifest eventually grew into other types of artifacts
103  but kept the Manifest naming convention. In Fossil developer
104  discussion, "manifest" most often means what this page calls
105  ARTIFACT (probably because that how the C code is modelled). The
106  libfossil API calls uses the term "deck" instead of "manifest" to
107  avoid ambiguity/confusion (or to move the confusion somewhere
108  else, at least).
109 
110  - CHECKIN is the term libfossil prefers to use for COMMIT
111  MANIFESTS. It is also the action of "checking in"
112  (a.k.a. "committing") file changes to a repository. A CHECKIN
113  ARTIFACT can be one of two types: a BASELINE MANIFEST (or BASELINE
114  CHECKIN) contains a list of all files in that version of the
115  repository, including their file permissions and the UUIDs of
116  their content. A DELTA MANFIEST is a checkin record which derives
117  from a BASELINE MANIFEST and it lists only the file-level changes
118  which happened between the baseline and the delta, recording any
119  changes in content, permisions, or name, and recording
120  deletions. Note that this inheritance of deltas from baselines is
121  an internal optimization which has nothing to do with checkin
122  version inheritance - the baseline of any given delta is normally
123  _not_ its direct checkin version parent.
124 
125  - BRANCH, FORK, and TAG are all closely related in Fossil and are
126  explained in detail (with pictures!) at:
127  http://fossil-scm.org/index.html/doc/trunk/www/concepts.wiki
128  In short: BRANCHes and FORKs are two names for the same thing, and
129  both are just a special-case usage of TAGs.
130 
131  - MERGE or MERGING: the process of integrating one version of
132  source code into another version of that source code, using a
133  common parent version as the basis for comparison. This is
134  normally fully automated, but occasionally human (and sometimes
135  Divine) intervention is required to resolve so-called "merge
136  conflicts," where two versions of a file change the same parts of
137  a common parent version.
138 
139  - RID (Record ID) is a reference to the blob.rid field in a
140  repository DB. RIDs are used extensively throughout the API for
141  referencing content records, but they are transient values local
142  to a given copy of a given repository at a given point in
143  time. They _can_ change, even for the same content, (e.g. a
144  rebuild can hypothetically change them, though it might not, and
145  re-cloning a repo may very well change some RIDs). Clients must
146  never rely on them for long-term reference to SCM'd data - always use
147  the full UUID of such data. Even though they normally appear to be
148  static, they are most explicitly NOT guaranteed to be. Nor are
149  their values guaranteed to imply any meaning, e.g. "higher is
150  newer" is not necessarily true because synchronization can import
151  new remote content in an arbitrary order and a rebuild might
152  import it in random order. The API uses RIDs basically as handles
153  to arbitrary blob content and, like most C-side handles, must be
154  considered transient in nature. That said, within the db, records
155  are linked to each other exclusively using RIDs, so they do have
156  some persistence guarantees for a given db instance.
157 
158  More to come...
159 
160 */
161 
162 
163 /** @page page_APIs High-level API Overview
164 
165  The primary end goals of this project are to eventually cover the
166  following feature areas:
167 
168  - Provide embeddable SCM to local apps using sqlite storage.
169  - Provide a network layer on top of that for synchronization.
170  - Provide apps on top of those to allow administration of repos.
171 
172  To those ends, the fossil APIs cover the following categories of
173  features:
174 
175  Filesystem:
176 
177  - Conversions of strings from OS-native encodings to UTF.
178  fsl_utf8_to_unicode(), fsl_filename_to_utf8(), etc. These are
179  primarily used internally but may also be useful for applications
180  working with files (as most clients will). Actually... most of
181  these bits are only needed for portability across Windows
182  platforms.
183 
184  - Locating a user's home directory: fsl_find_home_dir()
185 
186  - Normalizing filenames/paths. fsl_file_canonical_name() and friends.
187 
188  - Checking for existence, size, and type (file vs directory) with
189  fsl_is_file() and fsl_dir_check(), or the more general-purpose
190  fsl_stat().
191 
192 
193  Databases (sqlite):
194 
195  - Opening/closing sqlite databases and running queries on them,
196  independent of version control features. See fsl_db_open() and
197  friends. The actual sqlite-level DB handle type is abstracted out
198  of the public API, largely to simplify an eventual port from
199  sqlite3 to sqlite4 or (hypothetically) to other storage back-ends
200  (not gonna happen - too much work).
201 
202  - There are lots of utility functions for oft-used operations,
203  e.g. fsl_config_get_int32() and friends to fetch settings from one
204  of the three different configuration areas (global, repository,
205  and checkout).
206 
207  - Pseudo-recusive transactions: fsl_db_transaction_begin() and
208  fsl_db_transaction_end().
209 
210  - Cached statements (an optimization for oft-used queries):
211  fsl_db_prepare_cached() and friends.
212 
213 
214  The DB API is (as Brad put so well) "very present" in the public
215  API. While the core API provides access to the underlying
216  repository data, it cannot begin to cover even a small portion of
217  potential use cases. To that end, it exposes the DB API so that
218  clients who want to custruct their own data can do so. It does
219  require research into the underlying schemas, but gives
220  applications the ability to do _anything_ with their repositories
221  which the core API does not account for. Historically, the ability
222  to create ad-hoc data structures as needed, in the form of SQL
223  queries, has accounted for much of Fossil's feature flexibility.
224 
225 
226  Deltas:
227 
228  - Creation and application of raw deltas, using Fossil's delta
229  format, independent of version control features. See
230  fsl_delta_create() and friends. These are normally used only at
231  the deepest internal levels of fossil, but the APIs are exposed so
232  that clients can, if they wish, use them to deltify their own
233  content independently of fossil's internally-applied
234  deltification. Doing so is remarkably easy, but completely
235  unnecessary for content which will be stored in a repo, as Fossil
236  creates deltas as needed.
237 
238 
239  SCM:
240 
241  - A "context" type (fsl_cx) which manages a repository db and,
242  optionally, a checkout db. Read-only operations on the DB are
243  working and write functionality (adding repo content) is
244  ongoing. See fsl_cx, fsl_cx_init(), and friends.
245 
246  - The fsl_deck class assists in parsing, creating, and outputing
247  "artifacts" (manifests, control (tags), events, etc.). It gets its
248  name from it being container for "a collection of cards" (which is
249  what a Fossil artifact is).
250 
251  - fsl_content_get() expands a (possibly) deltified blob into its
252  full form, and fsl_content_blob() can be used to fetch a raw blob
253  (possibly a raw delta).
254 
255  - A number of routines exist for converting symbol names to RIDs
256  (fsl_sym_to_rid()), UUIDs to RIDs (fsl_uuid_to_rid(),
257  and similar commonly-needed lookups.
258 
259 
260  Input/Output:
261 
262  - The API defines several abstractions for i/o interfaces, e.g.
263  fsl_input_f() and fsl_output_f(), which allow us to accept/emit
264  data from/to arbitrary sources/destinations. A fsl_cx instance is
265  configured with an output channel, the intention being that all
266  clients of that context should generate any output through that
267  channel, so that all compatible apps can cooperate more easily in
268  terms of i/o. For example, the th1ish script binding for libfossil
269  routes fsl_output() through the script's i/o channels, so that any
270  output generated by libfossil-using code it links to can take
271  advantage of the script-side output features (such as output
272  buffering, which is needed for any non-trivial CGI output).
273 
274 
275  Utilities:
276 
277  - fsl_buffer, a generic buffer class, is used heavily by the
278  library. See fsl_buffer and friends.
279 
280  - fsl_appendf() provides printf()-like functionality, but sends
281  its output to a callback function (optionally stateful), making it
282  the one-stop-shop for string formatting within the library.
283 
284  - The fsl_error class is used to propagate error information
285  between the libraries various levels and the client.
286 
287  - The fsl_list class acts as a generic container-of-pointers, and
288  the API provides several convenience routines for managing them,
289  traversing them, and cleaning them up.
290 
291  - Hashing: there are a number of routines for calculating SHA1 and
292  MD5 hashes. See fsl_sha1_cx, fsl_md5_cx, and friends. We haven't yet
293  had need of an actual hash table class.
294 
295  - zlib compression is used for storing artifacts. See
296  fsl_data_is_compressed(), fsl_buffer_compress(), and friends.
297 */
298 
299 /** @page page_porting_checklist Porting Checklist
300 
301  An overview of what library-level features are implemented and
302  what's left to do...
303 
304  - Db abstraction layer: complete and more or less stable.
305 
306  - Infrastructure for opening/closing checkouts/repos
307  works. Infrastructure for a config db is in place.
308 
309  - Fetching blob content (raw or delta-applied) and low-level
310  content saving is working.
311 
312  - Artifact (e.g. manifest) parsing, generating, and delta manifest
313  baseline traversal works. Most artifacts can be exported from a
314  canonical Fossil repo then parsed and exported by this API with
315  100% fidelity, with the minor exception that _some_ timestamps
316  (D-cards) differ by a millisecond (round-trip precision change),
317  which changes their hash. So far i have only see the imprecision
318  affect "artifically generated" artifacts, not "real" ones. Artifacts
319  are never "round-tripped" like that in real use, anyway - it's only
320  for testing the parser and generator.
321 
322  - Adding new control artifacts (tag changes) is basically working.
323 
324  - Low-level delta generation and application is working, as well
325  as the (incidentally unrelated) diff-generation code (context- and
326  side-by-side).
327 
328  - Manifest crosslinking. This is a large part of what goes on
329  during any changes to a repository. Most of the work is finished
330  here but there are still some cases to handle (namely tickets) and
331  obscene amounts of testing to be done. And a testing
332  infrastructure needs to be architected and put into place.
333 
334  - Schema initialization/creation is complete. The rebuild process
335  (closely related but far more intricate) is far down the list of
336  TODOs.
337 
338  - Wiki features are basically working: loading/saving, but
339  it needs APIs for working with wiki history.
340 
341 
342  Actively in progress (today==March 14, 2014):
343 
344  - Event bits
345 
346  - Application-level bits (::fcli).
347 
348  - "vfile" (checkout-related) infrastructure is mostly ported
349  in. This includes checkin support.
350 
351  - Tickets APIs have been started but have a low priority. The v1
352  impl requires a good deal of application-level infrastructure
353  (namely TH1), and there are no plans to port TH1 in at the library
354  level.
355 
356  - All of the bits needed for performing a checkout are in place
357  with the exception of UNDO support and the actual creation of
358  the checkout db (but we have all the pieces needed for that).
359 
360 
361  Areas which have not yet been started or where no notable
362  progress has yet been made, in no particular order:
363 
364  - Handling of symlinks in a repo.
365 
366  - The 'rebuild' operation, i think, will essentially be the
367  ultimate test of the core library components. If it can do that,
368  it can "probably" do anything else.
369 
370  - UI. The library has no UI, of course, but as it is fleshed out
371  one may eventually be needed, even if it's only a CLI shell.
372 
373  - Synchronization. There are lots of underlying bits to finish
374  before this can be implemented.
375 
376  - Networking. Far down the list of TODOs. The core library needs know
377  nothing about networking.
378 
379  - "Received from" (rcvid field) info on artifacts. In v1 this is
380  tied closely to the network layer.
381 
382  - Versionable config settings.
383 
384  - Application/honoring of certain config
385  settings. e.g. ignore-glob and friends are currently not honored,
386  and case-insensitivity support is completely untested.
387 
388 */
389 
390 /** @page page_is_isnot Fossil is/is not...
391 
392  Through porting the main fossil application into library form,
393  the following things have become very clear (or been reinforced)...
394 
395  Fossil is...
396 
397  - _Exceedingly_ robust. Not only is sqlite literally the single
398  most robust application-agnostic container file format on the
399  planet, but Fossil goes way out of its way to ensure that what
400  gets put in is what gets pulled out. It cuts zero corners on data
401  integrity, even adding in checks which seem superfluous but
402  provide another layer of data integrity (i'm primarily talking
403  about the R-card here, but there are other validation checks). It
404  does this at the cost of memory and performance (that said, it's
405  still easily fast enough for its intended uses). "Robust" doesn't
406  mean that it never crashes nor fails, but that it does so with
407  (insofar as is technically possible) essentially zero chance of
408  data loss/corruption.
409 
410  - Long-lived: the underlying data format is independent of its
411  storage format. It is, in principal, usable by systems as yet
412  unconceived by the next generation of programmers. This
413  implementation is based on sqlite, but the model can work with
414  arbitrary underlying storage.
415 
416  - Amazingly space-efficient. The size of a repository database
417  necessarily grows as content is modified. However, Fossil's use of
418  zlib-compressed deltas, using a very space-efficient delta format,
419  leads to tremendous compression ratios. As of this writing
420  (September, 2013), the main Fossil repo contains approximately
421  1.3GB of content, were we to check out every single version in its
422  history. Its repository database is only 42MB, however, equating
423  to a 32:1 compression ration. Ratios in the range of 20:1 to 40:1
424  are common, and more active repositories tend to have higher
425  ratios. The TCL core repository, with just over 15 years of code
426  history (imported, of course, as Fossil was introduced in 2007),
427  is only 187MB, with 6.2GB of content and a 33:1 compression ratio.
428 
429 
430 
431  Fossil is not...
432 
433  - Memory-light. Even very small uses can easily suck up 1MB of RAM
434  and many operations (verification of the R card, for example) can
435  quickly allocate and free up hundreds of MB because they have to
436  compose various versions of content on their way to a specific
437  version. Tto be clear, that is total RAM usage, not _peak_ RAM
438  usage. Peak usage is normally a function of the content it works
439  with at a given time. For any given delta application operation,
440  Fossil needs the original content, the new content, and the delta
441  all in memory at once, and may go through several such iterations
442  while resolving deltified content. Verification of its 'R-card'
443  alone can require a thousand or more underlying DB operations and
444  hundreds of delta applications. The internals use caching where it
445  would save us a significant amount of db work relative to the
446  operation in question, but relatively high memory costs are
447  unavoidable. That's not to say we can't optimize a bit, but first
448  make it work, then optimize it. The library takes care to re-use
449  memory buffers where it is feasible (and not too intrusive) to do
450  so, but there is yet more RAM to be optimized away in this regard.
451 */
452 
453 /** @page page_threading Threads and Fossil
454 
455  It is strictly illegal to use a given fsl_cx instance from more
456  than one thread. Period.
457 
458  It is legal for multiple contexts to be running in multiple
459  threads, but only if those contexts use different
460  repository/checkout databases. Though access to the storage is,
461  through sqlite, protected via a mutex/lock, this library does not
462  have a higher-level mutex to protect multiple contexts from
463  colliding during operations. So... don't do that. One context, one
464  repo/checkout.
465 
466  Multiple application instances may each use one fsl_cx instance to
467  share repo/checkout db files, but must be prepared to handle
468  locking-related errors in such cases. e.g. db operations which
469  normally "always work" may suddenly pause for a few seconds before
470  giving up while waiting on a lock when multiple applications use
471  the same database files. sqlite's locking behaviours are
472  documented in great detail at http://sqlite.org.
473  */
474 
475 /** @page page_artifacts Creating Artifacts
476 
477  A brief overview of artifact creating using this API. This is targeted
478  at those who are familiar with how artifacts are modelled and generated
479  in fossil(1).
480 
481  Primary artifact reference:
482 
483  http://fossil-scm.org/index.html/doc/trunk/www/fileformat.wiki
484 
485  In fossil(1), artifacts are generated via the careful crafting of
486  a memory buffer (large string) in the format described in the
487  document above. While it's relatively straightforward to do, there
488  are lots of potential gotchas, and a bug can potentially inject
489  "bad data" into the repo (though the verify-before-commit process
490  will likely catch any problems before the commit is allowed to go
491  through). The libfossil API uses a higher-level (OO) approach,
492  where the user describes a "deck" of cards and then tells the
493  library to save it in the repo (fsl_deck_save()) or output it to
494  some other channel (fsl_deck_output()). The API ensures that the
495  deck's cards get output in the proper order and that any cards
496  which require special treatment get that treatment (e.g. the
497  "fossilize" encoding of certain text fields). The "deck" concept
498  is equivalent to Artifact in fossil(1), but we use the word deck
499  because (A) Artifact is highly ambiguous in this context and (B)
500  deck is arguably the most obvious choice for the name of a type
501  which acts as a "container of cards."
502 
503  Ideally, client-level code will never have to create an artifact
504  via the fsl_deck API (because doing so requires a fairly good
505  understanding of what the deck is for in the first place,
506  including the individual Cards). The public API strives to hide
507  those levels of details, where feasible, or at least provide
508  simpler/safer alternatives for basic operations. Some operations
509  may require some level of direct work with a fsl_deck
510  instance. Likewise, much read-only functionality directly exposes
511  fsl_deck to clients, so some familiarity with the type and its
512  APIs will be necessary for most clients.
513 
514  The process of creating an artifact looks a lot like the following
515  code example. We have elided error checking for readability
516  purposes, but in fact this code has undefined behaviour if error
517  codes are not checked and appropriately reacted to.
518 
519  @code
520  fsl_deck deck = fsl_deck_empty;
521  fsl_deck * d = &deck; // for typing convenience
522  fsl_deck_init( fslCtx, d, FSL_CATYPE_CONTROL ); // must come first
523  fsl_deck_D_set( d, fsl_julian_now() );
524  fsl_deck_U_set( d, "your-fossil-name", -1 );
525  fsl_deck_T_add( d, FSL_TAGTYPE_ADD, "...uuid being tagged...",
526  "tag-name", "optional tag value");
527  ...
528  // unshuffle is necessary when using multi-cards which may
529  // need sorting (tags, filenames, etc.):
530  fsl_deck_unshuffle(d, 0);
531  // Unshuffling is done by the client because the deck is const
532  // when we output it:
533  fsl_deck_output( f, d, fsl_output_f_FILE, stdout );
534  // note that fsl_deck_save() does the unshuffle itself.
535  fsl_deck_finalize(d);
536  @endcode
537 
538  The order the cards are added to the deck is irrelevant - they
539  will be output in the order specified by the Fossil specs
540  regardless of their insertion order. Each setter/adder function
541  knows, based on the deck's type (set via fsl_deck_init()), whether
542  the given card type is legal, and will return an error (probably
543  FSL_RC_TYPE) if an attempt is made to add a card which is illegal
544  for that deck type. Likewise, fsl_deck_output() and
545  fsl_deck_save() confirm that the decks they are given contain (A)
546  only allowed cards and (B) have all required
547  cards. fsl_deck_save() also sorts any "multi-cards" which need it
548  (e.g. T- and F-cards).
549 
550 */
551 
552 /** @page page_transactions DB Transactions
553 
554  The fsl_db_transaction_begin() and fsl_db_transaction_end()
555  functions implement a basic form of recursive transaction,
556  allowing the library to start and end transactions at any level
557  without having to know whether a transaction is already in
558  progress (sqlite3 does not natively support nested
559  transactions). A rollback triggered in a lower-level transaction
560  will propagate the error back through the transaction stack and
561  roll back the whole transaction, providing us with excellent error
562  recovery capabilities (meaning we can always leave the db in a
563  well-defined state).
564 
565  It is STRICTLY ILLEGAL to EVER begin a transaction using "BEGIN"
566  or end a transaction by executing "COMMIT" or "ROLLBACK" directly
567  on a db handle which associated with a fsl_cx instances. Doing so
568  bypasses internal state which needs to be kept abreast of things
569  and will cause Grief and Suffering (on the client's part, not
570  mine).
571 
572  Tip: implementing a "dry-run" mode for most fossil operations is
573  trivial by starting a transaction before performing the
574  operations. Many operations run in a transaction, but if the
575  client starts one of his own he can "dry-run" any op by simply
576  rolling back the transaction he started. Abstractly, that
577  looks like this pseudocode:
578 
579  @code
580  db.begin();
581  fsl.something();
582  fsl.somethingElse();
583  if( dryRun ) db.rollback();
584  else db.commit();
585  @endcode
586 
587 */
588 
589 /** @page page_code_conventions Code Conventions
590 
591  Project and Code Conventions...
592 
593  Foreward: all of this more or less evolved organically or was
594  inherited from fossil(1) (where it evolved organically, or was
595  inherited from sqilte (where it evol...)), and is written up here
596  more or less as a formality. Historically i've not been a fan of
597  coding conventions, but as someone else put it to me, "the code
598  should look like it comes from a single source," and the purpose
599  of this section is to help orient those looking to hack in the
600  sources. Note that most of what is said below becomes obvious
601  within a few minutes of looking at the sources - there's nothing
602  earth-shatteringly new nor terribly controversial here.
603 
604  The Rules/Suggestions/Guidelines/etc. are as follows...
605 
606 
607  - C89 whereever possible, with the exception that we optionally
608  use the C99-specified fixed integer types and their standard
609  formatting strings when possible (if the platform has them
610  resp. if the configuration header is configured for them). We also
611  use/tolerate 'long long' (via sqlite3), which is not strictly C89
612  but is supported on all modern compilers even when compiling in
613  C89 mode. For gcc and workalike-compiler, the -Wno-long-long flag
614  can be used to suppress warnings regarding non-standarization of
615  that type. (Whether or not those warnings appear depends on other
616  warning levels.) Apropos warning levels...
617 
618  - The canonical build environment uses the most restrictive set of
619  warning/error levels possible, with the exception of tolerating
620  'long long', as mentioned above. It is highly recommended that
621  non-canonical build environments do the same. Adding -Wall -Werror
622  -pedantic does _not_ guaranty that all C compliance/portability
623  problems can be caught by the compiler, but it goes a long way in
624  helping us to write clean code. The clang compiler is particularly
625  good at catching minor foo-foo's such as uninitialized variables.
626 
627  - API docs (as you have probably already noticed), does not (any
628  longer) follow Fossil's comment style, but instead uses
629  Doxygen-friendly formatting. Each comment block MUST start with
630  two or more asterisks, or '*!', or doxygen apparently doesn't
631  understand it
632  (http://www.stack.nl/~dimitri/doxygen/manual/docblocks.html). When
633  adding code snippets and whatnot to docs, please use doxygen
634  conventions if it is not too much of an inconvenience. All public
635  APIs must be documented with a useful amount of detail. If you
636  hate documenting, let me know and i'll document it (it's what i do
637  for fun).
638 
639  - Public API members have a fsl_ or FSL_ prefix (fossil_ seems too
640  long?). For private/static members, anything goes. Optional or
641  "add-on" APIs (e.g. ::fcli) may use other prefixes, but are
642  encouraged use an "f-word" (as it were), simply out of deference
643  to long-standing software naming conventions.
644 
645  - Structs and functions use lower_underscore_style()
646 
647  - Overall style, especially scope blocks and indentation, should
648  follow Fossil v1.x. We are not at all picky about whether or not
649  there is a space after/before parens in if( foo ), and similar
650  small details, just the overall code pattern.
651 
652  - Structs and enums all get the optional typedef so that they do
653  not need to be qualified with 'struct' resp. 'enum' when used.
654 
655  - Function typedefs are named fsl_XXX_f. Implementations of such
656  typedefs/interfaces are typically named fsl_XXX_f_SUFFIX(), where
657  SUFFIX describes the implementation's
658  specialization. e.g. fsl_output_f() is a callback
659  typedef/interface and fsl_output_f_FILE() is a concrete
660  implementation for FILE handles.
661 
662  - Typedefs for non-struct types (numerics and enumcs) tend to be
663  named fsl_XXX_t.
664 
665  - Functions follow the naming pattern prefix_NOUN_VERB(), rather
666  than the more C-conventional prefix_VERB_NOUN(),
667  e.g. fsl_foo_get() and fsl_foo_set() rather than fsl_get_foo() and
668  fsl_get_foo(). The primary reasons are (A) sortability for
669  document processors and (B) they more naturally match with OO API
670  conventions, e.g. noun.verb(). A few cases knowingly violate this
671  convention for the sake of readability or sorting of several related
672  functions (e.g. fsl_db_get_XXX() instead of fsl_db_XXX_get()).
673 
674  - Structs intended to be creatable on the stack are accompanied by
675  a const instance named fsl_STRUCT_NAME_empty, and possibly by a
676  macro named fsl_STRUCT_NAME_empty_m, both of which are
677  "default-initialized" instances of that struct. This is superiour
678  to using memset() for struct initialization because we can define
679  (and document) arbitrary default values and all clients who
680  copy-construct them are unaffected by many types of changes to the
681  struct's signature (though they may need a recompile). The
682  intention of the fsl_STRUCT_NAME_empty_m macro is to provide a
683  struct-embeddable form for use in other structs or
684  copy-initialization of const structs, and the _m macro is always
685  used to initialize its const struct counterpart. e.g. the library
686  guarantees that fsl_cx_empty_m (a macro representing an empty
687  fsl_cx instance) holds the same default values as fsl_cx_empty (a
688  const fsl_cx value).
689 
690  - Returning int vs fsl_int_t vs fsl_size_t: int is used as a
691  conventional result code. fsl_int_t is used as a signed
692  length-style result code (e.g. printf() semantics). Unsigned
693  ranges use fsl_size_t. char is used to indicate a boolean. ints
694  are (also) used as a "triplean" (3 potential values, e.g. <0, 0,
695  >0). fsl_int_t also guarantees that it will be 64-bit if
696  available, so can be used for places where large values are needed
697  but a negative value is legal (or handy), e.g. fsl_strndup()'s
698  second argument. The use of the fsl_xxx_f typedefs, rather than
699  (unsigned) int, is primarily for readability/documentation,
700  e.g. so that readers can know immediately that the function does
701  not use integer argument or result-code return semantics. It also
702  allows us to better define platform-portable printf/scanf-style
703  format modifiers for them (analog to C99's PRIi32 and friends),
704  which often come in handy.
705 
706  - Signed vs. unsigned types for size/length arguments: use the
707  fsl_int_t (signed) argument type when the client may legally pass
708  in a negative value as a hint that the API should use fsl_strlen()
709  (or similar) to determine a byte array's length. Use fsl_size_t
710  when no automatic length determination is possible (or desired),
711  to "force" the client to pass the proper length. Internally
712  fsl_int_t is used in some places where fsl_size_t "should" be used
713  because some ported-in logic relies on loop control vars being
714  able to go negative. Additionally, fossil internally uses negative
715  blob lengths to mark phantom blobs, and care must be taken when
716  using fsl_size_t with those.
717 
718  - Functions taking elipses (...) are accompanied by a va_list
719  counterpart named the same as the (...) form plus a trailing
720  'v'. e.g. fsl_appendf() and fsl_appendfv(). We do not use the
721  printf()/vprintf() convention because that hoses sorting of the
722  functions in generated/filtered API documentation.
723 
724  - Error handling/reporting: please keep in mind that the core code
725  is a library, not an application. The main implication is that
726  all lib-level code needs to check for errors whereever they can
727  happen (e.g. on every single memory allocation, of which there are
728  many) and propagate errors to the caller, to be handled at his
729  discretion. The app-level code (::fcli) is not particularly strict
730  in this regard, and installs its own allocator which abort()s on
731  allocation error, which simplifies app-side code somewhat
732  vis-a-vis lib-level code.
733 */
734 
735 
736 /** @page page_fossil_arch Fossil Architecture Overview
737 
738  An introduction to the Fossil architecture. These docs
739  are basically just a reformulation of other, more detailed,
740  docs which can be found via the main Fossil site, e.g.:
741 
742  - http://fossil-scm.org/index.html/doc/trunk/www/concepts.wiki
743 
744  - http://fossil-scm.org/index.html/doc/trunk/www/fileformat.wiki
745 
746 
747  Fossil's internals are fundamentally broken down into two basic
748  parts. The first is a "collection of blobs." The simplest way to
749  think of this (and it's not far from the full truth) is a
750  directory containing lots of files, each one named after the SHA1
751  hash of its contents. This pool contains ALL content required for
752  a repository - all other data can be generated from data contained
753  here. Included in the blob pool are so-called Artifacts. Artifacts
754  are simple text files with a very strict format, which hold
755  information regarding the idententies of, relationships involving,
756  and other metadata for each type of blob in the pool. The most
757  basic Artifact type is called a Manifest, and a Manifest tells us,
758  amongst other things, which of the SHA1-based file names has which
759  "real" file name, which version the parent (or parents!) is (or
760  are), and other data required for a "commit" operation.
761 
762  The blob pool and the Manifests are all a Fossil repository really
763  needs in order to function. On top of that basis, other forms of
764  Artifacts provide features such as tagging (which is the basis of
765  branching and merging), wiki pages, and tickets. From those
766  Artifacts, Fossil can create/calculate all sorts of
767  information. For example, as new Artifacts are inserted it
768  transforms the Artifact's metadata into a relational model which
769  sqlite can work with. That leads us to what is conceptually the
770  next-higher-up level, but is in practice a core-most component...
771 
772  Storage. Fossil's core model is agnostic about how its blobs are
773  stored, but libfossil and fossil(1) both make heavy use of sqlite
774  to implement many of their features. These include:
775 
776  - Transaction-capable storage. It's almost impossible to corrupt a
777  Fossil db in normal use. sqlite3 offers literally the most robust
778  general-purpose file format on the planet.
779 
780  - The storage of the raw blobs.
781 
782  - Artifact metadata is transformed into various DB structures
783  which allow libfossil to traverse historical data much more
784  efficiently than would be possible without a db-like
785  infrastructure (and everything that implies). These structures are
786  kept up to date as new Artifacts are stored in a repository,
787  either via local edits or synching in remote content. These data
788  are incrementally updated as changes are made to a repo.
789 
790  - A tremendous amount of the "leg-work" in processing the
791  repository state is handled by SQL queries, without which the
792  library would easily require 5-10x more code in the form of
793  equivalent hard-coded data structures and corresponding
794  functionality. The db approach allows us to ad-hoc structures as
795  we need them, providing us a great deal of flexibility.
796 
797  All content in a Fossil repository is in fact stored in a single
798  database file. Fossil additionally uses another database (a
799  "checkout" db) to keep track of local changes, but the repo
800  contains all "fossilized" content. Each copy of a repo is a
801  full-fledged repo, each capable of acting as a central copy for
802  any number of clones or checkouts.
803 
804  That's really all there is to understand about Fossil. How it does
805  its magic, keeping everything aligned properly, merging in
806  content, how it stores content, etc., is all internal details
807  which most clients will not need to know anything about in order
808  to make use of fossil(1). Using libfossil effectively, though,
809  does require learning _some_ amount of how Fossil works. That will
810  require taking some time with _other_ docs, however: see the
811  links at the top of this section for some starting points.
812 
813 
814  Sidebar:
815 
816  - The only file-level permission Fossil tracks is the "executable"
817  (a.k.a. "+x") bit. It internally marks symlinks as a permission
818  attribute, but that is applied much differently than the
819  executable bit and only does anything useful on platforms which
820  support symlinks.
821 
822 */
823 
824 #endif
825 /* NET_FOSSIL_SCM_PAGES_H_INCLUDED */