whefs  whefs

ACHTUNG: THIS SITE HAS MOVED

As of 16 June 2009, the whefs project has moved to Google Code:

http://code.google.com/p/whefs

The content of this site will only be current until development resumes under the new site (first the wiki needs to be ported). After that, this copy will not be maintained.

whefs: WanderingHorse.net Embedded Filesystem

whefs is a free C library implementing an embedded virtual filesystem. It works by creating a "filesystem" inside a so-called container file (or in memory). This API can then treat that container similarly to a filesystem. In essence this is similar to conventional archives (e.g. zip files), except that this library provides random-access read/write support to the filesystem via an API similar to the conventional fopen(), fread(), fwrite() and friends.

Author: Stephan Beal (http://wanderinghorse.net/home/stephan/)

License: Public Domain.

(A summary of the more useful links on this site can be found here.)

Features

  • Provides features to create and open embedded/virtual filesystems, which is basically a filesystem which lives inside of a single file, and to access "pseudofiles" within those filesystems.
  • i'm a documentation maniac - whefs comes with over 100k of API documentation and another 60k+ of docs in this site's wiki.
  • Optimized for low memory consumption: small use cases can get away with less than 2kb of malloc()'d memory, and "normal" cases need less than 10k (as measured by Valgrind).
  • The i/o support is provided by libwhio, meaning it can in principal use a wide range of back-end storage devices. Implementations are provided for FILE handles, in-memory buffers, and mapping a user-supplied memory range as storage. It can also host an EFS which is statically compiled and linked into the application (see whefs2c).
  • The VFS file format is independent of the device or platform bitness/endianness.
  • Provides two different approaches for "pseudofiles" inside a VFS. The lowest level implements a whio wrapper on top of the VFS, so that pseudofiles can act as full-fledged i/o devices. The higher-level API closely resembles the standard C file APIs (e.g. fopen(), fread()/read(), fwrite()/write(), etc.). The approaches are not mutually exclusive, and any given pseudofile can be accessed via either of the APIs.
  • A side-effect of the i/o model is that it is possible to embed one VFS within another (to a near arbitrary depth). Being a side-effect, this support requires no special-case handling in the vfs kernel or i/o layer.
  • The source tree comes with several tools for working with VFSes, e.g. for creating VFSes, listing their contents, and copying files into and out of a VFS.
  • Supports read-only as well as read-write operation, at the VFS and pseudofile levels.
  • Released into the Public Domain, completely unhindered by licensing restrictions (or warranties, for that matter!). Googling has revealed little open-source work on embedded filesystems (but lots of commercial products), and i have been unable to find a comparable library released under non-restrictive (or non-viral) licensing terms.
  • Designed to be easy to copy directly into a client source tree. See the amalgamation page for details.
  • Fairly rigorous consistency and bounds checking - it bails out of it finds the slightest hint of foul play. (Just remember to check the error codes!)

Misfeatures

  • Some parts of the public API need fleshing out.
  • Still needs lots more testing before i'm comfortable calling it "ready for use."
  • Very little support for concurrency - see ConcurrencyInWHEFS for details.
  • Optimised for ease of use/maintenance and memory consumption, not speed. Typical use requires only a few KB of dynamically allocated memory (for the minimal caching it does). Despite the minimalism, however, it is surprisingly performant. Versions as of 20090613 will use significantly more memory for inode name caching, but if the EFS is kept within "reasonable bounds" the memory consumption stays small.
  • Does not currently support directory hierarchies. See ticket #[10857664fa] for info on that.
  • Doesn't abstract enough away to support standard filesystems (e.g. VFAT) inside the container file. It uses its own custom filesystem implementation (my very first attempt at such) which may be suitable for small use cases but will certainly not scale well (performance-wise) into the thousands of files range.
  • A signal, crash, propagated C++ exception, or similar interruption while the FS is writing or holding unflushed file information can leave the pseudofile contents in an inconsistent state.
  • Until the software stabilizes, the file format may change from version to version. It is however always possible to export the data from an old VFS (using an older version of whefs) and re-import it using a newer version.

Significant TODOs

See whefs-TODOs for the list of more pressing TODOs.

Current status

"It works for me!"

Very Beta. It "seems to work", but the nature of the problem means there is lots of room for errors and bugs. Do not make the mistake of using it for data which you can't afford to lose.

That said, most of the basics are in place and working. There is plenty of cleaning up and refactoring to do, however.

The code has been shown to build, run, and pass basic sanity checks on:

  • gcc on Linux ix86/32: gcc 3.2.3, 4.0.2, 4.3.x
  • gcc on Linux ix86/64: gcc 4.2.x
  • tcc on ix86/32 Linux: tcc 0.9.24/25 (and tcc is FAST!)
  • gcc on Nexenta OpenSolaris 2.x in an x86/32 virtual machine.
  • gcc on a Sun V240 (sparcv9) under Solaris 10: gcc 3.4
  • Sun Studio 12 on a Sun V240 (sparcv9) under Solaris 10

The code compiles cleanly, even with gcc's pedantic mode enabled.

Most compilers explicitly require enabling C99 compatibility mode (in gcc this is the -std=c99 flag, on SunCC it's -xc99=all). Note that compilers which don't support C99 variable-sized arrays (e.g. tcc) will need to malloc() in some places where other compilers do not, so the overall memory costs may go up.

Reports of success/failure for other platforms are always appreciated.

Requirements

The library is standalone C code conforming to the ISO C99 standard (don't even think about asking me to back-port to C89 - not gonna happen). The storage handler for on-disk VFSes (as opposed to in-memory VFSes) requires certain functions defined in the POSIX-1.2001 standard (e.g. ftruncate() and fileno()). Most or all Unix-like systems will have the few required POSIX routines. Windows... i don't know. Without those storage handlers it can only be used for in-memory filesystems.

No third-party libraries are needed except the system's standard C libraries.

The i/o layer has some optional support for compression using zlib, but zlib has become a core system-level component and is available preinstalled on any sane system. See the Makefile for how to enable it.

Download

See the downloads page.

Documentation

See the docs page.

News

See the news page.

Want to help?

whefs has some notable limitations and could be improved in some significant ways. Any feedback or assistance is always appreciated. You can reach me via http://wanderinghorse.net/home/stephan/. Anyone who shows an interest and submits a patch or two will gladly be given write access to the code repository.

Some areas of improvement which specifically come to mind (and in which i could definitely use a hand) are: