See also: TableOfContents
libwhio
Welcome to the Fossil source code repository for libwhio, the WanderingHorse.net I/O library for C. whio is a C library encapsulating an i/o device and stream API. It originally developed as parts of two other libraries, but was found to be generic enough to fork out into its own library.
This site is a Fossil source repository, containing the source code, a wiki, bug tracker, etc., for this project. To be able to download the code or use most of the hyperlinks on this site, you must click the /login link and log in as "anonymous" with the password shown on the screen. This is to avoid that bots download every version of every file in the repository, or traverse the whole history of every source file.
Author: Stephan Beal (http://wanderinghorse.net/home/stephan/)
License: Public Domain
Code status: Works for me! whio is a core component of libwhefs, is developed in close conjunction with that library, and seems to work well. It is also the basis for a set of JavaScript I/O classes based on the Google v8 JavaScript engine, and (unsurprisingly) has caused no Grief there, either.
Download: downloads page
What is whio?
This library provides an object oriented C API for interacting with abstract data stores, either via random or sequential access. On top of this interface, concrete implementations are provided for FILE and in-memory data stores (via dynamic memory or a client-supplied memory range). All implementations have the same, fairly small, public interface.
It can essentially be used to wrap any random-access data stream, and adding wrappers for custom stream types is easy to do. For example libwhefs, an embedded filesystem library, uses a custom whio_dev implementation to provide access to "virtual" files (inside an embedded filesystem) using the same API as one can use for FILE and memory buffer access.
This code was originally developed as part of libc11n, a serialization framework, but was eventually forked out for inclusion into the more generic whio library.
Features:
- Pedantically thorough API documentation litters the header files. No public API member is undocumented (and most are documented very well).
- A simple object oriented interface, modeled after the standard C i/o API, for interacting with sequential- or random-access data stores. Includes an ioctl()-like interface for implementing customizations beyond what the public API provides for.
- Provides device implementations for FILE handles (using either FILE objects or file descriptors) and in-memory buffers. That is, read/write from/to memory or a file using the same interface.
- The in-memory i/o devices can be configured to use a client-specified memory range or dynamic memory. Dynamic buffers can be configured to stay at a fixed size or expand as needed when write() takes them out of bounds.
- "Subdevices" allow any i/o device to be partitioned into several logical i/o devices, each of which can only read/write from/to a specified range within the parent device. These can be embedded to further partition an address range.
- So-called "block devices" simplify uses where a device is partitioned into a number of records of a fixed size.
- The core i/o classes do no copying or transformation of data between the client and the data store, so the abstraction penalty is quite low.
- Very frugal memory usage. Some of my apps make heavy use of the whio_dev API, and those apps typically need to allocate under 200 bytes from the i/o API. (But note that fopen() allocates more under the hood - somewhere around 300 bytes per file on my box.)
- Allows client code to redefine the memory allocator used by the library, so they can specify a custom memory source.
- Uniform cleanup of device objects, regardless of the underlying storage. This makes avoiding leaks very easy.
- Allows tying client-side data to a device, along with an optional destructor function to clean up the data when the device is closed.
- Devices can, with a little work, be "stacked". That is, one implementation can be used to add features to another (e.g. buffering, device range fencing, or multiplexing).
- Compiles quickly and cleanly in 32- and 64-bit environments. (That said, it is untested on storage which cannot fit in 32 bits.)
- Can be compounded into two files to simplify redistribution and re-use (one .h and one .c file). See the AmalgamationBuild page for details.
Misfeatures
- Does not attempt to be the end-all/be-all of i/o interfaces. It does what i need it to do.
- Does not have specific features to support, and the API does not account for the requirements of, asynchronous i/o.
- Its interfaces assume that i/o devices (the whio_dev API) support read-only or read-write modes, but write-only is not accounted for in the random-access APIs. (Sequential streams (the whio_stream API) can of course be write-only.)
- Does not directly support thread locking. Multi-threaded access to any given i/o device must be carefully serialized by the client. It would appear to be feasible to add locking support to the stream classes (which have relatively few operations), but consistent un/locking of random-access classes requires a higher-level API in order to make certain combinations of operations (e.g. write-at-pos() (seek-then-write)) atomic vis-a-vis a given mutex.
- Because it is device-generic, it does not directly support device-level locking. That said, the whio_dev API does provide a locking abstraction API and the file-based device handlers support it by translating the requests to fcntl() locks (see whio_locking).
- Um... there are probably more, but none come to mind at the moment.
Requirements
The code is ANSI C, using some C99 features. The file-based i/o handlers, as opposed to the memory-based handlers, require some functions defined in POSIX-1.2001. All (or most) Unix systems will have the few POSIX functions the API requires.
It has been shown to compile using gcc 4.2.x on Linux x86/32, gcc 4.1 on Linux x86/64, tcc on Linux/x86, and gcc 3.4 on Solaris/Sparc, but older gcc versions explicitly require using the -std=c99 flag to enable the C99 features which whio uses.
There are some optional features which use zlib to provide gzip compression to/from whio_stream objects, and these features require zlib (which is installed by default on nearly every system on the planet). See whio_zlib.h for the routines and the macro which needs to be set to enable these functions.
Code Status
whio is used as the i/o plaform in several projects and seems to be quite stable (functionally speaking):
Potential uses
- Memory-based devices act nicely as dynamic string buffers, in particular when lots of output is being generated (e.g. buffering application output, like a generated web page).
- In-memory compression of arbitrary data (using the optional whio_stream/gzip support).
- An embedded filesystem, e.g. whefs uses whio_dev as its back-end and provides a higher-level whio_dev specialization which allows its embedded files to be manipulated using the whio_dev interface. (The lower-level whio_epfs API now exists to replace most of the libwhefs internals.)
- Many places where abstract access to a random-access data store is needed can benefit from being able to swap out storage back-ends. For any given data store, if a whio_dev implementation exists (or can be written) then the whio_dev and whio_stream APIs allows one to easily swap out storage for any given application. e.g. libc11n uses this to provide object de/serialization over arbitrary i/o channels. (In fact, this code is a generalized form of the i/o API from libc11n.) whio_epfs uses this to host an "embedded filesystem" in arbitrary storage.
See also...
- wiki table of contents
- whio_dev, the random-access i/o API.
- whio_stream, the sequential-access i/o API.
- whio_epfs, the embedded pseudo-filesystem API.
News
10 March 2010:
- After nearly 18 months of waiting for a solution to this problem to surface, the search times for the next-free inode and data block objects in whio_epfs are now O(1). (There's a small amount of new i/o overhead to possibly read/modify/flush one or two neighboring records.) Formerly these ops were O(1) average case but worst-case O(N) (N=block/inode count). The changes required no new memory allocations and removes the only two remaining notable performance bottlenecks in the library :).
2 March 2010:
- Added an interface which allows client applications to replace the memory allocator used by the whio internals. This obsoletes the older (and un-safer) "static allocators".
19 Feb 2010:
- whio_epfs is functional and almost feature-complete.
- Added a device locking abstraction to the whio_dev API.
13 Dec 2009:
- Work has started on refactoring libwhefs (which is based on whio), to move the most basic embedded filesystem features into libwhio. The core functionality is working, meaning clients can now embedded "pseudofilesystems" into their apps using an optional component of libwhio.
8 June 2009:
- Now compiles with gcc's -pedantic and -fstrict-aliasing flags. Seems to work, too.
9 March 2009:
- whio is now used in a "real" project to implement I/O classes for JavaScript bindings: http://code.google.com/p/v8-juice/wiki/PluginWhio.
30 Dec 2008: