whio  whio

See also: TableOfContents

libwhio

Welcome to the Fossil source code repository for libwhio, the WanderingHorse.net I/O library for C. whio is a C library encapsulating i/o device and stream APIs, plus many utilities based off them. It originally developed as parts of two other libraries, but was found to be generic enough to fork out into its own library. Since that time, it has grown to include a number of device implementations and utility classes, such as hashtables (whio_udb and whio_ht), the storage-side equivalent of malloc() and free() (whio_vlbm), and an embedded filesystem (whio_epfs).

This site is a Fossil source repository, containing the source code, and a bug tracker, etc., for this project. To be able to download the code or use most of the hyperlinks on this site, you must click the /login link and log in as "anonymous" with the password shown on the screen. This is to avoid that bots download every version of every file in the repository, or traverse the whole history of every source file.

This site's wiki is maintained in a separate dedicated wiki:

The old pages will be kept around (because they're already there) but will not be maintained on this site.

Author: Stephan Beal (http://wanderinghorse.net/home/stephan/)

License: Public Domain

Code status: Works for me! whio is a core component of libwhefs, is developed in close conjunction with that library, and seems to work well. It is also the basis for a set of JavaScript I/O classes based on the Google v8 JavaScript engine, and (unsurprisingly) has caused no Grief there, either.

Download: downloads page

What is whio?

This library provides an object oriented C API for interacting with abstract data stores, either via random or sequential access. On top of this interface, concrete implementations are provided for FILE and in-memory data stores (via dynamic memory or a client-supplied memory range), and clients can provide their own. All implementations have the same, fairly small, public interface.

It can essentially be used to wrap any random-access data stream, and adding wrappers for custom stream types is easy to do. For example, whio_epfs, an "embedded filesystem" API, uses a custom whio_dev implementation to provide access to "virtual" files (inside an embedded filesystem) using the same API as one can use for accessing FILEs and memory buffers.

This code was originally developed as part of libc11n, a generic serialization framework for C. It was eventually forked to become a more generic i/o framework because i needed such a creature in order to develop libwhefs. libwhefs and libwhio are developed in close conjunction, but whio also sees use outside of whefs, and is generic enough to use in many contexts. whio has become my own personal favourite library for general-purpose I/O in C.

Features:

  • Pedantically thorough API documentation litters the header files. No public API member is undocumented (and most are documented very well).
  • A simple object oriented interface, modeled after the standard C i/o API, for interacting with sequential- or random-access data stores. Includes an ioctl()-like interface for implementing customizations beyond what the public API provides for.
  • Provides device implementations for FILE handles (using either FILE objects or file descriptors) and in-memory buffers. That is, read/write from/to memory or a file using the same interface.
  • The in-memory i/o devices can be configured to use a client-specified memory range or dynamic memory. Dynamic buffers can be configured to stay at a fixed size or expand as needed when a write takes them out of bounds.
  • "Subdevices" allow any i/o device to be partitioned into several logical i/o devices, each of which can only read/write from/to a specified range within the parent device. These can be embedded to further partition an address range.
  • So-called "block devices" simplify uses where a device is partitioned into a number of records of a fixed size.
  • The core i/o classes do no copying or transformation of data between the client and the data store, so the abstraction penalty is quite low.
  • Very frugal memory usage. Some of my apps make heavy use of the whio_dev API, and those apps typically need to allocate under 200 bytes from the i/o API. (But note that fopen() allocates more under the hood - somewhere around 300 bytes per file on a 32-bit Linux system.)
  • Allows client code to redefine the memory allocator used by the library, so they can specify a custom memory source for device allocation.
  • Uniform cleanup of device objects, regardless of the underlying storage. This makes avoiding leaks very easy.
  • Allows tying client-side data to a device, along with an optional destructor function to clean up the data when the device is closed.
  • Devices can, with a little work, be "stacked". That is, one implementation can be used to add features to another (e.g. buffering, device range fencing, or multiplexing).
  • Compiles quickly and cleanly in 32- and 64-bit environments. (That said, it is untested on storage which cannot fit in 32 bits.) As of 20110419 the library code all compiles in C89 and C99 modes. The app-level code requires C99.
  • Can be compounded into two files to simplify redistribution and re-use (one .h and one .c file). See the AmalgamationBuild page for details.
  • Provides C++ wrappers for proxying whio_stream objects via the STL i/ostream interfaces. Since any whio_dev can be wrapped as a whio_stream, i/o devices can also be proxied via C++.
  • Comes with several classes for managing specialized storage requirements, e.g. whio_vlbm manages storage blocks in a manner similar to how malloc() and free() manage memory blocks, whio_udb and whio_ht provide storage-based hashtables, and whio_epfs provides an embedded filesystem.

Misfeatures

  • Does not attempt to be the end-all/be-all of i/o interfaces. It does what i need it to do.
  • Does not have specific features to support, and the API does not account for the requirements of, asynchronous i/o.
  • Its interfaces assume that i/o devices (the whio_dev API) support read-only or read-write modes, but write-only is not accounted for in the random-access APIs. (Sequential streams (the whio_stream API) can of course be write-only.)
  • Does not directly support thread locking. Multi-threaded access to any given i/o device must be carefully serialized by the client. It would appear to be feasible to add locking support to the stream classes (which have relatively few operations), but consistent un/locking of random-access classes requires a higher-level API in order to make certain combinations of operations (e.g. write-at-pos() (seek-then-write)) atomic vis-a-vis a given mutex.
  • Because it is device-generic, it does not directly support device-level locking. That said, the whio_dev API does provide a locking abstraction API and the file-based device handlers support it by translating the requests to fcntl() locks (see whio_locking).
  • Um... there are probably more, but none come to mind at the moment.

Requirements

The code is ANSI C, using some C99 features. The file-based i/o handlers, as opposed to the memory-based handlers, require some functions defined in POSIX-1.2001. All (or most) Unix systems will have the few POSIX functions the API requires.

It has been shown to compile using gcc 4.2.x on Linux x86/32, gcc 4.1 on Linux x86/64, tcc on Linux/x86, and gcc 3.4 on Solaris/Sparc, but older gcc versions explicitly require using the -std=c99 flag to enable the C99 features which whio uses.

There are some optional features which use zlib to provide gzip compression to/from whio_stream objects, and these features require zlib (which is installed by default on nearly every system on the planet). See whio_zlib.h for the routines and the macro which needs to be set to enable these functions.

Code Status

whio is used as the i/o plaform in several projects and seems to be quite stable (functionally speaking):

Potential uses

  • Memory-based devices act nicely as dynamic string buffers, in particular when lots of output is being generated (e.g. buffering application output, like a generated web page ).
  • In-memory compression of arbitrary data (using the optional whio_stream/gzip support).
  • An embedded filesystem, e.g. whefs uses whio_dev as its back-end and provides a higher-level whio_dev specialization which allows its embedded files to be manipulated using the whio_dev interface. (The lower-level whio_epfs API now exists to replace most of the libwhefs internals.)
  • Many places where abstract access to a random-access data store is needed can benefit from being able to swap out storage back-ends. For any given data store, if a whio_dev implementation exists (or can be written) then the whio_dev and whio_stream APIs allows one to easily swap out storage for any given application. e.g. libc11n uses this to provide object de/serialization over arbitrary i/o channels. (In fact, this code is a generalized form of the i/o API from libc11n.) whio_epfs uses this to host an "embedded filesystem" in arbitrary storage.

See also...

News

See the NewsPage.