ACHTUNG: AS OF 20110523, THIS PAGE IS NOW MAINTAINED IN THE NEW WIKI: http://whiki.wanderinghorse.net/wikis/whio/?page=whio_epfs_locking
Record locking in whio_epfs
This is under construction...
EPFS currently has only skeletal support for any sort of file locking. That support needs to be significantly improved before multiple applications may safely access an EFS container at the same time. Currently it is harmless to have multiple reader applications, but multiple concurrent writers will eventually cause corruption.
The currently-implemented locking behaviours include:
- When an EFS is opened, the engine tries to determine whether locking can be used at all (by querying the storage device via whio_dev_ioctl()). If locking is not available, EFS locking will be disabled. i'd like to add an option which says "lock or die", for clients who absolutely require locking for proper behaviour (but be aware that not all back-end devices support it (in-memory devices, for example)).
- whio_epfs_mkfs() and friends will fail with whio_rc.LockingError if another process has locked the device it wants to re-format. It does not wait for a lock because it would then hose an EFS which presumably is still useful (since another application has it locked).
- whio_epfs_openfs() and friends lock the whole EFS file with either a read or write lock, depending on their open mode. Thus applications which open an EFS may block while waiting for another process to let go of the EFS. i will eventually add the option to return with a locking error (like mkfs does) rather than waiting.
Some of the major TODOs:
- Locking individual records (inodes and blocks) as they are opened, as opposed locking to the whole file. The skeleton code is in place, but proper (un)locking becomes quite tricky once a given inode has been opened multiple times using multiple i/o modes. If we would limit each inode to one open instance it would simplify a bit of other internals and would eliminate some of the trickier unlocking problems altogether. Hmmm. Another alternative involves storing a bit array (size=2 bits per inode) to mark records which we have read- or write locked. That is doable, but i'm not really keen on the allocation it would need (though 4k inodes would only need 1k of memory). The block list would be a tiny bit trickier because the number of blocks can grow at runtime (and the block count is typically much higher than the inode count).
- The routines which find the next free inode/block need some seemingly tricky locking, e.g. to avoid un-doing or promoting/demoting a lock from an opened record (remember that a process does not see its own locks, only those from other processes). Still thinking that through.
- The way the FS-internal hints are read/flushed needs to be changed to do something like sqlite3 does: when the FS changes, a change number is written to the storage. The next time the FS wants to make a change it reads the change number. If it isn't the number it last wrote then it re-reads the hints, invalidates internal caches, and other internal bits which may need to be handled. This all gets quite complex, though.