whio  Artifact Content

Artifact 31d5d4c61c8af1c54e2d71d292c7d58eeb9037ef:

Wiki page [whio_epfs_mempool] by stephan 2011-05-23 18:50:55.
D 2011-05-23T18:50:55.999
L whio_epfs_mempool
P 9dff48d7de0bdce130f966b142e7445eccd86cb9
U stephan
W 5373
<strong>ACHTUNG: AS OF 20110523, THIS PAGE IS NOW MAINTAINED IN THE NEW WIKI:</strong> [http://whiki.wanderinghorse.net/wikis/whio/?page=whio_epfs_mempool]

See also: [whio_epfs], [whio_epfs_tools]

<h2>whio_epfs internal memory allocator</h2>

<b>ACHTUNG: the memory pool support is highly experimental.</b> It is known to cause memory corruption in some test code, and is not yet recommended for client use.

The following only applies if the library is compiled with its internal allocator support enabled (the macro <tt>WHIO_EPFS_CONFIG_ENABLE_MEMPOOL</tt>, defined in <tt>whio_epfs_config.h</tt>, is set to true at library build time).

The EPFS engine supports a per-fs-instance allocator object (a memory pool) which can be used for allocating fs-internal data such as inode handles and block chains. If the allocator feature is disabled, then standard de/re/allocators will be used for allocating EPFS-internal data. 

The pool/allocator can be set up using any of these approaches:

  *  The [whio_epfs_tools] set up a memory pool by default of a few kilobytes, and that will suffice for all but the weirdest cases (e.g. large pseudofiles with very small block sizes).
  *  <tt>whio_epfs_mempool_setup()</tt> (but read the docs to understand the important limitations!).
  *  The <tt>whio_epfs_setup_opt</tt> parameter to functions like <tt>whio_epfs_openfs2()</tt> and <tt>whio_epfs_mkfs2()</tt>. This is the preferred way to set up the pool in client applications.

The allocator allows one to choose whether the allocator should fall back to <tt>free/malloc()</tt> if its own pool fills up.

A portion of the client-supplied memory pool size is used for storing the allocator itself as well as its book-keeping data. Most cases won't need more than approximately 200 bytes of storage for the allocator <em>and</em> its book-keeping data.

When an opened inode is opened or grows, its block chain has to be loaded (or expanded). When growing block chains via re-allocation, the allocator may
temporarily need two copies (if it cannot expand the memory in-place). Thus
the pool always need some slack space, and cannot be <em>perfectly</em> pre-sized for any given usage.

The memory pool is broken down into blocks, and its possible to use up all of the memory blocks without actually using up all the memory (lots of objects smaller than the (unspecified) block size can cause this).

An out-of-memory error in the engine will, in my experience, most often show up as an error via the <tt>truncate()</tt> operation on pseudofiles. That routine is used for expanding block chains as files grow (normally during a <tt>write()</tt>). Block chains of <em>opened</em> inodes have to be cached in memory for various reasons (namely performance and multi-handle consistency). With a small memory pool and small EFS block sizes, it is easy to get OOM errors. (Which is nice for testing, but not for most real-life use.) i understand that performing allocation during i/o is a cardinal sin, but it is a side-effect the library has to live with for now. (And an OOM almost certainly won't happen if you're using the standard allocators, anyway.)

The internals of the allocator are implementation details, and the allocator is not available to EPFS clients via the public interface. That said, the allocator has its own source code repository at:


<h2>Example of custom allocator savings...</h2>

Most or all of the [whio_epfs_tools] try to use the memory pool support if the library is configured with it. As an example of how little dynamic memory is needed if the pool is configured, consider this command:

whio-epfs-ls my.epfs

That will [whio_epfs_ls|list the contents] of <tt>my.epfs</tt>. It must, abstractly speaking, do the following:

  *  Open the EFS container.
  *  Iterate over each EFS entry, and possibly each block, and output the information specified via various command-line options.
  *  Close the EFS container and free its resources.

How many calls to <tt>malloc()</tt> must it make?

The short answer is "none" and the long answer is "two". The whio/whio_epfs layer doesn't have to call <tt>malloc()</tt> a single time in this case, assuming the custom allocators can provide the memory. However, let's explain the long answer...

After running it through [http://valgrind.org/|valgrind, callgrind], and [http://kcachegrind.sourceforge.net|KCacheGrind], it was clear (for the configuration of my particular tests) that only two calls to <tt>malloc()</tt> were made for a total of 640 bytes. One was from the internals of <tt>fopen()</tt>, which cannot be avoided if we're using files for storage. The second one was a side-effect of the internal command-line-arguments parser used by the various [whio_epfs_tools|EPFS tools].

We can't get any fewer calls to <tt>malloc()</tt> unless we use storage which doesn't need allocation! Which... actually can be done via careful use of custom allocators and statically-allocated memory as an EPFS storage device.

That said, the "ls" tool doesn't need to <em>open</em> any inodes, and therefore  it doesn't really allocate anything. The [whio_epfs_cp|cp tool], on the other hand, has to open both files and pseudofiles, and must do a relatively high number of allocations.

Z 8a7c9c8584b7da274ece8e4dd277108b