iozone and caching – the bane of benchmarking

iozone is a common tool used by companies and researchers when benchmarking storage systems. What most iozone users don’t seem to realize is that unless care is taken, the test may exercise only the storage system’s cache and not the underlying system.

iozone supports different test types, the most common used are:

  • -i0 – sequential write
  • -i1 – sequential read
  • -i2 – random write / random read

These are often used together in a single iozone execution, like:

iozone -i0 -i1 -i2 [...]

This will do a sequential writes test, followed by a sequential read test, followed by a random write test, followed by a random read test. The problem with this approach is that it will exercise both the client’s VM cache as well as whatever front-end cache the storage system is using. Depending on the storage system, in addition to going to nonvolatile storage, the initial sequential write may also go into the storage system’s front-end cache, so the subsequent sequential read may not hit the nonvolatile storage at all making the read test a pure cached test. Ditto the random read test. Similarly, the client may cache some or all of the file as well making the results even less useful.

The client cache is why the iozone docs recommend using a file size 2x the size of the client’s memory. At Isilon we find this unwieldy — particularly when you have 24GB clients. Moreover it does’t solve the problem of the storage system cache at all.

Instead, we have a wrapper script that will run each test in isolation and between test runs flush both the client’s cache (assuming the client is Linux – sysctl vm.drop_caches=1, or just unmount and remount the storage) and the Isilon cluster’s cache (isi_for_array isi_flush) between each run. This allows us to use smaller file sizes while getting results from the underlying storage and not caches.

The above “-i0 -i1 -i2” test gets broken down executed like this (but in a loop of course):

Isilon cluster cache flush
client cache flush
iozone -i0
Isilon cluster cache flush
client cache flush
iozone -i1
Isilon cluster cache flush
client cache flush
iozone -i2

There’s only one other problem with this approach: random writes and reads are done with the same test operation (-i2). This prevents iozone from being able to provide cache-free results as the random read immediately follows the random write. At Isilon we’ve modified our iozone binary to split the random write and random read operations into separately runable tests to work around this limitation.

If your intent is to test a system’s cache, then by all means run all the iozone tests in the same execution. But if you’re wanting to test the underlying storage you need to run them separately and flush caches between the executions.