March 27th, 2009


Reducing disk use

UNIX filesystems generally store three pieces of timing information about files - ctime (when the file was changed in any way), mtime (when the file contents, as opposed to its metadata, was last changed) and atime (when the file was last accessed by any process). This is a usefully flexible system, but the semantics of atime can be troublesome. atime must be updated every time a file is read, causing a read operation to instead become a read/write operation. This results in a surprising amount of io being generated in normal filesystem use, slowing the more relevant io and causing disks to spin up due to atime updates being required even if the file was read out of cache. It also results in a lot of unnecessary activity on flash media which may reduce their lifetime.

One option is to disable atime updates entirely. The problem with this approach is that certain applications depend on atime. This is especially common in mail clients which compare atime to mtime in order to determine whether a mailbox has been read since it was last modified. So, unfortunately, disabling atime entirely is impractical as a default. Back in 2006, Valerie Aurora submitted a patch that worked around this issue. The new relatime option meant that atime would only be updated if it would otherwise be older than ctime or mtime. Mail clients became happy and the world rejoiced.

Unfortunately, it turned out that there was one other common case of atime being used. Applications like tmpwatch monitor files in /tmp and delete them if they appear unused. In this case, "unused" means "has an atime older than a certain date". Since merely reading files doesn't update the ctime or mtime, relatime wouldn't cause the atime on these files to be updated and tmpwatch would happily delete them - even if users were reading them on a daily basis.

Ingo Molnar submitted a patch to add a further heuristic to the relatime behaviour. With it, the atime of a file will be updated if it's older than mtime, older than ctime or (and this is the important one) more than 24 hours in the past. This deals with the tmpwatch case nicely, while still providing a significant reduction in the quantity of atime updates.

Fedora shipped this patch for several releases, and Ubuntu have used it by default since 8.04. Unfortunately there were some concerns over certain aspects of its behaviour (in respect to its interface as opposed to the relatime functionality itself) and it never got merged. I pushed a trimmed down version that purely implements the change to the relatime behaviour, and earlier today Linus merged it and a further patch that makes relatime the default behaviour on Linux.

Most users won't notice this change in behaviour at all, other than as a small improvement in io performance and a reduction in the number of drive spinups. For users that do have issues, a new strictatime mount option has been added - using this will require an updated mount command, but it's a trivial patch. I'd be surprised if there are any real world use cases that are negatively affected by this, especially since it's been default behaviour in several distributions for a while, but there's always the potential that someone will be tripped up by it. We'll see.