Embedded Inodes and Explicit Grouping: Exploiting Disk Bandwidth for Small Files

Proceedings of the USENIX Technical Conference, January 1997, pp. 1-17.

Gregory R. Ganger and M. Frans Kaashoek 
MIT Lab for Computer Science Parallel and Distributed Operating Systems Group

Small file performance in most file systems is limited by slowly improving disk access times, even though current file systems improve on-disk locality by allocating related data objects in the same general region. The key insight for why current file systems perform poorly is that locality is insufficient - exploiting disk bandwidth for small data objects requires that they be placed adjacently. We describe C-FFS (Co-locating Fast File System), which introduces two techniques, embedded inodes and explicit grouping, for exploiting what disks do well (bulk data movement) to avoid what they do poorly (reposition to new locations). With embedded inodes, the inodes for most files are stored in the directory with the corresponding name, removing a physical level of indirection without sacrificing the logical level of indirection. With explicit grouping, the data blocks of multiple small files named by a given directory are allocated adjacently and moved to and from the disk as a unit in most cases. Measurements of our C-FFS implementation show that embedded inodes and explicit grouping have the potential to increase small file throughput (for both reads and writes) by a factor of 5-7 compared to the same file system without these techniques. The improvement comes directly from reducing the number of disk accesses required by an order of magnitude. Preliminary experience with software-development applications shows performance improvements ranging from 10-300 percent.

Status: Presented at the 1997 USENIX Technical Conference (Anaheim, CA, January 6-10, 1997).

Download: Postscript, gzip'd postscript, pdf format