You are expected to understand this. CS 111 Operating Systems Principles, Spring 2005

Paper Report 4: Soft Updates

Due at 11:59pm Thursday 5/19

Soft updates is probably the most elegant idea in file systems in the last ten years.

File systems are, by definition, supposed to be persistent and safe. When a program writes data to disk, that data should stick around until it's deleted later. But all of us have probably experienced at least one occasion when a file system became corrupt and lost our data. Maybe the electricity went off in the middle of writing a file to disk, and when the machine rebooted, that partially-written information confused the machine, rendering much of the file system unusable: a total nightmare scenario.

To reduce the chance of this kind of confusion, most machines will check each file system on boot. If the machine shut down uncleanly, a file system checker program (fsck) will walk over the entire file system, finding and fixing problems with on-disk data structures (such as directories and file pointers). This can take literally hours, and if any problems are found, random parts of the file system might be deleted!

So is there any way to make this scenario less likely, and to speed up reboot? One way is to keep an on-disk log or journal of all important disk changes, allowing fsck to simply check the journal for half-completed operations. But this can reduce performance, since many writes are effectively written twice, in sequence (once to the journal, then once to the actual on-disk data structures). People didn't think there was any other way. Then soft updates came along. It's a great example of deep systems thinking, and will help you really understand file system structures.

This paper is much easier to read than Disco, but it does refer to concepts we may not cover in class until next week -- particularly to inodes. Read the file system chapter in Silberschatz to get a leg up.

Read "Soft Updates: A Solution to the Metadata Update Problem in File Systems", by Gregory R. Ganger, Marshall Kirk McKusick, Craig A. N. Soules, and Yale N. Patt.

By midnight on Thursday 5/19, turn in a one-page response to the following question by email (PDF format please).

It is commonly believed that safety in computer systems can only be achieved at the expense of performance. Soft updates appears to disprove this belief! Discuss a performance penalty that any safe file system must incur. And what do you think is the most important technique used by soft updates, or the most important insight behind the design of soft updates, to recover good performance? How well does it work? (Be specific! Refer to particular graphs and/or tables when appropriate.)