For some class meetings, readings will be assigned. Usually, these
readings will consist of relevant technical papers, articles or
instructor-prepared notes. Paper copies of assigned readings and
notes will be provided in class and online. However, please note that
online versions of the readings are only available when accessed from
a 128.2.* (CMU) IP address.
The readings listed should be read BEFORE class on the assigned
day.
April 25th: Guest lecture by Dan Lovinger (no readings)
April 11th (L16): Storage for data-intensive computing
- Google FS paper from April 6th
- Bigtable: A Distributed Storage System for Structured Data. Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber. Appears in USENIX Symposium on Operating Systems Design and Implementation (OSDI), 2006.
(pdf)
- Recommended optional reading:
- MapReduce: Simplified Data Processing on Large Clusters. Jeffrey Dean and Sanjay Ghemawat. Appears in USENIX Symposium on Operating Systems Design and Implementation (OSDI), 2004.
(pdf)
April 6th: Guest lecture by Marc Unangst
- GPFS: A Shared-Disk File System for Large Computing Clusters. Frank Schmuck and Roger Haskin. Appears in FAST, January 2002. (pdf)
- Scalable performance of the Panasas Parallel File System. Brent Welch, Marc Unangst, et al. Appears in FAST, February 2008. (pdf)
- The Google File System. Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung. Appears in ACM Symposium on Operating Systems Principles (SOSP), 2003.
(pdf)
April 4th: Guest lecture by Jiri Schindler (no readings)
March 30th: Guest speaker (no readings)
March 28th (L15): Backups and disaster recovery
- Designing for disasters. Kimberly Keeton, Cipriano Santos, Dirk Beyer, Jeffrey Chase, John Wilkes. Appears in the Proceesings of the Third Usenix Conference of File and Storage Technologies (FAST'04).
(pdf)
March 23rd: Scalable FS directories
- Scale and Concurrency of GIGA+: File System Directories with Millions of Files. Swapnil Patil and Garth Gibson. Appears in FAST 2011. (pdf)
- A Directory Index for Ext2. Daniel R. Phillips. Appears in Ottowa Linux Symposium 2002. (pdf)
- Optional reading:
- Optional reading: "Distributed directory service in the FARSITE file system", John R. Douceur and Jon Howell, OSDI 2006) (pdf)
March 21st (L14): Virtualization
- Recommended optional reading:
Serverless Network File Systems (xFS). Thomas
E. Anderson, Michael D. Dahlin, Jeanna M. Neefe, David A. Patterson,
Drew S. Roselli, Randolph Y. Wang. Appears in the ACM Transactions on
Computer Systems, Vol. 14, No. 1. February 1996. (pdf)
March 16th (L13): NAS interfaces (a.k.a. DFS II)
March 14th (L12): Distributed file systems
- The Design and Implementation of the 4.4BSD Operating System (Marshall Kirk McKusick, Keith Bostic, Michael J. Karels, and John S. Quarterman, 1996)
- Chapter 9 (The Network Filesystem) (pdf)
- Recommended optional reading:
- Scale and Performance in a Distributed File System. John H. Howard, Michael L. Kazar, Sherri G. Menees, David A. Nichols, M. Satyanarayanan, Robert N. Sidebotham, Michael J. West. Appears in the ACM Transactions on Computer Systems, Vol. 6, No. 1, Pages 51-81. February 1988. (pdf)
- Optional reading:
- Optional reading: "NFS Version 3 Protocol Specification" (RFC 1813:
B. Callaghan, B. Pawlowski, P. Staubach, June 1995) (txt)
February 28th (L11): Flash Storage
February 21st (L10): Disk array systems
- RAID: High-Performance, Reliable Secondary Storage (Peter M. Chen,
Edward K. Lee, Garth A. Gibson, Randy H. Katz, and David
A. Patterson, 1994) (same as L9)
February 14th (L9): Disk array organization
- RAID: High-Performance, Reliable Secondary Storage (Peter M. Chen,
Edward K. Lee, Garth A. Gibson, Randy H. Katz, and David
A. Patterson, 1994)(pdf)
February 9th (L8): Disk request optimization
- Scheduling Algorithms for Modern Disk Drives. Bruce L. Worthington, Gregory R. Ganger, and Yale N. Patt. Appears in the Proceedings of the ACM Sigmetrics Conference. May, 1994. (ps)
February 7th (L7): Caching and FS integrity
- Soft Updates: A Solution to the Metadata Update Problem. Gregory R. Ganger, Markall Kirk McKusick, Craig A.N. Soules, Yale N. Patt. ACM Transactions on Computer Systems. May 2000. (pdf)
- Practical File System Design with the Be File System (Dominic Giampaolo, 1999)
- Chapter 7 (Journaling) (pdf)
February 2nd (L6): On-disk data layout
- The Design and Implementation of the 4.4BSD Operating System (Marshall Kirk McKusick, Keith Bostic, Michael J. Karels, and John S. Quarterman, 1996)
- Chapter 8 (Local filestores) (pdf)
- Embedded Inodes and Explicit Grouping: Exploiting Disk Bandwidth for Small Files. Gregory R. Ganger and M. Frans Kaashoek. Appears in the Proceedings of the Usenix Technical Conference. January, 1997. (pdf) (ps)
January 31st (L5): File system organization
- Practical File System Design with the Be File System. Dominic Giampaolo. 1999.
- Chapter 2 (What is a file system?) (pdf)
- UNIX Internals: The New Frontiers. Uresh Vahalia. 1996.
- Chapter 8 (File system interface and framework) (pdf)
January 26th (L4): Disk drive firmware
- An Introduction to Disk Drive Modeling. Chris Ruemmler and John Wilkes, 1994. (pdf)
- Microbenchmark-based Extraction of Local and Global Disk Characteristics. Nisha Talagala, Remzi H. Arpaci-Dusseau, and David Patterson (pdf)
January 19th and 24th (L3): Disk drive operation
- Digital Large System Mass Storage Handbook. Paul Massiglia. 1986.
- Probability Refresher. Mor Harchol-Balter. 2000. (pdf)
January 12th (L2): Metrics of I/O system quality
- Computer Architecture: A Quantitative Approach, 3rd ed. John L. Hennessy and David A. Patterson. 2002. (pdf)
- Section 7.7: "I/O performance measures"
- Section 7.8: "A Little queuing theory"
- Section 7.9: "Benchmarks of storage performance and availability"
- MTBF Description, Kevin Dally. 1995. (txt)
January 10th (L1): Introduction and overview of storage systems
This first meeting will be more than just organizational in nature.
We will discuss how the class is going to work and what will (and won't)
be covered.
See the 18-746 overview for a recap of the
general information.
We will also dive into the course by overviewing the area of storage
systems.