For some class meetings, readings will be assigned. Usually, these
readings will consist of relevant technical papers, articles or
instructor-prepared notes. Paper copies of assigned readings and
notes will be provided in class and online. However, please note that
online versions of the readings are only available when accessed from
a 128.2.* (CMU) IP address or a local-only CMU IP address.
The readings listed should be read BEFORE class on the assigned
day.
April 27 (L19): Multi-core issues for FS/storage
- An Analysis of Linux Scalability to Many Cores. Silas Boyd-Wickizer, Austin T. Clements, Yandong Mao, Aleksey Pesterev, M. Frans Kaashoek, Robert Morris, and Nickolai Zeldovich. Appears in OSDI 2010. (pdf)
April 22: System Implications of Storage Class Memory (aka Persistent Memory or NVM)
Guest Speaker: Scott Hahn, Intel Labs
- System Software for Persistent Memory. Subramanya R. Dulloor, Sanjay Kumar, Anil Keshavamurthy, et al. Eurosys 2014. (pdf)
- Recommended optional reading:
- A Protected Block Device for Persistent Memory Feng Chen, Michael P. Mesnier, Scott Hahn. Appears in
Conference on Massive Storage Systems and Technology (MSST) 2014.
(pdf)
April 20: Evolution of Google FS
Guest Speaker: Larry Greenfield, Google
- The Tail at Scale. Jeffrey Dean, Luiz Andre' Barroso. Communications of the ACM, 56(2), February 2013. (pdf)
April 13: Design and Evolution of WAFL
Guest Speaker: Ram Kesavan, NetApp
- File System Design for an NFS File Server Appliance. David Hitz, James Lau, Michael Malcolm. 1994 USENIX Winter Conference. (pdf)
- Recommended optional reading:
- FlexVol: Flexible, Efficient File Volume Virtualization in WAFL John K. Edwards, Daniel Ellard, Craig Everhart, et al. Appears in
Usenix Annual Technical Conference 2008.
(pdf)
- SnapMirror: File System Based Asynchronous Mirroring for Disaster Recovery Hugo Patterson, Stephen Manley, Mike Federwisch, Dive Hitz, Steve Kleiman, Shane Owara. Appears in
FAST 2002.
(pdf)
April 8 (L18): Flash File Sysems
- DFS: A File System for Virtualized Flash Storage,
William K. Josephson, Lars A. Bongo, David Flynn, Kai Li. Appears in FAST 2010.
(pdf, recommended in lecture 5)
- Recommended optional reading:
- F2FS: A New File System for Flash Storage. Changman Lee, Dongho Sim, Jooyoung Hwang, Sangyeon Cho. Appears in FAST
2015. (pdf)
April 6: Deduplicating Storage Systems
Guest Speaker: R. Hugo Patterson II, Distinguished PDL Alumni
Co-founder and CTO, Datrium former CTO, Data Domain (acquired by EMC)
- Venti: a New Approach to Archival Storage. Sean Quinlan, Sean Dorward. 2002 USENIX Conference on File and Storage Technologies (FAST). (pdf)
- Avoiding the Disk Bottleneck in the Data Domain Deduplication File System. Benjamin Zhu, Kai Li, Hugo Patterson. 2008 USENIX Conference on File and Storage Technologies (FAST). (pdf)
March 30 (L17): More reliability techniques
- Architectures and Algorithms for On-Line Failure Recovery in Redundant Disk Arrays. Mark Holland, Garth A. Gibson, and Daniel P. Siewiorek. Appears in the Journal of Distributed and Parallel Databases, Vol. 2, No. 3, July 1994.
(available here)
- Scalable performance of the Panasas Parallel File System. (from Lecture 14 below).
March 25 (L16): Storage for data-intensive computing
- The Google File System. Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung. Appears in ACM Symposium on Operating Systems Principles (SOSP), 2003.
(pdf)
- Bigtable: A Distributed Storage System for Structured Data. Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber. Appears in USENIX Symposium on Operating Systems Design and Implementation (OSDI), 2006.
(pdf)
- Recommended optional reading:
- MapReduce: Simplified Data Processing on Large Clusters. Jeffrey Dean and Sanjay Ghemawat. Appears in USENIX Symposium on Operating Systems Design and Implementation (OSDI), 2004.
(pdf)
March 23 (L15): Parallel File Systems
- GPFS: A Shared-Disk File System for Large Computing Clusters. Frank Schmuck and Roger Haskin. Appears in FAST, January 2002. (pdf)
- Scalable performance of the Panasas Parallel File System. Brent Welch, Marc Unangst, et al. Appears in FAST, February 2008. (pdf)
March 18 (L14): Multi-server Distributed file systems
- Same readings as Lecture L13.
- Recommended optional reading:
- Serverless Network File Systems (xFS). Thomas E. Anderson, Michael D. Dahlin, Jeanna M. Neefe, David A. Patterson, Drew S. Roselli, Randolph Y. Wang. Appears in the ACM Transactions on Computer Systems, Vol. 14, No. 1. February 1996. (pdf)
March 16 (L13): Distributed file systems and NAS Interfaces
- The Design and Implementation of the 4.4BSD Operating System (Marshall Kirk McKusick, Keith Bostic, Michael J. Karels, and John S. Quarterman, 1996)
- Chapter 9 (The Network Filesystem) (pdf)
- Recommended optional reading:
- Scale and Performance in a Distributed File System. John H. Howard, Michael L. Kazar, Sherri G. Menees, David A. Nichols, M. Satyanarayanan, Robert N. Sidebotham, Michael J. West. Appears in the ACM Transactions on Computer Systems, Vol. 6, No. 1, Pages 51-81. February 1988. (pdf)
- Optional reading:
- Optional reading: "NFS Version 3 Protocol Specification" (RFC 1813:
B. Callaghan, B. Pawlowski, P. Staubach, June 1995) (txt)
March 4 (L12): Backup and disaster recovery
- Designing for disasters. Kimberly Keeton, Cipriano Santos, Dirk Beyer, Jeffrey Chase, John Wilkes. Appears in the Proceesings of the Third Usenix Conference of File and Storage Technologies (FAST'04).
(pdf)
February 25 (L11): Disk array systems
- RAID: High-Performance, Reliable Secondary Storage (Peter M. Chen,
Edward K. Lee, Garth A. Gibson, Randy H. Katz, and David
A. Patterson, 1994) (same as L8)
February 23 (L10): Disk array organization
- RAID: High-Performance, Reliable Secondary Storage (Peter M. Chen,
Edward K. Lee, Garth A. Gibson, Randy H. Katz, and David
A. Patterson, 1994)(PDF)
February 18 (L9, Guest lecture, Professor David Andersen): Using Flash
- Challenges and opportunities for efficient computing with FAWN.
Vijay Vasudevan, David G. Andersen, Michael Kaminsky, Jason Franklin, Michael A. Kozuch, Iulian Moraru, Padmanabhan Pillai, Lawrence Tan. Published in ACM SIGOPS OSR v45 n1, Jan 2011.
- Recommended: SILT: A Memory-Efficient, High-Performance Key-Value Store,
Hyeontaek Lim, Bin Fan, David G. Andersen, Michael Kaminsky.
Published in 2011 ACM Symposium on Operating Systems Principles.
February 11 (L8): Caching and FS integrity
- Soft Updates: A Solution to the Metadata Update Problem. Gregory R. Ganger, Markall Kirk McKusick, Craig A.N. Soules, Yale N. Patt. ACM Transactions on Computer Systems. May 2000. (pdf)
- Practical File System Design with the Be File System (Dominic Giampaolo, 1999)
- Chapter 7 (Journaling) (pdf)
- Verifying File System Consistency at Runtime. Daniel Fryer, Kuei Sun, Rahat Mahmood, TingHao Cheng, Shaun Benjamin, Ashvin Goel, Angela Demke Brown. Conference on File and Storage Technologies (FAST), 2012. (pdf)
February 9 (L7): On-disk data layout
- The Design and Implementation of the 4.4BSD Operating System (Marshall Kirk McKusick, Keith Bostic, Michael J. Karels, and John S. Quarterman, 1996)
- Chapter 8 (Local filestores) (pdf)
- TableFS: Enhancing Metadata Efficiency in the Local File System. Kai Ren and Garth Gibson. Published as Technical Report CMU-PDL-13-102, January 2013. (pdf)
February 2 (L6): File system organization
- Practical File System Design with the Be File System. Dominic Giampaolo. 1999.
- Chapter 2 (What is a file system?) (pdf)
- UNIX Internals: The New Frontiers. Uresh Vahalia. 1996.
- Chapter 8 (File system interface and framework) (pdf)
January 28 (L5): Flash Storage
- Design Tradeoffs for SSD Performance.
Nitin Agrawal, Vijayan Prabhakaran, Ted Wobber, John D. Davis, Mark Manasse, and Rina Panigrahy. Published in 2008 USENIX Annual Technical Conference.
- Recommended: DFS: A File System for Virtualized Flash Storage,
William K. Josephson, Lars A. Bongo, David Flynn, Kai Li.
Published in FAST 2010.
- Recommended: Operating System Support for NVM+DRAM Hybrid Main Memory,
Jeffrey C. Mogul, Eduardo Argollo, Mehul Shah, Paolo Faraboschi.
Published in Hotos 2009.
January 26 (L4): Disk drive firmware and request optimization
- Microbenchmark-based Extraction of Local and Global Disk Characteristics. Nisha Talagala, Remzi H. Arpaci-Dusseau, and David Patterson (pdf)
- Scheduling Algorithms for Modern Disk Drives. Bruce L. Worthington, Gregory R. Ganger, and Yale N. Patt. Appears in the Proceedings of the ACM Sigmetrics Conference. May, 1994. (ps)
January 21 (L3): Disk drive operation
- Digital Large System Mass Storage Handbook. Paul Massiglia. 1986.
- Chapter 2 (Magnetic disk technologies): read pages 2-1 to 2-20 (pdf) and 2-38 to 2-52 (pdf).
- Chapter 12 (System considerations): read pages 12-1 to 12-11. (pdf)
- An Introduction to Disk Drive Modeling. Chris Ruemmler and John Wilkes, 1994. (pdf)
January 14 (L2): Metrics of I/O system quality
- Computer Architecture: A Quantitative Approach, 3rd ed. John L. Hennessy and David A. Patterson. 2002. (pdf)
- Section 7.7: "I/O performance measures"
- Section 7.8: "A Little queuing theory"
- Section 7.9: "Benchmarks of storage performance and availability"
- MTBF Description, Kevin Dally. 1995. (txt)
- Probability Refresher. Mor Harchol-Balter. 2000. (pdf)
January 12 (L1): Introduction and overview of storage systems
This first meeting will be more than just organizational in nature.
We will discuss how the class is going to work and what will (and won't)
be covered.
See the 15-746/18-746 overview for a recap of the
general information.
We will also dive into the course by overviewing the area of storage
systems.