18-847 Readings (Fall 2010)
Each class meeting will have required readings, to be discussed in class.
Usually, these readings will consist of relevant technical papers or articles.
Many of these readings are available on the web at other locations, but
some will be provided directly from this site.
However, please note that online versions provided by this site are only
available when accessed from a 128.2.* (CMU) IP address.
The readings listed should be read BEFORE class on the assigned
day. Otherwise, how could you participate in the discussion?
December 1st: Class project presentations
- #1: Vijay V.
- #2: Lin X.
- #3: William W.
- #4: Alexey T., Lianghong X., Ilari S.
November 29th: Class project presentations
- #1: Jim C., Nitin G.
- #2: Yi Z., Luis B., Justin M.
- #3: Bin F.
- #4: Reinhard M, Iulian M.
November 24th: No Class (Thanksgiving)
November 22nd: Two more papers (#1 Yi Z.; #2 Bin F.)
- #1: Volley: Automated Data Placement for Geo-Distributed Cloud Services. S. Agarwal, J. Dunagan, et al. NSDI 2010. (PDF available here)
- #2: Q-Clouds: Managing Performance Interference Effects for QoS-Aware Clouds. R. Nathuji, A. Kansal, A. Ghaffarkhah. Eurosys 2010. (available here)
- Recommended optional reading/materials:
November 17th: Incremental processing (#1 Vijay V.; #2 Iulian M.)
- #1: Large-scale Incremental Processing Using Distributed Transactions and Notifications. D. Peng, F. Dabek. OSDI 2010. (PDF available here)
- #2: MapReduce Online. T. Condie, N. Conway, et al. NSDI 2010. (PDF available here)
- Recommended optional reading/materials:
November 15th: Cloud storage (#1 Nitin G.; #2 Luis B.)
- #1: Lithium: Virtual Machine Storage for the Cloud. J. Hansen, E. Jul. SOCC 2010. (PDF available here)
- #2: Availability in Globally Distributed Storage Systems. D. Ford, F. Labelle, et al. OSDI 2010. (available here) (slides used by presenter)
- Recommended optional reading/materials:
November 10th: VM image storage (#1 Justin M.; #2 Richard M.)
- #1: The Collective: A Cache-Based System Management Architecture. R. Chandra, N. Zeldovich, et al. NSDI 2005. (available here)
- #2: Parallax: Virtual Disks for Virtual Machines. D. Meyer, G. Aggarwal, et al. Eurosys 2008. (PDF available here)
- Recommended optional reading/materials:
November 8th: VMware vCloud Director and cloud marketplaces (no CMU lead)
- Two of the authors (Orran and Arkady) of our one reading will join us for class. They will give a presentation about the vCloud vision and experience, though we will all have read the paper ;), and then discussion will ensure.
- Enabling a Marketplace of Clouds: VMware's vCloud Director. O. Krieger, P. McGachey, A. Kanevsky, The VCD Team. OSR 2010 (I think). (PDF available here)
November 3rd: More cluster/scalable OSs/schedulers (#1 Alexey T.; #2 Jim C.)
- #1: An Operating System for Multicore Clouds: Mechanisms and Implementation. D. Wentzlaff, C. Gruenwald, et al. SOCC 2010. (PDF available here)
- #2: Quincy: fair scheduling for distributed computing clusters. M. Isard, V. Prabhakaran, et al. SOSP 2009. (ACM Digital Library page and PDF linked here for class)
- Recommended optional reading/materials:
- More papers on fos project available here.
- Hive: Fault Containment for Shared-Memory Multiprocessors. J. Chapin, M. Rosenblum, et al. SOSP 1995. (PDF linked here)
- Platform Enterprise Grid Orchestrator. A two page marketing description: PDF linked here
November 1st: Berkeley cluster scheduling papers (#1 Ilari S.; #2 Lianghong X.)
- #1: Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center. B. Hindman, A. Konwinski, et al. UCB Tech Report Sept 30, 2010. (PDF available here)
- #2: Delay Scheduling: A Simple Technique for Achieving Locality and Fairness in Cluster Scheduling. M. Zaharia, D. Borthakur, et al. Eurosys 2010. (PDF available here)
- Recommended optional reading/materials:
- More papers on Mesos (previously called Nexos) project available here.
October 25th and 27th: No class meeting
October 20th: Google Apps (Bin F., Lin X., Luis B., Justin M.)
October 18th: Microsoft Azure (Jim C., Nitin G., Richard M., Iulian M.)
October 13th: Amazon AWS (Lianghong X., Ilari S., Alexey T., William W.)
October 11th: Cloud provider building blocks (#1 Yi Z.; #2 Vijay V.)
- #1: The eucalyptus open-source cloud-computing system. Daniel Nurmi, Rich Wolski, et al. CCGRID 2009. (PDF available here)
- #2: Beyond Virtual Data Centers: Toward an Open Resource Control Architecture. Jeff Chase, Laura Grit, et al. ICVCI 2007. (PDF available here)
- Recommended optional reading/materials:
October 6th: Doing up virtual machines (#1 Bin F.; #2 Lin X.)
- #1: Memory Resource Management in VMware ESX Server. Carl A. Waldspurger. OSDI 2002. (PDF available here)
- #2: Differential Virtual Time (DVT): Rethinking I/O Service Differentiation for Virtual Machines. Mukil Kesavan, Ada Gavrilovska, Karsten Schwan. SOCC 2010. (PDF available here)
- Recommended optional reading/materials:
October 4th: Cloud Computing and some folks' thoughts (#1 Ilari S.; #2 Alexey T.)
- #1: Beyond Server Consolidation. Werner Vogels. ACM Queue, Jan/Feb 2008. (PDF available here)
- #2: Above the Clouds: A Berkeley View of Cloud Computing. Michael Armbrust, Armando Fox, et al. University of California at Berkeley Technical Report No. UCB/EECS-2009-28, Feb 2009. (PDF available here)
- Recommended optional reading/materials:
- Berkeley cloud report gets mixed reviews. James Urguhart. CNET News, Feb 2009. (available here)
- A conversation with Werner Vogels. ACM Queue. June 2006. (available here)
- All Things Distributed. Werner Vogels. A Weblog. (available here)
September 29th: No class meeting due to Open Cirrus summit at CMU
September 27th: Large tables (#1 Justin M.; #2 Reinhard M.)
- #1: BigTable: A Distributed Storage System for Structured Data. Fay Chang, Jeffrey Dean, et al. OSDI 2006. (available here)
- #2: Benchmarking Cloud Serving Systems with YCSB. B.F. Cooper, A. Silberstein, et al. SOCC 2010. (available here)
- Recommended optional reading/materials:
September 22nd: Higher-level languages (#1 Iulian M.; #2 William W.)
- #1: Pig latin: a not-so-foreign language for data processing. Christopher Olston, Benjamin Reed, et al. SIGMOD 2008. (PDF available here and ACM DL page here)
- #2: DryadLINQ: A System for General-Purpose Distributed Data-Parallel Computing Using a High-Level Language. Yuan Yu, Michael Isard, et al. OSDI 2008. (PDF available here)
- Recommended optional reading/materials:
September 20th: Parallel DBMSs (#1 Nitin G.; #2 Lianghong X.)
- #1: MapReduce and Parallel DBMSs: Friends or Foes?. Michael Stonebraker, Daniel Abidi, et al. Communications of the ACM, January 2010. (PDF available here)
- #2: MapReduce: A Flexible Data Processing Tool. Jeffrey Dean and Sanjay Ghemawat. Communications of the ACM, January 2010. (available here)
- Recommended optional reading/materials:
- A Comparison of Approaches to Large-Scale Data Analysis. Andrew Pavlo, Erik Paulson, et al. SIGMOD 2009. (PDF available here)
September 15th: Others (#1 Yi Z.; #2 Luis B.)
- #1: Dryad: Distributed Data-Parallel Programs from Sequential Building Blocks. Michael Isard, Mihai Budiu, Yuan Yu, Andrew Birrell, and Dennis Fetterly. Eurosys 2007. (available here)
- #2: Distributed Aggregation for Data-Parallel Computing: Interfaces and Implementations. Yuan Yu, Pradeep Kumar Gunda, and Michael Isard. SOSP 2009. (available here)
- Recommended optional reading/materials:
September 13th: MapReduce (#1 Vijay V.; #2 Jim C.)
- #1: MapReduce: Simplified Data Processing on Large Clusters. Jeffrey Dean and Sanjay Ghemawat. OSDI 2004. (available here)
- #2: Improving MapReduce Performance in Heterogeneous Environments. Matei Zaharia, Andy Konwinski, et al. OSDI 2008. (available here)
- Recommended optional reading/materials:
- Applying performance models to understand data-intensive computing efficiency. Elie Krevat, Tomer Shiran, et al. Carnegie Mellon Technical Report #CMU-PDL-10-108. (available here)
September 8th: First day of class (overview papers)
- Data-Intensive Supercomputing: The case for DISC. Randal E. Bryant. Carnegie Mellon University technical report CMU-CS-07-128. (pdf)
- NIST definition of cloud computing. Peter Mell and Tim Grance. (linked here (with other information))
- State of Public Sector Cloud Computing. Vivek Kundra. Report at CIO.gov. (executive summary and pdf)
- Recommended optional reading/materials: