If you are seeking a "knowledge verification" letter from me, such as for supporting a green card application, please follow the process described here.

Recent Events

  • hmm, updated every few years, whether needed or not... kids will do that to you, I guess.
  • Everyone who thought "Greg never takes a vacation" is now proven wrong... family vacation to Hawaii, in July 2012, complete with surfing (really!). Check out a few action shots of Gangers getting it done.
  • We launched a new research center focused on cloud computing in 2012. Check out the Intel Science and Technology Center for Cloud Computing (ISTC-CC).
  • Our storage systems course is being offerred regularly, with the current offering in the Spring term 2015 and the next one in Fall 2015.
  • In Fall 2010, I taught a PhD-level special topics class (it'd been awhile) on Data-intensive and Cloud computing and storage.
  • In July 2010, I testified before Congress about cloud computing in a hearing on the benefits and risks of moving federal IT to the cloud. It was a very interesting experience.
  • In 2010, I was named Jatras Professor of ECE. An overwhelming honor.
  • We've established several clouds at CMU, including the CMU vCloud, an OpenCirrus site, and an OpenCloud site.
  • Tim and Will (my sons) make it tough to keep web pages up to date, but lots of research progress is happening too. Check out my publications page for recent and previous papers.

Research Interests

I have broad research interests in computer systems, including storage/file systems, cloud computing, software systems for large-scale machine learning, distributed systems, and operating systems. Most of my current research explores these topics in the broad context of storage systems and large-scale infrastructures within CMU's Parallel Data Lab (PDL), for which I serve as Director.

See the PDL pages for more current information, since this page gets updated rately, but... Together with industry collaborators and other members of CMU's Parallel Data Lab (PDL) and CMU's CyLab, my students and I are currently pursuing several topics of research:

  • Big Learning (Systems for ML) - The BigLearning project aims at scaling machine learning to large and sophisticated models and huge data for average machine learning practitioners by developing programmable, distributed computing frameworks.
  • Caching to Improve Latency & Efficiency at Scale (CILES) - new designs that make both flash caching systems and content delivery caches more efficient, in terms of wear, tail latency, resiliency, cost-effectiveness, and of course hit rates.
  • Cost-efficient Computing in the Cloud - explores opportunities to exploit the various CSP VM instance offerings (e.g., best-fit VM sizes or low-cost but unreliable VMs rented on transient contracts) to reduce the costs to run user applications in the cloud.
  • Data Center Observatory (DCO) - a working data center and a research vehicle for the study of data center automation and efficiency.
  • Data Lake Scheduling - unearthing, analyzing, and exploiting hidden inter-job dependencies in data lakes (data analytics infrastructures) to better schedule jobs and manage resources.
  • DeltaFS - a new distributed file system service created to efficiently handle small chunk read/write data at exascale.
  • HeART - adaptive redundancy tuning to observed device failure rates in large distributed storage systems.
  • Mimir: Navigating cloud storage - helping users to make optimal decisions when composing distributed storage systems in the public cloud.
  • NVM Redundancy - hardware and software approaches to providing memory-speed, storage-quality checksums and cross-chip parity.
  • Zoned Storage - exploring the system and software impacts of this new interface, which redefines the division of responsibilities between storage software and device firmware.

Other recent projects include:


Currently on a cycle primarily consisting of 15-213/18-243 in Fall semesters and 15-746/18-746 in Spring semesters.

In Spring 2001, Dave Nagle and I created a new storage systems course (18-546). It has, of course, evolved over the years. We continue to teach it annually, now as 15-746/18-746.

In Fall 2000, I was drafted at the last minute to teach 15-712 (graduate OS and Distributed Systems). It was a lot of fun, and I taught it again in Fall 2001 and Fall 2002.

Brief Bio

(In reverse order) In addition to research and teaching, I enjoy spending time with my wife, Jenny, playing various sports (basketball, in particular), and occasionally sleeping. Of course, with the arrivals of Tim and Will, all of that became irrelevant.

I spent 8/95 thru 11/97 as a postdoc in Frans Kaashoek's research group (PDOS) at the MIT Lab for Computer Science. During that time, I participated in the design and development of the exokernel operating system. I also pursued (with others, of course) a number of specific projects, most of which are mentioned above.

In a previous life (i.e., prior to August 1995), I was a graduate student in Yale Patt's research group in the University of Michigan's Advanced Computer Architecture Lab. Although the group's main focus is on compilation for and implementation of high-performance processors, my research was focused on file systems and storage subsystem architecture.

I earned all of my collegiate degrees (B.S., 1991, M.S., 1993, Ph.D., 1995) from the University of Michigan. I grew up in Michigan (Troy, post-1980) and Northern Ohio (Sylvania, pre-1980).