If you are seeking a "knowledge verification" letter from me, such as for supporting a green card application, please follow the process described here.

Recent Events

Research Interests

I have broad research interests in computer systems, including storage/file systems, cloud computing, software systems for large-scale machine learning, distributed systems, and operating systems. Most of my current research explores these topics in the broad context of storage systems and large-scale infrastructures within CMU's Parallel Data Lab (PDL), for which I serve as Director.

See the PDL pages for more current information, since this page gets updated rately, but... Together with industry collaborators and other members of CMU's Parallel Data Lab (PDL) and CMU's CyLab, my students and I are currently pursuing several topics of research:

  • Big Learning (Systems for ML) - The BigLearning project aims at scaling machine learning to large and sophisticated models and huge data for average machine learning practitioners by developing programmable, distributed computing frameworks.
  • Caching to Improve Latency & Efficiency at Scale (CILES) - new designs that make both flash caching systems and content delivery caches more efficient, in terms of wear, tail latency, resiliency, cost-effectiveness, and of course hit rates.
  • Cost-efficient Computing in the Cloud - explores opportunities to exploit the various CSP VM instance offerings (e.g., best-fit VM sizes or low-cost but unreliable VMs rented on transient contracts) to reduce the costs to run user applications in the cloud.
  • Data Lake Scheduling - unearthing, analyzing, and exploiting hidden inter-job dependencies in data lakes (data analytics infrastructures) to better schedule jobs and manage resources.
  • HeART (Disk-adaptive redundancy in distributed storage) - adaptive redundancy tuning to observed device failure rates in large distributed storage systems.
  • Mimir: Navigating cloud storage - helpin users to make optimal decisions when composing distributed storage systems in the public cloud.
  • Zoned Storage - exploring the system and software impacts of this new interface, which redefines the division of responsibilities between storage software and device firmware.

Other recent projects include:


I'm currently on a teaching cycle of teaching Storage Systems in Fall semesters and Advanced Cloud Computing in Spring semesters.

Brief Pre-CMU Bio

I spent 8/95 thru 11/97 as a postdoc in Frans Kaashoek's research group (PDOS) at the MIT Lab for Computer Science. During that time, I participated in the design and development of the exokernel operating system. I also pursued (with others, of course) a number of specific projects, most of which are mentioned above.

In a previous life (i.e., prior to August 1995), I was a graduate student in Yale Patt's research group in the University of Michigan's Advanced Computer Architecture Lab. Although the group's main focus is on compilation for and implementation of high-performance processors, my research was focused on file systems and storage subsystem architecture.

I earned all of my collegiate degrees (B.S., 1991, M.S., 1993, Ph.D., 1995) from the University of Michigan.