I have broad research interests
in computer systems, including storage/file systems, cloud computing,
software systems for large-scale machine learning, distributed systems,
and operating systems. Most of my current research
explores these topics in the broad context of storage systems and
large-scale infrastructures within
CMU's Parallel Data Lab (PDL),
for which I serve as Director.
See the PDL pages for more current information, since this page
gets updated rately, but... Together with industry collaborators
and other members of CMU's Parallel
Data Lab (PDL) and CMU's
CyLab, my students and I are currently pursuing several
topics of research:
- Big Learning (Systems for ML) - The BigLearning project aims at scaling machine learning to large and sophisticated models and huge data for average machine learning practitioners by developing programmable, distributed computing frameworks.
- Caching to Improve Latency & Efficiency at Scale (CILES) - new designs that make both flash caching systems and content delivery caches more efficient, in terms of wear, tail latency, resiliency, cost-effectiveness, and of course hit rates.
- Cost-efficient Computing in the Cloud - explores opportunities to exploit the various CSP VM instance offerings (e.g., best-fit VM sizes or low-cost but unreliable VMs rented on transient contracts) to reduce the costs to run user applications in the cloud.
- Data Lake Scheduling - unearthing, analyzing, and exploiting hidden inter-job dependencies in data lakes (data analytics infrastructures) to better schedule jobs and manage resources.
- HeART (Disk-adaptive redundancy in distributed storage) - adaptive redundancy tuning to observed device failure rates in large distributed storage systems.
- Mimir: Navigating cloud storage - helpin users to make optimal decisions when composing distributed storage systems in the public cloud.
- Zoned Storage - exploring the system and software impacts of this new interface, which redefines the division of responsibilities between storage software and device firmware.
Other recent projects include:
- NVM Redundancy - hardware and software approaches to providing memory-speed, storage-quality checksums and cross-chip parity.
Storage Systems, which automate and simplify storage administration
in clusters of self-configuring, self-organizing, self-tuning, self-healing,
self-managing, etc. storage servers.
Survivable distributed storage (PASIS) that provide data availability and
security even after some successful breakins.
Security via Smarter Devices. (which, among other things, includes
devices that protect their contents from compromised OSes).
Automated database storage management (Fates) that matches data layouts and
access patterns to device-specific characteristics to increase performance and predictability.
- Characterization and simulation of storage systems. (Two useful
side-effects of this work have been the publicly-available DiskSim
storage system simulator and the DIXtrac-enabled database
of validated disk parameters for it.)
- Design, implementation
and use of Systems of Active
Components, including "intelligent" NICs, Active
Storage Networks, and Active
Extracting and using free bandwidth from busy disks.
MEMS-based storage devices in computer systems.
I spent 8/95 thru 11/97
as a postdoc in Frans
Kaashoek's research group
(PDOS) at the MIT Lab for Computer
Science. During that time, I participated in the design and development
of the exokernel operating system.
I also pursued (with others, of course) a number of specific projects,
most of which are mentioned above.
In a previous life (i.e.,
prior to August 1995), I was a graduate student in Yale Patt's
research group in the University of Michigan's
Advanced Computer Architecture Lab. Although the group's main focus
is on compilation for and implementation of high-performance processors,
my research was focused on file systems and storage subsystem architecture.
I earned all of my collegiate
degrees (B.S., 1991, M.S., 1993, Ph.D., 1995) from the University of