Doctoral Dissertation

Dependable, Online Upgrades in Distributed Systems

Software upgrades are inevitable in distributed enterprise systems, and they can cause downtime, data loss or latent errors. My doctoral dissertation identifies and addresses the leading causes of both unplanned failures (breaking hidden dependencies) and planned downtime (migrating persistent data) when upgrading large-scale enterprise systems. Previous research has focused either on upgrading individual components of distributed systems (e.g., the application program or the database schema), or on incorporating online-upgrade mechanisms in existing middleware frameworks. Building on empirically-derived insights on current upgrade practices and problems, I leverage the opportunities provided by emerging technologies, such as cloud computing, to improve the dependability of end-to-end upgrades in distributed systems.
[more information ...]

Other Research

I am broadly interested in fault-tolerant distributed systems, with an emphasis on the emergent behavior of complex distributed systems. I worked on dependability-in-the-small (fault-tolerant networks-on-chip) and dependability-in-the-large (fault-tolerant middleware and zero-downtime software upgrades).