μTune: Auto-Tuned Threading for OLDI Microservices

Abstract

Modern On-Line Data Intensive (OLDI) applications have evolved from monolithic systems to instead comprise numerous, distributed microservices interacting via Remote Procedure Calls (RPCs). Microservices face sub-millisecond (sub-ms) RPC latency goals, much tighter than their monolithic counterparts that must meet ≥ 100 ms latency targets. Sub-ms–scale threading and concurrency design effects that were once insignificant for such monolithic services can now come to dominate in the sub-ms–scale microservice regime. We investigate how threading design critically impacts microservice tail latency by developing a taxonomy of threading models—a structured understanding of the implications of how microservices manage concurrency and interact with RPC interfaces under wide-ranging loads. We develop μTune, a system that has two features: (1) a novel framework that abstracts threading model implementation from application code, and (2) an automatic load adaptation system that curtails microservice tail latency by exploiting inherent latency trade-offs revealed in our taxonomy to transition among threading models. We study μTune in the context of four OLDI applications to demonstrate up to 1.9x tail latency improvement over static threading choices and state-of-the-art adaptation techniques.

Publication
In proceedings of the 13th USENIX Symposium on Operating Systems Design and Implementation (Acceptance rate: 47/264 = 17.8%)
Akshitha Sriraman
Akshitha Sriraman
Assistant Professor

I am an Assistant Professor in the Department of Electrical and Computer Engineering at Carnegie Mellon University. My research bridges computer architecture and software systems, with a focus on making datacenter-scale web systems more efficient, sustainable, and equitable (via solutions that span the systems stack).