Publications
Diagnosis Survey Papers
Diagnosis in Voice-over-IP Systems
Diagnosis in MapReduce Systems
Diagnosis in Replicated systems
Diagnosis Survey Papers
- Failure Diagnosis of Complex Systems.
S. P. Kavulya, K. Joshi, F. Di Giandomenico, P. Narasimhan.
Book on Resilience Assessment and Evaluation. Wolter, K.; Avritzer, A.; Vieira, M.; van Moorsel, A. (Eds.). Springer Verlag, 2012.
-
Causes of Failure in Web Applications.
S. Pertet and P. Narasimhan.
Carnegie Mellon University Parallel Data Lab
Technical Report CMU-PDL-05-109. December 2005
Diagnosis in Voice-over-IP Systems
-
Draco: Statistical Diagnosis of Chronic Problems in Large Distributed Systems.
S. P. Kavulya, S. Daniels, K. Joshi, M. Hiltunen, R. Gandhi, P. Narasimhan.
IEEE/IFIP Conference on Dependable Systems and Networks (DSN), June 2012.
-
Practical Experiences with Chronics Discovery in Large Telecommunications Systems.
S. P. Kavulya, K. Joshi, M. Hiltunen, S. Daniels, R. Gandhi, P. Narasimhan.
Best Papers from SLAML 2011 in Operating Systems Review, Volume 45, Number 3, December 2011.
Diagnosis in MapReduce Systems
- Theia: Visual Signatures for Problem Diagnosis in Large Hadoop Cluster.
E. Garduno, S. P. Kavulya, J. Tan, R. Gandhi, P. Narasimhan.
USENIX Large Installation System Administration (LISA) Conference, December 2012. (Best Student Paper)
-
Understanding and improving the diagnostic workflow of MapReduce users.
J. D. Campbell, A. B. Ganesan, B. Gotow, S. P. Kavulya, J. Mulholland, P. Narasimhan, S. Ramasubramanian, M. Shuster, J. Tan.
ACM Symposium on Computer Human Interaction for Management of Information Technology (CHIMIT), December 2011.
-
An analysis of traces from a production MapReduce cluster.
S. Kavulya, J. Tan, R. Gandhi, P. Narasimhan.
IEEE/ACM Conference on Cluster, Cloud and Grid Computing (CCGrid), May 2010.
-
Visual, log-based causal tracing for performance debugging of MapReduce systems.
J. Tan, S. Kavulya, R. Gandhi, P. Narasimhan.
IEEE Conference on Distributed Computing Systems (ICDCS). June 2010.
-
ASDF: An Automated, Online Framework for Diagnosing Performance Problems.
K. Bare, S. Kavulya, J. Tan, X. Pan, E. Marinelli, M. Kasick, R.Gandhi, P. Narasimhan.
Architecting Dependable Systems, in Lecture Notes in Computer Science, Volume 6420/2010, No. 7, 2010.
-
Kahuna: Problem Diagnosis for MapReduce-Based Cloud Computing Environments.
J. Tan, X. Pan, S. Kavulya, R. Gandhi, P. Narasimhan.
IEEE/IFIP Network Operations and Management Symposium (NOMS), April 2010.
-
Blind Men and the Elephant (BLIMEy): Piecing together Hadoop for Diagnosis.
X. Pan, J. Tan, S. Kavulya, R. Gandhi, P. Narasimhan.
International Symposium on Software Reliability Engineering (ISSRE), December 2009.
-
Ganesha: Black-Box Fault Diagnosis for MapReduce Systems.
X. Pan, J. Tan, S. Kavulya, R. Gandhi, P. Narasimhan.
Workshop on Hot Topics in Measurement and Modeling of Computer Systems (HotMetrics), June 2009.
-
Mochi: Visual Log-Analysis Based Tools for Debugging Hadoop.
J. Tan, X. Pan, S. Kavulya, R. Gandhi, P. Narasimhan.
USENIX Workshop on Hot Topics in Cloud Computing (HotCloud), June 2009.
-
SALSA: Analyzing Logs as StAte Machines.
J. Tan, X. Pan, S. Kavulya, R. Gandhi. P. Narasimhan.
USENIX Workshop on Tackling Computer Systems Problems with Machine Learning Techniques (SysML), December 2008.
Diagnosis in Replicated systems
|