|   BRIEF BIO
   
             I joined Carnegie Mellon University in 2003 as a
            Systems Faculty with the Electrical and Computer
            Engineering Department and the Information Networking
            Institute. I was previously a Research Staff Member with
            Motorola's Broadband Communications Division in San Diego,
            CA, where I was involved in the H.264 video-compression
            standardization activity. I received a Motorola
            Outstanding Performance award in 2002 in recognition of my
            contributions to global standardization activities. Prior
            to this, I received my Ph.D. in March 2000 from the
            University of California, Santa Barbara and my
            B.Tech. degree from
            IIT Bombay in 1994.
		      
             RESEARCHMy research interests are in the area of problem diagnosis or 
	    fingerpointing in large-scale distributed systems. Problem 
	    diagnosis involves instrumenting a given system to gather meaningful 
	    data, and analyzing the collected data to detect the source or even the root 
	    cause of the problems in the system. Fingerpointing is a challenging
	    problem because the distributed nature of processing/computation can cause the 
	    problem to affect the behavior of all the nodes in the system. We are currently
	    working on identifying performance problems in MapReduce systems such as Hadoop, 
            and file systems such as PVFS, Lustre, BFS and CoreFS. Our current fingerpointing algorithms
	    use black-box data and/or white-box data to fingerpoint a faulty node in Hadoop and 
	    the filesystems. 
My current research projects include
            the following:
          
             
             Problem Diagnosis in PVFS/Lustre: Automatically diagnosing performance
 	    problems in parallel file systems by identifying, gathering and analyzing either OS-level 
	    black-box performance metrics or system call attributes across parallel file systems.  
             Kahuna: Diagnosing performance problems in Hadoop by comparing OS-level performance
	    metrics and Hadoop's log statistics across all the nodes of a cluster to fingerpoint a faulty node. 
	     SALSA: Analyzing Logs as StAte machines: SALSA examines Hadoop logs to derive a state-machine
	    view of the system's execution along with control-flow, data-flow models and related statistics. The state-machine
	    view of Hadoop is then used for failure diagnosis and visualizing the Hadoop's distributed behavior.  
        
            Gumshoe: Failure diagnosis 
            in distributed systems through the application
            of statistical anomaly-detection algorithms, machine-learning
            techniques such as clustering, etc.
            I am fortunate to work with talented students such as
            Jiaqi Tan
            , Soila Kavulya,
            Michael Kasick
             and Xinghao Pan. I am also affiliated with the Center for Sensed
            Critical Infrastructure Research 
            (CenSCIR)  and Parallel Data Lab (PDL) at
            CMU.
	 RECENT PUBLICATIONS
	                
	     Visual, Log-based Causal Tracing for Performance Debugging of MapReduce SystemsJiaqi Tan, Soila Kavulya, Rajeev Gandhi, Priya Narasimhan. to be presented at IEEE International Conference on Distributed Computing Systems (ICDCS), Genoa, Italy, Jun 2010
            An Analysis of Traces from a Production MapReduce ClusterSoila Kavulya, Jiaqi Tan, Rajeev Gandhi, Priya Narasimhan, to be presented at IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), Melbourne, Australia, May 2010
	     Kahuna: Problem Diagnosis for MapReduce-Based Cloud Computing EnvironmentsJiaqi Tan, Xinghao Pan, Soila Kavulya, Rajeev Gandhi and Priya Narasimhan, to be presented at IEEE/IFIP Network Operations and Management Symposium (NOMS), Osaka, Japan (April 2010)
             Black-Box Diagnosis in Parallel File SystemsMichael P. Kasick, Jiaqi Tan, Rajeev Gandhi and Priya Narasimhan, to be presented at USENIX Conference on File and Storage Technologies (FAST), San Jose, CA (Feb 2010)
             System-Call Based Problem Diagnosis for PVFSMichael Kasick, Keith Bare, Eugene Marinelli, Rajeev Gandhi and Priya Narasimhan, Fifth Workshop on Hot Topics in System Dependability (HotDep), Lisbon, Portugal, June 2009
 The list of all my publications can be found here.
	      
             TEACHINGI teach the Fundamentals of Embedded Systems
             (18-342/14-642) course at Carnegie Mellon University.
	      This practical, hands-on course introduces students to
	      the basic building-blocks and the underlying
	        scientific principles of embedded systems. The course
	        covers both the hardware and software aspects of
	        embedded processor architectures, along with operating
	        system fundamentals, such as virtual memory,
	        concurrency, task scheduling and
	        synchronization. Through a series of laboratory
	        projects involving state-of-the-art processors,
	        students learn to understand implementation
	        details and to write assembly-language and C programs
	        that implement core embedded OS functionality, and
	        that control/debug features such as timers,
	        interrupts, serial communications, flash memory,
	        device drivers and other components used in typical
	        embedded applications. Relevant topics, such as
	        optimization, profiling,
	        and real-time operating systems are also covered.
		
             PATENTS
	     
	    	Co-inventor, Frequency coefficient scanning paths for coding digital video content. 
		United States Patent: 7088867. August 2006.  
		Co-inventor, Macroblock level adaptive frame/field coding for digital video content. 
		United States Patent: 6980596. December 2005.  
	    |