Tze Meng Low
Assistant Research Professor
Department of Electrical and Computer Engineering
Carnegie Mellon University
Hammerschlag Hall A303
5000 Forbes Ave
Pittsburgh, PA 15213
Contact: lowt (at) cmu (dot) edu
The goal of my research is to systematically derive and develop efficient
algorithms from new and emerging domains on current and future architectures. The main focus
of my research revolves around the derivation of efficient algorithms, analytical models for mapping software to and from hardware, and abstractions for software implementations. The eventual goal is to develop a theory of high performance that identifies key hardware and software factors affecting performance so as to automatically generate high performance implementations for current and future architectures.
These papers are selected to provide an overview of my research interests.
For a more updated and complete list, please see Google Scholar
Modeling Matrix Engines for Portability and Performance
36thInternational Parallel and Distributed Processing Symposium (IPDPS)
qLD: High-performance Computation of Linkage Disquilibrium on CPU and GPU
20th IEEE International Conference on Bioinfomatics and BioEngineering
Portable GPU Framework for SNP Comparisons
18thIEEE International Workshop on High Performance Computational Biology (HiCOMB)
High Performance Zero-memory Overhead Direct Convolutions
International Conference on Machine Learning (ICML)
Large bandwidth-efficient FFTs on multicore and multi-socket systems
International Parallel and Distributed Processing Symposium (IPDPS)
A Family of Provably Correct Algorithms for Exact Triangle Counting
Workshop on Software Correctness for HPC
Analytical Modeling is Enough for High Performance BLIS
ACM Transactions on Mathematical Software 43(2)
Teaching and Education
I teach 18-645 "How to Write Fast Code", a graduate
course in the ECE Department at Carnegie Mellon Univerity, on the
principles of high performance computing, every Fall since 2017.
Graduate Students (PhD)
- Mark Blanco
- Elliott Binder
- Upasana Sridhar
- Nicholai Tukanov
- Yuttapichai "Guide" Kerdcharoen
- Navya Chandra, "Optimizing CNNs for TinyML devices", Spring 2020-Fall 2021
- Yurun Tian, "Applying ML approaches to Real-life Sale Data", Fall 2021
- Susan Quan, "Applying ML approaches to Real-life Sale Data", Fall 2021
- Jenya Singh, "Optimizing Direct Convolution for GPUs", Spring 2020
- Nicholas Kiesei, "Modeling collective communication in a multi-GPU system", Fall 2019
- Upasana Sridhar, "Formal Derivation of High Performance Graph Algorithms", Spring 2019
- Nidhi Bhatia, "Evaluation of Graph Clustering Implementations of Louvain Method", Spring 2019
- Rajnish Aggarwal, "Analytical Model for Deep Neural Nets", Spring 2019
- Elliott Binder, "High Performance Implementation of FastID on CPU and GPU",Fall 2017 - Spring 2018
- Liu Yuan, "High Performance Code Generator for Matrix Multiplication-Like Operations", Summer 2018
- Claudia Kho, "High Performance Matrix Multiplication on GPUs", Summer 2018.
- Anurag Kutuluru, "High Performance K-truss computation", Summer 2018.
- Rahul Mayuranat, "Task Parallelism of Graph Algorithms", Summer 2018.
- Krzysztof Drewniak, "Automating Loop Invariant Fusion", Spring 2018.
- Matthew Lee, "A case for HPC on a microcontroller", Summer 2017.
- Varun Nagaraj Rao, "First look: Linear Algebra-based Exact Triangle Counting without Matrix Multiplication", Summer 2017.
- Peter Oostema, "Method of Four Russians on Finite Field Matrix Multiplication", Summer 2017.
- Co-advised with other ECE faculty
- Maithreyi Deshpande, "Effects of Cache Replacement policies on Performance", Spring 2017
- Jennifer Xiao, "Creatures of Habit, Detecting Anomalies in Human Behavior,", Summer 2017
- Cody Yang, "Interpolation-based twiddle factors for FPGA-based FFT implementations", Summer 2016.
- Bill Kim, "UAV onboard data processing for state estimation", Summer 2016.
- Kashish Garg, "Data collection on UAV for sound based state estimation", Summer 2016.
- Jonathan Li, "UAV sound analysis for state estimation", Summer 2016.
Siying Jin, "Memory Bandwidth and Vectorization on x86 Multicores", Spring 2016.
Awards, Prizes, Honors
SIAM Activity Group on Supercomputing (SIAG/SC) Best Paper Prize, "The BLIS Framework: Experiments in Portability"
SIAM Conference on Parallel Processing 2020.
Champion, "Exploration of Fine-Grained Parallelism for Load Balancing Eager K-truss on GPU and CPU, "
Graph Challenge 2019
Finalist, "Linear Algebraic Formulation of Edge-centric K-truss Algorithms with Adjacency Matrices, "
Graph Challenge 2018
Honorable Mention, "First look: Linear algebra-based triangle counting without matrix multiplication,"
Graph Challenge 2017