Signal Processing under Geometric Constraints

One major challenge in the data deluge is identifying patterns of activity from the flow of information in a network, examples including sensor networks, social networks and biological networks, by the prospect of systematically learning and predicting the network behavior. Network inference is hard because of its massive scale and time dynamics of the flow of information. How can we reduce network dimensionality without compromising the potential for learning and control within limited processing capabilities?

What saves the day is the geometry of information flow, the fact that it can be viewed as a low-dimensional manifold in a very high dimensional space. Leveraging low-dimensional data representations through geometric constraints such as low rank, sparsity, and graphs, we hope to reduce the sampling cost of information acquisition and improve the performance of decision making. However there is still the problem of tapping into that manifold. When the manifold is linear, we developed online algorithms with low-complexity that can estimate and track the evolution of a low-rank subspace from partial observations even with Poisson noise, and demonstrated superior performance on various applications. What is truly exciting is the recognition that the geometry of information flow, expressed in terms of the top principal components of the data covariance matrix, is preserved even from highly incomplete observations.

In order to further overcome the barrier between the high data generation rate and limited processing capabilities facing many modern data-intensive applications, we realized that by leveraging structural assumptions of covariance structures, a single sketch per sample suffices for accurately reconstructing the covariance matrix rather than the original data set. This offers a radical breakthrough for learning covariance structures requiring substantially fewer storage of the data than conventional methods. The key approach is to develop efficient sketching strategies of streaming data tailored to specific inference tasks with a recognition of the embedded low-dimensional structures. This idea can be extended to sketch and analyze graph-structured data, as well as detect anomalies and change-points. Our new framework has the potential to advance the paradigm of high-dimensional data analysis from simple post-processing toward an integration of data priors, agile acquisition, and adaptive processing.

Overview

Covariance Sketching and Low-Rank Covariance Estimation

Subspace Tracking and Statistical Inference for Streaming Data

Leveraging Graph Structures for Better Inference

Sparsity for Signal Processing and Machine Learning