Artificial Intelligence

Tensor models for large, complex and high dimensional data

Friday, December 07, 2018, 11:00am - 12:00pm PDTiCal
10th floor conference room (1016)
This event is open to the public.
AI Seminar
Shuheng Zhou, UC Riverside

Building models and methods for large spatio-temporal data is important for many scientific and application areas that affect our lives. In this talk,  I will discuss several interrelated yet distinct models and methods on graph and mean recovery problems with applications in neuroscience, spatio-temporal modeling, and genomics.

I discuss the Gemini methods for estimating the graphical structures and underlying parameters, namely, the row and column covariance and inverse covariance matrices from the matrix variate data. Under sparsity conditions, we show that one is able to recover the graphs and covariance matrices with a single random matrix from the matrix variate normal distribution. Our method extends, with suitable adaptation, to the general setting where replicates are available. We establish consistency and obtain the rates of convergence in the operator and the Frobenius norm. We show that having replicates will allow one to estimate more complicated graphical structures and achieve faster rates of convergence. We provide simulation evidence showing that we can recover graphical structures as well as estimating the precision matrices, as predicted by theory.

It has been proposed that complex populations, such as those that arise in genomics studies, may exhibit dependencies among observations as well as among variables. This gives rise to the challenging problem of analyzing high-dimensional data with unknown mean and dependence structures. In the second part of the talk, I  present a practical method utilizing generalized least squares and penalized (inverse) covariance estimation to address this challenge. We establish consistency and obtain rates of convergence for estimating the mean parameters and covariance matrices iteratively. We use simulation studies and analysis of genomic data from a twin study of ulcerative colitis to illustrate the statistical convergence and the performance of our methods in practical settings.

This talk is based on joint work with Michael Hornstein, Roger Fan, Kerby Shedden.


Shuheng Zhou is an associate professor of statistics at University of California, Riverside. She received her Ph.D. degree from Carnegie Mellon University in Electrical and Computer Engineering in 2006, with her thesis work focusing on theoretical computer science. Her current research interests are in high-dimensional statistics, graphical models, errors-in-variables, privacy, statistical machine learning theory, algorithms and optimization. She enjoys designing statistical models and methods for handling large, complex, and incomplete matrix and tensor data, arising from biological and neuroscience applications.  (LIVE ONLY; WILL NOT BE RECORDED)


« Return to Events