Greg Ver Steeg



Ph.D., Physics, Caltech


Powerful (unsupervised) learning requires abstraction. My main research focus is to formalize this idea in a practical and principled way using information theory.

  • An information-theoretic foundation for modularly and hierarchically decomposing information in complex systems with prototype and sample applications (NIPS-14) and (code), and more theoretical developments (AISTATS-15)
  • The linear version of CorEx exhibits a unique "blessing of dimensionality" for recovering latent factor structure (paper) and excellent performance for estimating covariance matrices with high-dimensional, under-sampled data (code)
  • Information in complex systems can be extracted incrementally using the "information sieve" method (ICML-16) and (code). An implementation for continuous variables is more practical (code) and we show that we can use it to extract common information (IJCAI-17).
  • Historically, the impact of information theory on machine learning has been limited for two reasons. (1) A preoccupation with (pairwise) mutual information leads to its frequent mis-use, see (ICML-14) for one example. (2) Information measures are hard to estimate, (UAI-15) (AISTATS-15) (NIPS-16).
  • Applications: gene expression (interesting podcast and article about this work), brain imaging 1, 2, text analysis (code) 1 2, psychometrics, finance

As intelligent beings in a complex world, we are constantly faced with the challenge of making sense of it all through simplification and abstraction. Our abstract representations of the world seem to be most useful when they "carve nature at its joints" or "perceive and bring together in one idea the scattered particulars" (as Socrates puts it). The motivation of this research program is to formalize this idea in a practical and general way. While information theory contains tempting concepts to quantify "information", traditional measures don't capture the notion of abstraction. Information theory was developed to describe lossless communication between two parties. From this point of view, the main focus has been on perfect reconstruction, or memorization, of input data. I propose to focus on a different characterization of multivariate information about the world that can be decomposed modularly and hierarchically. Besides introducing a notion of abstraction, we would like our characterization of information to allow us to efficiently optimize representations to be as informative as possible. We have been using these ideas to search for abstract representations that help us understand complex data like neurophysiology of Alzheimer's patients, gene expression of cancer patients, and human behavior. Ultimately, I hope these ideas will lead to a better understanding of intelligence.
Other ongoing research efforts