Greg Ver Steeg

Practical methods for discovering meaningful structure in complex systems

"If you do not expect the unexpected, you will not find it." -Heraclitus

“If your mind is empty, it is always ready for anything, it is open to everything. In the beginner's mind there are many possibilities, but in the expert's mind there are few. ” -Shunryu Suzuki 

More and more, scientists find themselves studying rich data from complex, but poorly understood, systems like human behavior and biology. My goal is to find model-independent ways to automatically discover the important structure in these types of complex systems. The ultimate goal of this research is to discover information-theoretic principles that can help us understand intelligence. 

I received my PhD in physics from Caltech in June 2009 for research in quantum information theory. My research draws on a diverse set of connections between information theory, machine learning, causal inference, and physics. My recent work applies these methods to uncovering structure in complex systems like human behavior, language, biology, and social networks. I am also involved in a quantum computing initiative at ISI. High level overviews of some papers are on my blog. The current main thrust of my research is about a new information-theoretic foundation for unsupervised learning of representations and is described here. Other research directions are briefly described below. 

  • Information theoretic methods for learning (the main page for this project is here):
    • We have introduced a principled and practical new approach to deep learning based on information theory. The intuition behind the method is to use information theory to find the best answer to the following question: "What are the simplest explanations that explain most of the relationships in the data?" 

Small portion of a deep model learned from Twitter data using "CorEx".

  • Information theory's alluring nomenclature often invites mis-use. Recent work (ICML-14) explored how "mutual information" has been incorrectly used for clustering. In the future, I hope to expand on this work to describe some of the major problems with the venerable "InfoMax" principle. 

  • Influence in time series. Transfer entropy provides a powerful, general framework for discovering connections between variables. Unfortunately, it is rarely used for this purpose because of the difficulty of estimating probability densities. Recent advances in nonparametric entropy estimators allow us to side-step the density estimation problem to get at entropies directly. This allows us to estimate useful, higher-order entropies such as transfer entropy, which is a measure of how well one system helps us predict the future of another system. We have already applied this perspective successfully to discover influential relationships in social networks using timing of activity and content of tweets. 

  • Bell inequalities for complex networks, or Statistical tests for hidden variables
    • Generally, I would like to extend the notion in quantum physics of a test for hidden variables (the Bell inequalities, e.g.) to various hidden variable models in machine learning contexts. What are the minimal assumptions that allow you to infer causal effects even in the presence of confounding factors? 
    • As a concrete application, many papers have differentiated influence and homophily on social networks, but Shalizi and Thomas have pointed out that latent homophily can not be distinguished from influence. We have shown how to lower bound the strength of causal effects in social networks. Work in progress consider ways to make these tests simpler, more powerful, and more general. 

  • Mapping graph clustering to an Ising problem gives us a framework for asking several questions. When are clusters detectable? Is clustering always stable? Which prior information would affect these properties? 
  • A new generation of quantum chips made by D-wave solve Ising problems using a quantum annealing process. Our work on clusters shows that not all Ising problems are created equal: there is a rich phase structure. How does this structure affect the quality of solutions achievable through quantum annealing?

 

Teaching

Aram Galstyan and I designed and taught CS 599, "Computation and physics" in Spring 2012. 

Groups: