What's Going On: Unsupervised Learning

November 17, 2016

Greg Ver Steeg thinks there's less difference between supervised and unsupervised machine learning than might appear to be the case.

At his early November seminar on unsupervised learning, Ver Steeg made that point through a series of creative analogies - and described the concrete ways in which his work appears to have contributed to a California Institute of Technology researcher's cancer remission.

At the seminar, part of ISI's "What's Going On" research breakfast series, Ver Steeg walked an audience of about 30 ISI colleagues through the distinction between supervised and unsupervised learning, beginning with photos that could qualify for adorable-cat YouTube videos.

Supervised learning, said Ver Steeg, involves "cute" data that is recognizable and has training labels, meaning data is designated as either "cat" or "not cat." While such data can be trained to millions of parameters, it's difficult to learn anything new, since the data already is known to be or not be a cat.

Conversely, "Unsupervised learning is the dark matter of artificial intelligence," he said. As in the universe, which is known to be mostly dark matter about which little is understood, most learning by humans and animals actually is unsupervised. In other words, how we learn is rarely as simple as dividing the world into cats and not-cats. We instead work with a vast range of information from which we unconsciously manage to identify elements, predict outcomes and discern vital relationships.

Ver Steeg invoked the example of Google's Alpha Go software, which Facebook AI director Yann LeCun has asserted was simply reinforcement learning, not true AI. LeCun suggested imagining unsupervised learning as a cake, supervised learning as the frosting, and reinforcement learning as the cherry on top - something that works only once millions of examples have been collected and trained.

Given the close relationship of the cake and frosting, Ver Steeg suggested any distinction is largely artificial. Thatâ&euro&trades because everything is really cake (unsupervised learning) and what we choose to view as the "frosting" is just a matter of perspective (from a learning point of view.)

He went on to describe his hierarchical information theory, CorEx, as a kind of sieve in which a main ingredient is extracted from data "soup", with further ingredients extracted at each subsequent layer. When the main ingredient is "information about relationships," compression, prediction and generative models become possible. In essence, CorEx decides what the most important relationships are in a given dataset, then searches for the factors that best explain them.

In practice, Ver Steeg has worked closely with computational biologist Shirley Pepke, who recently left CalTech for industry, on ovarian cancer gene expression. The work took on new urgency when Pepke herself was diagnosed with the relatively rare disease. Using CorEx, the pair created gene ontology database annotations that showed specific cell groups reflect strong, specific, diverse functions.

When Pepke's cancer stopped responding to chemotherapy, her oncologists recommended she continue on that path. Pepke instead chose to bolster facets of her immune system that Ver Steeg's work had shown to be deficient, and opted for an immune therapy being used for melanoma patients. Pepke since has gone into remission in what appears to be a validation of their discoveries.

Among other potential biomedical applications, said Ver Steeg: neuroscience, dynamic treatment modeling, other gene expression applications, and patient record phenotyping. Heâ&euro&trades also working with dating service eHarmony, along with social networks, forecasting stock trading and human language.

"What's Going On" research breakfasts will continue with Pedro Szekely, Jose Luis Ambite and Craig Knoblock of Intelligent Systems; John Heidemann of Internet and Networked Systems; and Andrew Schmidt of Computational Sciences and Technology. Gully Burns is coordinating the series.