Seminars and Events

Artificial Intelligence Seminar

Who Needs Training Data: Towards Generalizable, Zero-shot Commonsense Services with Consolidated Knowledge

Event Details

Valuable lessons from top-down resources like Cyc, the creation of large-scale bottom-up resources like ConceptNet and ATOMIC, as well as the impressive performance and flexibility of language models, bring us closer to the fundamental goals of acquiring, representing, and reasoning with commonsense knowledge. Yet, the key aspect of generalizability is mostly neglected, as models typically rely on within-distribution training data. In this talk, I will argue that training data is obsolete for commonsense reasoning, and consider the following question: can we build generalizable, zero-shot commonsense services by informing language models with consolidated knowledge?

In the first part, I will discuss our efforts to consolidate existing commonsense knowledge sources in a sound representation, while sufficiently preserving the richness of their linguistic expression. I will describe our ongoing construction of CSKG, a commonsense knowledge graph that consolidates seven popular, previously disjoint sources. I will discuss our principles of harmonizing and modeling nodes and relations, with a focus on our abstraction of the knowledge types of CSKG into 13 dimensions, such as temporal or utility knowledge.

In the second part, I will show that pretraining of language models with CSKG knowledge leads to more accurate, generalizable, and explainable QA reasoning, without relying on (much) training data. In addition, our experiments demonstrate that its dimensions help the alignment of background knowledge with the task and can guide the selection of data for pretraining. Inspired by these findings, we are investigating the next steps towards building generalizable commonsense services that require no training data: 1) making CSKG more comprehensive; 2) aligning CSKG to downstream tasks dynamically; 3) filling the evaluation gaps that have been revealed by our consolidation.

Speaker Bio

Dr. Filip Ilievski is a Computer Scientist in the Center on Knowledge Graphs within the Information Sciences Institute (ISI) at the USC Viterbi School of Engineering. Filip holds a Ph.D. in Natural Language Processing from the Vrije Universiteit (VU) in Amsterdam. His dissertation investigated the identity of long-tail entities in text. During a research visit at CMU in 2017, Filip worked under the supervision of Prof. Ed Hovy on inducing entity type profiles from knowledge graphs. His principal research interest concerns the role of background knowledge for filling gaps in human communication. Filip is leading the development of the Commonsense Knowledge Graph (CSKG), which consolidates existing sources into a harmonized representation, allowing us to enhance language models for zero-shot Question Answering with more knowledge, precisely select relevant subsets, or perform explainable inference. Filip is also heavily involved in the development of the Knowledge Graph ToolKit (KGTK) - a comprehensive set of operations for working with modern, hyperrelational graphs like Wikidata. In the past year, he worked together with nearly a dozen Master students, and collaborated with researchers at CMU, Bosch Research, RPI, etc. They have been publishing this work in top-tier venues like AAAI, EMNLP, and ISWC, and organizing commonsense workshops (AAAI’21) and tutorials (AAAI’21 and ISWC’20).