Publications

Modeling Concept Dependencies in a Scientific Corpus

Abstract

Our goal is to generate reading lists for students that help them optimally learn technical material. Existing retrieval algorithms return items directly relevant to a query but do not return results to help users read about the concepts supporting their query. This is because the dependency structure of concepts that must be understood before reading material pertaining to a given query is never considered. Here we formulate an information-theoretic view of concept dependency and present methods to construct a “concept graph” automatically from a text corpus. We perform the first human evaluation of concept dependency edges (to be published as open data), and the results verify the feasibility of automatic approaches for inferring concepts and their dependency relations. This result can support search capabilities that may be tuned to help users learn a subject rather than retrieve documents based on a single query.

Date
January 12, 2026
Authors
Jonathan Gordon, Linhong Zhu, Aram Galstyan, Prem Natarajan, Gully Burns
Journal
Proc. of the Association for Computational Linguistics (ACL)