Publications

Efficient graph-based document similarity

Abstract

Assessing the relatedness of documents is at the core of many applications such as document retrieval and recommendation. Most similarity approaches operate on word-distribution-based document representations - fast to compute, but problematic when documents differ in language, vocabulary or type, and neglecting the rich relational knowledge available in Knowledge Graphs. In contrast, graph-based document models can leverage valuable knowledge about relations between entities - however, due to expensive graph operations, similarity assessments tend to become infeasible in many applications. This paper presents an efficient semantic similarity approach exploiting explicit hierarchical and transversal relations. We show in our experiments that (i) our similarity measure provides a significantly higher correlation with human notions of document similarity than comparable measures, (ii) this …

Date
September 22, 2025
Authors
Christian Paul, Achim Rettinger, Aditya Mogadala, Craig A Knoblock, Pedro Szekely
Conference
The Semantic Web. Latest Advances and New Domains: 13th International Conference, ESWC 2016, Heraklion, Crete, Greece, May 29--June 2, 2016, Proceedings 13
Pages
334-349
Publisher
Springer International Publishing