Incremental Formalization of Document Annotations through Ontology-based Paraphrasing

Jim Blythe and Yolanda Gil
Thirteenth International World Wide Web Conference (WWW), 2004.


For the manual semantic markup of documents to become wide-spread, users must be able to express annotations that conform to ontologies (or schemas) that have shared meaning. However, a typical user is unlikely to be familiar with the details of the terms as defined by the ontology authors. In addition, the idea to be expressed may not fit perfectly within a pre-defined ontology. The ideal tool should help users find a partial formalization that closely follows the ontology where possible but deviates from the formal representation where needed. We describe an implemented approach to help users create semi-structured semantic annotations for a document according to an extensible OWL ontology. In our approach, users enter a short sentence in free text to describe all or part of a document, and the system presents a set of potential paraphrases of the sentence that are generated from valid expressions in the ontology, from which the user chooses the closest match. We use a combination of off-the-shelf parsing tools and breadth-first search of expressions in the ontology to help users create valid annotations starting from free text. The user can also define new terms to augment the ontology, so the potential matches can improve over time.

PDF version.

HTML version.

Jim Blythe