Publications
Discovering and learning semantic models of online sources for information integration
Abstract
Much work in Information Integration and the Semantic Web assumes that rich semantic models of sources exist. In practice, there is a tremendous amount of data on the Web, but it is typically hard to find, has little or no explicit structure, and there is rarely any semantic description of the data. We describe an integrated end-to-end system that can automatically discover web sources, invoke and extract the data from them, and build their semantic models. We describe the challenges in integrating the component technologies into a unified approach to discovering, extracting and modeling new online sources. We evaluate the integrated system in three different domains and demonstrate that it can automatically discover and model new data sources.
- Date
- September 10, 2025
- Authors
- José Luis Ambite, Bora Gazen, Craig A Knoblock, Kristina Lerman, Thomas Russ
- Journal
- IJCAI Workshop on Information Integration on the Web, Pasadena, CA