Publications
D-REPR: a language for describing and mapping diversely-structured data sources to RDF
Abstract
Publishing data sources to knowledge graphs is a complicated and laborious process as data sources are often heterogeneous, hierarchical and interlinked. As an example, food price datasets may contain product prices of various units at different markets and times, and different providers can have many choices of formats such as CSV, JSON or spreadsheet. Beyond data formats, these datasets may have differing layout, where one dataset may be organized as a row-based table or relational table (prices are in one column), while another may use a matrix table (prices are in one matrix). To address these problems, we present a novel data description language for mapping datasets to RDF. In particular, our language supports specifying the locations of source attributes in the sources, mapping of the attributes to ontologies, and simple rules to join the data of these attributes to output final RDF triples. Unlike …
- Date
- September 23, 2019
- Authors
- Binh Vu, Jay Pujara, Craig A Knoblock
- Book
- Proceedings of the 10th International Conference on Knowledge Capture
- Pages
- 189-196