Publications
Building Information Servers
Abstract
This research addressed the problem of determining the relationships among multiple, diverse information sources in order to support the integration of data from these sources. In general, to integrate data from multiple sources requires a model of the precise relationships between the sources. Constructing such a model by hand is a difficult and time consuming process. The relationships captured in a model describe the type of overlap between data instances in different sources. In this work data mining techniques were used to determine these relationships by comparing the data instances between sources. A related problem is that data instances can exist in different formats across several sources, e.g. IBM may be abbreviated as IBM in one source and appear as International Business Machines in another source. This work addressed this problem by developing techniques for automatically determining the …
- Date
- September 22, 1997
- Authors
- Craig A Knoblock, William Swartout, Sheila Tejada
- Journal
- NASA
- Issue
- 19980202709