| | | |
| | |
| |
| | |
| |
| | |
| |
| | |
| |
| |
Source modeling automatically builds rich semantic description of online sources, including the ontological types of the data provided by a source as well as the functional relationships between the inputs and outputs of a source. We use machine learning methods to learn from known sources and then build semantic models of new online sources. These techniques can be used to automatically discover and model new sources of information, which can then be integrated with other sources of data.
| |
| |
| | |
| |
| | |
| |
| | |
| |
| |
The proliferation of data and data sources has compounded one of the Information Age's major ironies: incompatible data sources and inadequate methods for integration. We have developed data integration tools that integrate data at both the schema and data level. Our work includes the development of Prometheus, an information mediation system that facilitates uniform access to data sources; and EntityBases, a scalable record linkage approach to integrating entities across heterogeneous sources.
| |
| |
| | |
| |
| | |
| |
| | |
| |
| |
The rapid expansion of geospatial data sources on the Internet has sparked tremendous possibilities in integrating those sources to provide new information. We have developed tools to automatically discovery maps, extract the various layers from those maps, and align them with current satellite or aerial imagery of a region. Through geospatial information fusion, we are integrating nontraditional sources, such as phone books, with existing sources such as satellite images, maps and vector data to automatically identify roads and structures in imagery.
| |
| |
| |
Such techniques have significant applicability to problems such as earthquake disaster intervention and recovery.
| |
| |
| | |
| |
| | |
| |
| | |
| |
| |
Mashups such as Zillow and Wikimapia offer an integrated, effective means to extract, integrate and view diverse information. But the process of creating Mashups often requires programming expertise, putting their creation out of reach of many otherwise capable Web users. We have developed a mashup building tool, called Karma, that enables users to create a Mashup in a seamless, interactive process. We are currently refining and deploying Karma to the problem of integrating data to help develop effective cancer treatments.
| |
| |
| | |
| |
| | |
| |
| | |
| |
| |
There are vast amounts of Web data that is unstructured, ungrammatical and visually dissimilar from one another. By exploiting other sources of data within a given domain, called a reference set, we can much more effectively extract and query this large amount of unstructured data. These techniques can be used to more effectively organize and query data from sources such as Craigslist or eBay.
| |
| |
| | |
| |
| |