| | | | |
| | |
| |
| | |
|
| | |
| |  |
Wrapper methods provide extraction techniques for semi-structured sources, such as similarly-looking Web pages, but lots of data on the World Wide Web exists in an unstructured and ungrammatical form.
| |
|
| | |
| |  |
Raster maps are widely available for areas around the globe and are an important source of geospatial data. Comparing to other geospatial data, raster maps are easily accessible and provide geographic features that are difficult to find elsewhere, such as landmarks in historical maps.
| |
|
| | |
| |  |
Mashups provide an integrated and effective approach to extract, integrate and view diverse information. Some interesting examples of Mashups on the Internet are Zillow and WikiMapia. However, the process of creating a Mashup often requires programming knowledge and background information of widgets to use existing technologies such as Yahoo Pipes and Intel Mashmaker.
| |
|
| | |
| |  |
The ability to reason over geospatial entities using publicly available information is greatly enhanced by the abundance of geospatial data sources on the Internet. Traditional data sources such as satellite imagery, maps, gazetteers and vector data have long been used in geographic information systems (GIS).
| |
|
| | |
| |  |
There are a huge number of high quality maps on the Internet that can be used to extract useful geospatial information about the region they describe. For example, by aligning satellite images with these maps, we can label the streets automatically.
| |
|
| | |
| |  |
Only a very small portion of data on the Web is semantically annotated and available for use within Information Integration applications. Semantically annotating existing Web sources requires significant manual effort that must be repeated for each new data source.
| |
|
| | |
| |  |
We can utilize various extraction techniques to extract data from a wide variety of sources. However, different sources often have different schemas, access methods, and coverage. To address this issue, we have developed a data integration framework called Prometheus that facilitates uniform access to the sources.
| |
|
| | |
| |  |
The current approaches for linking information across sources, often called record linkage, require finding common attributes between the sources and comparing the records using those attributes. This often leads to unsatisfactory results because the sources are often missing information or contain incorrect or outdated information.
| |
|
| | |
| |
| | |
|
| | |
| |  |
People use search engines today to find information, but in many cases what people actually want is an application that allows them to access a set of related sources, extract the information they need, and integrate the data in ways that allow them to solve their problems.
| |
|
| | |
| |  |
We utilized a wide variety of geospatial and textual data available on the Internet in order to efficiently and accurately identify objects in the satellite imagery. To demonstrate the utility of our technique, we built an application that utilizes the satellite imagery from online sources to annotate buildings on the imagery.
| |
|
| | |
| |  |
Theseus is an execution platform for information agents. Its goals are to allow complex information management plans to be easily specified and to provide an infrastructure that optimizes the execution of such plans.
| |
|
| | |
| |  |
The task of object identification occurs when integrating information from multiple websites. The same data objects can exist in inconsistent text formats across sites, making it difficult to identify matching objects using exact text match.
| |
|
| | |
| |  |
With the expansion of the Web, computer users have gained access to a large variety of comprehensive information repositories. However, the Web is based on a browsing paradigm that makes it difficult to retrieve and integrate data from multiple sources.
| |
|
| | |
| |  |
Wrappers facilitate access to Web-based information sources by providing a uniform querying and data extraction capability. A wrapper for the yellow pages source can take a query for a Mexican restaurant near Marina del Rey, CA, for example, retrieve the Web page containing results of the query and extract the restaurant's name, its address and the phone number.
| |
|