to ISI Home Page
isd home
About ISD
education at isd
employment
environment
news
people
research
AI Seminars
div3admin

environment
Chen Li


"Integrating Information from Heterogeneous Data Sources"

1/24/2003: 10:30 AM - 12:00 PM
11th Floor Small Conference Room

Abstract: The goal of information integration is to support seamless access to heterogenous, autonomous data sources. Many data-integration systems use a mediation architecture, in which a mediator accepts a user query and answers the query by accessing relevant sources through wrappers. In this talk I will focus on two research problems in information integration. The first one is how to do query processing and optimization in the presence of limited query capabilities, i.e., data sources do not allow simple scans of their data. I will discuss several challenges such as how to describe source restrictions, how to compute mediator capabilities, and how to answer queries efficiently. The second problem is efficient record linkage. That is, given two lists of records from two different sources, we want to determine all record pairs that are similar to each other, where the overall similarity between two records is defined based on domain-specific similarities over individual attributes constituting the record. I will report some of the initial results of our research conducted in the Flamingo Project on Data cleansing.

About Chen Li: Dr. Chen Li is an assistant professor in the Department of Information and Computer Science at the University of California, Irvine. He received his Ph.D. degree in Computer Science from Stanford University in 2001, and his B.S. degree in Computer Science from Tsinghua University, China, in 1994. His research interests are in the fields of database and information systems, including data integration, data warehouses, data cleansing, multimedia databases, and XML.


Last updated: Mon Jun 19 17:44:06 2006

 

 

 

 

 
USC Home Page ISI Home Page