Publications
Selective Sampling with Co-Testing: Preliminary Results
Abstract
We present a novel approach to selective sampling, cotesting, which can be applied to problems with redundant views (ie, problems with multiple disjoint sets of attributes that can be used for learning). The main idea behind co-testing consists of selecting the queries among the unlabeled examples on which the existing views disagree.
Selective sampling (Seung, Opper, & Sompolinski 1972), a form of active learning, reduces the number of training examples that need to be labeled by examining unlabeled examples and selecting the most informative ones for the human to label. We introduce co-testing, which is a novel approach to selective sampling for domains with redundant views. A domain has redundant views if there are at least two mutually exclusive sets of features that can be used to learn the target concept. Our work was inspired by (Blum & Mitchell 1998), who noted that there are many real world domains with multiple views. For example, in Web page classification, one can identify faculty home pages either based on the words on the page or based on the words in HTML anchors pointing to the page.
- Date
- 2000
- Authors
- Ion Muslea, Steven Minton, Craig A Knoblock
- Conference
- AAAI/IAAI
- Pages
- 1107