Publications

Selective Sampling with Co-Testing: Preliminary Results

Abstract

We present a novel approach to selective sampling, cotesting, which can be applied to problems with redundant views (ie, problems with multiple disjoint sets of attributes that can be used for learning). The main idea behind co-testing consists of selecting the queries among the unlabeled examples on which the existing views disagree.
Selective sampling (Seung, Opper, & Sompolinski 1972), a form of active learning, reduces the number of training examples that need to be labeled by examining unlabeled examples and selecting the most informative ones for the human to label. We introduce co-testing, which is a novel approach to selective sampling for domains with redundant views. A domain has redundant views if there are at least two mutually exclusive sets of features that can be used to learn the target concept. Our work was inspired by (Blum & Mitchell 1998), who noted that there are many real world domains with multiple views. For example, in Web page classification, one can identify faculty home pages either based on the words on the page or based on the words in HTML anchors pointing to the page.

Date
2000
Authors
Ion Muslea, Steven Minton, Craig A Knoblock
Conference
AAAI/IAAI
Pages
1107