to ISI Home Page
isd home
About ISD
education at isd
employment
environment
news
people
research
AI Seminars
div3admin

environment
Omid Madani
Yahoo! Research Labs
donotspam.Omid.Madani@overture.com
http://research.yahoo.com/staff/algorithms/madani.xml


"Co-Validation: Using Model Disagreement to Validate Classification Algorithms"

10/22/04: 10:30 AM
11th Floor Large Conference Room
Host: Patrick Pantel, schedule

Abstract: In many applications of machine learning, labeled data is scarce whileunlabeled data is plentiful. In these settings, unlabeled data can beutilized in a number of ways to help address the shortage of labeleddata. For example, in active learning, unlabeled instances areselectively sampled and labeled in order to quickly improve theaccuracy of the learning algorithm while lowering labeling costs.Other related methods include techniques for transduction andsemi-supervised induction. In this talk, I describe a new way of utilizing unlabeled data. Inthe context of binary classification, we define disagreement as ameasure of how often two independently-trained models differ in theirclassification of unlabeled data. The disagreement rate is areflection of learning algorithm stability, model complexity, andproblem difficulty, and enjoys a number of properties. For example,we show that disagreement yields a lower bound on the prediction(generalization)error, and an upper bound on the ``variance of prediction error'', where variance is measured across training sets. I will report on ourempirical experiments on a number of datasets using disagreement forerror estimation and model selection. We call the general procedureco-validation, since the two independently-trained models areeffectively used to validate one another. The procedure is especiallyeffective in active learning settings, where training sets are notdrawn at random and cross validation often greatly overestimateserror. We believe that variants of co-validation may be of greatpractical use when unlabeled data is plentiful. Joint work with David Pennock and Gary Flake. To appear in NIPS04.

About Omid Madani: Omid Madani earned a doctorate in computer science at the Universityof Washington, and attended the University of Alberta as apost-doctoral fellow, where he won the Alberta IngenuityAssociateship. Omid is interested in many areas of artificialintelligence, including machine learning (utilizing unlabeled data, incorporating prior knowledge,..) and dynamic decision making underuncertainty (algorithms design/analysis for MDPs). He is a senior research scientist at Yahoo! Research Labs, applying his research tochallenging and exciting problems in domains such as informationretrieval and personalization.


Last updated: Mon Jun 19 17:44:06 2006

 

 

 

 

 
USC Home Page ISI Home Page