David G. Stork
donotspam.stork@OpenMind.org
www.OpenMind.org
"The Open Mind Initiative: Large-scale knowledge acquisition from non-experts via the web"
1/30/2004: [time not recorded]
11th Floor Large Conference Room
Abstract: The Open Mind Initiative is a web-based collaborative framework for
collecting large knowledge bases from non-expert contributors. Such
knowledge bases are vital for a wide range of 'intelligent' software
such as speech and handwriting recognizers, commonsense reasoners, and
natural language understanding systems. This talk begins by examining
several important trends that underly Open Mind:
- the rise in open source software
- the expansion of opportunities for less-skilled users to contribute
knowledge
- the increase in scientific collaboration over the internet
- the growing need for large sets of 'informal' data from non-experts
Next we contrast the Open Mind approach with traditional data mining,
and then describe ongoing projects collecting common sense, natural
language and handwriting recognition knowledge bases. Our largest
project, Open Mind common sense, has collected nearly a million simple
assertions from over tens of thousands of non-expert contributors.
Two important considerations are speeding the collection of data (by
interactive learning techniques) and ensuring data quality (by
identifying and filtering unreliable or even 'hostile'
contributions). We derive information-theoretic algorithms and perform
simple simulations which justify our approach to these two problems.
The talk concludes with a vision of future directions and
opportunities.
[including work by P. Singh, T. Chklovski, C. Lam. W. Lu, R. Gupta,
R. Mihalcea, and N. Aron]
About David G. Stork: David G. Stork is Chief Scientist of Ricoh Innovations as well as
Consulting Professor of Electrical Engineering and Visiting Lecturer
in Art and Art History at Stanford University. His primary interests
lie in pattern recognition, machine learning, neural networks and
novel uses of the internet; he is the creator and leader of the Open
Mind Initiative. He sits on the editorial boards of four international
journals and his five books include HAL's Legacy: 2001's computer as
dream and reality (MIT Press) for general audiences and the second
edition of Pattern Classification with R. Duda and P. Hart (Wiley).
Last updated: Mon Jun 19 17:44:06 2006
 |