to ISI Home Page
isd home
About ISD
education at isd
employment
environment
news
people
research
AI Seminars
div3admin

environment
Marti Hearst
SIMS, UC Berkeley
donotspam.hearst@sims.berkeley.edu
http://www.sims.berkeley.edu/~hearst/


"Towards Semi-Supervised Algorithms for Semantic Relation Detection in BioScience Text"

11/19/04: 10:30 AM
11th Floor Large Conference Room
Host: Patrick Pantel, schedule

Abstract: A crucial step toward the goal of automatic extraction of propositional information from natural language text is the identification of semantic relations between constituents in sentences. In the bioscience text domain, we have developed a simple ontology-based algorithm for determining which semantic relation holds between terms in noun compounds, and a supervised learning algorithm for discovering relations between entities. In this talk, I will first briefly describe these results. A major bottleneck for semantic labeling work is the development of labeled training data. To remedy this, we propose a new approach for creating semantically-labeled data that makes use of what we call *citances*: the text of the sentences surrounding citations to research articles. Citances provide us with differently-worded statements of approximately the same semantic information; by looking at the way that different authors talk about the same facts, we obtain paraphrases nearly for free. We have just begun to assess how well citances work for the creation of labeled training data for the problem of detecting protein-protein interaction relations. We also hypothesize that citances will be useful for synonym creation, document summarization, and database curation.

About Marti Hearst: Dr. Marti Hearst is an associate professor in SIMS, the School of Information Management and Systems at UC Berkeley, with an affiliate appointment in the Computer Science Division. Her primary research interests are user interfaces and visualization for information retrieval, empirical computational linguistics, and text data mining. She received BA, MS, and PhD degrees in Computer Science from the University of California at Berkeley, and she was a Member of the Research Staff at Xerox PARC from 1994 to 1997. Prof. Hearst is on the editorial boards of ACM Transactions on Information Systems and ACM Transactions on Computer-Human Interaction and was formerly on the boards of Computational Linguistics and IEEE Intelligent Systems, and was the program co-chair of HLT-NAACL '03 and SIGIR '99. She has received an NSF CAREER award, an IBM Faculty Award, an Okawa Foundation Fellowship, and two student-initiated Excellence in Teaching awards.


Last updated: Mon Jun 19 17:44:06 2006

 

 

 

 

 
USC Home Page ISI Home Page