Marti Hearst
SIMS, UC Berkeley
donotspam.hearst@sims.berkeley.edu
http://www.sims.berkeley.edu/~hearst/
"Towards Semi-Supervised Algorithms for Semantic Relation
Detection in BioScience Text"
11/19/04: 10:30 AM
11th Floor Large Conference Room
Host: Patrick Pantel, schedule
Abstract: A crucial step toward the goal of automatic extraction of propositional
information from natural language text is the identification of semantic
relations between constituents in sentences. In the bioscience text
domain, we have developed a simple ontology-based algorithm for
determining which semantic relation holds between terms in noun
compounds, and a supervised learning algorithm for discovering relations
between entities. In this talk, I will first briefly describe these
results.
A major bottleneck for semantic labeling work is the development of
labeled training data. To remedy this, we propose a new approach for
creating semantically-labeled data that makes use of what we call
*citances*: the text of the sentences surrounding citations to research
articles. Citances provide us with differently-worded statements of
approximately the same semantic information; by looking at the way that
different authors talk about the same facts, we obtain paraphrases
nearly for free. We have just begun to assess how well citances work
for the creation of labeled training data for the problem of detecting
protein-protein interaction relations. We also hypothesize that
citances will be useful for synonym creation, document summarization,
and database curation.
About Marti Hearst: Dr. Marti Hearst is an associate professor in SIMS, the School of
Information
Management and Systems at UC Berkeley, with an affiliate appointment in
the
Computer Science Division. Her primary research interests are user
interfaces
and visualization for information retrieval, empirical computational
linguistics, and text data mining. She received BA, MS, and PhD degrees
in
Computer Science from the University of California at Berkeley, and she
was a
Member of the Research Staff at Xerox PARC from 1994 to 1997. Prof.
Hearst is on
the editorial boards of ACM Transactions on Information Systems and ACM
Transactions on Computer-Human Interaction and was formerly on the
boards of
Computational Linguistics and IEEE Intelligent Systems, and was the
program
co-chair of HLT-NAACL '03 and SIGIR '99. She has received an NSF CAREER
award,
an IBM Faculty Award, an Okawa Foundation Fellowship, and two
student-initiated
Excellence in Teaching awards.
Last updated: Mon Jun 19 17:44:06 2006
 |