Eduard
Hovy
Since 2012, Prof. Hovy has been working at the
Language Technologies
Institute
Carnegie Mellon University
Please visit his homepage
there.
This page will automatically redirect you there in 10 seconds.
Positions while at ISI
Until September 2012, I was working at the University of Southern
California, as
a Fellow of its Information Sciences
Institute (ISI),
as Director of the Human Language
Group, as Research Associate Professor in USC's
Computer Science Department, and as Director of Research for
ISI's
Digital Government Research Center
DGRC
Co-Director for Research of the Command, Control,
and Interoperability Center for Advanced Data Analysis
(CCICADA). Rutgers
University heads a consortium of about 20 universities in this
Center of Excellence
funded by the Department of Homeland Security University Affiliates
Center program.
Our research focuses on a broad range of topics of interest to data
analysis and
threat identification
Advisory Professor, Department of Computer Science,
Beijing
University of
Posts and Telecommunications, Beijing, China (Sep 2005 -- Aug -->
-- 2015)
Regular High-Level Visiting Scientist, International Guest Academic
Talents (IGAT)
Program for the Development of University Disciplines in China (111
Program), China
(Jan 2008 -- Dec 2015)
Adjunct Associate Professor,
School of Computer Science,
University of
Waterloo, Waterloo, Canada (May 2010 -- Apr 2013)
Research at ISI until 2012
Research can be organized into three principal overlapping
directions:
(1) Natural Language Processing / Computational Linguistics /
Human Language
Technology
Development of sophisticated machine reading, information
extraction, parsing, and
text analysis technology (relevant
publications).
DARPA's DEFT Program (2012--16) and previous Machine Reading Program -->
--(2009--14) have
the goal to develop NLP and knowledge representation and reasoning
techniques for
deeper semantic analysis of text and resultant automated learning of
domain information
by reading texts in given domains. Prof. Hovy leads our current DEFT
project SAFT
(Semantic Analysis and Filtering of Text), which includes researchers
at Carnegie
Mellon University and the University of Southern California's
Information Sciences
Institute. From 2008--12, Prof. Hovy's groups participated in two of -->
--DARPA's MRP
teams: RACR (headed by IBM, the team that developed the Watson
QA game-playing
engine) and ERUDITE (headed by BBN; the OntoNotes corpus
was developed
as part of this project).
The SASO (2004--11) and
--href="http://www.ict.usc.edu/disp.php?bd=proj_mre">
MRE (2001--04) projects at the
--href="http://www.ict.usc.edu/">Institute of
Creative Technology of the University of Southern California
developed virtual
humans in virtual reality simulations, employing text-to-semantics
parsers and
opposite-direction generators developed by Prof. Hovy and students.
Development of automated question answering systems
(relevant publications).
Associated with the above are several QA systems developed at ISI,
such as
Textmap
and
Webclopedia
(with Dr. Daniel Marcu, Dr. Ulf Hermjakob, Dr. Chin-Yew Lin, and
others). This work
employed information retrieval, clustering, text summarization,
parsing, and text
harvesting methods described elsewhere.
Development of automated text summarization systems and
automated
summarization evaluation theory and technology
(relevant publications).
Summarization engines developed by Prof. Hovy, Dr. Chin-Yew
Lin, and
others at ISI
include
SUMMARIST
(single documents),
NeATS (multiple documents), and
GOSP
(producing headlines). Summarization was used in multilingual text
access and
management systems such as
C*ST*RD
and MuST.
Summarization evaluation systems include
ROUGE (2003--04) -->
--developed
by Dr. Chin-Yew Lin of ISI with Prof. Hovy, and the
BE package (2005--08) -->
--developed by Dr.
Stephen Tratz and Prof. Hovy.
Research on various aspects of machine translation and
automated MT
evaluation systems and technology
(relevant publications).
For MT evaluation, work in 2002--04 includes a systematization of all -->
--major machine
translation evaluation measures (
the FEMTI
survey) with
Prof. Maghi King and
Dr. Andre Popescu-Belis
at the University of Geneva, as well as students and researchers at
other
universities and commercial MT companies.
Work on machine translation included development of the
Pangloss MT system
(1990--94) together with researchers at CMU and New Mexico State -->
--University, which
helped establish ISI's
Gazelle
system headed by Dr. Kevin Knight.
The NSF-sponsored IL-Annotation project
IAMTC (2003--04), -->
--joint with
researchers at CMU, University of Maryland, MITRE, Columbia
University, and New
Mexico State University, focused on Interlingua design and text
annotation; see
under lexical semantics below.
Development of sophisticated social media analysis and opinion
identification
technology (relevant
publications).
Prof. Hovy is working with researchers at ISI in the Social
Media project to
develop techniques for identifying the persuasiveness and
persuadability of participants
in online discussions (currently, Wikipedia discussions). This work
follows work with
Dr. Don Metzler and others at ISI in 2010--11 to recognize important -->
--events from
analyzing the Twitter stream, and prior research in the
MKIDS-ISI
project (2002--05) that developed methods to analyze emails for -->
--expertise (of people
and groups) and relative social status, using topic signature and
speech act
recognition. Earlier, the Psyop project (2004--08) employed -->
--information
extraction and sentiment analysis technology to extract from online
texts entities,
events, beliefs, goals, opinions, and other information of interest,
and to compose
the results into psychologically informative descriptions of people.
Development of theories and systems to perform automated text
generation,
including multi-sentence text and sentence planners
(relevant publications) and
single-sentence
generators (relevant
publications).
Work with researchers at USC's Institute for
Creative Technologies to develop a parser and generator
generator for the
software agents in virtual reality simulations called SASO
(2004--09) and
Mission
Rehearsal Exercise
(2001--04) (this work in collaboration with Dr. David
Traum, Dr. Anton
Leuski, Dr. David
DeVault, and others).
The project
Quick!Help
focuses on the
generation of tailored recipes for poor people (this work in
collaboration with
Prof. Peter
Clarke
and Dr. Susan Evans from USC and Andrew Philpot from ISI). This work
relates to
language tailoring done earlier in the
HealthDoc
project with Prof. Chrysanne
DiMarco
from the University
of Waterloo, Canada and Prof. Graeme Hirst
from the University of Toronto).
Earlier work focused on the development of discourse relations and
planners that
employ them to ensure the production of coherent multisentential
text. This includes
a taxonomization of all available discourse relations collected from
various sources
(1992) and the RST Test Structurer (1987--92).
Prof. Hovy's work in 1987--92 included participation on the Penman -->
--sentence
generator with researchers in various countries, to develop the
then-largest
sentence generator in the world.
Prof. Hovy's Ph.D. work focused on the development of a text
generation program
PAULINE that took into account the pragmatic aspects of
communication, since the
absence of sensitivity toward hearer and context has been a serious
shortcoming of
generator programs written to date. In general, he is interested in
all facets of
communication, especially language, as situated in the wider context
of intelligent
behavior. Related areas include Artificial Intelligence (work on
planning and learning),
Linguistics (semantics and pragmatics), Psychology, Philosophy
(ontologies), and Theory
of Computation.
Development of theory to address problems in multimedia
human-computer
communication
(relevant publications).
This work (1989--2002), conducted with Dr. Yigal Arens of ISI and -->
--students, focused
specifically on the question of dynamic planning and allocation of
information to media during presentation design.
(2) Ontologies, Text Mining/Harvesting, and Lexical Semantics
Development of shallow semantic representation notations and
tools that
support manual annotation of large amounts of text with shallow
semantic
information (relevant
publications).
The DARPA-funded OntoNotes project (2008--2012), joint with -->
--Dr. Ralph Weischedel
and Dr. Lance Ramshaw of BBN, Prof. Mitch
Marcus of the
University of Pennsylvania,
and Prof. Martha Palmer of the
University of Colorado, focused on the creation of a large corpus of
texts in English,
Chinese, and Arabic that was annotated with shallow semantic
information (word senses
and some coreference). The wordsense information was incorporated
into the Omega
ontology (see below).
The NSF-funded IL-Annot project IAMTC
(2003--04), joint with researchers at CMU, University of Maryland, -->
--MITRE, Columbia
University, and New Mexico State University, focused on stepwise
Interlingua design
and verification by annotation of texts in 7 languages. In both these
projects, the
Omega ontology (see below) provided the symbol set for semantic
annotation.
Development of large concept taxonomies/ontologies through a
combination of
merging together existing ontologies, adding to the knowledge by
extracting information
from online text (see below), and enriching the interdependency
relations by extracting
information from dictionaries (relevant publications).
The Omega ontology, built at ISI
since 2003, contains
over 120,000 concept terms and several million instances, in addition
to various other
information, acquired from a variety of sources, including Princeton's
WordNet, NMSU's
Mikrokosmos, and
ISI's earlier ontology SENSUS
(1996--2000). During 2008--2011, in the OntoNotes project (see above), -->
--a new Upper
Model was built for Omega, and its Entities were thoroughly
re-organized. Work on
Omega has been performed by Prof. Hovy in collaboration with
Mr. Andrew Philpot, Dr.
Patrick Pantel, Mr. Michael Fleischman, and Dr. Jerry
Hobbs from ISI.
Development of techniques to extract large amounts of instance- and
concept-level
information from online text
(relevant publications).
At ISI, Dr. Zornitsa Kozareva and
Prof. Hovy
developed the Double-Anchored Pattern (DAP) text harvesting
technique and
demonstrated its effectiveness for collecting terms and relations, and
for organizing
them hierarchically, over large amount sof domain texts. (This work
was partially
done in collaboration with Prof. Ellen Riloff from the University of
Utah.)
In several earlier projects since 1996, Prof. Hovy, students, and
collaborators
developed a series of text mining and information extraction engines,
and built
collections comprising several millions facts (about people,
locations, objects, etc.).
This information, stored in a database, was in many cases connected to
the Omega ontology
(see above). The Learning by Reading and Mobius
(2005--08) experiments
attempted to combine tagging, parsing, semantic analysis, and
inference techniques to
create a knowledge base automatically from a high school textbook of
Chemistry and from
texts about the heart and engines, and to answer high school-level
test questions about
this.
(3) Digital Government
Analysis of the nature of information processing in government to
recognize opportunities
to deploy IT for effective Digital Government
(relevant publications).
The development of systems to automatically find alignments or
aliases across and
within databases (2003--06). The
--href="http://sift.isi.edu/">SiFT system used
mutual information technology to detect patterns in the distribution
of data values.
Government partners in this NSF-funded project project were the
Environmental Protection
Agency (EPA), who provided databases with air quality measurement
data. (This work was
done at ISI with Mr. Andrew Philpot and Dr. Patrick Pantel).
Development of sophisticated text analysis of public
commentary, such aprogram to develop ICT for city-to-citizen
interaction. Work at ISI focused on the
development of a system to classify emails and extract speech acts,
opinions, and
stakeholders in German.
Development of systems to access multiple heterogeneous
databases
(relevant publications).
Funded by the NSF (1999s email,
letters, and reports, delivered to the government
(relevant publications).
Several projects from 2000-07 addressed the problem faced by
government regulation
writers that they regularly face tens to hundreds of thousands of
emails and other
comments about proposed regulations, sent to them by the public.
2003) a series of projects addressed the problem that many
government agencies face: their data is distributed in various formats
over different
databases, and evolves to include slightly different variations over
the years. Our
EDC and
AskCal systems
provided access
to over 50,000 table of information about gasoline, produced by
various Federal
Statistics agencies, including the Census Bureau, the Bureau of Labor
Statistics, and
the Energy Information Administration. The system included a large
ontology and a
natural language question interpreter. This work was done at ISI in
collaboration
with Mr. Andrew Philpot and Dr. Jose-Luis Ambite. External partners
in this project
were the DGRC team at Columbia University, New York, headed by
Dr. Judith Klavans.
Work Experience
Associate Research Professor (Sep 2012 --) at the
--href="http://www.lti.cs.cmu.edu">
Language Technologies Institute of Carnegie Mellon
University, Pittsburgh, PA
Division Director (2011 -- 2012), ISI Fellow (Aug 2000 -- 2012),
Deputy Division Director (Oct 2002 -- 2011), Senior Project Leader -->
-- (May 1997 -- 2002),
Project Leader (Jul 1989 -- Apr 1997), and Computer Scientist (Mar -->
-- 1987 -- Jun 1989),
Information Sciences Institute of
the
University of Southern California,
Los Angeles, CA
Research Associate Professor (Nov 1999 --) and Research Assistant -->
--Professor (Dec 1989
-- Oct 1999), Department of -->
-- Computer Science,
University of Southern California,
Los Angeles, CA
Co-Director (May 1999 -- Jun 2005), Master's Degree Program in -->
-- Computational
Linguistics, University of Southern
California, Los
Angeles, CA
Advisory Professor (Oct 2005 -- Aug 2015),
Beijing
University of
Posts and Telecommunications, Beijing, China
Adjunct Associate Professor (Feb 1997 -- Jan 2003) and (May 2010 -- -->
-- Apr 2013),
School of Computer Science,
University of
Waterloo, Waterloo, Canada
Concurrent Professor (Oct 2008 -- Sep 2011), Department of Computer -->
-- Science,
Northeastern University,
Shenyang, China
Adjunct Professor (Sep 2005Funded by the NSF,
the eRule projects
were a
collaboration between Prof. Stuart
Shulman (a political
scientist then at the
University of Pittsburgh and the University of Massachusetts Amherst),
Prof. Jamie
Callan (a computer
Aug 2012), Department of Computer Science,
KAIST, Daejeon, Korea
Honors
Ph.D. honoris causa, National University of Disstance
Education (UNED),
Madrid, Spain. Jan 2013
ACL Fellow, one of the original 17 Fellows of the Association for
Computational
Linguistics. Dec 2011
Regular High-Level Visiting Scientist, International Guest Academic
Talents (IGAT)
Program for the Development of University Disciplines in China (111
Program).
(China's Ministry of Education launched the so-called "111" program
in September
2006, aiming to invite 1,000 world class academics from the world's
top 100
universities to establish 100 innovative research bases in China.)
Jan 2008 -- Dec 2012, renewed to Dec 2015
Best Paper Award, IEEE International Conference on Semantic
Computing (IEEE-ICSC-07).
Nov 2007
Mellon Award for Excellence in Mentoring, awarded by the USC Center
for Excellence
in Teaching (Office of the Provost). University of Southern
California. Apr 2006
Program Committee Honorary Chair, IEEE International Conference on
Natural Language
Processing and Knowledge Engineering (IEEE NLP-KE). Wuhan, China
(2005); Beijing,
China (2007); Beijing, China (2010)
ISI Fellow. USC Information Sciences Institute. Aug 2000
Scientia prize for best science graduate. Rand Afrikaans
University,
Johannesburg, South Africa. Dec 1977
De Beers Undergraduate Scholarship. De Beers Consolidated
Mines. Johannesburg, South
Africa. Jan 1975 -- Dec 1978
Biography
| Education
| Publications
| Research Grants
Professional Activities
| Invited Presentations
| Teaching and Advising