Eduard Hovy


Since 2012, Prof. Hovy has been working at the
Language Technologies Institute
Carnegie Mellon University


Please visit his homepage there.

This page will automatically redirect you there in 10 seconds.



Information Sciences Institute
of the University of Southern California
4676 Admiralty Way
Marina del Rey, CA 90292-6695
U.S.A.

tel: +1-310-448-8731
fax: +1-310-823-6714
email: hovy@isi.edu
Projects: http://www.isi.edu/natural-language/nlp-at-isi.html


Positions while at ISI

  • Until September 2012, I was working at the University of Southern California, as a Fellow of its Information Sciences Institute (ISI), as Director of the Human Language Group, as Research Associate Professor in USC's Computer Science Department, and as Director of Research for ISI's Digital Government Research Center DGRC
  • Co-Director for Research of the Command, Control, and Interoperability Center for Advanced Data Analysis (CCICADA). Rutgers University heads a consortium of about 20 universities in this Center of Excellence funded by the Department of Homeland Security University Affiliates Center program. Our research focuses on a broad range of topics of interest to data analysis and threat identification
  • Advisory Professor, Department of Computer Science, Beijing University of Posts and Telecommunications, Beijing, China (Sep 2005 -- Aug --> -- 2015)
  • Regular High-Level Visiting Scientist, International Guest Academic Talents (IGAT) Program for the Development of University Disciplines in China (111 Program), China (Jan 2008 -- Dec 2015)
  • Adjunct Associate Professor, School of Computer Science, University of Waterloo, Waterloo, Canada (May 2010 -- Apr 2013)

    Research at ISI until 2012

    Research can be organized into three principal overlapping directions:

    (1) Natural Language Processing / Computational Linguistics / Human Language Technology

  • Development of sophisticated machine reading, information extraction, parsing, and text analysis technology (relevant publications).
    DARPA's DEFT Program (2012--16) and previous Machine Reading Program --> --(2009--14) have the goal to develop NLP and knowledge representation and reasoning techniques for deeper semantic analysis of text and resultant automated learning of domain information by reading texts in given domains. Prof. Hovy leads our current DEFT project SAFT (Semantic Analysis and Filtering of Text), which includes researchers at Carnegie Mellon University and the University of Southern California's Information Sciences Institute. From 2008--12, Prof. Hovy's groups participated in two of --> --DARPA's MRP teams: RACR (headed by IBM, the team that developed the Watson QA game-playing engine) and ERUDITE (headed by BBN; the OntoNotes corpus was developed as part of this project). The SASO (2004--11) and --href="http://www.ict.usc.edu/disp.php?bd=proj_mre"> MRE (2001--04) projects at the --href="http://www.ict.usc.edu/">Institute of Creative Technology of the University of Southern California developed virtual humans in virtual reality simulations, employing text-to-semantics parsers and opposite-direction generators developed by Prof. Hovy and students.
  • Development of automated question answering systems (relevant publications).
    Associated with the above are several QA systems developed at ISI, such as Textmap and Webclopedia (with Dr. Daniel Marcu, Dr. Ulf Hermjakob, Dr. Chin-Yew Lin, and others). This work employed information retrieval, clustering, text summarization, parsing, and text harvesting methods described elsewhere.
  • Development of automated text summarization systems and automated summarization evaluation theory and technology (relevant publications).
    Summarization engines developed by Prof. Hovy, Dr. Chin-Yew Lin, and others at ISI include SUMMARIST (single documents), NeATS (multiple documents), and GOSP (producing headlines). Summarization was used in multilingual text access and management systems such as C*ST*RD and MuST. Summarization evaluation systems include ROUGE (2003--04) --> --developed by Dr. Chin-Yew Lin of ISI with Prof. Hovy, and the BE package (2005--08) --> --developed by Dr. Stephen Tratz and Prof. Hovy.
  • Research on various aspects of machine translation and automated MT evaluation systems and technology (relevant publications).
    For MT evaluation, work in 2002--04 includes a systematization of all --> --major machine translation evaluation measures ( the FEMTI survey) with Prof. Maghi King and Dr. Andre Popescu-Belis at the University of Geneva, as well as students and researchers at other universities and commercial MT companies. Work on machine translation included development of the Pangloss MT system (1990--94) together with researchers at CMU and New Mexico State --> --University, which helped establish ISI's Gazelle system headed by Dr. Kevin Knight. The NSF-sponsored IL-Annotation project IAMTC (2003--04), --> --joint with researchers at CMU, University of Maryland, MITRE, Columbia University, and New Mexico State University, focused on Interlingua design and text annotation; see under lexical semantics below.
  • Development of sophisticated social media analysis and opinion identification technology (relevant publications).
    Prof. Hovy is working with researchers at ISI in the Social Media project to develop techniques for identifying the persuasiveness and persuadability of participants in online discussions (currently, Wikipedia discussions). This work follows work with Dr. Don Metzler and others at ISI in 2010--11 to recognize important --> --events from analyzing the Twitter stream, and prior research in the MKIDS-ISI project (2002--05) that developed methods to analyze emails for --> --expertise (of people and groups) and relative social status, using topic signature and speech act recognition. Earlier, the Psyop project (2004--08) employed --> --information extraction and sentiment analysis technology to extract from online texts entities, events, beliefs, goals, opinions, and other information of interest, and to compose the results into psychologically informative descriptions of people.
  • Development of theories and systems to perform automated text generation, including multi-sentence text and sentence planners (relevant publications) and single-sentence generators (relevant publications).
    Work with researchers at USC's Institute for Creative Technologies to develop a parser and generator generator for the software agents in virtual reality simulations called SASO (2004--09) and Mission Rehearsal Exercise (2001--04) (this work in collaboration with Dr. David Traum, Dr. Anton Leuski, Dr. David DeVault, and others).
    The project Quick!Help focuses on the generation of tailored recipes for poor people (this work in collaboration with Prof. Peter Clarke and Dr. Susan Evans from USC and Andrew Philpot from ISI). This work relates to language tailoring done earlier in the HealthDoc project with Prof. Chrysanne DiMarco from the University of Waterloo, Canada and Prof. Graeme Hirst from the University of Toronto). Earlier work focused on the development of discourse relations and planners that employ them to ensure the production of coherent multisentential text. This includes a taxonomization of all available discourse relations collected from various sources (1992) and the RST Test Structurer (1987--92). Prof. Hovy's work in 1987--92 included participation on the Penman --> --sentence generator with researchers in various countries, to develop the then-largest sentence generator in the world.
    Prof. Hovy's Ph.D. work focused on the development of a text generation program PAULINE that took into account the pragmatic aspects of communication, since the absence of sensitivity toward hearer and context has been a serious shortcoming of generator programs written to date. In general, he is interested in all facets of communication, especially language, as situated in the wider context of intelligent behavior. Related areas include Artificial Intelligence (work on planning and learning), Linguistics (semantics and pragmatics), Psychology, Philosophy (ontologies), and Theory of Computation.
  • Development of theory to address problems in multimedia human-computer communication (relevant publications).
    This work (1989--2002), conducted with Dr. Yigal Arens of ISI and --> --students, focused specifically on the question of dynamic planning and allocation of information to media during presentation design.
  • (2) Ontologies, Text Mining/Harvesting, and Lexical Semantics

  • Development of shallow semantic representation notations and tools that support manual annotation of large amounts of text with shallow semantic information (relevant publications).
    The DARPA-funded OntoNotes project (2008--2012), joint with --> --Dr. Ralph Weischedel and Dr. Lance Ramshaw of BBN, Prof. Mitch Marcus of the University of Pennsylvania, and Prof. Martha Palmer of the University of Colorado, focused on the creation of a large corpus of texts in English, Chinese, and Arabic that was annotated with shallow semantic information (word senses and some coreference). The wordsense information was incorporated into the Omega ontology (see below). The NSF-funded IL-Annot project IAMTC (2003--04), joint with researchers at CMU, University of Maryland, --> --MITRE, Columbia University, and New Mexico State University, focused on stepwise Interlingua design and verification by annotation of texts in 7 languages. In both these projects, the Omega ontology (see below) provided the symbol set for semantic annotation.
  • Development of large concept taxonomies/ontologies through a combination of merging together existing ontologies, adding to the knowledge by extracting information from online text (see below), and enriching the interdependency relations by extracting information from dictionaries (relevant publications).
    The Omega ontology, built at ISI since 2003, contains over 120,000 concept terms and several million instances, in addition to various other information, acquired from a variety of sources, including Princeton's WordNet, NMSU's Mikrokosmos, and ISI's earlier ontology SENSUS (1996--2000). During 2008--2011, in the OntoNotes project (see above), --> --a new Upper Model was built for Omega, and its Entities were thoroughly re-organized. Work on Omega has been performed by Prof. Hovy in collaboration with Mr. Andrew Philpot, Dr. Patrick Pantel, Mr. Michael Fleischman, and Dr. Jerry Hobbs from ISI.
  • Development of techniques to extract large amounts of instance- and concept-level information from online text (relevant publications).
    At ISI, Dr. Zornitsa Kozareva and Prof. Hovy developed the Double-Anchored Pattern (DAP) text harvesting technique and demonstrated its effectiveness for collecting terms and relations, and for organizing them hierarchically, over large amount sof domain texts. (This work was partially done in collaboration with Prof. Ellen Riloff from the University of Utah.)
    In several earlier projects since 1996, Prof. Hovy, students, and collaborators developed a series of text mining and information extraction engines, and built collections comprising several millions facts (about people, locations, objects, etc.). This information, stored in a database, was in many cases connected to the Omega ontology (see above). The Learning by Reading and Mobius (2005--08) experiments attempted to combine tagging, parsing, semantic analysis, and inference techniques to create a knowledge base automatically from a high school textbook of Chemistry and from texts about the heart and engines, and to answer high school-level test questions about this.
  • (3) Digital Government

  • Analysis of the nature of information processing in government to recognize opportunities to deploy IT for effective Digital Government (relevant publications).
    The development of systems to automatically find alignments or aliases across and within databases (2003--06). The --href="http://sift.isi.edu/">SiFT system used mutual information technology to detect patterns in the distribution of data values. Government partners in this NSF-funded project project were the Environmental Protection Agency (EPA), who provided databases with air quality measurement data. (This work was done at ISI with Mr. Andrew Philpot and Dr. Patrick Pantel).
  • Development of sophisticated text analysis of public commentary, such aprogram to develop ICT for city-to-citizen interaction. Work at ISI focused on the development of a system to classify emails and extract speech acts, opinions, and stakeholders in German.
  • Development of systems to access multiple heterogeneous databases (relevant publications).
    Funded by the NSF (1999s email, letters, and reports, delivered to the government (relevant publications).
    Several projects from 2000-07 addressed the problem faced by government regulation writers that they regularly face tens to hundreds of thousands of emails and other comments about proposed regulations, sent to them by the public. 2003) a series of projects addressed the problem that many government agencies face: their data is distributed in various formats over different databases, and evolves to include slightly different variations over the years. Our EDC and AskCal systems provided access to over 50,000 table of information about gasoline, produced by various Federal Statistics agencies, including the Census Bureau, the Bureau of Labor Statistics, and the Energy Information Administration. The system included a large ontology and a natural language question interpreter. This work was done at ISI in collaboration with Mr. Andrew Philpot and Dr. Jose-Luis Ambite. External partners in this project were the DGRC team at Columbia University, New York, headed by Dr. Judith Klavans.

  • Work Experience

  • Associate Research Professor (Sep 2012 --) at the --href="http://www.lti.cs.cmu.edu"> Language Technologies Institute of Carnegie Mellon University, Pittsburgh, PA
  • Division Director (2011 -- 2012), ISI Fellow (Aug 2000 -- 2012), Deputy Division Director (Oct 2002 -- 2011), Senior Project Leader --> -- (May 1997 -- 2002), Project Leader (Jul 1989 -- Apr 1997), and Computer Scientist (Mar --> -- 1987 -- Jun 1989), Information Sciences Institute of the University of Southern California, Los Angeles, CA
  • Research Associate Professor (Nov 1999 --) and Research Assistant --> --Professor (Dec 1989 -- Oct 1999), Department of --> -- Computer Science, University of Southern California, Los Angeles, CA
  • Co-Director (May 1999 -- Jun 2005), Master's Degree Program in --> -- Computational Linguistics, University of Southern California, Los Angeles, CA
  • Advisory Professor (Oct 2005 -- Aug 2015), Beijing University of Posts and Telecommunications, Beijing, China
  • Adjunct Associate Professor (Feb 1997 -- Jan 2003) and (May 2010 -- --> -- Apr 2013), School of Computer Science, University of Waterloo, Waterloo, Canada
  • Concurrent Professor (Oct 2008 -- Sep 2011), Department of Computer --> -- Science, Northeastern University, Shenyang, China
  • Adjunct Professor (Sep 2005Funded by the NSF, the eRule projects were a collaboration between Prof. Stuart Shulman (a political scientist then at the University of Pittsburgh and the University of Massachusetts Amherst), Prof. Jamie Callan (a computer Aug 2012), Department of Computer Science, KAIST, Daejeon, Korea

    Honors

  • Ph.D. honoris causa, National University of Disstance Education (UNED), Madrid, Spain. Jan 2013
  • ACL Fellow, one of the original 17 Fellows of the Association for Computational Linguistics. Dec 2011
  • Regular High-Level Visiting Scientist, International Guest Academic Talents (IGAT) Program for the Development of University Disciplines in China (111 Program). (China's Ministry of Education launched the so-called "111" program in September 2006, aiming to invite 1,000 world class academics from the world's top 100 universities to establish 100 innovative research bases in China.) Jan 2008 -- Dec 2012, renewed to Dec 2015
  • Best Paper Award, IEEE International Conference on Semantic Computing (IEEE-ICSC-07). Nov 2007
  • Mellon Award for Excellence in Mentoring, awarded by the USC Center for Excellence in Teaching (Office of the Provost). University of Southern California. Apr 2006
  • Program Committee Honorary Chair, IEEE International Conference on Natural Language Processing and Knowledge Engineering (IEEE NLP-KE). Wuhan, China (2005); Beijing, China (2007); Beijing, China (2010)
  • ISI Fellow. USC Information Sciences Institute. Aug 2000
  • Scientia prize for best science graduate. Rand Afrikaans University, Johannesburg, South Africa. Dec 1977
  • De Beers Undergraduate Scholarship. De Beers Consolidated Mines. Johannesburg, South Africa. Jan 1975 -- Dec 1978

    Biography | Education | Publications | Research Grants
    Professional Activities | Invited Presentations | Teaching and Advising