BEGIN:VCALENDAR
CALSTYLE:GREGORIAN
PRODID:-//NL//Seminar Calendar//EN
VERSION:2.0
X-WR-CALNAME:NL
BEGIN:VEVENT
DESCRIPTION:
DTEND:20030801T160000
DTSTART:20030801T150000
LOCATION:11 Large
SUMMARY:Toward deciphering the 2-dimensional ancient Luwian script by discovering its writing order [Shou-de Lin]
UID:20030801T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: The presentation will give an overview of the SMT activities at the
 Language Technologies Institute, Carnegie Mellon University, in large
 vocabulary text translation tasks, esp. the Chinese-English and
 Arabic-English, as well as in limited domain speech-to-speech translation
 tasks.  The CMU SMT system is, like most modern statistical MT systems,
 based on phrase translation.  Several approaches have been developed to
 extract the phrase pairs from parallel corpora and current research
 investigates different scoring approaches for these translation pairs.
 Details of the decoder, esp. on hypothesis recombination, pruning, and
 efficient n-best list generation will be given.  Recently, the SMT system
 has been extended to use partial translations generated from example based
 and grammar based translation system, thereby performing multi-engine
 machine translation.
 
 Bio:
 
 Stephan Vogel is a researcher at the Language Technologies Institute,
 Carnegie Mellon University, where he heads the statistical machine
 translation team.  He received a Diploma in Physics from Philips
 University Marburg, Germany, and a Masters of Philosophy from the
 University of Cambridge, England.  After working for a number of years on
 the history of science, he turned to computer science, especially natural
 language processing.  Before coming to CMU, he worked for several years at
 the Technical Univerity of Aachen on statistical machine translation, and
 also in the Interactive Systems Lab at the University of Karlsruhe.
 

DTEND:20040402T160000
DTSTART:20040402T150000
LOCATION:11 Large
SUMMARY:The CMU Statistical Machine Translation System [Stephan Vogel]
UID:20040402T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: I will present work that extends the standard hidden Markov model to a
 version that can emit multiple symbols in a single time step.  Using this
 model, we are able to automatically create phrase-to-phrase mappings in an
 alignment process.  I've applied this model to the task of creating
 alignments between documents and their human-written abstracts, yielding
 an overall alignment F-score of 0.548, a significant improvement on the
 best results to date of 0.363.  These results are published in an EMNLP
 paper this year, but the talk will be an extended version of the talk I
 will give there (namely, I will discuss the mechanics of the extended HMM
 in more detail in this seminar).
 

DTEND:20040702T150000
DTSTART:20040702T133000
LOCATION:11 Large
SUMMARY:A Phrase-Based HMM Approach to Document/Abstract Alignment [Hal Daume III]
UID:20040702T133000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: We present an approach to automatically extracting paraphrase templates
 from document/abstract pairs. This methodology relies on word-based
 alignments created by off-the-shelf software. Our paraphrases are
 evaluated by human evaluators for precision and automatically for
 applicability. We find that 77% of the extracted paraphrases are judged
 to be always correct and that the generalized templates of 60% are
 judged to be applicable most of the time and 87% are judged to be
 applicable sometimes.
 

DTEND:20030502T160000
DTSTART:20030502T150000
LOCATION:11 Large
SUMMARY:Acquiring Paraphrase Templates from Document/Abstract Pairs [Hal Daum&eacute; III]
UID:20030502T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION:
DTEND:20031002T170000
DTSTART:20031002T160000
LOCATION:11 Large
SUMMARY:TBA [Ana-Maria Popescu]
UID:20031002T160000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Automatic word alignment plays a critical role in statistical machine
 translation. Unfortunately the relationship between alignment quality and
 statistical machine translation performance has not been well understood.
 In the recent literature the alignment task has frequently been decoupled
 from the translation task, and assumptions have been made about measuring
 alignment quality for machine translation which, it turns out, are not
 justified. In particular, none of the tens of papers published over the
 last five years has shown that significant decreases in Alignment Error
 Rate (AER) result in significant increases in translation quality. I will
 explain this state of affairs and present steps towards measuring
 alignment quality in a way which is predictive of statistical machine
 translation quality.
 
 I will also provide a brief overview of some of my other work on training
 and search for word alignment.
 

DTEND:20060203T163000
DTSTART:20060203T150000
LOCATION:11 Large
SUMMARY:Measuring Word Alignment Quality for Statistical Machine Translation [Alex Fraser]
UID:20060203T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: (note: this is a very tentative title -- comments welcome!)
 
 We present a novel extension of syntax-directed translation for
 statistical MT. Formally speaking, our model is based on tree-to- string
 transducers that recursively convert a parse-tree in the source-language
 into a string in the target-language. These transduction rules have
 multi-level trees on the source-side, giving this system more
 transformational power due to the extended domain of locality. We also
 present efficient algorithms for decoding based on dynamic programming.
 Initial experiments on English-to-Chinese translation show promising
 results in both speed and the translation quality.
 
 Joint work with Kevin Knight and Aravind Joshi.
 
 Bio:
 
 Liang Huang is a 3rd-year PhD student from the University of Pennsylvania.
 He is mainly interested in algorithms and formalisms for parsing and
 syntax-based machine translation. His recent work has been on k-best
 parsing algorithms (with David Chiang) and synchronous binarization for MT
 (with Hao Zhang, Dan Gildea, and Kevin Knight).

DTEND:20060303T163000
DTSTART:20060303T150000
LOCATION:11th Floor (Large)
SUMMARY:Syntax-Directed Translation with Extended Domain of Locality [Liang Huang (Penn)]
UID:20060303T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: I would like to talk about some of the things I did during the last 
 year. I will discuss and demonstrate CuSTaRD, a cross-lingual 
 information retrieval, organization, summarization, and visualization 
 system that was built for the Surprise Language exercise. I will focus 
 in more details on iNeATS, the interactive multi-document summarization 
 part of CuSTaRD. The other project I plan to present is eArchivarius, a 
 system for accessing collections of electronic mail.
 

DTEND:20031003T160000
DTSTART:20031003T150000
LOCATION:11 Large
SUMMARY:A Year in Paradise [Anton Leuski]
UID:20031003T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: We will present the results of the 2003 Johns Hopkins University
 Summer Workshop on "Syntax for Statistical Machine Translation".
 
 We will describe a large effort to extend a high-performing
 phrase-based MT system as baseline by adding new features representing
 syntactic knowledge that deal with specific problems of the underlying
 baseline. We investigate a broad range of possible feature functions,
 from very simple binary features to sophisticated tree-to-tree
 translation models. Simple feature functions test if a certain
 constituent occurs in the source and the target language parse
 tree. More sophisticated features will be derived from an alignment
 model where whole sub-trees in source and target can be aligned node
 by node. We present results on the Chinese-English large data track of
 the recent TIDES MT evaluations.
 
 This is joint work with the other workshop team members: Daniel
 Gildea, Anoop Sarkar, Sanjeev Khudanpur, Kenji Yamada, Libin Shen,
 Shankar Kumar, David Smith, Viran Jain, Katherine Eng, Jin Zhen and
 Dragomir Radev.
 
 See <a
 href="http://www.clsp.jhu.edu/ws03/groups/translate/">http://www.clsp.jhu.edu/ws03/groups/translate/</a>
 for more.
 

DTEND:20030903T160000
DTSTART:20030903T150000
LOCATION:11 Large
SUMMARY:JHU MT Workshop [Alex Fraser and Franz Och]
UID:20030903T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: In this talk, I will present my current work on language understanding
 in the project, Mission Rehearsal Exercise(MRE). One of the challenges
 in a dialogure system is to provide a robust understanding/parsing
 compoment. We applied both Finte State Model and Statistical Learning
 Model for the parsing of separate sentences of dialogue utterances.
 Their performances are evaluated and compared with a new blind set.
 And we hope to incorporate them to make a better solution in this
 specific application.
 

DTEND:20030404T160000
DTSTART:20030404T150000
LOCATION:11 Large
SUMMARY:Natural Language Understanding in MRE [Donghui Feng]
UID:20030404T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Test collections for information retrieval tasks have traditionally
 assumed that what we are searching for are documents (e.g., Web pages,
 news stories, or academic documents).  Most information that is generated
 is, however, not in originally generated as part of a document, but rather
 as what we might refer to as "conversational media" (e.g., email, speech,
 or instant messaging).  In this talk, I'll describe the creation of two
 test collections for conversational media, an email collection being
 created in the TREC Enterprise Search track and a spoken word test
 collection for the the Cross-Language Evaluation Forum (CLEF).  I'll spend
 most of the talk describing the details of the CLEF test collection,
 illustrating the issues with some of the results that we have obtained
 from our experiments with that collection.  I'll conclude with a few
 remarks about the implications of what we are learning for DARPA's new
 GALE program.  This is joint work with Charles University, the IBM TJ
 Watson Research Center, the Johns Hopkins University, the Survivors of the
 Shoah Visual History Foundation, and the University of West Bohemia.
 
 
 About the speaker:
 
 Douglas Oard is an Associate Professor at the University of Maryland,
 College Park, with a joint appointment in the College of Information
 Studies and the Institute for Advanced Computer Studies.  He holds a Ph.D.
 in Electrical Engineering from the University of Maryland, and his
 research interests center around the use of emerging technologies to
 support information seeking by end users.  In 2002 and 2003, Doug spent a
 year in paradise here at USC-ISI.  His recent work has focused on
 interactive techniques for cross-language information retrieval and on
 searching conversational text and speech.  Additional information is
 available at http://www.glue.umd.edu/~oard/.

DTEND:20050805T163000
DTSTART:20050805T150000
LOCATION:11 Large
SUMMARY:The CLEF Cross-Language Speech Retrieval Test Collection [Doug Oard (Maryland)]
UID:20050805T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: (This talk occurs in the morning on the same day as the Bayesian tutorial.)
 
 The goal of our research is to support cooperative work performed by
 stakeholders sitting around a table. To support such cooperation, various
 table-based systems with a shared electronic display on the tabletop have
 been developed. These systems, however, suffer the common problem of not
 recognizing shared information such as text and images equally because the
 orientation of their view angle is not favorable. To solve this problem,
 we propose the Lumisight Table. This is a system capable of displaying
 personalized information to each required direction on one horizontal
 screen simultaneously by multiplexing them and of capturing stakeholders'
 gestures to manipulate the information.
 
 About the Speaker:
 
 Mitsunori Matsushita is a research scientist of NTT Communication Science
 Labs., Nippon Telegraph and Telephone Corporation (NTT). He received B.E.,
 M.E., and Dr.E. degrees from Osaka University, in 1993, 1995 and 2003
 respectively. In 1995, he joined NTT, and has been engaged in researches
 on natural language understanding, information visualization, and
 interaction design.
 

DTEND:20050622T240000
DTSTART:20050622T110000
LOCATION:11 Large
SUMMARY:Lumisight Table: A Face-to-face Collaboration Support System That Optimizes Direction of Projected Information to Each Stakeholder [Mitsunori Matsushita]
UID:20050622T110000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: I present our approach to identify an argument structure defined as a
 simple hierarchical structure of claim and reasons.  The claim is also
 classified into "in favor of" or "against" the topic. The experiment is
 performed on the comments from the general public sent to government
 officials in response to proposed regulations.
 
 

DTEND:20060505T163000
DTSTART:20060505T150000
LOCATION:11 Large
SUMMARY:Recognizing Argument Structures in Texts [Namhee Kwon]
UID:20060505T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: The ABC (Assess by Computer) system has been developed and used in the
 School of Computer Science at the University of Manchester for formative
 and (principally) summative assessment at undergraduate and postgraduate
 level. We believe that fully automatic marking of constructed answers -
 especially free text answers - is not a sensible aim. Instead - drawing on
 parallels in the history of machine translation - we take a
 "human-computer collaborative" approach, in which the system does what it
 can to support the efficiency and consistency of the human marker, who
 keeps the final judgement.
 
 Our current work focuses on what are generally referred to as "short text
 answers" as contrasted to "essays". However we prefer to contrast
 "factual" with "discursive" answers, and speculate that the former may be
 amenable to simple statistical techniques, while the latter require more
 sophisticated natural language analysis. I will show some examples of real
 exam data and the techniques we are using and developing to handle them.
 

DTEND:20041105T163000
DTSTART:20041105T150000
LOCATION:11 Large
SUMMARY:A Human-Computer Collaborative Approach to Computer Aided Assessment [Mary Wood (Manchester)]
UID:20041105T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: A major hurdle in building automated information retrieval systems for
 Hindi text is the lack of an uniform encoding for text representation.
 Standards do exist, but noone seems interested. Every web content
 publisher seems to have their encoding system, making information
 extraction a nightmare. We explore an unsupervised approach to
 convert any given "unknown" encoding to UTF-8, by treating it as a
 decipherment problem. We also study how a little amount of supervision
 can improve decoding accuracy.
 

DTEND:20030905T160000
DTSTART:20030905T150000
LOCATION:11 Large
SUMMARY:Deciphering Hindi Scripts [Nishit Rathod and Anish Nair]
UID:20030905T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Information retrieval using word senses is emerging as a good research
 challenge on semantic information retrieval. In this presentation, I am
 going to propose a new method using word senses in information retrieval:
 root sense tagging method. This method assigns coarse-grained word senses
 defined in WordNet to query terms and document terms by unsupervised way
 using co-occurrence information constructed automatically. The sense
 tagger is crude, but performs consistent disambiguation by considering
 only the single most informative word as evidence to disambiguate the
 target word. We also allow multiple-sense assignment to alleviate the
 problem caused by incorrect disambiguation.
 
 Experimental results on a large-scale TREC collection show that the
 proposed approach to improve retrieval effectiveness is successful, while
 most of the previous work failed to improve performances even on small
 text collection. The proposed method also shows promising results when is
 combined with pseudo relevance feedback and state-of-the-art retrieval
 function such as BM25.
 

DTEND:20040806T163000
DTSTART:20040806T150000
LOCATION:11 Large
SUMMARY:Information Retrieval using Word Senses: Root Sense Tagging Approach [Hae-Chang Rim]
UID:20040806T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: We propose a theory that gives formal semantics to word-level
 alignments defined over parallel corpora. We use our theory to
 introduce a linear algorithm that can be used to derive from
 word-aligned, parallel corpora the minimal set of syntactically
 motivated transformation rules that explain human translation data.
 
 (joint work with Michel Galley, Kevin Knight, and Daniel Marcu)
 

DTEND:20040206T160000
DTSTART:20040206T150000
LOCATION:11 Large
SUMMARY:What's in a Translation Rule? [Mark Hopkins]
UID:20040206T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Automatic Natural Language applications often require the processing of
 structured data. Traditional machine learning approaches attempt to
 represent structured syntactic/semantic objects by means of flat feature
 representations, i.e. attribute-value vectors. This raises two problems:
 
 1. There is no well defined theoretical motivation for such feature model.
 Structural properties may not fit in any flat feature representation.
 
 2. To define effective flat features, a deep knowledge about the
 linguistic phenomenon is required.
 
 Kernel methods for Natural Language Processing aim to solve both the above
 problems as kernel functions can be used to define similarities between
 linguistic objects without explicitly defining the target feature space.  
 In this way, a linguistic phenomenon can be modeled at a more abstract
 level where the modeling is easier. Such property is extremely useful when
 the representation of linguistic phenomena is still not well understood.
 For example, the feature design of semantic role labeling appear to be
 quite complex since several and non-definitive feature sets have been
 proposed.
 
 As a viable alternative to manual feature design, kernel methods propose
 two steps: (1) they generate all substructures of the target
 syntactic/semantic structures and (2) they let the learning algorithm
 (e.g. Support Vector Machines) to select the most relevant substructures.
 In this talk, we (1) introduce the PropBank and FrameNet predicate
 argument structures, (2) present the standard approaches to the automatic
 labeling of semantic roles and (3) show advanced semantic role labeling
 models based on kernel methods.
 
 About the speaker:
 
 Alessandro Moschitti is a researcher at the Computer Science Department of
 the University of Rome ^ÓTor Vergata^Ô. In 1998 he took his master degree
 in Computer Science at the University of Rome ^ÓLa Sapienza^Ô. In 2003 he
 finished his PhD in Computer Science at ^ÓTor Vergata^Ô University.  
 Between 2002 and 2004 he worked as an associate researcher in the
 University of Texas at Dallas. His research interests concern machine
 learning approaches for Natural Language Processing and Information
 Retrieval. His deep expertise relates to automated text categorization and
 semantic role labeling.  Recently, he has devised new kernels which enable
 Support Vector and other kernel-based machines to carry out advanced
 semantic processing.
 
 

DTEND:20050706T153000
DTSTART:20050706T140000
LOCATION:11 Large
SUMMARY:Kernel Methods for Semantic Role Labeling [Alessandro Moschitti (Rome)]
UID:20050706T140000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: I will give a status report work on information extraction during last
 10 months. The motivation of this work is to learn extraction
 patterns automatically using seed template and web search engine. My
 approach is to generate linguistics patterns and surface patterns and
 combine them to compenstate for the respective weaknesses of two
 patterns. On the DUC01-test-disasters (67 documents),
 DUC01-training-disasters (54 documents) I got a 0.34/0.26 f-measure
 respectively. In this talk, I will give a status report on ReAD
 project (with Dr. Chin-Yew Lin).
 

DTEND:20030207T160000
DTSTART:20030207T150000
LOCATION:11 Large
SUMMARY:Automatic Pattern Learning for Information Extraction using Web Data [Jeongwon Cha]
UID:20030207T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Text-to-text applications -- Machine Translation, Summarization, Question
 Answering -- do not usually involve generic Natural Language Generation
 (NLG) systems in their generation components, but rather use
 application-specific algorithms. The main reason for this state of affairs
 is that virtually all the formalisms used by current generic NLG systems
 require information that cannot be reliably extracted from unrestricted
 text.
 
 This thesis proposal is about meeting the demand for natural language
 generation in the context of text-to-text applications. I introduce a new
 representation formalism (WIDL-expressions), propose generation algorithms
 that operate on representations specific to this formalism, and discuss a
 generic sentence realization framework for text-to-text applications. The
 generation mechanism is based on algorithms for intersecting
 WIDL-expressions with probabilistic language models. I present both
 theoretical and empirical results concerning the correctness and
 efficiency of these algorithms. I also discuss the practical aspects
 arising from implementing this generation mechanism.
 
 In a concrete application of the proposed generation mechanisms, I present
 an end-to-end Machine Translation application. I also discuss another
 possible application for Automated Summarization, namely automated
 headline generation.
 

DTEND:20050707T163000
DTSTART:20050707T150000
LOCATION:11 Small
SUMMARY:Natural Language Generation for Text-to-Text Applications Using an Information-Slim Representation [Radu Soricut]
UID:20050707T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Our contextual inquiry into the practices of oral
 historians unearthed
 a curious incongruity. While oral historians consider interview
 recordings a central historical artifact, these recordings
 sit unused
 after a written transcript is produced. We hypothesized
 that this is
 largely because books are more usable than recordings.
 Therefore, we
 created Books with Voices: bar-code augmented paper transcripts
 enabling fast, random access to digital video interviews on
 a PDA. We
 present quantitative results of an evaluation of this tangible
 interface with 13 participants. They found this lightweight,
 structured access to original recordings to offer
 substantial benefits
 with minimal overhead. Oral historians found a level of
 emotion in the
 video not available in the printed transcript. The video
 also helped
 readers clarify the text and observe nonverbal cues.
 
 <a
 href="http://guir.berkeley.edu/oral-history/">http://guir.berkeley.edu/oral-history/
 

DTEND:20030307T160000
DTSTART:20030307T150000
LOCATION:11 Large
SUMMARY:Books with Voices: Paper Transcripts as a Tangible Interface to Oral Histories [Scott Klemmer]
UID:20030307T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: TBA
 

DTEND:20050408T163000
DTSTART:20050408T150000
LOCATION:11 Large
SUMMARY:Search Engines for HLT Applications [Jamie Callan (CMU)]
UID:20050408T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: The Inversion Transduction Grammar (ITG) of \cite{DekaiCL} generates a
 synchronous parse tree for a given pair of sentences in two languages. By
 allowing inversion of the order of children at any level of the
 synchronous parse tree, ITG can do recursive, systematic word reordering.
 We made a version of ITG where the nonterminals are lexicalized by word
 pairs and the inversions are dependent on the so-lexicalized nonterminals.  
 We found out that after lexicalization, the Alignment Error Rate (AER)
 against gold standard is reduced for short sentences. ITG parsing
 complexity is high polynomial. We proposed a pruning techique that
 utilizes IBM Model 1 to estimate the inside and outside probability of a
 bitext cell. Taking a step further, we applied the A* parsing having been
 used for monolingual parsing to ITG.  I will talk about the heuristic
 estimates we used for A* parsing for Viterbi alignment selection and
 decoding.
 

DTEND:20050608T163000
DTSTART:20050608T150000
LOCATION:4th floor
SUMMARY:Lexicalization and A* Searching for Inversion Transduction Grammar [Hao Zhang (Rochester)]
UID:20050608T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: An interesting (disturbing?) new trend is beginning to manifest itself in
 NLP, one that is focused on performance and hence very attractive in the
 context of inter-system competitive evaluations such as TREC and DUC, but
 one that does not provide much insight about language or NLP methods to
 the researcher interested in these topics.  This addition of a new
 paradigm to NLP has implications for all of us.
 

DTEND:20040409T163000
DTSTART:20040409T150000
LOCATION:11 Large
SUMMARY:Three (and a half?) Trends: The Future of NLP [Eduard Hovy]
UID:20040409T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Justin Busch:
 Weight and Semantic Class Issues in Japanese Noun Phrase Ordering
 
 Many current designs for automatic parsers learn probabilities for the
 relative frequencies of parts-of-speech and syntactic rules, and this has
 proven to be generally reliable. In spite of the ubiquity of probabilistic
 techniques for parsing, however, little attention has been given to the
 linguistic significance of the probabilistic data and what it might say
 about human performance.
 
 Hawkins proposes a general theory of grammaticalization based on the
 minimization of syntactic domains. Given that a sentence of any language
 will contain at least one noun phrase, one verb, and possibly additional
 noun phrases and prepositional phrases, "minimize domains" suggests that
 these phrases will order themselves according to whichever pattern
 requires the least effort to recognize the higher syntactic structure of
 the sentence. These effects are directly measurable through corpus
 statistics, and can be interpreted as potential heuristics for
 probabilistic parsers.  In this study, we examine Japanese data from the
 Kyoto Treebank and test Hawkins' predictions for noun phrase ordering by
 noun phrase weight as well as by generic semantic types. The discussion
 will focus primarily on how accurately Hawkins' predictions are reflected
 in the corpus statistics, and will conclude with observations about how
 they might be applied to the decision mechanisms of probabilistic parsers.
 
 --------------------------------------------------------------------------
 
 Hai Huang:
 TBA
 
 --------------------------------------------------------------------------
 
 Jens Stephan:
 Evaluation and Visualization of a Dialogue System
 
 Evaluations have become a necessary standard to almost any type of
 research. However, there are many areas where there is no common agreement
 on how to evaluate, which is the case for complex problem of evaluating
 dialogue systems. The evaluation of the multi party multi modal dialogue
 system MRE(1) provides a good example of what questions are important for
 such an evaluation, how to actually do the evaluation and finally how to
 how make special problems of the system visible to use the evaluation
 results to improve the systems performance.
 
 After a brief introduction of the MRE domain and architecture, I will
 break the task town to a set of general evaluation questions. From there I
 will explain what kinds of metrics and visualizations are suited to answer
 those questions and what kind of data is needed, as well as how that data
 was obtained. Along the road, examples of actual system problems and
 performances will be presented. The topics of data formatting and
 visualization will receive some special attention by introducing the MRE
 Evaluation Toolkit as well as the corpus it operates on.
 
 --------------------------------------------------------------------------
 
 Chen-kang Yang:
 Using the Omega Ontology to Determine Selectional Restrictions for Word Sense Disambiguation
 
 Word sense disambiguation is fundamental for language processing. Though
 purely statistical methods are effective for this task, they neglect the
 syntactic and semantic aspects. In this study, we adopt a hybrid approach
 by applying an unsupervised machine learning method to learn verbs
 selectional restrictions on their subjects/objects. The system then uses
 these learned selectional restrictions for word sense disambiguation of
 the subjects/objects. Instead of words, the training data contains
 ontological taxonomy hierarchies that are retrieved from the Omega
 ontology. Unlike other similar systems, we are able to automatically find
 the best match among classes from different levels of the ontology. This
 provides us more flexibility and is closer to human instinct. Our system
 performs better than other similar systems, though it still needs
 cooperating methods for better results.

DTEND:20040809T163000
DTSTART:20040809T150000
LOCATION:11 Large
SUMMARY:CL Student Presentations [Justin Busch, Hai Huang, Jens Stephan & Chen-kang Yang]
UID:20040809T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: I'll give a survey of trees and grammars, at least the parts that seem
 most relevant to ongoing work at ISI.  This will be a theory talk.  I'll
 start with context-free grammars, which were developed in the 1950s, and
 cover other tree-generating systems.  I'll also talk about
 tree-transforming systems.

DTEND:20040709T163000
DTSTART:20040709T150000
LOCATION:11 Large
SUMMARY:Survey of Trees and Grammars [Kevin Knight]
UID:20040709T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: For ten days in March, nine research teams worked together to build
 Cebuano language resources and systems for a "dry run" the TIDES Suprise
 Language experiment. Cebuano is spoken widely in the southern
 Phillipines, but there had previously been little work on computational
 linguistics for that language. As we prepare for the actual Suprise
 Language experiment this June, we will use this talk to look back on what
 worked, what didn't, and what lessons there are to be learned from our
 experience in March. Come prepared to share the excitement, offer your
 ideas, and understand why we have tried to ask Ed to cancel all vacations
 during the month of June (just kidding...).
 

DTEND:20030509T160000
DTSTART:20030509T150000
LOCATION:11 Large
SUMMARY:Coping with Surprise: The Case of Cebuano [Doug Oard]
UID:20030509T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: ISI's Tactical Language Project is a system designed to teach Americans
 how to speak Arabic through a video game environment. We've taken a FPS
 engine (Unreal 2003) and re-did the graphics so it looks like you're in a
 typical Lebanese village. We took away the guns, added speech recognition,
 and set the player in the middle of it all. The theory is that if you
 learn well in a classroom, you'll perform well in a classroom, but if you
 learn well in a pseudo-naturalistic environment, you'll perform better in
 real life.
 
 In a pedagogical context, speech recognition is a hard thing we're trying
 to recover signal from noisy language-learner speech--with all of its
 mispronunciations, disfluencies, and grammatical errors . Language
 understanding is hopeless unless you have a good approximation of what
 kinds of mistakes learners make, and you can build a system to anticipate
 them.
 
 Suppose an English language learner says "Water". Is he asking you for
 water? Is he telling you there's a puddle in front of you? Is he saying
 his name is "Walter", but with horrible pronunciation? There's a lot of
 ambiguity involved. In order to disambiguate, we need to look at the
 speech signal itself, the utterance's context, the learner's past language
 performance, and details about the learner's mother language as it relates
 to English, etc., etc... Only then can we hope to guess what the learner
 is actually trying to say.
 
 And then, of course, once we've made a good guess at the learner's speech
 intentions, what do we do about it? How do we correct him? How do we
 balance the consideration of inherent qualities of learner motivation,
 language errors, learning objectives, and possibly low-confidence speech
 recognition, as we generate good pedagogical feedback?
 
 This is NLP (primarily statistical) with a bit of pedagogy theory and
 linguistic (SLA and phonology) theory sprinkled in.
 

DTEND:20041210T163000
DTSTART:20041210T150000
LOCATION:11 Large
SUMMARY:Developing a Language Model for Second Language Learner Speech [Nick Mote]
UID:20041210T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: The Arabic language exhibits diglossia, i.e., the coexistence of two forms
 of language, a variety with standard orthography and sociopolitical clout
 which is not natively spoken by anyone (Modern Standard Arabic, MSA) and
 varieties that are primarily spoken and lack writing standards (Arabic
 dialects). There are important resources currently available for MSA with
 much on-going NLP work; for example, there is an Arabic Treebank and
 several syntactic parsers for MSA.  However, Arabic dialect resources and
 NLP research are still at an infancy stage. I will present work done at
 the Johns Hopkins CLSP Summer Workshop on parsing of Arabic dialects, in
 particular, Levantine Arabic.  We have experimented with three approaches
 to leveraging MSA resources to create a parser for Levantine Arabic, as
 well as methods for induction of MSA-Levantine translation lexicons and a
 Levantine part-of-speech tagger. Using these methods we obtain error
 reductions of up to 15% compared with applying an MSA parser directly to
 Levantine text.
 
 Rambow et al. Parsing Arabic Dialects: Final Report. Johns Hopkins
 University Center for Language and Speech Processing Workshop 2005.  
 http://www.clsp.jhu.edu/ws2005/groups/arabic/documents/finalreport.pdf
 
 Chiang et al. Parsing Arabic Dialects. To appear in Proc. EACL 2006.
 
 This is joint work with O. Rambow, M. Diab, N. Habash, R. Hwa, K. Sima'an,
 V.  Lacey, R. Levy, C. Nichols and S. Shareef.

DTEND:20060210T160000
DTSTART:20060210T150000
LOCATION:11 Large
SUMMARY:Parsing Arabic Dialects [David Chiang]
UID:20060210T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: We discuss the relevance of k-best parsing to recent applications in
 natural language parsing, and develop algorithms that substantially
 improve on previously-used algorithms with respect to efficiency,
 scalability, and accuracy. We demonstrate these algorithms in experiments
 on Bikel's implementation of Collins' lexicalized PCFG model, and on a
 synchronous CFG based decoder for statistical machine translation. We show
 in particular how the improved output of our algorithms has the potential
 to improve results from parse reranking systems and other applications.
 
 In this talk, I will demonstrate the convergence of several popular
 parsing formalisms (weighted deduction, shared forest, semiring) under the
 powerful hypergraph formalism. If time permits, I will also show how
 generic Dynamic Programming can be formalised as hypergraph searching.
 
 Joint work with David Chiang (University of Maryland)
 
 
 

DTEND:20050610T163000
DTSTART:20050610T150000
LOCATION:11 Large
SUMMARY:Better k-best Parsing, Hypergraphs and Dynamic Programming [Liang Huang (Penn)]
UID:20050610T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: We revisit the idea of history-based parsing, and present a history-based
 parsing framework that strives to be simple, general, and flexible.  We
 also provide a decoder for this probability model that is linear-space,
 optimal, and anytime.  A parser based on this framework, when evaluated on
 Section 23 of the Penn Treebank, compares favorably with other
 state-of-the-art approaches, in terms of both accuracy and speed.
 
 

DTEND:20060310T163000
DTSTART:20060310T150000
LOCATION:10th Floor
SUMMARY:Exploring the Potential of Intractable Parsers [Mark Hopkins]
UID:20060310T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: (This is a practice run for I talk I will give a few times over the next
 weeks when interviewing for job positions.)
 
 I will review the state of the art in statistical machine translation
 (SMT), present my dissertation work, and sketch out the research
 challenges of syntactically structured statistical machine translation.
 
 The currently best methods in SMT build on the translation of phrases (any
 sequences of words) instead of single words. Phrase translation pairs are
 automatically learned from parallel corpora. While SMT systems generate
 translation output that often conveys a lot of the meaning of the original
 text, it is frequently ungrammatical and incoherent.
 
 The research challenge at this point is to introduce syntactic knowledge
 to the state of the art in order to improve translation quality. My
 approach breaks up the translation process along linguistic lines. I will
 present my thesis work on noun phrase translation and ideas about clause
 structure.
 

DTEND:20031010T160000
DTSTART:20031010T150000
LOCATION:11 Large
SUMMARY:Advances in Statistical MT: Phrases, Noun Phrases and Beyond [Philipp Koehn]
UID:20031010T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: This summer we held a three-month workshop on syntax-driven machine
 translation, in which we learned syntactic transformations automatically
 from Chinese/English translated corpora and applied them to translate new
 text.  We'll give a progress report!
 

DTEND:20040910T163000
DTSTART:20040910T150000
LOCATION:11 Large
SUMMARY:About Syntax Fest 2004 (Part I) [Various]
UID:20040910T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Bilingual term lists have proven to be a useful basis for
 dictionary-based Cross-Language Information Retrieval (CLIR), but
 there is ample anecdotal evidence that differences in vocabulary
 coverage can have a substantial impact on retrieval effectiveness.
 This issue has recently been explored using ablation studies in which
 progressively smaller term lists were synthesized using sampling
 techniques. The ablation techniques used in those studies have not,
 however, been validated using real terms lists. In this talk I will
 report the results of what we believe is the first large coverage
 study use naturally occurring term lists. Thirty-five bilingual terms
 lists were obtained from a variety of sources, each with English as
 one of the two paired languages. From these, we created 35
 English-to-English term lists by taking each term that was present in
 the English side of the list as its own translation. When used with
 an English information retreval test collection, this allowed us to
 measure the reduction in retrieval effectivenss that could be
 attributed to deficiencies in the coverage of English terms. Eight
 types of untranslatable terms were identified in a collection of news
 stories, of which named entitles were found to have the greatest
 impact on retrieval effectiveness. Differences in named entity
 coverage were found to produce large differences in retrieval
 effectiveness for term lists of similar sizes. Controlling for named
 entity effects yielded a clear relationship between retrieval
 effectiveness and the size of the translatable English vocabulary.
 The functional dependence that we observed is consistent with one
 previously applied ablation technique and inconsistent with another.
 Our results indicate that the outcome of a widely cited landmark study
 of query expansion effects for CLIR was likely affected by a flawed
 ablation model. We conclude our talk with a suggestion for further
 work on that topic, and a simple prescription for avoiding such
 problems in the future.
 

DTEND:20030612T240000
DTSTART:20030612T110000
LOCATION:11 Large
SUMMARY:Measuring the Effect of Dictionary Coverage on Cross-Language Retrieval [Dina Demner-Fushman]
UID:20030612T110000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: TBA
 

DTEND:20040312T163000
DTSTART:20040312T150000
LOCATION:11 Large
SUMMARY:About My Thesis Proposal [Deepak Ravichandran]
UID:20040312T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: This is two practice talks.
 
 -----------------------------------------------------------------------------
 FIRST TALK:
 
 The traditional approach to diagnosing learner speech errors in Computer
 Aided Language Learning is to create a linguistic profile of the
 learner/user. We, however, propose that work must also be done to model
 the linguistic profile of a typcial native listener.
 
 Not all errors in second langage learner speech are created equal.
 Different errors sound more "severe" or "harsh" to native speaker ears and
 should therefore be treated with more emphasis in pedagogical interaction.
 
 The Tactical Language Training System (TLTS) is a speech-enabled
 virtual-reality based computer learning environment designed to teach
 Arabic spoken communication to American English speakers. This talk
 addresses the ways the TLTS contextualizes non-native speech errors, and
 how this contextualization fits in the corrective exchanges between a
 non-native learner and a pedagogical agent built to model a native
 listener.
 
 The pedagogical system used in TLTS includes:
 
   * Automatic Speech Recognition (ASR) models which are built on a
     combination of both annnotated and unannotated non-native speech with
     native speech data.
 
   * A stochastic generative model for errors in learner speech that
     creates mispronunciation grammars for the ASR
 
   * Reweighting of system-perceived mispronunciation severity based on
     aggregate native speaker judgements of quality pronunciation and
     intelligiblity.
 
   * Contextualization of feedback based on lexical and phonetic
     inventories of the native and non-native languages.
 
 
 -----------------------------------------------------------------------------
 SECOND TALK:
 
 We present a novel feature-enriched approach that learns to detect the
 conversation focus of threaded discussions by combining NLP analysis and
 IR techniques. Using the graph-based algorithm HITS, we integrate
 different features such as lexical similarity, poster trustworthiness, and
 speech act analysis of human conversations with featureoriented link
 generation functions. It is the first quantitative study to analyze human
 conversation focus in the context of online discussions that takes into
 account heterogeneous sources of evidence. Experimental results using a
 threaded discussion corpus from an undergraduate class show that it
 achieves significant performance improvements compared with the baseline
 system.
 

DTEND:20060512T163000
DTSTART:20060512T150000
LOCATION:11 Large
SUMMARY:Pedagogical Contextualization of Language Learner Speech Errors AND Learning to Detect Conversation Focus of Threaded Discussions [Nick Mote and Donghui Feng]
UID:20060512T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Textual data is everywhere, in email and scientific papers, in online
 newspapers and e-commerce sites. The Web contains more than 200 terabytes
 of text not even counting the contents of dynamic textual databases. This
 enormous source of knowledge is seriously underexploited. Textual
 documents on the Web are very hard to model computationally: they are
 mostly unstructured, time-dependent, collectively authored, multilingual,
 and of uneven importance.  Traditional grammar-based techniques don't
 scale up to address such problems. Novel representations and analytical
 tools are needed.
 
 I will discuss several current projects at Michigan related to text mining
 from a variety of genres. Depending on the amount of time, I will talk
 about (a) lexical centrality for multidocument summarization, (b)
 syntax-based sentence alignment, (c) graph-based classification,(d)
 lexical models of Web growth, and (e) mining protein interactions from
 scientific papers. As it turns out, the right representations, when
 complemented with traditional NLP and IR techniques, turn many of these
 into instances of better studied problems in areas such as social
 networks, statistical mechanics, sequence analysis, and computational
 phylogenetics.
 
 
 
 About the Speaker:
 
 Dragomir R. Radev is Assistant Professor of Information, Electrical
 Engineering and Computer Science, and Linguistics at the University of
 Michigan, Ann Arbor.  He leads the CLAIR (Computational Lingusitics
 And Information Retrieval) group which currently includes 12
 undergraduate and graduate students.  Dragomir holds a Ph.D. in
 Computer Science from Columbia University.  Before joining Michigan,
 he was a Research Staff Member at IBM's TJ Watson Research Center in
 Hawthorne, NY.  He is the author of more than 45 papers on information
 retrieval, text summarization, graph models of the Web, question
 answering, machine translation, text generation, and information
 extraction.  Dr. Radev's current research on probabilistic and
 link-based methods for exploiting very large textual repositories,
 representing and acquiring knowledge of genome regulation, and
 semantic entity and relation extraction from Web-scale text document
 collections is supported by NSF and NIH.  Dragomir serves on the
 HLT-NAACL advisory committee, was recently reelected as treasurer of
 NAACL, is a member of the editorial boards of JAIR and Information
 Retrieval, and is a four-time finalist at the ACM international
 programming finals (as contestant in 1993 and as coach in
 1995-1997). Dragomir received a graduate teaching award at Columbia
 and recently, the U. of Michigan award for Outstanding Research
 Mentorship (UROP).
 

DTEND:20041112T163000
DTSTART:20041112T150000
LOCATION:11 Large
SUMMARY:Words, links, and patterns: novel representations for Web-scale text mining [Dragomir Radev]
UID:20041112T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: In this talk, I look at how the notion of discourse coherence can be
 modeled computationally. I begin with the following idea: if you take
 a text and shuffle its sentences into a random order, that text will
 no longer make sense. In other words, the text will be "incoherent".
 Our task is to learn how to reassemble a shuffled text into an order
 that humans would consider to be coherent.
 
 I discuss practical and theoretical motivations for the task,
 evaluations of our model, increases in performance achieved over the
 summer, and directions for future research.
 
 This work was done in collaboration with Kevin Knight, Daniel Marcu,
 Jonathan Graehl and Nick Mote.
 

DTEND:20030912T160000
DTSTART:20030912T143000
LOCATION:11 Large
SUMMARY:Discourse Coherence for Ordering Information [Lara Taylor]
UID:20030912T143000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Automated essay scoring was initially motivated by its potential cost
 savings for large-scale writing assessments.  However, as automated essay
 scoring became more widely available and accepted, teachers and assessment
 experts realized that the potential of the technology could go way beyond
 just essay scoring.  Over the past five years or so, there has been rapid
 development, and commercial deployment of automated essay evaluation for
 both large-scale assessment and classroom instruction.  A number of
 factors contribute to an essay score, including varying sentence
 structure, grammatical correctness, appropriate word choice, errors in
 spelling and punctuation, use of transitional words/phrases, and
 organization and development. Instructional software capabilities exist
 that provide essay scores and evaluations of student essay writing in all
 of these domains.  The foundation of automated essay evaluation software
 is rooted in NLP research.  This talk will walk through the development of
 CriterionSM, e-rater, and Critique writing analysis tools, automated essay
 evaluation software developed at Educational Testing Service - from NLP
 research through deployment as a business.
 
 (Preview of an HLT/NAACL-2004 Invited Speaker Presentation)
 
 Jill Burstein
 Educational Testing Service
 Princeton, NJ
 

DTEND:20040413T163000
DTSTART:20040413T150000
LOCATION:4 Large
SUMMARY:Automated Essay Evaluation: From NLP research through deployment as a business [Jill Burstein (ETS)]
UID:20040413T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: The last decade has seen a plethora of papers in NLP devoted to Machine
 Learning algorithms. However, most of these papers have devoted their
 effort exclusively to improving the system performance on the accuracy
 axis. Most of the sophisticated NLP algorithms are extremely slow and do
 not scale up easily when applied to large amounts of data.
 
 I will talk about the importance of randomized algorithms and their
 potential in speeding up some NLP algorithms. This talk will be a survey
 of some recent advances in Theoretical Computer Science/Math seen with an
 NLP point-of-view. I am not going to present any results. But I am hoping
 that this talk will clarify my thinking process, get feedback from people
 and help me colloborate with others.
 

DTEND:20040813T163000
DTSTART:20040813T150000
LOCATION:11 Large
SUMMARY:Randomized algorithms and its application to NLP [Deepak Ravichandran]
UID:20040813T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: I'm going to talk about what I've been working on recently.  My thesis
 proposal is something having to do with the interaction of search,
 learning and features in supervised natural language problems.  I will be
 focusing on the task of coreference, since it is a well-studied problem,
 yet nevertheless not really solved and quite difficult.  It is also a 
 great pedagogical example for why we should care about something *other* 
 than standard Markov random fields for structured prediction, since, for 
 the coreference problem (and pretty much every other "real" natural 
 language problem) inference in such models is intractable.
 
 The contents of this talk will be roughly 40% from a paper I have at ICML
 this year on efficient, accurate supervised learning techniques for
 structured prediction (and why I feel inclined to make the very
 controversial statement that supervised learning for NLP problems is
 solved); it will be roughly 40% about an application of this technique to
 the coreference resolution problem and an exploration of the feature space
 for solving this problem (submitted to HLT); and it will be roughly 20%
 about looking forward to what I want to accomplish in the remainder of my
 thesis, not covered by the first 80%.

DTEND:20050613T240000
DTSTART:20050613T103000
LOCATION:11 Small
SUMMARY:Search, Learning and Features (my thesis proposal proposal) [Hal Daume III]
UID:20050613T103000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: I will describe some recent work on "natural logics", logics for languages
 that are more similar to human languages than traditional first order
 predicate logic, giving particular attention to questions about what the
 syntax encodes about semantic relations among sentences. On everyone's
 view, some but not all entailments are syntactically encoded (in a sense
 that I will define precisely), but, beyond this starting point,
 controversy starts almost immediately. Considering some particular
 examples, I will sketch methods for addressing some of the basic
 questions.
 
 

DTEND:20050513T163000
DTSTART:20050513T150000
LOCATION:11 Large
SUMMARY:Natural Logic [Ed Stabler (UCLA)]
UID:20050513T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Although a considerable number of generic Natural Language Generation
 (NLG) systems has been produced over the years, none of them is usually
 employed in end-to-end, text-to-text applications such as Machine
 Translation, Summarization, Question Answering, etc. In this talk, we
 identify the likely reasons for this state of affairs, and propose
 WIDL-expressions as a flexible formalism that facilitates the integration
 of a generic NLG engine within end-to-end language processing
 applications.
  
 WIDL-expressions represent compactly probability distributions over finite
 sets of candidate realizations, and have optimal algorithms for text
 realization via interpolation with language model probability
 distributions. We show the effectiveness of our WIDL-based NLG engine for
 both sentence realization and document realization tasks. By employing
 language models that capture sentence-level properties, we perform Machine
 Translation and Headline Generation at state-of-the-art levels or better.
 By employing language models that capture document-level properties such
 as text coherence, we synthesize output for Multi-document Summarization
 that displays both high content selection performance and increased
 coherence.
 
 

DTEND:20060414T163000
DTSTART:20060414T150000
LOCATION:11 Large
SUMMARY:Natural Language Generation for Text-to-Text Applications using an Information-Slim Representation [Radu Soricut]
UID:20060414T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: One of the key challenges in retrieval is what to do when a query term
 needs to be replaced with more than one term. This problem arises in
 applications such as cross language information retrieval and
 thesaurus expansion. One solution is to use structured query methods,
 which treat all the possible replacements as if they were one query
 term by computing a joint document frequency and a joint term
 frequency. This presentation will review prior work on structured
 query techniques and then introduce three new variants that aim to
 improve computational efficiency and to leverage estimates of
 replacement probabilities to improve retrieval effectiveness. The
 methods have now been tested in cross-language retrieval and
 OCR-degraded text retrieval applications in which replacement
 probability estimates could be estimated. In both applications, the
 new structured query methods showed statistically significant
 improvements in retrieval effectiveness over previously known
 structured query methods.
 

DTEND:20030314T160000
DTSTART:20030314T150000
LOCATION:11 Large
SUMMARY:Improving the Efficiency and Effectiveness of Structured Query Methods [Kareem Darwish]
UID:20030314T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION:
DTEND:20030815T160000
DTSTART:20030815T150000
LOCATION:11 Large
SUMMARY:On Her Masters Research [Beata Klebanov]
UID:20030815T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: (Yarowsky et al., 2001) present an algorithm for bootstrapping a POS
 tagger for an arbitrary target language, using an existing POS tagger for
 a source language and a parallel corpus in the source and target
 languages.  The source text is annotated with the POS tagger; the parallel
 corpus is word-aligned; the POS tags are "projected" from source to target
 language; and finally smoothing is performed before training a POS tagger
 for the target language on the projected annotations.
 
 I will talk about my work (jointly with my advisor, Steve Abney, at U. of
 Michigan) in which we extend this algorithm by projecting from multiple
 source languages onto a target language, then combining the outputs to
 compute a consensus POS tagger.  Our hypothesis is that systematic
 transfer errors from different source-target pairs can be reduced by using
 multiple source languages.  I will present experimental results for three
 different source languages (English, German, and Spanish), and two
 different target languages (French and Czech).  Our results indicate that
 using multiple source languages improves performance.

DTEND:20050715T163000
DTSTART:20050715T150000
LOCATION:11 Large
SUMMARY:Inducing POS Taggers by Projecting from Multiple Source Languages [Victoria Li Fossum (Michigan)]
UID:20050715T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: In this talk, I'll present the investigation I'm carrying out in ISI
 lately under Daniel Marcu's supervision.  Following the noisy-channel
 framework, we propose a statistical model for learning the argument
 structures of verbs automatically.  We show that we are able to learn both
 lexicalized and generalized structures and achieve good results, relying
 only on basic NLP tools like a POS tagger and named-entity recognizer. We
 also present a comparison of the structures we learn with the predicted
 ones in PropBank.
 

DTEND:20041115T163000
DTSTART:20041115T150000
LOCATION:8th floor multipurpose room (#849) -- NOT the conference room
SUMMARY:Unsupervised learning of verb argument structures [Thiago Pardo]
UID:20041115T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: I present my summer project  - writing rule-based software for
 simplifying texts. Task definition and motivations will be
 discussed, as well as human and automatic evaluation, the
 latter using a question answering system.
 
 This is joint work with Daniel Marcu and Kevin Knight.
 

DTEND:20030915T160000
DTSTART:20030915T143000
LOCATION:11 Large
SUMMARY:Analyzing Sentences into Facts: Simple is Beautiful [Beata Klebanov]
UID:20030915T143000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Although we live in a predominantly statistical world, there are still
 many language processing applications that long for accurate
 representations of text meaning. Even applications that found partial
 solutions in statistical modeling, including information retrieval,
 machine translation, or automatic summarization, are likely to get a
 significant boost from deeper text understanding.
 
 In this talk, I will present an innovative method for automatic extraction
 of conceptual graphs as a means to represent text meaning.  The method
 relies on a novel adaptation of graph-based ranking algorithms -
 traditionally (and successfully) used in citation analysis, Web page
 ranking, and social networks. I will show how such algorithms can be
 adapted to semantic networks, resulting in an efficient unsupervised
 method for resolving the semantic ambiguity of all words in open text, and
 identifying relations between entities in the text. I will also outline a
 number of applications that are enabled by this representation, including
 keyphrase extraction, domain classification, and extractive summarization.
 
 BIO: Rada Mihalcea is an Assistant Professor of Computer Science at
 University of North Texas. Her research interests are in lexical
 semantics, minimally supervised natural language learning, and
 multilingual natural language processing. She is currently involved in a
 number of research projects, including word sense disambiguation, shallow
 semantic parsing, (non-traditional) methods for building annotated corpora
 with volunteer contributions over the Web, word alignment for language
 pairs with scarce resources, and graph-based ranking algorithms for
 language processing. Her research is supported by NSF and the state of
 Texas.

DTEND:20040416T240000
DTSTART:20040416T103000
LOCATION:11 Large
SUMMARY:Graph-based Ranking Algorithms for Language Processing [Rada Mihalcea (UNT)]
UID:20040416T103000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Broad-coverage repositories of semantic relations between verbs could
 benefit many NLP tasks. We present a semi-automatic method for extracting
 fine-grained semantic relations between verbs. We detect similarity,
 strength, antonymy, enablement, and temporal happens-before relations
 between pairs of strongly associated verbs using lexico-syntactic patterns
 over the Web. On a set of 29,165 strongly associated verb pairs, our
 extraction algorithm yielded 65.5% accuracy. We provide the resource,
 called VerbOcean, for download at http://semantics.isi.edu/ocean/. We will
 also discuss current work on disambiguating the verbs in the network as
 well as refining the semantic relations using path analysis.
 
 

DTEND:20040816T153000
DTSTART:20040816T140000
LOCATION:11 Large
SUMMARY:VerbOcean: Mining the Web for Fine-Grained Semantic Verb Relations [Patrick Pantel & Tim Chklovski]
UID:20040816T140000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Ranked lists of output trees from syntactic statistical NLP applications
 frequently contain multiple repeated entries. This redundancy leads to
 misrepresentation of tree weight and reduced information for debugging and
 tuning purposes. It is chiefly due to nondeterminism in the weighted
 automata that produce the results. I will introduce an algorithm that
 determinizes such automata while preserving proper weights, returning the
 sum of the weight of all multiply derived trees. I will also report
 results of the application of the algorithm to machine translation and
 Data Oriented Parsing.
 
 

DTEND:20051216T163000
DTSTART:20051216T150000
LOCATION:11 Large
SUMMARY:A Better N-Best List - Practical Determinization of Weighted Finite Tree Automata [Jonathan May]
UID:20051216T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Leading Question-Answering systems employ a variety of means to boost the
 accuracy of their answers.  Such methods include redundancy (getting the
 same answer from multiple documents/sources), deeper parsing of questions
 and texts (hence improving the accuracy of confidence measures),
 inferencing (proving the answer from information in texts plus background
 knowledge) and sanity-checking (verifying that answers are consistent with
 known facts).  To our knowledge, however, no QA system deliberately asks
 additional questions in order to derive constraints on the answers to the
 original questions.
 
 We present in this talk the method of QA-by-Dossier-with-Constraints (QDC).
 This is an extension of the simpler method of QA-by-Dossier, in which
 definitional questions ("Who/what is X") are addressed by asking a set of
 questions about anticipated properties of X.  In QDC, the collection of
 Dossier candidate answers, along with possibly other answers to questions
 asked expressly for this purpose, are subjected to satisfying a set of
 naturally-arising constraints.  For example, for a "Who is X" question, the
 system will ask about birth, accomplishment and death dates, which, if they
 exist, must occur in that order, and also obey other constraints such as
 lifespan.  Temporal, spatial and kinship relationships seem to be
 particularly amenable to this treatment, but it would seem that almost any
 "factoid" question can benefit from QDC.  We will discuss the setting-up
 and application of constraint networks, and talk about how (and whether) to
 develop the constraint sets automatically.  We will demonstrate several
 applications of QDC, and present one evaluation in which the F-measure for
 a set of questions improved with QDC from .39 to .69.

DTEND:20040116T150000
DTSTART:20040116T140000
LOCATION:11 Large
SUMMARY:Using Constraints to Improve Question-Answering Accuracy [John Prager (IBM)]
UID:20040116T140000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: TBA
 

DTEND:20040716T163000
DTSTART:20040716T150000
LOCATION:11 Large
SUMMARY:Practice Talks for ACL (+workshops) [Hal Daume III and Radu Soricut]
UID:20040716T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Following the recent adoption by the machine translation community of
 automatic evaluation using the BLEU/NIST scoring process, we conduct an
 in-depth study of a similar idea for evaluating summaries. The results
 show that automatic evaluation using unigram co-occurrences between
 summary pairs correlates surprising well with human evaluations, based
 on various statistical metrics; while direct application of the BLEU
 evaluation procedure does not always give good results.
 

DTEND:20030516T160000
DTSTART:20030516T150000
LOCATION:11 Large
SUMMARY:Automatic Evaluation of Summaries Using N-gram Co-Occurrence Statistics [Chin-Yew Lin]
UID:20030516T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: This talk will address the problem of assessing the correctness of MT
 output on the word level. I will give an overview on word confidence
 measures for SMT.  Different variants of word posterior probabilities that
 can be directly used as confidence measure will be presented. Their
 connection with the Bayes decision rule and the underlying error measure
 will be shown. Experimental comparison of different word confidence
 measures will be presented on a translation task consisting of technical
 manuals.
 
 Additionally, I will show how word confidence measures can be applied in
 an interactive SMT system. This system predicts translations, taking parts
 of the sentence into account that have already been accepted or typed by
 the user. Through the use of confidence measures, the performance of the
 prediction engine can be improved.
 
 
 About the Speaker:
 
 Nicola Ueffing is a graduate research assistant at the group for "Human
 Language Technology and Pattern Recognition" (Lehrstuhl fuer Informatik
 VI) at RWTH Aachen University. She received her diploma in mathematics
 from RWTH Aachen University in 2000. Her research topic is statistical
 machine translation, focusing on confidence measures for SMT. In 2003, she
 was a member of the team working on "Confidence Estimation for SMT" at the
 CLSP workshop at JHU.
 

DTEND:20041217T163000
DTSTART:20041217T150000
LOCATION:11 Large
SUMMARY:Word-Level Confidence Measures for SMT [Nicola Ueffing]
UID:20041217T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: My presentation will overview recent activities on Chinese-English SMT
 carried out at ITC-irst (Trento, Italy).  After an overview of the
 complete architecture of our system, I will focus on progress made in
 Chinese word-segmentation, phrase-based modeling and decoding, log-linear
 modeling and minimum error training, and language model adaptation.
 Experimental results will be provided in terms of Bleu and Nist scores on
 two translation tasks:  basic traveling expressions and news reports,
 respectively adopted by the C-STAR consortium and for the 2002 and 2003
 NIST MT evaluation campaigns.
 
 Bio:
 
 Marcello Federico has been a permanent researcher at ITC-irst since 1991.  
 During 1998-2003, he led the "Multilingual natural speech technologies"
 (MUNST)  research line at ITC-irst.  Since 2004, he is head of the
 "Cross-language information processing" (Hermes) research line. His
 interests include automatic speech recognition, statistical language
 modeling, information retrieval, and machine translation.
 
 

DTEND:20040617T163000
DTSTART:20040617T150000
LOCATION:4th Floor
SUMMARY:Statistical Machine Translation at ITC-irst [Marcello Federico]
UID:20040617T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: As a discipline of biology, the field of neuroscience suffers greatly from
 information overload, non-standardization and complexity. In the absence
 of a mathematical theoretical structure for the subject, scientists use
 their own ad-hoc methods of collating and synthesizing information from
 both the primary literature and their own data. In order to eventually
 formalize and accelerate the development of theoretical approaches in the
 subject, we are combining an Electronic Laboratory Notebook (ELN) with
 asset management of the primary research literature to construct a
 knowledge engineering framework based around the organizational unit of a
 neuroscience laboratory. This project, called ¡NeuroScholar¢
 (http://www.neuroscholar.org/) is open-source, and is being tested and
 used in the laboratories of Prof. Larry Swanson and Prof. Alan Watts at
 USC. In each laboratory, the system will operate on top of a ¡laboratory
 corpus¢ of knowledge resources (data files, full-text pdf files , etc.)
 that summarizes the relevant knowledge for that laboratory. Not only will
 this collection provide a valuable resource for the members of the
 laboratory, it provides a platform for natural language processing and
 knowledge engineering to answer formally-defined research questions. The
 Society for Neuroscience¢s annual meeting attracts over 30,000 attendees,
 who collectively form potential user-base of this software.
 
 I will talk about the ideas underlying the project, the current
 implementation of NeuroScholar, developments from collaboration with the
 natural language group at ISI and possible collaborations for the future.
 
 

DTEND:20050617T240000
DTSTART:20050617T103000
LOCATION:11 Large
SUMMARY:The neuroscience laboratory as a knowledge factory: challenges, approaches and tools [Gully Burns]
UID:20050617T103000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: In the 1990s, researchers applied their new developments in transducer
 theory using widely available easy-to-use toolkits for string transducers,
 and made well-known advances in parsing, machine translation, and other
 areas. Rapid prototyping via software such as the AT&T toolkit and carmel
 was useful for proofs of concept and in many cases led to unforseen
 developments in novel areas. In the current nlp research environment tree
 based strategies and new models have shown promising results in advancing
 the state of the art, and recent developments in weighted tree automata
 theory are enriching the bedrock created 40 years ago, but as of yet there
 is no toolkit available with the necessary capabilities to turn promise
 into solution.
 
 Tiburon is the first probablistic tree transducer toolkit. Similar in form
 and function to the string-based toolkits of yesteryear, it is designed to
 be easy to use, with simple but expressive definitions of tree automata
 and a concise set of vital operations that can be used to construct many
 useful tree-based nlp projects. Although a work in progress, Tiburon is
 already a usable tool with active users between the ages of 6 and 41. I
 will describe the current status of the system, demonstrate ease of use
 and potential power, and discuss the challenges ahead.

DTEND:20060317T163000
DTSTART:20060317T150000
LOCATION:4th Floor
SUMMARY:Tiburon: A Finite State Tree Automata Toolkit [Jon May]
UID:20060317T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: An Overview of Question Answering Challenge
 Jun'ichi Fukumoto and Tsuneaki Kato
 
 In this talk, we will present an overview of Question Answering
 Challenge(QAC), which is the question answering task of the NTCIR
 Workshop.  QAC-1 (the first evaluation of QAC) was carried out
 at NTCIR Workshop 3 in October 2002, and QAC-2 will be at
 NTCIR Workshop 4 in December 2003.  In the QAC, systems to be
 evaluated are expected to return exact answers consisting of a noun
 or noun compound denoting, for example, the names of persons,
 organizations, or various artifacts or numerical expressions such
 as money, size, or date.  Those basically range over the Named
 Entity (NE) elements of MUC and IREX but is not limited to them.
   QAC consists of three kinds of subtasks: Task 1, where the systems
 are allowed to return ranked five possible answers; Task 2, where
 the systems are required to return a complete list of answers; and
 Task 3, the systems are required to answer series of questions, that
 have anaphora and zero-anaphora.  We will present the results of
 QAC-1, and vision and prospect of QAC-2.
 
 NTCIR -- the Way Ahead
 Noriko Kando
 
 Dr. Noriko Kando is the leader of NTCIR(Test Collections and Evaluation
 of IR, Text Summarization, Q&A, etc) project, and an associate professor
 of National Institute of Informatics (NII).  She got her Ph. D in 1995
 from Keio University.  Her research interest includes evaluation of
 information retrieval systems, technologies to "Make Information Usable
 for Users", cross-lingual information retrieval, and analysis of text
 structure, genre, citation & link  She is a member of editorial boards of
 International Journal on Information Processing and Management,
 ACM-Transaction on Asian Language Information Processing, etc.
 
 Jun'ichi Fukumoto and Tsuneaki Kato are task organizers of QAC.
   Dr. Jun'ichi Fukumoto is an associate professor of Ritsumeikan
 University.  He got his Ph. D in 1999 from University of Manchester
 Institute of Science and Technology.  His research interest includes
 Q&A, automatic summarization, and dialogue processing.
 Dr. Tsuneaki Kato is an associate professor of the University of Tokyo.
 He got his Dr. of Engineering in 1995 from Tokyo Institute of
 Technology.  His research interests includes multimodal dialogue
 processing, multimodal presentation generation and domain independent
 question and answering.  He is a member of editorial committee of
 transaction on information and systems of The Institute of Electronics,
 Information and Communication Engineers.
 

DTEND:20031117T240000
DTSTART:20031117T103000
LOCATION:4th Floor
SUMMARY:An Overview of the QA Challenge + NTCIR -- The Way Ahead [Dr. Kato and Dr. Fukomoto (NTCIR)]
UID:20031117T103000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: The annual Computational Linguistics Open House will be held at USC's Information
 Sciences Institute from 3:00-4:30pm in the 11th floor Conference Room. Researchers from
 ISI, including Eduard Hovy, Daniel Marcu, and Kevin Knight will present overviews of
 their latest research.  We will also hear about the research activities of Dani Byrd of
 the Linguistics Department, Shri Narayanan's group in EE, and David Traum and Andrew
 Gordon of USC's Institute for Creative Technologies.
 

DTEND:20031017T163000
DTSTART:20031017T150000
LOCATION:11 Large
SUMMARY:Introduction to CL Research [Hovy, Marcu, Knight, Byrd, Narayanan, Traum, Gordon]
UID:20031017T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: This summer we held a three-month workshop on syntax-driven machine 
 translation, in which we learned syntactic transformations automatically
 from Chinese/English translated corpora and applied them to translate new
 text.  We'll give a progress report!
 
 

DTEND:20040917T163000
DTSTART:20040917T150000
LOCATION:11 Large
SUMMARY:About Syntax Fest 2004 (Part II) [Various]
UID:20040917T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: TBA
 

DTEND:20050218T163000
DTSTART:20050218T150000
LOCATION:11 Large
SUMMARY:TBA [Inderjeet Mani (Georgetown)]
UID:20050218T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION:
DTEND:20030718T160000
DTSTART:20030718T150000
LOCATION:11 Large
SUMMARY:A Maryland Yankee in King Eduard's Court: Some Remarks on a Year in Paradise [Doug Oard]
UID:20030718T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: This talk is the second in three tutorial lectures on ontologies.  It
 first shows some details of various Upper Ontologies-ResearchCYC, SUMO,
 DOLCE, and the Penman Upper Model.  It then discusses the problem of
 creating content for the 'Middle Model' zone of ontologies, and outlines a
 methodology for moving from words to word senses to concepts.  It
 concludes by describing ISI's Omega ontology and showing how Omega has
 been used in annotation projects to support semantic labeling of texts.
 
 Please bring a pen or pencil and some paper; there is a small exercise!
 

DTEND:20050318T163000
DTSTART:20050318T150000
LOCATION:11 Large
SUMMARY:Methodologies of ontology content construction [Ed Hovy]
UID:20050318T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION:     Previous research has indicated that when a polysemous word appears two
     or more times in a discourse, it is extremely likely that they will all
     share the same sense (Gale et al. 92). However, those results were
     based on a coarse-grained distinction between senses (e.g, {\em
     sentence} in the sense of a `prison sentence' vs. a `grammatical
     sentence'). I conducted an analysis of multiple senses within two
     sense-tagged corpora, Semcor and DSO. These corpora used WordNet for
     their sense inventory. I found significantly more occurrences of
     multiple-senses per discourse than reported in (Gale et al. 92) (33\%
     instead of 4\%). I also found classes of ambiguous words in which as
     many as 45\% of the senses in the class co-occur within a document. I
     will discuss the implications of these results for the task of 
     word-sense tagging and for the way in which senses should be  
     represented.

DTEND:20031219T163000
DTSTART:20031219T150000
LOCATION:11 Large
SUMMARY:More than One Sense Per Discourse [Robert Krovetz (Ask Jeeves)]
UID:20031219T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: In the past decade, researchers have explored many approaches to
 automatically extract large collections of knowledge from text. In this
 talk, we present Espresso, a weakly-supervised, general-purpose, and
 broad-coverage algorithm for harvesting binary semantic relations. The
 main contributions are: i) a method for exploiting generic patterns by
 filtering incorrect instances using the Web; and ii) a principled measure
 of pattern and instance reliability enabling the filtering algorithm. We
 present an empirical comparison of Espresso with various state of the art
 systems, on different size and genre corpora, on extracting various
 general and specific relations. Experimental results show that our
 exploitation of generic patterns substantially increases system recall
 with small effect on overall precision.
 

DTEND:20060519T163000
DTSTART:20060519T150000
LOCATION:11 Large
SUMMARY:Espresso: Making Use of Generic Patterns for Mining Relations from Small and Large Corpora [Patrick Pantel]
UID:20060519T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: As DARPA's TIDES (Translingual Information Detection, Extraction, and
 Summarization) program coming to an end, I will give a summary of what we
 have learned from TIDES in summarization and a brief overview of our
 current effort in developing automatic evaluation methods that go beyond
 surface n-gram matching. Topics to be covered:
 
 (1) Summary of DUCs 2001 - 2004
 (2) Automatic Evaluations in Summarization and MT
 (3) Basic Elements - New Efforts in Summarization at ISI

DTEND:20041119T163000
DTSTART:20041119T150000
LOCATION:11 Large
SUMMARY:After TIDES, What's Left? - Finding Basic Elements [Chin-Yew Lin]
UID:20041119T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: I will be presenting some recent results of mine regarding the possibility
 of automatic evaluation in summarization.  I will discuss both my own 
 findings, as well of those of people here and at Columbia, and attempt to 
 explain in a principled fashion why there are disparate opinions on the 
 plausibility of performing automatic evaluation in this task.  I will
 discuss my (perhaps pessimistic) views on the plausibility of doing any
 sort of evaluation of summarization, automatic or otherwise.
 
 The results and experimental setups developed in connection with 
 summarization will be extended to the machine translation.  I will review 
 possible reasons why metrics such a bleu have experienced significantly 
 more success in machine translation than in summarization.  I will also 
 connect the evaluation criterea developed in the context of summarization 
 to machine translation, and discuss the automation of these methods.
 
 In short: I'll talk about why I've been doing so much data elicitaiton 
 recently.
 
 This will be a highly informal seminar and participation is highly
 encouraged.
 

DTEND:20040220T160000
DTSTART:20040220T150000
LOCATION:4 Large
SUMMARY:Some Results in Automatic Evaluation for Summarization and MT [Hal Daume III]
UID:20040220T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Narratology analyzes the discursive structure of narratives as finalized
 products of human invention, such as novels, short-stories, or
 fairy-tales. Those narratives are rendered in a given surface form;
 Narratology focuses on narratives in natural language. Narratologists
 assume that each narrative surface representation is associated with a
 neutral, abstract event sequence, the "Story" (histoire, sjuzhet). The
 abstractness of Story is illustrated by the fact that the same Story can
 be realized in different surface texts. By discursive structure or
 "Discourse" (discours, fabula), narralogists mean the relation between an
 abstract Story and its concrete expression in a sequential text. For
 example, if the chronological order of the Story is not respected in its
 textual recount, we are dealing with the Discourse parameter of order.
 Other Discourse parameters include the frequency with which Story events
 are evoked, the point of view from which they are narrated (perceived,
 evaluated,...), or framed narratives with several narrative levels.
 
 The Story Generator Algorithms project at the University of Hamburg
 evaluated several existing Story Generators with respect to their
 discursive abilities. It became obvious that most Story Generators
 concentrate on creating a coherent and chronological abstract Story,
 which is directly mapped onto natural language. This results in a
 predominance of 1:1 relations between Story and surface, and in most
 cases corresponds to a default or zero instantiation of Discourse
 parameters. As a consequence, Story Generator outputs tend to be very
 explicit and straightforward, and are likely to be perceived as uniform
 and boring.
 
 Narratological expert knowledge might be useful to future enhanced Story
 Generators and to Natural Language Generation systems dealing with
 narrative. One of the aims of Computational Narratology is to model that
 expert knowledge. Ideally, narratological knowledge will be integrated
 into a Narratological Structurer, as a processing component of an
 advanced system that creates narratives. In such a system, the
 Narratological Structurer will be the interface between a Story Generator
 and subsequent Natural Language Generation modules. The talk also
 presents examples of the knowledge that is being modelled.
 
 
 About the Speaker:
 
 Birte Lönneker graduated from the University of Hamburg, Germany, with a
 degree in French with Finno-Ugristics (Finnish) and Business
 Administration. Since then, her main fields of publication are Cognitive
 Linguistics and electronic resources for Natural Language Processing,
 with special focus on frames and metaphors, as well as electronic
 dictionaries, corpora, and recently part-of-speech tagging. Her PhD on
 Concept Frames and Relations, also published as a book in 2003, was
 co-supervised at the Institute for Romance Languages and at the
 Department of Informatics in Hamburg. For her Slovenian-German online
 dictionary, Birte Lönneker was twice awarded the EURALEX Laurence Urdang
 Award. From 2002 to 2004, she received various research grants for
 Slovenia, where she was working in the Corpus Laboratory of the Institute
 of Slovenian Language.
 
 Since 2004, Birte Lönneker carries out research on Story Generator
 Algorithms within the Narratology Research Group Hamburg. She is also a
 board member of the German Cognitive Linguistics Association.
 

DTEND:20050620T113000
DTSTART:20050620T100000
LOCATION:11 Small
SUMMARY:Between Story Generation and Natural Language Generation [Birte Loenneker (Hamburg)]
UID:20050620T100000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION:
DTEND:20030520T160000
DTSTART:20030520T150000
LOCATION:11 Large
SUMMARY:Discourse Segmentation of Multi-Party Conversation [Michel Galley]
UID:20030520T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: In this talk, we introduce a methodology for analyzing judgment opinions.
 We define a judgment opinion as consisting of a valence, a holder, and a
 topic. We decompose the task of opinion analysis into four parts: 1)
 recognizing the opinion; 2) identifying the valence; 3)  identifying the
 holder; and 4) identifying the topic. We evaluate our methodology using
 both intrinsic and extrinsic measures.

DTEND:20060421T163000
DTSTART:20060421T150000
LOCATION:11 Large
SUMMARY:Identifying and Analyzing Judgment Opinions [Soo-Min Kim]
UID:20060421T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: The large corpora of written text that is available to the language
 community has largely been utilized for language understanding; it has
 somewhat been ignored in the context of language generation. Recent
 developments in stochastic generation have allowed such systems to shift
 the burden from hand crafted databases (lexicons, grammars, ontologies) to
 the knowledge implicitly found in written text. However, when building a
 dialogue system, generation is largely interactive, very different from
 the written structure of most corpora.
 
 In this talk, I will discuss my recent work at applying a stochastic
 generator, HALogen, and its newswire language model to a dialogue system,
 TRIPS. I'll describe the difficulties in mapping the TRIPS semantic form
 into HALogen's representation, the critical differences between newswire
 and dialogue, and the possibility of using HALogen and a large newswire
 model as a domain independent generator. 
 

DTEND:20030221T160000
DTSTART:20030221T150000
LOCATION:11 Large
SUMMARY:Statistical Language Generation in a Dialogue System [Nate Chambers]
UID:20030221T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: This talk will be about automatic speech-to-speech translation.  In our
 system, a doctor speaks one language, the patient speaks another language,
 and the machine translates their utterances from one language to the
 other.  The talk will be followed by a demo of our system.
 
 One approach we have been successful with is phrase classification, i.e.,
 classifying a noisy speech-recognized utterance into one of many meaning
 categories.  Phrase classification is computationally cheap and can
 provide high quality translations for in domain utterances almost
 instantaneously. Speed is important for speech translation, where
 processing delay is a great concern.
 
 In this talk, different aspects of building a classification-based speech
 translator are discussed. Following an overview of automatic
 speech-to-speech translation and its challenges, a comparison of different
 classification methods is presented and data collection techniques for
 that application are introduced.
 
 

DTEND:20040621T160000
DTSTART:20040621T150000
LOCATION:11 Large
SUMMARY:Speech-to-Speech Translation: A Phrase Classification Approach [Emil Ettelaie]
UID:20040621T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Term weighting methods have been shown to give significant increases
 in information retrieval performance. Term weights are typically
 calculated using frequency counts across the whole retrieval
 collection, frequency of each term within individual documents and
 compensation for varying document length. The presence of pronomial
 references in documents effectively reduces the within document term
 frequency of associated words with a consequent effect on term weights
 and information retrieval behaviour. This presentation will describe
 an experimental investigation into the impact on information retrieval
 performance of broad coverage automatic pronoun resolution. Results
 using a standard information retieval test collection indicate that
 calculating term weights using a pronoun resolved version of the
 document test collection can improve both fixed cutoff and average
 retrieval precision.
 

DTEND:20030321T160000
DTSTART:20030321T150000
LOCATION:11 Large
SUMMARY:An Investigation of the Application of Broad Coverage Automatic Pronoun Resolution in Information Retrieval [Gareth Jones]
UID:20030321T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: <b>Natural Language Understanding: A fast and accurate Statistical Learning Approach for Dialogue Systems</b>
 
 Natural Language Understanding (NLU) is an essential module of a good
 dialogue system. To achieve satisfactory performance levels, real time
 dialogue systems need the NLU module to be both fast and accurate. Finite
 State Model (FSM) based systems are fast and accurate but lack robustness
 and flexibility. The Statistical Learning Model (SLM) based systems are
 robust and flexible but lack accuracy and are at most times slow.
 
 In this talk, I am going to talk about an SLM based NLU approach for
 dialogue utterances that is both accurate and fast. The system has high
 accuracy and produces frames in real time.
 
 <b>A Community of Words: Understanding Social Relationships from E-mail</b>
 
 A corpus of e-mail messages presents a number of challenges for NLP
 techniques, with its nearly unconstrained structure and vocabulary,
 mistyped words and ungrammatical sentences, and extensive contextual
 information that is never explicitly stated. Yet, the intrinsically social
 nature of such communication provides an opportunity to study not just a
 bag of words, but also the relationships, competencies, and activities
 behind them.
 
 This talk presents work with Eduard Hovy as part of the MKIDS project.
 

DTEND:20040521T163000
DTSTART:20040521T150000
LOCATION:11 Large
SUMMARY:Statistical Learning for Dialogue System <b>and</b> A Community of Words [Tom Murray and Rahul Bhagat]
UID:20040521T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: I am going to be talking about stuff that I have been working over the
 past 6-9 months. This includes randomized algorithms and its application
 to 2 NLP problems: noun clustering and noun-pair clustering. I will also
 be commenting on my experience of working with very very large amounts of
 real Natural Language text (This includes processing and working with data
 available from the web. This corpus is not the standard newspaper text
 that we are so used to in the NLP community.) This talk will also cover a
 large part of my thesis work.

DTEND:20050422T163000
DTSTART:20050422T150000
LOCATION:11 Large
SUMMARY:Working with Large Corpus, High speed clustering and its applications [Deepak Ravichandran]
UID:20050422T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION:
DTEND:20030822T160000
DTSTART:20030822T150000
LOCATION:11 Large
SUMMARY:Information Extraction, IR and QA [Satoshi Sekine]
UID:20030822T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: EM has proved to be a great and useful technique for unsupervised learning
 problems in natural language.  Unfortunately, it cannot solve every
 problem out there, either because the E-step is intractable, the M-step is
 intractable or both.  Typically our community resorts to a Viterbi
 approximation in this case, which really isn't very justified and can
 easily diverge from our expectations (no pun intended). Moreover, EM --
 like all maximum likelihood methods -- suffers from a need for ad-hoc and
 undesirable smoothing.  All of these problems -- intractable E- or
 M-steps, the Viterbi approximation, and the annoyance of smoothing -- are
 solved by using Bayesian methods. Moreover, from a theoretic point of
 view, the Bayesian paradigm is much more foundationally well justified
 than the frequentist use of estimators (such as the maximum likelihood
 estimator), at some cost in computation (though not as much as you might
 believe).
 
 In this tutorial, I will discuss Bayesian methods as they can be used in
 natural language processing.  The first half will be background (some of
 which you probably won't have seen, some of which you probably will have
 seen, but which will probably be presented in a different way that you're
 used to) including graphical models, EM, priors and pro- (and con-)
 Bayesian arguments.  The second half of the tutorial will focus on solving
 complex inference problems, essentially building on what we've seen from
 EM.  I'll cover MAP (*not* Bayesian -- if you can't tell me why, then you
 should come to the tutorial!), summing, Monte Carlo, MCMC, Laplace,
 variational and expectation propagation.  Time permitting, I will briefly
 discuss Bayesian discriminative models (basically what a Bayesian uses
 instead of SVMs), non-parametric (infinite) models and Bayesian decision
 theory, all of which make use of the inference techniques we will have
 already covered.
 
 This tutorial is intended to be largely self contained, though I will
 expect that you know what probabilities are, what distributions are and
 the standard manipulations of conditional/joint distributions. Familiarity
 with EM would be helpful, but I'll cover this topic in some depth since it
 will be important for understanding the rest of the tutorial.  I hope --
 though this never really seems to come to fruition -- that this will be a
 semi-interactive talk and I will attempt to adjust according to what
 people are interested in and what is putting people to sleep.
 
 (see http://www.isi.edu/~hdaume/bayesnlp/ for more information)
 

DTEND:20050622T163000
DTSTART:20050622T130000
LOCATION:11 Large
SUMMARY:Beyond EM: Bayesian Techniques for NLP Researchers [Hal Daume III]
UID:20050622T130000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: This is a practice tutorial for one I am giving at HLT/NAACL one week
 later.  Comments/feedback are very welcome.
 
 ----------------------------------------------------------------------
 
 Expectation Maximization (EM) has proved to be a great and useful
 technique for unsupervised learning problems in speech and language
 processing.  Unfortunately, its range of applications is limited either by
 intractable E- or M-steps, or by its reliance on the maximum likelihood
 estimator.  The natural language processing community typically resorts to
 ad-hoc approximation methods to get (some reduced form of) EM to apply to
 NLP tasks.  However, many of the problems that plague EM can be solved
 with Bayesian methods, which are theoretically more well justified.  In
 this tutorial, I discuss Bayesian methods as they can be used in natural
 language processing.  The two primary foci of this tutorial are specifying
 prior distributions and performing the necessary computations to perform
 inference in Bayesian models.  I focus on unsupervised techniques (for
 which EM is the obvious choice), but discuss supervised and discriminative
 techniques at the conclusion with pointers to relevant literature.
 
 Depending on one's inference technique of choice, the math required to
 build Bayesian learning models can be difficult.  Compounding this problem
 is the fact that current written tutorials on Bayesian techniques tend to
 focus on continuous-valued problems, a poor match for the high-dimension
 discrete world of text.  This combination makes the cost of entrance to
 the Bayesian learning literature often too high.  The goal of this
 tutorial is to provide sufficient motivation, intuition and vocabulary
 mapping so that one can easily understand recent papers in Bayesian
 learning that are published at conferences like NIPS, and increasingly at
 ACL.  In addition to the standard tutorial materials (slides), this
 tutorial is accompanied by a technical report that spells out all the
 mathematic derivations in great detail, for those who wish to start
 research projects in this fields.
 
 This tutorial should be accessible to anyone with a basic understanding of
 statistics.  I use a query-focused summarization task as a motivating
 running example for the tutorial, which should be of interest to
 researchers in natural language processing and in information retrieval.  
 Additionally, though the tutorial does not focus on speech problems, those
 attendees interested in graphical modeling techniques for automatic speech
 recognition might also find the tutorial of interest.

DTEND:20060524T240000
DTSTART:20060524T090000
LOCATION:4th Floor
SUMMARY:Beyond EM: Bayesian Techniques for Human Language Technology Researchers [Hal Daume III]
UID:20060524T090000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: As part of an effort to encode the commonsense knowledge we need in
 natural language understanding, I have been looking at several very common
 words and their uses in diverse corpora, and asking what we have to know
 to understand this word in this context.  In this talk, I will describe
 the investigations of the uses of two words -- the adverb "now" and the
 preposition "like".
 
 One might think that "now" simply expresses a temporal property of an
 event.  But in fact in almost every instance, it is used to point up a
 contrast -- "This is true now.  Something else was true then."  It is thus
 more of a relation than a property.  I will describe several categories of
 such relations.  Another question of interest about "now" is "How long a
 period is the word "now" describing in its various uses?": "I'm typing an
 abstract now" vs. "We travel by automobile now."  I suggest some
 categories of knowledge that need to be encoded to answer this question.
 
 When we successfully understand "A is like B", we have figured out some
 property that A and B have in common.  How can we find that property
 computationally?  In the data I looked at, in 80% of the instances, the
 property is explicit in the nearby text, and I will talk about how we can
 identify it.  For the remainder I examine the knowledge we would need in
 order to infer the common property.
 

DTEND:20041022T163000
DTSTART:20041022T150000
LOCATION:11 Large
SUMMARY:Like Now:  Two Explorations in Deep Lexical Semantics [Jerry Hobbs]
UID:20041022T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: I'll describe our entry into the DUC 2004 automatic document summarization
 competition.  We competed only in the single document, headline generation
 task.  Our system is based on a novel kernel dubbed the tree position
 kernel, combined with two other well-known kernels.  Our system performs
 well on white-box evaluations, but does very poorly in the overall DUC
 evaluation.  C'est la vie.

DTEND:20040423T160000
DTSTART:20040423T150000
LOCATION:10 Large
SUMMARY:A Tree-Position Kernel for Document Compression [Hal Daume III]
UID:20040423T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Natural language interfaces designed for agents that interact with users
 in shared environments (e.g. training simulators, videogames) must
 incorporate knowledge about the users' context in order to address the
 many ambiguities of situated language use. We introduce a model of
 situated language acquisition that operates in two phases.  First,
 intentional context is represented and inferred from user actions using
 probabilistic context free grammars.  Then, utterances are mapped onto
 this representation in a noisy channel framework.  The acquisition model
 is trained on unconstrained speech collected from subjects playing an
 interactive game, and tested using an understanding task.  Discussion of
 results focuses both on the implications for theoretical models of
 cognition, as well as, for natural language applications in shared
 environments.
 

DTEND:20050623T240000
DTSTART:20050623T103000
LOCATION:11 Small
SUMMARY:Intentional Context in Situated Language Learning [Michael Fleischman (MIT)]
UID:20050623T103000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: 1) A serious bottleneck in the development of trainable text summarization
 systems is the shortage of training data. Constructing such data is a very
 tedious task, especially because there are in general many different
 correct ways to summarize a text. Fortunately we can utilize the Internet
 as a source of suitable training data. In this paper, we present a
 summarization system that uses the web as the source of training data. The
 procedure involves structuring the articles downloaded from various
 websites, building adequate corpora of (summary, text) and (extract,
 text) pairs, training on positive and negative data, and automatically
 learning to perform the task of extraction-based summarization systems.
 
 2) Headlines are useful for users who only need information on the main
 topics of a story. We present a headline summarization system that is
 built at ISI for this purpose and is a top performer for DUC2003's task 1,
 generating very short summaries (10 words or less). 
 

DTEND:20030523T160000
DTSTART:20030523T150000
LOCATION:11 Large
SUMMARY:A Web-Trained Extraction Summarization System and Headline Summarization at ISI [Liang Zhou]
UID:20030523T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: 3:30pm  Mark Hopkins (UCLA)
 Tree Sequence Automata: A Unifying Framework for Tree Relation Formalisms
 
 There exist a wide variety of competing formalisms for representing a
 language of ordered tree pairs.  These include (bottom-up and top-down)  
 tree transducers, synchronous tree-substitution grammars (STSGs),
 synchronous tree-adjoining grammars (STAGs), and inversion transduction
 grammars (ITGs).  Since these formalisms have all developed independently
 of one another, it is difficult to compare their respective
 representational power.  This work seeks to make this task simpler by
 viewing these formalisms as instances of a general unifying formalism,
 which we call tree sequence automata (TSA).  By casting these different
 formalisms in a single framework, we can compare them directly by studying
 the specific subclass of TSA that they fall into.
 
 4:00pm  Jason Riesa (Johns Hopkins)
 A case study in building a cost-effective speech-to-speech machine translation system with sparse resources: English - Iraqi Arabic
 
 The Arabic spoken dialect of Iraq is a language deprived of the vast
 resources that researchers enjoy when working with its written
 counterpart, Modern Standard Arabic (MSA). The Iraqi Arabic lexicon and
 grammar are also sufficiently distinct so that the use of existing tools
 or corpora for MSA yield little or no positive effect on machine
 translation output quality.  One can see that building a machine
 translation system normally dependent on a large parallel corpus is a
 particularly difficult task when given just a 37,000 line translated
 parallel text based on transcribed speech. This talk will explore the
 constraints involved in working with this type of data, how we endeavored
 to mitigate such problems as a non-standard orthography and a highly
 inflected grammar, and propose a cost- effective way for dealing with such
 projects in the future.
 
 4:30pm  Preslav Nakov (UC Berkeley)
 Multilingual Word Alignment
 
 Recently there has been a growing number of available multilingual
 parallel texts. One such source is the European Union, which publishes its
 official documents in the official languages of all member states
 (sometimes also in the languages of the candidates). Another source are
 the United Nations. These corpora are a great source of training data for
 machine translation between new language pairs. But they also offer the
 opportunity to obtain better pairwise word alignments by looking at
 multiple languages in parallel. In this talk I will present my research as
 a summer intern at ISI on getting better French (Fr) to English (En) word
 alignments using an additional language (Xx). First, I will introduce two
 heuristics which start with pairwise alignments between Fr-Xx, En-Xx and
 Fr-En and then combine them probabilistically (in a linear model) or
 graph-theoretically (by looking at in- and out-degrees for each word).  
 Then I will present two Model1 inspired alignment models: (a) from "Fr and
 Xx" to En; and (b) from Fr to "En and Xx".

DTEND:20050824T170000
DTSTART:20050824T153000
LOCATION:11 Large
SUMMARY:Summer Student Presentations [Hopkins, Riesa, and Nakov]
UID:20050824T153000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: I present an algorithm, Searn (for "search-learn") that is designed to
 solve structured prediction problem: problems whose goal is to learn to
 predict complex objects such as parts-of-speech, parse trees,
 translations, etc...  Searn functions by "breaking apart" structured
 prediction problems into classification problems in the process of search.  
 I analyze Searn in the framework of learning reductions and show that good
 performance on the underlying classification problems implies good search
 performance.  Moreover, Searn is computationally efficient in a superset
 of the settings where previous algorithms are efficient and is not limited
 by conditional independence assumptions (as in CRFs).  This excessively
 simple and general algorithm turns out to have excellent state-of-the-art
 performance.
 
 This is joint work with John Langford (TTI-C) and Daniel Marcu; and, to a
 lesser extent, with Drew Bagnell (CMU) and Bianca Zadrozny (IBM TJ
 Watson).

DTEND:20060224T163000
DTSTART:20060224T150000
LOCATION:11 Large
SUMMARY:Search-based Structured Prediction [Hal Daume III]
UID:20060224T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Since its inception more than 30 years ago, electronic mail (email)
 has developed into a powerful communication medium with applications
 that extend well beyond simple asynchronous message exchange between
 individuals. Automated tools to support the use of email in
 individual, organizational and social contexts have received
 increasing attention in recent years. Among the tasks that are now
 supported are filtering (e.g., spam detection), aggregation (e.g.,
 mailing list digests), workflow management (e.g., help desk routing),
 and reuse (e.g., retrospective search). We are interested in how
 today's email will be used in the future -- some will certainly be
 preserved (indeed, some MUST be preserved!), and those records will
 serve as powerful evidence of how we lived our lives and organized our
 societies. The challenges of managing many types of electronic record
 collections are receiving increasing attention, but we are not aware
 of any work yet on supporting access to electronic mail archives.
 That will be the focus of this talk.
 
 We will introduce the Open Archival Information Systems (OAIS) model,
 and then focus on two key processes: ingestion and access. Our focus
 in ingestion is on support for review and redaction, which we believe
 will be key enablers to acquisition and near-term access. For access,
 we will address both browsing based on provenance (original order) and
 user-guided reorganization based on search and visualization. Along
 the way, we will identify potentially productive opportunities to
 apply natural language processing technologies such as topic
 segmentation, link detection, and summarization. We will then
 describe two test collections, and demonstrate a system that we have
 developed to explore user-guided reorganization through visualization
 for one of those collections. We will conclude the talk by sketching
 out a research agenda. At that point, we will expect suggestions and
 comments from the audience. Knowing this audience, it is unlikely
 that we will need to wait that long :-).
 

DTEND:20030124T160000
DTSTART:20030124T150000
LOCATION:11 Large
SUMMARY:Access to Archival Collections of Electronic Mail [Doug Oard &amp; Anton Leuski]
UID:20030124T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Parallel texts -- texts that are translations of each other -- are an
 important resource in many cross-lingual NLP applications, such as lexical
 acquisition, cross-language IR, and annotation projection. However, their
 importance is paramount for Statistical Machine Translation (SMT), as they
 provide the training data from which all the translation knowledge is
 learned. The state of the art in SMT is advanced enough that, given
 sufficient parallel data (i.e. a few million words) for any language pair
 in a given domain, a generic SMT system trained on it will achieve a
 reasonable translation performance in that domain. The main reason why SMT
 systems exist only for a handful of languages is that, for most language
 pairs, parallel training data is simply not available.
 
 One way to alleviate this lack of parallel data is to exploit a much
 richer and more diverse resource: comparable corpora, texts which are not
 strictly parallel but related. The prototypical example of comparable
 texts are two news articles in different languages which report on the
 same event. I will present methods for automatic extraction of parallel
 data from such corpora. I will show how to detect parallel data at various
 levels of granularity: parallel documents, parallel sentences, and even
 parallel sub-sentence fragments. The parallel corpora obtained using these
 methods help improve translation performance for both resource-scarce
 language pairs (such as Romanian-English) and resource-rich ones (such as
 Arabic-English).
 

DTEND:20060324T163000
DTSTART:20060324T150000
LOCATION:11 Large
SUMMARY:Automatic creation of parallel corpora [Dragos Munteanu]
UID:20060324T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: In the last years a standard model in statistical machine
 translation has emerged, which is based on the translation
 of sequences of words (so-called "phrases") at a time.
 I will describe this model, how to train and decode with it,
 but the focus of this talk will be how to address the
 challenges to advance and move beyond the model: my thesis
 work on noun phrase translation, making use of syntax, and
 better modeling, such as discriminative training.
 
 Bio: Philipp Koehn is the author of papers on natural language
 processing, machine translation, and machine learning. He
 received his PhD from the University of Southern California
 in 2003 (advisor: Kevin Knight), and is currently employed as
 a postdoc at the Massachusetts Institute of Technology, working
 with Michael Collins. He has worked at AT&T Laboratories on
 text-to-speech systems, and at WhizBang! Labs on text
 categorization.
 

DTEND:20040524T170000
DTSTART:20040524T160000
LOCATION:11 Large
SUMMARY:Challenges in Statistical Machine Translation [Philipp Koehn]
UID:20040524T160000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: I will present some preliminary results on the problem of domain 
 adaptation in maximum entropy models, specifically in the case when there 
 is a large amount of "out of domain" data, and only a very small amount of 
 "in domain" data.  The model and algorithms I present are based on the 
 technique of conditional Expectation Maximization (CEM) and allow for 
 relatively fast optimization of these models.  Preliminary results on some 
 tasks are quite promising.
 
 

DTEND:20040924T163000
DTSTART:20040924T150000
LOCATION:11 Large
SUMMARY:Domain Adaptation in Maximum Extropy Models [Hal Daume III]
UID:20040924T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Traditional statistical MT systems mostly work on the word-
 andphrase-level. For different language pairs, the performance of such
 systems vary from some 15% to 35%. These systems suffer from problems
 such as sparse data, with huge vocabulary sizes leading to less
 reliable probability estimates. In our current research, we aim to
 come up with a better MT system by looking inside the words. Almost in
 every language, a root (stem) can have many different forms
 (inflectional, derivational, etc.). If we can identify the roots, the
 size of the vocabulary will quite small, and we can have better
 probability estimates, reducing the sparse data problem and
 potentially leading to higher accuracy. We are trying to come up with
 a model that induces morphology automatically from a bilingual corpus
 and achieves this improvement.
 

DTEND:20030425T160000
DTSTART:20030425T150000
LOCATION:11 Large
SUMMARY:Statistical MT with Bilingual Morphology [Quamrul Tipu]
UID:20030425T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION:
DTEND:20030725T160000
DTSTART:20030725T150000
LOCATION:11 Large
SUMMARY:Super-Carmel for Trees [Jonathan Graehl and Kevin Knight]
UID:20030725T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Tree-based probability models of translation have been proposed to take
 advantage of parse trees on one, both, or neither sides of a parallel
 corpus.  I will present comparative results for these three approaches for
 the task of word alignment on Chinese-English and French-English data, as
 well as some analysis of what is going on behind the numbers.
 

DTEND:20040625T160000
DTSTART:20040625T150000
LOCATION:11 Large
SUMMARY:Syntactic Supervision and Tree-Based Alignment [Dan Gildea]
UID:20040625T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: The Scamseek project aims to build a surveillance tool for identifying
 financial scams on the Internet by performing document classification of
 Internet pages. There are three principle types of documents of concern:
 those that give financial advice by unregistered advisors, unlawful
 investment schemes, and share ramping.
 
 The first phase of the project has been completed and a working system,
 known as ScamAlert installed at the Australian Securities and Investment
 Commission (ASIC). The independent audit of the performance of the system
 proved satisfactory with a result for precision of .75, recall .43, and
 F=. 54, along with identification of 4 scams misclassified by the client.
 Significant improvement in recall is foreshadowed in the 2nd phase of the
 project.  The results are satisfying in the context of the structure of
 the data where the density of scam documents is about 1.8% of the total
 corpus.
 
 The good performance of the operational system is ascribed to the
 combination of using a strong linguistic model of language (Systemic
 Functional Linguistics) to define the scam documents in parallel with a
 rich statistical analysis of the structure of non-scam documents and scam
 look-alikes. A large amount of the experimental program has concentrated
 on understanding and exploiting the interaction between the linguistically
 described aspects of the documents and the statistical properties. Each
 type of data has been used to inform and modify the usage of the other.
 
 The operational aspects of the project have proven to be as challenging as
 the research objectives. The project has a budget of $2.2M over 15 months.
 It has been managed so as to create a balance in resources between the
 needs of both the research objectives and the engineering objectives.
 Software development has concentrated on three aspects. Firstly, to
 produce an environment for the strong directive management of
 computational linguistics experiments, secondly, in the aid of the
 linguists to create tools to support their manual analysis, and thirdly
 the best practice of software engineering principles to ensure a clean
 automated rollout of the production system for ASIC.
 
 The contributing partners in the Scamseek project are The Capital Markets
 Co-operative Research Centre (CMCRC), ASIC, the University of Sydney and
 Macquarie University.

DTEND:20040325T240000
DTSTART:20040325T103000
LOCATION:11 Large
SUMMARY:ScamSeek: Capturing Financial Scams at the Coalface by Language Technology [Jon Patrick (U. of Sydney)]
UID:20040325T103000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Speech is a crucial component in human computer interaction. While
 tremendous progress has been made in automatic speech recognition, speech
 transcription -- which is the output of automatic speech recognition -- is
 far from providing all the information that one could retrieve from
 speech. For example, prominence, pause, rhythm, and rate of speech all
 carry important information in speech and are crucial in speech
 perception. Inclusion of such information can facilitate better machine
 recognition and understanding of speech.
 
 In this talk, we will introduce the research effort and result in speech
 rate, prominence, disfluency and utterance boundary detection. We will
 also show some interesting applications utilizing these features in
 natural language understanding and dialog management.

DTEND:20050325T163000
DTSTART:20050325T150000
LOCATION:11 Large (THIS HAS CHANGED!!!)
SUMMARY:Metalinguistic feature study for spontaneous speech in human computer interaction [Dagen Wang]
UID:20050325T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Parsing and translating natural languages can be viewed as
 structured-prediction problems. We outline the crucial design
 decisions that must be made to build a machine to solve structured
 prediction problems, and explain our particular choices for these two
 large-scale NLP problems.  Our approach uses a purely discriminative
 learning method that scales up well to problems of this size.  Unlike
 currently popular methods, this one does not require a great deal of
 feature engineering a priori, because it performs feature selection
 over a compound feature space as it learns.  Accuracy on constituent
 parsing was at least as good as other comparable methods.  To our
 knowledge, it is the first purely discriminative learning algorithm
 for translation with tree-structured models.  Experiments demonstrate
 the method's versatility, accuracy, and efficiency.
 

DTEND:20060623T163000
DTSTART:20060623T150000
LOCATION:11 Large
SUMMARY:Discriminative Training for Large-Scale NLP [Joseph Turian (NYU)]
UID:20060623T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: In this talk, I will introduce some of the technologies which
 we have developed in the project on an English reading assistant system
 called English Reading Wizard. The technologies include a method for
 mining translations from web (unparallel corpora), a method for word
 translation disambiguation based on bootstrapping, which is called
 Bilingual Bootstrapping, and a general method of bootstrapping, which is
 called Collaborative Bootstrapping. First, I will introduce the main
 features of English Reading Wizard. Next, I will introduce each of the
 methods. The translation mining method is based on a naïve Bayesian
 ensemble and the EM algorithm. Bilingual Bootstrapping uses the
 asymmetric translation relationship between words in the two languages
 in translation and can construct reliable classifiers for word
 translation disambiguation. Collaborative Bootstrapping contains the
 co-training algorithm as its special case, and it uses the strategy of
 uncertainty reduction in training of the two classifiers.
 
 Bio:
 
 Hang Li is a researcher at the Natural Language Computing Group
 of Microsoft Research in Beijing, China. He is also adjunct professor of
 Xian Jiaotong University. Hang Li obtained a B.S. in Electrical
 Engineering from Kyoto University (Japan) in 1988 and a M.S. in Computer
 Science from Kyoto University in 1990. He earned his Ph.D. in Computer
 Science from the University of Tokyo in 1998. >From 1990 to 2001, Hang
 Li worked at the Research Laboratories of NEC Corporation in Kawasaki,
 Japan. He joined Microsoft Research in 2001.  His research interest
 includes statistical learning, natural language processing, data mining,
 and information retrieval. Hang Li's web site:
   http://research.microsoft.com/users/hangli/
 

DTEND:20031125T240000
DTSTART:20031125T223000
LOCATION:11th Floor Large
SUMMARY:Using Bilingual Data to Mine and Rank Translations [Hang Li (MSR Beijing)]
UID:20031125T223000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: 3:00pm  Victoria Fossum (Michigan)
 Exploring the Continuum between Phrase-based and Syntax-based Machine Translation
 
 State-of-the-art statistical machine translation systems use lexical
 phrases as the basic unit of translation.  Phrase-based systems can
 capture those aspects of translation that are sensitive to local context.  
 Syntax-based systems, on the other hand, make use of linguistically
 motivated syntactic structure, can capture long-distance dependencies and
 reorderings, and offer greater generalization in translation rules.  
 However, their performance lags that of phrase-based systems.
 
 Hierarchical phrase-based translation, introduced by [Chiang 05], provides
 an elegant framework for exploring the continuum between phrase-based and
 syntax-based translation.  This system combines the "formal machinery" of
 syntax-based systems without any "linguistic commitment" to a particular
 syntactic structure [Chiang 05].
 
 I will present results from my re-implementation of Chiang's hierarchical
 phrase-based system, and (if time permits) compare those results with the
 following systems on Chinese-English translation: ISI's phrase-based
 system, and ISI's syntax-based system.  Between now and December 2005, I
 plan to incrementally explore the space between phrase-based and
 syntax-based systems by augmenting these hierarchical phrase-based rules
 with richer syntactic annotation.
 
 
 3:30pm  Liang Huang (Penn) and Hao Zhang (Rochester)
 Efficient Integration of n-gram Language Models with Syntax-based Decoding
 
 We first give an overview of the ISI syntax-based MT system which is based
 on tree-to-string (xRs) translation rules. The biggest problem at this
 stage is the inefficiency of the integration of n-gram models.  Without
 n-gram models, the xRs translation rules can be easily binarized with
 respect to the foreign language to ensure cubic-time decoding. With n-gram
 models, however, binarization without considering both languages will lead
 to exponential complexity.
 
 Inspired by Inversion Transduction Grammar (ITG) (Wu, 97), we will focus
 on the so-called ITG binarizable rules which count for over 99% of the
 whole rule set. A simple linear-time algorithm will be presented to do the
 binarization. Decoding with ITG-like rules is of low polynomial complexity
 in both time and space. We will discuss experimental results on both
 efficiency and accuracy of decoding with the new binarization.  If time
 permits, we will also present the "hook trick" (inspired by (Eisner and
 Satta, 99)) to even further reduce the polynomial complexity of the
 decoding process.

DTEND:20050826T163000
DTSTART:20050826T150000
LOCATION:11 Large
SUMMARY:Summer Student Presentations [Fossum, Huang and Zhang]
UID:20050826T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Many research efforts are addressing the problem of enabling automatic
 summarization of opinions and assessments stated on the web in product
 reviews, discussion forums, and blogs. One key difficulty is that relevant
 assessments scattered throughout web pages are obscured by variations in
 natural language. In this paper, we focus on a novel aspect of enabling
 aggregations of assessments of degree to which a given property holds for
 a given entity (for instance, how touristy is Boston). We present
 GrainPile, a user interface for extracting from the web, aggregating and
 quantifying degree assessments of unconstrained topics. The interface
 provides a variety of functions: a) identification of dimensions of
 comparison (properties) relevant to a particular entity or set of
 entities, b) comparisons of like entities on user-specified properties
 (for example, which university is more prestigious, Yale or Cornell), c)
 tracing the derived opinions back to their sources (so that the reasons
 for the opinions can be found). A central contribution in GrainPile is the
 evaluated demonstration of feasibility of mapping the recognized
 expressions (such as fairly, very, extremely, and so on) to a common scale
 of numerical values and aggregating across all the extracted assessments
 to derive an overall assessment of degree. GrainPile&#8217;s novel
 assessment and aggregation of degree expressions is shown to strongly
 outperform an interpretation-free, co-occurrence based method.
 
 Full paper:
 
 http://www.isi.edu/~timc/papers/IUI06-grainpile-chkl.pdf
 
 

DTEND:20060126T140000
DTSTART:20060126T130000
LOCATION:4th floor
SUMMARY:GrainPile: Deriving Quantitative Overviews of Free Text Assessments on the Web [Tim Chklovski]
UID:20060126T130000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: This talk will survey results of several recent projects we have been
 undertaking in automated text categorization based upon the style,
 rather than the topic, of the documents.  I will describe a general
 text-categorization framework using machine learning along with general
 principles for choosing stylistically relevant sets of features for
 learning effective classification models.  Applications of these methods
 include determining author gender and text genre in published books and
 articles, authorship attribution of email messages, and analysis of
 language use in different scientific fields.  In many cases, the models
 that are learned also give some insight into the respective styles being
 distinguished, which I will also discuss.
 
 Shlomo Argamon is an associate professor at the Illinois Institute of
 Technology Chicago.
 

DTEND:20040326T150000
DTSTART:20040326T133000
LOCATION:11 Large
SUMMARY:On Writing, Our Selves: Explorations in Stylistic Text Categorization [Shlomo Argamon]
UID:20040326T133000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: These are two practice talks for our upcoming thesis defenses.  The titles 
 and abstracts are:
 
 --------------------------------------------------------------------------
 
 NATURAL LANGUAGE GENERATION FOR TEXT-TO-TEXT APPLICATIONS USING AN INFORMATION-SLIM REPRESENTATION
 
 Radu Soricut
 
 In this talk, I describe a new natural language generation paradigm, based
 on direct transformation of textual information into well-formed textual
 output.  I support this language generation paradigm with theoretical
 contributions in the field of formal languages, new algorithms, empirical
 results, and software implementations. At the core of this work is a novel
 representation formalism for probability distributions over finite
 languages. Due to its convenient representation and computational
 properties, this formalism supports a wide range of language generation
 needs, from sentence realization to text planning.
 
 Based on this formalism, I describe, implement, and analyze theoretically
 a family of algorithms that perform language generation using direct
 transformations of text. These algorithms use stochastic models of
 language to drive the generation process. I perform extensive empirical
 evaluations using my implementation of these algorithms. These evaluations
 show state-of-the-art performance in automatic translation, and
 significant improvements in state-of-the-art performance in abstractive
 headline generation and coherent discourse generation.
 
 
 --------------------------------------------------------------------------
 
 PRACTICAL STRUCTURED LEARNING FOR NATURAL LANGUAGE PROCESSING
 
 Hal Daume III
 
 Natural language processing is replete with problems whose outputs are
 highly complex and structured.  The current state-of-the-art in machine
 learning is not yet sufficiently general to be applied to general problems
 in NLP.  In this thesis, I present Searn (for "search" + "learn"), an
 approach to learning for structured outputs that is applicable to the wide
 variety of problems encountered in natural language.  Searn operates by
 transforming structured prediction problems into a collection of
 classification problems, to which any standard binary classifier may be
 applied.  From a theoretical perspective, Searn satisfies a strong
 fundamental performance guarantee: given a good classification algorithm,
 Searn yields a good structured prediction algorithm.  To demonstrate
 Searn's general applicability, I present applications in such diverse
 areas as automatic document summarization and entity detection and
 tracking.  In these applications, Searn is empirically shown to achieve
 state-of-the-art performance.

DTEND:20060526T170000
DTSTART:20060526T150000
LOCATION:11 Large
SUMMARY:Defense Practice Talks: Generation and Learning [Radu Soricut and Hal Daume III]
UID:20060526T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION:
DTEND:20030827T160000
DTSTART:20030827T150000
LOCATION:11 Large
SUMMARY:Syntax for Statistical MT [Michel Galley and Mark Hopkins]
UID:20030827T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: A multi-document summary gives the "gist" of what is contained in a
 collection of related documents. But how can we define a "gist?" We
 explore this question by analyzing human written summaries for clusters of
 document sets. In particular, we estimate the probability that word will
 be chosen by a human to be included in a summary. We demonstrate that if
 this probability model were given by an oracle, then a simple automatic
 method of summarization can produce extract summaries which are
 statistically indistinguishable from the human summaries.
 
 About the Speaker:
 
 John M. Conroy received a B.S. in Mathematics from Saint Joseph's
 University in 1980 and a Ph.D. in Applied Mathematics from the University
 of Maryland in 1986. Since then he has been a research staff member for
 the IDA Center for Computing Sciences in Bowie, MD. His research interest
 is applications of numerical linear algebra and statistics. He is a member
 of the Society for Industrial and Applied Mathematics, Institute of
 Electrical and Electronics Engineers (IEEE), and the Association for
 Computational Linguistics.
 

DTEND:20060127T163000
DTSTART:20060127T150000
LOCATION:11 Large
SUMMARY:Multi-Document Summary Space:What do People Agree is Important? [John Conroy]
UID:20060127T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION:
DTEND:20030627T160000
DTSTART:20030627T150000
LOCATION:10 Large
SUMMARY:Offline Strategies for Online Question Answering: Answering Questions Before They Are Asked and Maximum Entropy Models for FrameNet Classification [Michael Fleischman]
UID:20030627T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: We describe a new sentence realization framework for text-to-text
 applications. This framework uses IDL-expressions as a representation
 formalism, and a generation mechanism based on algorithms for intersecting
 IDL-expressions with probabilistic language models. We present both
 theoretical and empirical results concerning the correctness and
 efficiency of these algorithms.
 

DTEND:20050527T163000
DTSTART:20050527T150000
LOCATION:11 Small
SUMMARY:Towards Developing Generation Algorithms for Text-to-Text [Radu Soricut]
UID:20050527T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Probabilistic parsing methods have in recent years transformed our ability to
 robustly find correct parses for open domain sentences.  Much of this work has
 been within a common architecture of heuristic search for good pares in
 lexicalized probabilistic context-free grammars, with many layers of back-off
 to avoid problems of sparse data. 
 
 In this talk, I will outline some different ideas that we have been pursuing. 
 I will connect stochastic parsing with finding shortest paths in hypergraphs,
 and show how this approach naturally provides a chart parser for arbitrary
 probabilistic context-free grammars (finding shortest paths in a hypergraph is
 easy; the central problem of parsing is that the hypergraph has to be
 constructed on the fly). From this viewpoint, a natural approach is to use the
 A* algorithm to cut down the work in finding the best parse. On unlexicalized
 grammars, this can reduce the parsing work done dramatically, by at least 97%.
 This approach is competitive with methods standardly used in statistical
 parsers, while ensuring optimality, unlike most heuristic approaches to
 best-first parsing. 
 
 Finally, I will present a novel modular generative model in which semantic
 (lexical dependency) and syntactic structures are scored separately. This
 factored model is conceptually simple, linguistically interesting, admits exact
 inferenence with an extremely effective A* algorithm, and provides
 straightforward opportunities for separately improving the component models. In
 particular, I will mention some of the work we have done focusing on the PCFG
 component to produce a very high accuracy unlexicalized grammar. 
 
 This is joint work with Dan Klein. 
 
 About the Speaker:
 
 Christopher Manning is an Assistant Professor of Computer Science and
 Linguistics at Stanford University. He received his Ph.D. from Stanford
 University in 1995, and served on the faculty of the Computational Linguistics
 Program at Carnegie Mellon University (1994-1996) and the University of Sydney
 Linguistics Department (1996-1999) before returning to Stanford. His research
 interests include probabilistic models of language, natural language parsing,
 constraint-based linguistic theories, syntactic typology, information
 extraction and text mining, and computational lexicography. He is the author of
 three books, including Foundations of Statistical Natural Language Processing
 (MIT Press, 1999, with Hinrich Schuetze). 
 
 Chris' schedule is available in <a href="manning.ps">Postscript</a> or
 <a href="manning.pdf">PDF</a> format.

DTEND:20031027T110000
DTSTART:20031027T100000
LOCATION:11 Large
SUMMARY:Natural Language Parsing: Graphs, the A* Algorithm, and Modularity [Christopher Manning (Stanford)]
UID:20031027T100000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: TBA
 

DTEND:20040428T170000
DTSTART:20040428T150000
LOCATION:11 Large
SUMMARY:Practice Talks for HLT/NAACL [Dragos Muntanu, Radu Soricut and Hal Daume III]
UID:20040428T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: The research of extracting event duration information from texts is 
 potentially very important in applications in which the time course of 
 events is to be extracted from news.  For example, whether two events 
 overlap or are in sequence often depends very much on their durations.  If 
 a war started yesterday, we can be pretty sure it is still going on today. 
 If a hurricane started last year, we can be sure it is over by now.
 
 In the talk, I will first present our work on constructing an annotated 
 corpus for extracting information about the typical durations of events 
 from texts, including the annotation guidelines, the event classes we 
 categorized, the way we use normal distributions to model such vague and 
 implicit temporal information, and how we evaluate inter-annotator 
 agreement. I will then show that machine learning techniques applied to 
 this data yield coarse-grained event duration information, considerably 
 outperforming a baseline and approaching human performance.
 
 At the beginning of the talk, I will also give a brief overview of the 
 time ontology (OWL-Time, formerly DAML-Time) we have developed, which is 
 represented in both first-order logic and the OWL web ontology language.
 

DTEND:20060428T163000
DTSTART:20060428T150000
LOCATION:11 Large
SUMMARY:Learning Event Durations from Event Descriptions [Feng Pan]
UID:20060428T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: We introduce two probabilistic models that can be used to identify 
 elementary discourse units and build sentence-level discourse parse 
 trees. The models use syntactic and lexical features. A discourse parsing
 algorithm that implements these models derives discourse parse trees with
 an error reduction of 18.8\% over a state-of-the-art decision-based
 discourse parser. A set of empirical evaluations shows that our discourse
 parsing model is sophisticated enough to yield discourse trees at an
 accuracy level that matches near-human levels of performance.
 

DTEND:20030228T160000
DTSTART:20030228T150000
LOCATION:11 Large
SUMMARY:Sentence Level Discourse Parsing using Syntactic and Lexical Information [Radu Soricut]
UID:20030228T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: This talk has two parts. In the first part, I will introduce research
 activities in Speech-to-Speech Translation at ATR, including on-going
 research on statistical machine translation. In the second part, I will
 present a new approach to QA named Question-Biased Term Extraction (QBTE).
 The QBTE directly extracts answers as terms biased by the question. To
 confirm the feasibility of our QBTE approach, we conducted experiments on
 the CRL QA Data based on 10-fold cross validation, using Maximum Entropy
 Models as an ML technique. Experimental results showed that the trained
 system achieved approximately 0.35 in MRR and 50% in TOP5 accuracy. This
 part is an English version of my presentation given in IPSJ SIGNL-163 in
 2004 in Japanese. If time allows, I would like to introduce the NTCIR-5
 (2004/2005) Cross-Lingual QA task (CLQA) that I am going to organize.
 
 About the speaker:
 
 Yutaka Sasaki received his Ph.D. in Engineering from the University of
 Tsukuba, Japan in 2000 for his work on generating Information Extraction
 rules with hierarchically sored Inductive Logic Programming. He joined NTT
 Laboratories in 1988. Since then, he was involved in research in
 rule-based CAI, inductive logic programming, Information Extraction, and
 Question Answering. From 1995 to 1996, he spent one year at Simon Fraser
 University, Canada as a visiting researcher. From 1999, he led a subgroup
 to develop the first practical Japanese Question Answering System SAIQA.
 Then, he applied SVMs to automatically construct the QA system SAIQA-II
 from QA and NE data. In June 2004, he moved to ATR Spoken Language
 Translation Research Laboratories. Currently, he is the head of Department
 of Natural Language Processing. He is also an organizer of the NTCIR 5
 Cross-Lingual Question Answering Task.
 

DTEND:20050128T163000
DTSTART:20050128T150000
LOCATION:11 Large
SUMMARY:Research Activities in Speech Translation at ATR/QA as Question-Biased Term Extraction [Yutaka Sasaki (ATR)]
UID:20050128T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION:
DTEND:20030829T160000
DTSTART:20030829T150000
LOCATION:11 Large
SUMMARY:Deepening Representations [Stefan Riezler]
UID:20030829T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION:
DTEND:20030729T160000
DTSTART:20030729T150000
LOCATION:11 Small
SUMMARY:A Model of Word Movement for Machine Translation [Michael Brasser]
UID:20030729T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: 
 The topics & approximate start times:
 
 (3:00 sharp) My 7-10 min bit for panel discussion on "Manual vs. Automated
 Knowledge Acquisition"
 
 Will touch on web extraction vs. learning from volunteers -- strengths and
 weaknesses, new thoughts on synergies
 
 (3:15) Designing Intelligent Acquisition Interfaces for Collecting World
 Knowledge from Web Contributors
 (paper by Timothy Chklovski, Yolanda Gil)
 
 (3:55) Collecting Paraphrase Corpora from Volunteer Contributors (paper by
 Timothy Chklovski)

DTEND:20050929T163000
DTSTART:20050929T150000
LOCATION:11 Large
SUMMARY:Previews of my talks for K-CAP [Tim Chklovski]
UID:20050929T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Summarization requires one to identify the internal structure of
 information and to bring that to the surface both operationally and
 organizationally.
 
 How does one put this theory to practice and build real summarization
 systems? How do the systems built based on this idea perform?
 

DTEND:20040430T163000
DTSTART:20040430T150000
LOCATION:11 Large
SUMMARY:Automating the Building of Summarization Systems [Liang Zhou]
UID:20040430T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: PropBank: the next stage of Treebank
 
 Natural-language engineers the world over are coming to a consensus that a
 degree of semantic knowledge is a necessary addition to purely structural
 representations of language.  This talk describes the Propbank project at
 Penn, which provides a complete shallow semantic parse of the Treebank II
 corpus.
 
 Inducing a Chronology of the Pali Canon:
 
 Works such as Kroch (1989), Taylor (1994) and Han (2000) have demonstrated
 that syntactic change can be described mathematically as the competition
 between innovating and archaic formations.  This paper demonstrates how
 this same mathematical description can be turned around to predict the
 date of a historical text.  The Middle Indic period showed dramatic change
 in the morphological system, such as the collapse of the past-tense verbal
 system.  Whereas Sanskrit had three competing formations, each with
 multiple possible morphological realizations, Pali (a Middle Indo-Aryan
 language) had only a single formation, based mostly on the sigmatic aorist
 although many archaic nonsigmatic aorists are also attested.  The
 proportions of the archaic and innovative forms can be easily calculated
 for each text in the Pali Canon and these proportions used to assign an
 approximate date for each text.  The accuracy of the method can be
 assessed qualitatively by comparing the derived chronology to chronologies
 based on various non-linguistic criteria, or quantitatively by comparing
 the derived chronology to a known dating scheme.  For the latter it is
 necessary to turn to a different dataset, such as that describing the rise
 of do-support in Early Modern English, as described in Ellegard (1953) and
 Kroch (1989).
 
 Bio:
 
 Paul Kingsbury graduated summa cum laude in linguistics from Ohio State
 University in 1993 with a thesis on "Some sources for L-words in
 Sanskrit".  He subsequently entered the University of Pennsylvania to
 study historical linguistics and Sanskrit, but (like most historical
 students) was diverted to computational issues.  He joined the Propbank
 project in 2000 and soon thereafter engineered a major rethinking of the
 methods and goals of the project, in order to make the annotation
 linguistically meaningful.  He completed his doctorate in 2002 with a
 thesis entitled 'The Chronology of the Pali Canon: the case of the
 aorist'.
 

DTEND:20040130T163000
DTSTART:20040130T150000
LOCATION:11 Large
SUMMARY:PropBank: the next stage of Treebank <b>and</b><br>Inducing a Chronology of the Pali Canon [Paul Kingsbury (Penn)]
UID:20040130T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: (This is a practice talk for a paper by Giorgio Satta and Enoch Peserico)
 
 This paper investigates some computational problems associated with
 probabilistic translation models that have recently been adopted in the
 literature on machine translation. These models can be viewed as pairs of
 probabilistic context-free grammars working in a `synchronous' way. Two
 hardness results for the class NP are reported, along with an exponential
 time lower-bound for certain classes of algorithms that are currently used
 in the literature.
 

DTEND:20050930T163000
DTSTART:20050930T150000
LOCATION:4 Large
SUMMARY:Some Computational Complexity Results for Synchronous Context-Free Grammars [David Chiang]
UID:20050930T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: I will give a status report on my current thesis work on
 noun phrase translation. The motivation of this work is
 to break up the machine translation problem into smaller,
 more manageable units. The treatment of noun phrase translation
 as a subtask of machine translation is both linguistically
 and empirically motivated. My approach is to generate 
 a n-best list of candidate translations with a statistical 
 machine translation system and rerank the candidates with
 additional features. For about 90% of all noun phrases we
 can find an acceptable translation in the 100-best list, while 
 an acceptable translation comes out on the very top for only 
 about 60% of the noun phrases. I will discuss a variety of 
 linguistic and empirical features that (may) help to move 
 the acceptable translations higher in the list. I will also
 present results modeling issues such as phrase based 
 translation and compound splitting. This talk is also 
 intended as a fishing expedition for feature suggestions by
 the audience.
 

DTEND:20030131T160000
DTSTART:20030131T150000
LOCATION:11 Large
SUMMARY:Noun Phrase Translation [Philipp Koehn]
UID:20030131T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: The Prague Dependency Treebank project is aimed at a linguistically
 complex, multi-tier annotation of relatively large amounts of naturally
 occuring sentences of natural language. There are four tiers at present:
 the basic token tier (level 0), and the morphological, surface-syntacic,
 and semantic (called "tectogrammatics") tiers. The syntactic and
 tectogrammatic tiers are based on a richly labelled dependency
 representation principle. So far, the project produced three corpora: the
 Czech-language-only Prague Dependency Treebank, the Prague Czech-English
 Dependency Treebank and the Prague Arabic Dependency Treebank. In the
 talk, the principles of the Prague Dependency Treebank linguistic
 annotation scheme will be presented. Some technical details will also be
 discussed, as well as some of the tools developed both for the manual
 annotation itself and for corpus-based NLP of Czech, English and Arabic.
 

DTEND:20050805T240000
DTSTART:20050805T103000
LOCATION:11 Large
SUMMARY:The Family of Prague Dependency Treebanks [Jan Hajic (Charles U)]
UID:20050805T103000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: (Practice tutorial for ACL/COLING 2006)
 
 Once upon a time, synchronous grammars and tree transducers were esoteric
 topics in formal language theory, far removed from the practice of
 building real, large-scale natural language systems. However, these tools
 are now rapidly becoming essential for modeling machine translation and
 other complex language transformations. It has therefore become practical
 and important to understand the basic properties of tree transformation
 systems, which we cover in this tutorial.
 

DTEND:20060630T170000
DTSTART:20060630T140000
LOCATION:11 Large
SUMMARY:Synchronous Grammars and Tree Transducers [David Chiang and Kevin Knight]
UID:20060630T140000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: This is a practice talk for my Ph.D. defense, which  
 will be held on Aug 24th 3-5pm, SAL 322.
 
 An important problem in the area of homeland security and fraud  
 detection is to identify abnormal entities in large datasets.   
 Although there are methods from knowledge discovery and data mining  
 focusing on finding anomalies in numerical datasets, there has been  
 little work aimed at discovering abnormal or suspicious instances in  
 large and complex semantic graphs whose nodes are richly connected  
 with many different types of links. In this talk, I will describe a  
 novel, domain-independent and unsupervised framework to identify such  
 instances.  Besides discovering suspicious instances, we believe that  
 to complete the discovery process and to deal with the "curse of  
 false positives", a system has to convince the users by providing  
 explanations for its findings. Therefore, in the second part of the  
 talk I will describe an explanation mechanism to automatically  
 generate human-understandable explanations for the discovered  
 results. Experimental results show that our discovery system  
 outperforms state-of-the-art unsupervised network algorithms used to  
 analyze the 9/11 terrorist network by a large margin. Additionally, a  
 human study we conducted demonstrates that our explanation system,  
 which provides natural language explanations for its findings,  
 allowed human subjects to perform complex data analysis in a much  
 more efficient and accurate manner
 
 

DTEND:20060804T163000
DTSTART:20060804T153000
LOCATION:11 Large
SUMMARY:Ph.D. defense practice talk [Shou-de Lin]
UID:20060804T153000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Extracting Parallel Sub-Sentential Fragments from Non-Parallel Corpora
 Dragos Munteanu
 
 We present a novel method for extracting parallel sub-sentential fragments
 from comparable bilingual corpora. Currently, the state of the art in
 comparable corpus mining is only able to extract full sentence pairs which
 are judged to be parallel. We advance the state of the art by showing how
 to obtain useful data even from not-fully-parallel sentences. By analyzing
 sentence pairs using a signal-processing-inspired approach, we detect
 which segments of the source sentence are translated into segments of the
 target sentence, and which are not. We evaluate the quality of the
 extracted data by showing that it improves the performance of a
 state-of-othe-art machine translation system.
 
 
 Advances in Discriminative Parsing
 Joseph Turian
 
 The present work advances the accuracy and training speed of
 discriminative parsing. Our discriminative parsing method has no
 generative component, yet surpasses a generative baseline on constituent
 parsing, and does so with minimal linguistic cleverness. Our model can
 incorporate arbitrary features of the input and parse state, and performs
 feature selection incrementally over an exponential feature space during
 training. We demonstrate the flexibility of our approach by testing it
 with several parsing strategies and various feature sets.

DTEND:20060711T160000
DTSTART:20060711T143000
LOCATION:11 Large
SUMMARY:Practice Talks for ACL [Dragos Munteanu + Joseph Turian]
UID:20060711T143000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: This talk is about an improved approach for learning dependency parsers
 from treebank data. Our technique is based on two ideas for improving
 large margin training in the context of dependency parsing.  First, we
 incorporate local constraints that enforce the correctness of each
 individual link, rather than just scoring the global parse tree. Second,
 to cope with sparse data, we smooth the lexical parameters according to
 their underlying word similarities using Laplacian Regularization.  To
 demonstrate the benefits of our approach, we consider the problem of
 parsing Chinese treebank data using only lexical features, that is,
 without part-of-speech tags or grammatical categories.  We achieve state
 of the art performance, improving upon current large margin approaches.
 
 Here is the link for the paper:
   http://www.cs.ualberta.ca/~wqin/papers/depar_margin_conll06.pdf
 
 About the speaker:
 
 Qin Iris Wang is a Ph.D. student from the University of Alberta,
 working with Dekang Lin and Dale Schuurmans. Her research interests
 are in natural language processing and machine learning. Specifically,
 she has been working on dependency parsing using both generative and
 discriminative methods.

DTEND:20060728T163000
DTSTART:20060728T150000
LOCATION:11 Large
SUMMARY:Improved Large Margin Dependency Parsing via Local Constraints and Laplacian Regularization [Qin Iris Wang (Alberta)]
UID:20060728T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Noisy word alignments negatively affect the quality of the translation 
 rules extracted by the ISI syntax-based MT system.  In the literature, 
 alignment is typically treated as a separate process from subsequent 
 stages in the MT pipeline.  By contrast, we allow rule extraction to 
 guide the alignment process.
 
 We present an unsupervised algorithm for identifying and removing "bad" 
 links using GHKM syntax-based rule extraction.  We show that
 we can improve upon the precision of GIZA union (measured against a gold 
 standard set of manually aligned Chinese-English sentence pairs),
 while only decreasing recall slightly.
 
 
 (Note: This is part of the Summer Intern Series)

DTEND:20060825T153000
DTSTART:20060825T150000
LOCATION:11 Large
SUMMARY:Improving Precision of Word Alignments Using GHKM Syntax-Based Rule Extraction [Victoria Fossum (Michigan)]
UID:20060825T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: TBA
 
 (Note: This is part of the Summer Intern Series)

DTEND:20060823T160000
DTSTART:20060823T153000
LOCATION:11 Large
SUMMARY:Speeding-up Syntax-based Decoding [Joseph Turian (NYU)]
UID:20060823T153000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: This talk is about modeling the Syntax-Based Machine Translation  
 (SBMT) problem within the Searn (Search & Learn) framework developed by Hal Daume in  
 his PhD thesis. I will present the way we define the states, actions
 and the search space and how to implement the cost function.
 
 
 (Note: This is part of the Summer Intern Series)

DTEND:20060823T153000
DTSTART:20060823T150000
LOCATION:11 Large
SUMMARY:Towards combining Searn and Syntax-Based Machine Translation (SBMT) [Oana-Diana Postolache]
UID:20060823T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Textual Entailment has been proposed recently as a generic framework
 for modeling semantic variability in many Natural Language Processing
 applications, such as Question Answering, Information Extraction,
 Information Retrieval and Document Summarization. The Textual
 Entailment relationship holds between two text fragments, termed text
 and hypothesis, if the truth of the hypothesis can be inferred from
 the text.
 
 In this talk, the Textual Entailment framework will be introduced.
 I'll then present an algorithm for large-scale Web-based acquisition
 of entailment rules, a type of knowledge needed for robust inference.
 Finally, I will present an unsupervised Relation Extraction approach
 based on the Textual Entailment framework.
 
 About the speaker:
 
 Idan Szpektor is a PhD student under the supervision of Dr. Ido Dagan
 at Bar Ilan University, Israel. His current research activity is in
 acquisition of knowledge for textual entailment.
 
 

DTEND:20060811T163000
DTSTART:20060811T150000
LOCATION:11 Large
SUMMARY:Textual Entailment: Framework, Learning and Applications [Idan Szpektor (Bar-Ilan U)]
UID:20060811T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: In this summer project, we investigate a scalable method to extract
 Chinese-English name transliterations from large comparable corpora,  
 which consist of two languages discussing same or similar topics. We show  
 that bigram Jaccard coefficient is a good similarity method to compare English  
 and Chinese names, at Chinese pronunciation (Pinyin) level. Based on this phonetic
 similarity score, an efficient randomized algorithm is then used to  
 find name pair candidates from English and Chinese lists. Finally, context  
 information, such as dates, frequency, place and titles are combined with the  
 phonetic similarity to improve the accuracy of the name pairs list.
 
 (Note: This is part of the Summer Intern Series)

DTEND:20060818T153000
DTSTART:20060818T150000
LOCATION:11 Large
SUMMARY:Name Entity Transliteration Discovery from Large Bilingual Comparable Corpora [Chenhai Xi]
UID:20060818T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: In this talk I will examine problems encountered in coming to some
 kind of understanding of one sonnet by Shakespeare (his 64th), ask
 what it would take to solve these problems computationally, and
 suggests routes to the solution.  The general conclusion is that we
 are closer to this goal as one might think.  Or are we?
 
 Bio:
 
 Jerry Hobbs is famous primarily for having an office next to Kevin
 Knight's and a parking space next to Ed Hovy's.  He has read
 everything of Shakespeare's that survives, including his will and
 plays of dubious authorship.  But that was all a long time ago.

DTEND:20061215T163000
DTSTART:20061215T150000
LOCATION:11 Large
SUMMARY:When Will Computers Understand Shakespeare? [Jerry Hobbs]
UID:20061215T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: If you speak a little perl, are an occasional perl-scripter, and  
 would like to know more about how to use it as a (p)ortable, (e)
 fficient, and (r)eadible (l)anguage, you may be interested in my  
 brown bag (read: bring your own) lunch seminar:
 
 I will talk about using Perl in a portable fashion, the environment  
 it is run in, and how avoid common mistakes and misconceptions. Perl  
 offers more than a thousand ways to solve a problem, but some are  
 more portable or more efficient than others. If time permits, simple  
 hands-on examples can be tried out during the talk, so power for  
 laptops will be provided.

DTEND:20061023T133000
DTSTART:20061023T240000
LOCATION:11 Large
SUMMARY:perl - how to use it, not abuse it [Jens-Soenke Voeckler]
UID:20061023T240000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: As machine learning algorithms and their application for NLP become
 better understood, attention turns toward the production of annotated
 corpora to which they can be applied.  Numerous phenomena present
 themselves for annotation, including aspects in lexical semantics,
 discourse, pragmatics, and dialogue.  But several questions
 immediately must be answered:  
 
 1. How does one obtain a balanced corpus to annotate?  What is a
 balanced corpus?  
 
 2. How does one decide which aspects to annotate? How does one
 adequately express the theory behind the phenomena in simple annotation steps? 
 
 3. Which annotators does one hire?  How does one ensure that they are adequately trained?  
 
 4. How does one establish a simple, fast, and trustworthy annotation
 procedure?  What interfaces does one build?  How does one ensure that
 the interfaces do not affect the annotation results?  
 
 5. How does evaluate the results? What are the appropriate agreement
 measures?  At which cutoff points should one re-do the annotations?
 How does one ensure improvement?
 
 6. How should one formulate and store the results?  How does one
 ensure compatibility with other existing resources?  How does one make
 results available for best impact?  
 
 7. How does one report the annotation effort and results?  How does
 one actually get a paper on this work published at an important
 conference?  What should the paper contain?   
 
 Despite their being so basic, there is almost no established procedure
 or standard set of answers to these questions today.  In this talk I
 discuss some of these aspects, pointing to the lessons learned in the
 ongoing OntoNotes project (joint with BBN, the University of Colorado
 (PropBank), the University of Pennsylvania (Treebank), and ISI).

DTEND:20060922T163000
DTSTART:20060922T150000
LOCATION:11 Large
SUMMARY:Toward a 'Science' of Annotation: Experiences from OntoNotes  [Eduard Hovy]
UID:20060922T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Inflected languages in a low-resource setting present a data sparsity problem for
 statistical machine translation. In this work, we present a minimally  
 supervised algorithm for morpheme segmentation on Arabic dialects  
 which reduces unknown words at translation time by over 50%, total  
 vocabulary size by over 40%, and yields a significant increase in  
 BLEU score over a previous state-of-the-art phrase-based statistical MT system.

DTEND:20060825T160000
DTSTART:20060825T153000
LOCATION:11 Large
SUMMARY:Minimally Supervised Morphological Segmentation with Applications to Machine Translation [Jason Riesa]
UID:20060825T153000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: We begin by describing a set of pruning constraints that are applied
 in the literature to effectively restrict the search space of
 synchronous PCFGs intersected with target language model contexts. We
 apply these constraints to non-binarized grammars with a large number
 of non-terminals and demonstrate effective parsing within the
 framework of Wu, 97.  
 
 We then present a novel parsing approach that avoids language model
 context intersection during parsing in favor of language model driven
 n-best list extraction.Ê The parsing step produces aÊ sentence
 spanning parse forest which is explored in left-to-right target order
 by the N-Best extraction method. 
 
 This method avoids lossy pruning during the parsing process, searching
 a much larger effective parse space than practically possible in the
 full intersection scenario, and has the important benefit of allowing
 integration of a high order language within the N-Best search process,
 rather than only in parse re-scoring.  
 
 We demonstrate the impact of this parsing approach using the SPCFG
 approach described in Zollmann, Venugopal, Vogel 06, which is similar
 to Galley et al., 04 and compare performance against full
 intersection.  
 
 This is joint work with Andreas Zollmann
 
 About the Speaker:
 
 Ashish Venugopal is a Ph.D candidate at the Language Technologies
 Institute at Carnegie Mellon University, and holds B.S (SCS,
 Univ. Honors), M.S degrees from the same institution. He is a Seibel
 Scholar and has received the annual Graduate Student Teaching Award at
 Carnegie Mellon. His research focus is on syntax augmented machine
 translation. 
 

DTEND:20060929T163000
DTSTART:20060929T150000
LOCATION:11 Large
SUMMARY:Delayed LM Intersection and Left-to-Right N-Best Extraction for Syntax-Based MT [Ashish Venugopal (CMU)]
UID:20060929T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Since part 1 of the Perl tutorial didn't cover the juicy bits (like a  
 unique function in Perl), based on feedback from participants, I am  
 offering a part 2 "Perl - Advanced Magick" covering:
 
 o the slides from roughly page 40
     - The Schwartzian Transform
     - Dissecting a program
 o What to do, if you do need popen or backticks?
 o OO Perl - a start
 o C embedding - definitely only a "start here"
 o Useful recipes, e.g. interpolating variables in configuration  
 scripts from Perl values.
 
 If there is something you are especially interested in seeing, please  
 send me an email

DTEND:20061103T170000
DTSTART:20061103T153000
LOCATION:11 Large
SUMMARY:perl part 2 - advanced magick [Jens-Soenke Voeckler]
UID:20061103T153000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Practical dialogue systems must exploit context to interpret user
 utterances correctly.  Received views of context and coordination in
 pragmatic theory equate utterance context with the occurrent
 subjective states of interlocutors using notions like common knowledge
 or mutual belief.  We argue that these views are not well suited for
 practical modeling due to the uncertainty and robustness of context
 dependence in human-human dialogue.  We present an alternative
 characterization of utterance context as objective and normative.  On
 this view, an interlocutor's representation of context reflects
 private uncertainty about the true objective context as determined by
 prior speaker meanings.  As conversation moves forward, new utterances
 provide interlocutors with retrospective insight about each other's
 prior meanings and therefore about what the true context really is.
 This view reconciles the need for uncertainty with received intuitions
 about coordination, and can directly inform computational approaches
 to dialogue.
 
 Joint work with Matthew Stone, Rutgers and Rich Thomason, Michigan
 
 About the Speaker:
 
 David DeVault is a Ph.D. candidate in the Department of Computer
 Science at Rutgers University.  He holds a B.S. in Engineering and
 Applied Science from the California Institute of Technology and an
 M.A. in Philosophy from Rutgers University.  David's research aims to
 develop techniques to allow computational agents to participate in
 flexible task-oriented conversations with human beings.  His recent
 work has drawn on design challenges encountered in building such an
 agent to try to articulate practical, learnable, and theoretically
 satisfying representations of context, utterance meaning, and speaker
 intention for implemented conversational systems.

DTEND:20061117T163000
DTSTART:20061117T150000
LOCATION:11 Large
SUMMARY:Scorekeeping in an Uncertain Language Game [David DeVault (Rutgers)]
UID:20061117T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Lexical cohesion refers to structure created in a text by use of words with
 related meanings. Apart from its importance in theoretical and applied
 linguistics, lexical cohesion detection is used in NLP tasks like topic
 segmentation, extractive summarization, spelling correction, etc.  However, the
 intuitive potential of lexical cohesion for such tasks is often not realized in
 practice, possibly due to shortcomings of detection algorithms.
 
 I will briefly describe an experiment with readers aimed at providing reliable
 data for a computational investigation of lexical cohesion. We then discuss a
 number of informative features for cohesion detection, drawing on sources like
 WordNet, distributional information, free associations, and the structure of
 information in  the text itself.  Finally, I report experiments 
 with supervised learning of lexical cohesion. 
 
 About the speaker:
 
 Beata Beigman Klebanov is a PhD candidate at the Hebrew University of Jerusalem,
 Israel, currently a visiting scholar at Northwestern University. Beata's 
 interests are in experimental, computational and applied research in text
 pragmatics.

DTEND:20070105T163000
DTSTART:20070105T150000
LOCATION:11 Large
SUMMARY:Experimental and Computational Investigation of Lexical Cohesion in Texts [Beata Klebanov (Hebrew U)]
UID:20070105T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: We discuss preliminary work on a possible approach to exploiting
 syntax in an effective way for machine translation. The driving
 guideline is to devise a machine translation system that can perform
 effectively, given a very limited quantity of parsed training data.  

DTEND:20061127T163000
DTSTART:20061127T150000
LOCATION:11 Large
SUMMARY:Towards the Effective Exploitation of Syntax in Machine Translation [Mark Hopkins (Potsdam)]
UID:20061127T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Knowledge representation is hard.  As natural language scientists and
 engineers, we'd like something that 
 
 - is expressive enough to capture how natural language works
 
 - permits tractable inference
 
 - admits learning algorithms for automatic knowledge acquisition
 
 - leads to modular system construction
 
 This talk will look at knowledge representation for capturing natural
 language transformations.  A lot of what we do falls into this
 category.  Examples of transformations include language translation
 (French to English), question answering (Question to Answer),
 transliteration (foreign script to Roman alphabet), summarization
 (long text to short text), parsing (string to tree), language
 generation (meaning to string), etc. 
 
 I'll show various knowledge formats (starting with simple finite-state
 transducers) and show how they stack up on the 4 criteria above, using
 theorems and examples.  We'll see that different types of tree and
 string automata lead to good behavior on various subsets of the 4
 criteria, but getting 4 out of 4 is still elusive. 
 
 This is a Krazy Theory talk -- since this kind of talk should not go
 on and on, I promise to finish within 50 minutes.

DTEND:20070112T153000
DTSTART:20070112T140000
LOCATION:11 Large
SUMMARY:Capturing Natural Language Transformations [Kevin Knight]
UID:20070112T140000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: A major obstacle in syntax-based machine translation is the  
 prohibitively large search space for decoding with an integrated  
 language model. We develop faster approaches for this problem based  
 on lazy algorithms for k-best parsing. When comparing against  
 Chiang's technique of cube pruning, our method runs up to twice as  
 fast without making more search errors or decreasing translation  
 accuracy as measured by BLEU. We demonstrate the effectiveness of the  
 algorithm on a large-scale translation system.
 
 Interestingly, these techniques can be applied to speed up bilexical  
 parsing as well, where the (bi-) lexical probabilities can be viewed  
 as n-gram probabilities that causes non-monotonicity. This method  
 fits naturally into the coarse-to-fine grained multi-pass parsing  
 schemes.
 
 To push this direction even further, we can generalize cube and lazy  
 cube pruning as generic tools for reducing complicated search spaces,  
 as alternatives to the well-known A* and annealing techniques.
 
 This is joint work with David Chiang (ISI).

DTEND:20061214T150000
DTSTART:20061214T133000
LOCATION:11 Large
SUMMARY:Faster Decoding with Synchronous Grammars and n-gram Language Models [Liang Huang (Penn)]
UID:20061214T133000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: If you understood all of the world's languages, you would still not be
 able to read many of the texts that you find on the world wide web,
 because they are written in non-Roman scripts -- often ones that have
 been arbitrarily encoded for electronic transmission in the absence of
 an accepted standard.  This very modern nuisance reflects a dilemma as
 ancient as writing itself: the association between a language as it is
 spoken and its written form has a sort of internal logic to it that we
 can comprehend, but the conventions are different in every individual
 case --- even among languages that use the same script, or between
 scripts used by the same language.  This conventional association
 between language and script, called a <i>writing system</i>, is indeed
 reminiscent of the Saussurean conception of language itself, a
 conventional association of meaning and sound, upon which modern
 linguistic theory is based.  Despite linguists' reliance upon writing
 to present and preserve linguistic data, however, writing systems were
 a largely forgotten corner of linguistics until the 1960s, when Gelb
 presented their first classification.
 
 This talk will describe recent work that aims to place the study of
 writing systems upon a sound computational and statistical foundation.
 While archaeological decipherment may eternally remain the holy grail
 of this area of research, it also has applications to speech
 synthesis, machine translation, and multilingual document retrieval.

DTEND:20070126T163000
DTSTART:20070126T150000
LOCATION:11 Large
SUMMARY:The Quantitative Study of Writing Systems [Gerald Penn (Toronto)]
UID:20070126T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: We present a survey of resent research into using information visualization
 to reveal new insights about linguistic data.  Our recent work includes
 using WordNet hyponymy as a basis for document visualization and visualizing
 the uncertainty in machine translation in an instant messaging chat
 context.  We will present our preliminary findings and prototype
 visualization for machine translation data resulting from a week of
 collaboration with ISI researchers.
 
 About the speaker:
 
 Christopher Collins is a PhD candidate in information visualization and
 computational linguistics at the University of Toronto.  He works with Prof.
 Gerald Penn and Prof. Sheelagh Carpendale (University of Calgary).
 

DTEND:20070420T163000
DTSTART:20070420T150000
LOCATION:11 Large
SUMMARY:Information Visualization to Support Computational Linguistics [Christopher Collins (Toronto)]
UID:20070420T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: The medieval Voynich Manuscript has been called "the most 
 mysterious document in the world".  Its pages contain bizarre drawings 
 of strange plants and astrological diagrams, as well as an undeciphered 
 script of 20,000 running words, written in a character set that has never 
 been seen elsewhere.  Its origin is also controversial, with many theories 
 abounding.  I will describe the document, show samples, explain where it 
 may have come from, and present some properties of the text. 
 
 This will more of a history/mystery talk than 
 a computer science talk.

DTEND:20070309T163000
DTSTART:20070309T150000
LOCATION:11 Large
SUMMARY:The Voynich Manuscript [Kevin Knight]
UID:20070309T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: The talk gives an overview of  Multilayered Extended Semantic Networks
 (abbreviated MultiNet), which is one of the most comprehensively
 described knowledge representation paradigms used as a semantic
 interlingua in large-scale NLP applications and for linguistic
 investigations into the semantics and pragmatics of natural
 language. As with other semantic networks, concepts are represented in
 MultiNet by nodes, and relations between concepts are represented as
 arcs between these nodes. Additionally to that, every node is
 classified according to a predefined conceptual ontology forming a
 hierarchy of sorts, and the nodes are embedded in a multidimensional
 space of layer attributes and their values. MultiNet provides a set of
 about 150 standardized relations and functions which are described in
 a very concise way including an axiomatic apparatus, where the axioms
 are classified according to predefined types. The representational
 means of MultiNet claim to fulfill the criteria of universality,
 homogeneity, and cognitive adequacy. In the talk, it is also shown,
 how MultiNet can be used for the semantic representation of different
 semantic phenomena. To overcome the quantitative barrier in building
 large knowledge bases and semantically oriented computational lexica,
 MultiNet is associated with a set of tools including a semantic
 interpreter NatLink for automatically translating natural language
 expressions into MultiNet networks, a workbench LIA for the computer
 lexicographer, and a workbench MWR for the knowledge engineer for
 managing and graphically manipulating semantic networks. The
 applications of MultiNet as a semantic interlingua range from natural
 language interfaces to the Internet and to dedicated databases, over
 question-answering systems, to systems for automatic knowledge
 acquisition.
 
 About the speaker:
 
 Prof. Helbig is head of the chair Intelligent Information and Communication 
 Systems at the University of Hagen, Germany. His main research areas are
 Knowledge Representation, Semantic Natural Language Processing, and 
 Question-Answering.
 
 A CV can be found <a href="slides/CV-En-HH.pdf"> here</a>.

DTEND:20070323T163000
DTSTART:20070323T150000
LOCATION:4 CR
SUMMARY:Multilayered Extended Semantic Networks as a Knowledge Representation Paradigm and Interlingua for Meaning Representation  [Hermann Helbig (U at Hagen, Germany)]
UID:20070323T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: We have recently proposed Recognizing Textual Entailment (RTE) as a
 generic task that captures major semantic inferences across different
 natural language processing applications. The talk will first review
 the motivation and definition of the textual entailment task and the
 PASCAL RTE-1,2&3 Challenges benchmarks. Then we will demonstrate
 directions for building textual entailment systems, based on knowledge
 acquisition and inference, and for utilizing them within concrete
 applications. Furthermore, we suggest that textual entailment modeling
 may become a comprehensive framework for applied semantics
 research. Such framework introduces useful variants of known semantic
 problems and highlights important tasks which were hardly investigated
 so far at an applied computational level. The semantic modeling
 perspective will be illustrated in more detail by a case study for an
 entailment-based variant of word sense disambiguation. 
 
 About the speaker:
 
 Ido Dagan is a Senior Lecturer at the Department of Computer Science
 at Bar Ilan University, Israel. His areas of interest are largely
 within empirical NLP, particularly empirical approaches for applied
 semantic processing. In the last few years Ido and his colleagues
 introduced <i>textual entailment</i> as a generic framework for applied
 semantic inference and have organized the first three rounds of the
 PASCAL Recognizing Textual Entailment Challenges. Ido received his
 Ph.D. from the Technion. He has been a research fellow at the IBM
 Haifa Scientific Center and a Member of Technical Staff at AT&T Bell
 Laboratories. During 1998-2003 he was co-founder and CTO of
 FocusEngine and VP of Technology of LingoMotors.  

DTEND:20070330T163000
DTSTART:20070330T150000
LOCATION:11 Large
SUMMARY:Textual entailment as a framework for applied semantics [Ido Dagan (Bar-Ilan U)]
UID:20070330T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Consider Donald Norman's quote, "The power of the unaided mind is
 highly overrated. Without external aids, memory, thought, and
 reasoning are all constrained. But human intelligence is highly
 flexible and adaptive, superb at inventing procedures and objects that
 overcome its own limits. The real powers come from devising external
 aids that enhance cognitive abilities." (Norman, 1993) Common methods
 for externalization include making sketches on whatever happens to be
 handy -- paper napkins, program margins, etc. -- and/or finding a
 colleague or two to discuss the problem with. It would seem then, that
 visualization and collaboration are natural possibilities for creating
 positive cognitive aids. I will discuss our approach to developing
 interactive information visualizations both to support individuals and
 small groups of collaborators and briefly describe some of our recent
 results. 
 
 About the speaker:
 
 Sheelagh Carpendale holds a Canada Research Chair in Information
 Visualization at the University of Calgary. Her research focuses on
 the visualization, exploration and manipulation of information;
 visualizing such topics as ecological dynamics, uncertainty in
 information, social and communication information and investigating
 the development of information visualization environments that support
 collaboration. Dr. Carpendale's research in information visualization
 and interaction design draws on her dual background in Computer
 Science (BSc. and Ph.D. Simon Fraser University) and Visual Arts
 (Sheridan College, School of Design and Emily Carr, College of Art). 

DTEND:20070504T163000
DTSTART:20070504T150000
LOCATION:11 Large
SUMMARY:Information Visualization and Collaboration [Sheelagh Carpendale (Calgary)]
UID:20070504T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: (This will be an extended version of the talk for NAACL-HLT 2007. It's
 based on my summer internship work at IBM T.J. Watson Research Center
 last year.) 
 
 The project aimed to address the problems encountered when trying to
 match available employees to open job positions, based on skill
 matches. Currently, job search applications, like IBM's Professional
 Marketplace, only find exact matches. A skill affinity computation is
 desired to allow searches to be expanded to related/similar skills,
 and return more potential matches. 
 
 In this talk, I will explore the problem of computing text similarity
 between verb phrases describing skilled human behavior for the purpose
 of finding approximate matches. Four parsers (Charniak's parser,
 Stanford's parser, IBM XSG slot grammar parser, and Lin's MINIPAR) are
 evaluated on a corpus of skill statements extracted from an
 enterprise-wide expertise taxonomy. A similarity measure utilizing
 common semantic role features extracted from parse trees was found
 superior to an information-theoretic measure of similarity and
 comparable to the level of human agreement. 
 
 

DTEND:20070518T163000
DTSTART:20070518T150000
LOCATION:11 Large
SUMMARY:Computing Semantic Similarity between Skill Statements for Approximate Matching [Feng Pan]
UID:20070518T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: We compare and contrast the strengths and weaknesses of a syntax-based
 machine translation model with a phrase-based machine translation
 model on several levels.  We briefly describe each model, highlighting
 points where they differ.  We include a quantitative comparison of the
 phrase pairs that each model has to work with, as well as the reasons
 why some phrase pairs are not learned by the syntax-based model.  We
 then propose improvements to the syntax-based extraction techniques to
 capture more phrases.  We also compare the translation accuracy for
 all variations. 

DTEND:20070511T163000
DTSTART:20070511T150000
LOCATION:11 Large
SUMMARY:What Can Syntax-based MT Learn from Phrase-based MT? [Steve DeNeefe]
UID:20070511T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Psychiatric document retrieval attempts to help people to efficiently
 and effectively locate the consultation documents relevant to their
 depressive problems. Individuals can understand how to alleviate their
 symptoms according to recommendations in the relevant documents. This
 work proposes the use of high-level topic information extracted from
 consultation documents to improve the precision of retrieval
 results. The topic information adopted herein includes negative life
 events, depressive symptoms and semantic relations between symptoms,
 which are beneficial for better understanding of users'
 queries. Experimental results show that the proposed approach achieves
 higher precision than the word-based retrieval models, namely the
 vector space model (VSM) and Okapi model, adopting word-level
 information alone.  
 
 About the speaker:
 
 Liang-Chih Yu (<a href=http://www.isi.edu/~liangchi>http://www.isi.edu/~liangchi</a>) is
 now a visiting student in the Information Sciences Institute (ISI) of
 University of Southern California (USC). My host advisor is Dr. Eduard
 Hovy. I am also a PhD candidate in the Department of Computer Science
 and Information Engineering, National Cheng Kung University, Tainan,
 Taiwan. My advisor is Dr. Chung-Hsien Wu. My research interests
 include natural language processing, text mining, information
 retrieval, ontology construction, spoken dialogue system. 
 

DTEND:20070608T153000
DTSTART:20070608T150000
LOCATION:11 Large
SUMMARY:Topic Analysis for Psychiatric Document Retrieval (Practice Talk for ACL) [Liang-Chih Yu (Cheng Kung U)]
UID:20070608T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: We show that phrase structures in Penn Treebank style parses
 are not optimal for syntax-based machine translation. We
 exploit a series of binarization methods to restructure the
 Peen Treebank style trees such that syntactified phrases
 smaller than Penn Treebank constituents can be acquired and
 exploited in translation. We find that by employing the EM
 algorithm for determining the binarization of a parse tree
 among a set of alternative binarizations gives us the best
 translation result.

DTEND:20070525T153000
DTSTART:20070525T150000
LOCATION:11 Large
SUMMARY:Binarizing Syntax Trees to Improve Syntax-Based Machine Translation Accuracy [Wei Wang (Language Weaver)]
UID:20070525T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: We describe existing forward and backward bisimulation minimisation
 algorithms for nondeterministic automata and extend these algorithms
 to weighted tree automata. The extended algorithms, which work for all
 semirings, retain the time complexity of their counterparts for
 unweighted tree automata for additively cancellative semirings, and
 are only slightly higher (linear instead of logarithmic in the number
 of states) on other semirings. We describe the effectiveness of an
 implementation of these algorithms on a typical task in natural
 language processing. 
 
 This is joint work with Johanna Högberg, Umeå University and Andreas
 Maletti, Technische Universität Dresden.

DTEND:20070608T160000
DTSTART:20070608T153000
LOCATION:11 Large
SUMMARY:Bisimulation Minimisation for Weighted Tree Automata [Jonathan May]
UID:20070608T153000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: In this paper, we analyze the effect of resampling techniques,
 including under-sampling and over-sampling used in active learning for
 word sense disambiguation (WSD). Experimental results show that
 under-sampling causes negative effects on active learning, but
 over-sampling is a relatively good choice. To alleviate the
 within-class imbalance problem of over-sampling, we propose a
 bootstrap-based over-sampling (BootOS) method that works better than
 ordinary over-sampling in active learning for WSD. Finally, we
 investigate when to stop active learning, and adopt two strategies,
 max-confidence and min-error, as stopping conditions for active
 learning. According to experimental results, we sug-gest a prediction
 solution by considering max-confidence as the upper bound and
 min-error as the lower bound for stopping conditions. 

DTEND:20070601T153000
DTSTART:20070601T150000
LOCATION:11 Large
SUMMARY:Active Learning for Word Sense Disambiguation with Methods for Addressing the Class Imbalance Problem [Jingbo Zhu]
UID:20070601T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Large corpora of parsed sentences with semantic role labels (e.g. PropBank)
 provide training data for use in the creation of high-performance automatic
 semantic role labeling systems. Despite the size of these corpora,
 individual verbs (or rolesets) often have only a handful of instances in
 these corpora, and only a fraction of English verbs have even a single
 annotation. In this paper, we describe an approach for dealing with this
 sparse data problem, enabling accurate semantic role labeling for novel
 verbs (rolesets) with only a single training example. Our approach involves
 the identification of syntactically similar verbs found in PropBank, the
 alignment of arguments in their corresponding rolesets, and the use of their
 corresponding annotations in PropBank as surrogate training data.

DTEND:20070601T160000
DTSTART:20070601T153000
LOCATION:11 Large
SUMMARY:Generalizing Semantic Role Annotations Across Syntactically Similar Verbs [Andrew S. Gordon]
UID:20070601T153000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: In this paper, we address the problem of extracting data records and
 their attributes from unstructured biomedical full text. There has
 been little effort reported on this in the research community. We
 argue that semantics is important for record extraction or
 finer-grained language processing tasks. We derive a data record
 template including semantic language models from unstruc-tured text
 and represent them with a dis-course level Conditional Random Fields
 (CRF) model. We evaluate the approach from the perspective of
 Information Extrac-tion and achieve significant improvements on system
 performance compared with other baseline systems. 

DTEND:20070615T113000
DTSTART:20070615T110000
LOCATION:11 Large
SUMMARY:Extracting Data Records from Unstructured Biomedical Full Text [Donghui Feng]
UID:20070615T110000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Automatic word alignment is the problem of automatically annotating
 parallel text with translational correspondence. Previous generative
 word alignment models have made structural assumptions such as the
 1-to-1, 1-to-N, or phrase-based consecutive word assumptions, while
 previous discriminative models have either made one of these
 assumptions directly or used features derived from a generative model
 using one of these assumptions. We present a new generative alignment
 model which avoids these structural limitations, and show that it is
 effective when trained using both unsupervised and semi-supervised
 training methods. Experiments show strong improvements in word
 alignment accuracy and usage of the generated alignments in
 hierarchical and phrasal SMT systems improves the BLEU score.

DTEND:20070615T110000
DTSTART:20070615T103000
LOCATION:11 Large
SUMMARY:Getting the structure right for word alignment: LEAF [Alex Fraser]
UID:20070615T103000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: There is a hierarchy of generative devices that generate trees:
 starting with regular tree languages (RTLs), which are contained
 within context-free tree languages (CFTLs), and so on.  The string
 yield of the RTLs is exactly the set of Context-Free Languages,
 while the yield of the CFTLs is exactly the set of Indexed Languages.
 In this talk we introduce Adjoining Tree Languages (ATLs) which sit
 in between RTLs and CFTLs.
 
 The yield of ATGs is exactly the set of Tree-Adjoining Languages.
 Just like RTGs are stronger than CFGs, ATGs are stronger than TAGs.
 In addition we will show that the ATG notation simplifies many of
 the foundational proofs for TAGs including proofs of the closure
 properties. In particular, ATLs do not use adjunction constraints,
 and thus are much easier to understand than TAGs.
 
 We compare ATGs with previously proposed simplifications of CFTGs,
 called monadic simple CFTGs, which also have been shown to be weakly
 equivalent to TAG (i.e. they generate the same set of string
 languages). We consider the question of whether these two weakly
 equivalent formalisms are strongly equivalent (i.e. generate exactly
 the same set of tree languages).
 
 Finally, we will show that the standard definition used for
 probabilistic TAG is (surprisingly) very different from the natural
 definition of probabilistic ATL. Using an example of PP-attachment
 ambiguity we show that the two probabilistic models are different
 from each other. 
 
 About the speaker:
 
 Anoop Sarkar is an assistant professor in the Department of Computing
 Science at Simon Fraser University. He received his PhD in 2002
 from the Department of Computer and Information Science at the
 University of Pennsylvania, with Prof. Aravind Joshi as his advisor.
 His research work is on machine learning, especially semi-supervised
 learning, applied to the processing of natural language and stochastic
 formal grammars.
 
 Anoop Sarkar's web-page: <a href=http://www.cs.sfu.ca/~anoop/> http://www.cs.sfu.ca/~anoop</a>

DTEND:20070816T163000
DTSTART:20070816T150000
LOCATION:11 Large
SUMMARY:Extensions of Regular Tree Grammars and their relation to Tree Adjoining Grammars [Anoop Sarkar (Simon Fraser)]
UID:20070816T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Noisy Language Models
 
 The language models used in statistical machine translation are often
 quite large, requiring significant memory and sometimes pre-processing
 in order to be utilized effectively. It would be desirable to have a
 more compact representations of language models while minimizing the
 impact on translation quality. Various quantization methods and lossy
 storage of language models will be presented.
 
 Context for Syntax-Based Translation Rules
 
 The rules that a translation system employs should be applicable in
 many contexts.  This ensures that a rich language is expressible with
 a minimum number of rules.  However, when rules that are applicable in
 too many contexts are combined, they result in nonsensical
 translations.  How can we keep rules general but constrain the context
 of their use?  This summer we explored the approach of constraining
 the context by conditioning on various neighboring elements of each
 rule. 
 

DTEND:20070824T170000
DTSTART:20070824T153000
LOCATION:11 Large
SUMMARY:Summer Intern Presentations: Noisy Language Models AND Context for Syntax-Based Translation Rules [Wei Ho (Princeton) <br> Jennifer Gillenwater (Rice)]
UID:20070824T153000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Composition of Tree Transducers
 
 Since finite state (string) transducers are not expressive enough for many NLP 
 applications, computational linguistics started to investigate tree 
 transducers for the task of machine translation, for example. Quite some 
 successful work has been done on generalizing results from string transducers 
 to tree transducers. But when it comes to composition results are not 
 satisfying because generally tree transducers are not closed under 
 composition. Still we think that most of the tree transducers used in NLP are 
 composable and that is why we defined the problem of the composition for two 
 individual transducers instead of the whole class. During the summer we 
 started with linear nondeleting tree transducers with epsilon rules and 
 approached an algorithm to decide for two such transducers whether their 
 composition is again in the same class.
 
 Using the Perceptron Algorithm to Tune Large Numbers of Feature Weights for Syntax-Based Statistical Machine Translation
 
 Current state-of-the-art syntax-based statistical machine translation
 systems produce many candidate translations out of which the output translation
 is selected by taking the argmax over all candidates i of &lt;w,f_i&gt; where w is a 
 weight vector and f_i is a vector of the feature values for candidate i. The
 features used by the system and their corresponding weights have a major impact
 on a system's performance.  Currently, Minimum Error Rate Training (MERT) is used to
 tune the weights of the features.  A drawback of this is that it isn't tractable
 to tune large numbers of feature weights.  I will discuss using the perceptron 
 algorithm to tune feature weights for statistical machine translation.  If I get interesting
 results before my talk, I may also dicsuss new classes of features (potentially very large 
 numbers of features) that can be used for improving MT performance.  

DTEND:20070829T163000
DTSTART:20070829T150000
LOCATION:11 Large
SUMMARY:Summer Intern Presentations: Composition of Tree Transducers AND Using the Perceptron Algorithm to Tune Large Numbers of Feature Weights for Syntax-Based Statistical Machine Translation [Carmen Heger (Dresden) <br> Michael Bloodgood (Delaware)]
UID:20070829T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION:
DTEND:20050211T163000
DTSTART:20050211T150000
LOCATION:11 Large
SUMMARY:Unsupervised Word Sense Disambiguation Using Wordnet Relatives [Hae-Chang Rim]
UID:20050211T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: (Note that this is a MONDAY!)
 

DTEND:20050214T163000
DTSTART:20050214T150000
LOCATION:11 Large
SUMMARY:Collecting Broad-Coverage Knowledge Bases from Volunteers [Tim Chklovski]
UID:20050214T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: There are many tools available to the NLP community for Natural Language Parsing, (i.e converting a raw sentence in to a parse-tree). NLP researchers usually use some "off-the-shelf" parser which has been trained on the Wall Street Journal (WSJ) corpora and then apply the WSJ-trained parser to their data. This works in many cases, especially for systems which use data from WSJ or similar corpora. However, in real life applications, the data may be compiled from many different sources and span different genres, and may not be similar to the WSJ corpora in terms of sentence structure, etc . A particular parser might parse well on some corpora and not so well on others. Choosing the right parser for your data may have an impact on the performance of the NLP system as a whole. But in order to measure the accuracy of any parser for a given corpus, we require a set of gold-standard parse trees corresponding to the sentences within the corpus. Generating gold-standard set takes a lot of manual work and in many real-life applications, it is not a feasible  task to generate gold-standard parses for large corpora.
 
 We attempted to build a system which can predict the accuracy (in terms of f-measure value) of the Charniak parser (a popular parsing tool) on any given sentence corpus. Without using any additional information (i.e gold std. parses), our system predicts "how accurately the Charniak parser could parse the given corpus". In order to evaluate our system's predictions on a particular corpus, we compute the "Correlation" measure between the "actual accuracies (using Gold-standard)" vs. "predicted accuracies (from our system)" for the given corpus. We tested our system on different corpora and using different methods and will present these results.

DTEND:20071005T163000
DTSTART:20071005T150000
LOCATION:11 Large
SUMMARY:Will this parser work with my data? - Predicting Parser Accuracy without Gold-Standard information [Sujith Ravi]
UID:20071005T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Randomized data structures can help us scale discrete models encountered in NLP. This talk will describe their use in language modeling and present some more general related results.
 
 N-gram language models are fundamental to speech recognition and machine translation. Unfortunately, the n-gram parameter space grows exponentially with the dimension of the feature vector. I will describe how randomization can be used to remove the space-dependency of such models on the a priori parameter space.
 
 The novel extensions of the Bloom filter that I will present are able to take advantage of the entropy of the distribution of values assigned to feature vectors to save space in a discrete statistical model. I will review some results applying these models to language modeling in machine translation and relate their space-requirements to a novel lower bound on the general problem of querying a map of key/value pairs.
 
 No prior knowledge of randomized data structures will be assumed.
 
 

DTEND:20071012T163000
DTSTART:20071012T150000
LOCATION:11 Large
SUMMARY:Scalable Language Modeling: Breaking the Curse of Dimensionality [David Talbot (Edinburgh)]
UID:20071012T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: The value of mathematical formalisms for speech recognition, language generation, and machine translation has long been recognized. Not so much work, though, has been spent reconciling these formalisms with linguistic theories. In this talk I'll propose a theoretical descriptive mechanism based on feature logic, which is central to construction and constraint-based linguistic theories like construction grammar and HPSG, and which  can be used to view tree transducers and tree-adjoining grammars as giving rise to a construction-based framework.

DTEND:20071102T163000
DTSTART:20071102T150000
LOCATION:11 Large
SUMMARY:Constructions, Constraints, Transducers, and TAGs: A unifying view through Feature Logic [Bill Rounds (Michigan and Stanford)]
UID:20071102T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: The School of Information Technologies at the University of Sydney has
 had a 3 year partnership with the Intensive Care Unit at the Royal
 Prince Alfred Hospital, Sydney. In that time they have managed 8 joint
 projects aimed at producing software solutions that enhance
 productivity in the Unit and in some cases enabled entirely new
 functionalities in their information systems. The principle motivation
 for the research is the processing of the narratives in clinical notes
 but concomitant problems in information systems have also been tackled
 and the combination of the two disciplines have led to the two related
 processing systems to be described in this presentation.
 
 
 - Ward Rounds Information Systems (WRIS) & Handovers -
 The WRIS is designed to support the work of all clinical staff in
 their ward rounds activities. The system, when activated,
 automatically populates from the resident clinical database a pro
 forma report with the most recent relevant data about the patient,
 such as vital signs, pathology reports, and other diagnostic
 measurements, presented as a web page. The clinical staff then write
 their progress notes into the web page which converts the text to
 SNOMED CT codes and other relevant concepts and entities. The
 clinician is given the opportunity to change any analyses done by the
 processor. This clinician approved data is loaded to the patient
 record. The essential elements of this system, that is computing an
 extract of the patient record, accepting narrative input, and
 analysing the text for coding, is a productivity gain of itself, but
 more importantly, also constitutes the beginning of a hospital wide
 Handovers System for use throughout each step in the patient
 journey. This system is being tested at the RPAH ICU in readiness for
 ward usage. The impact of this system in improving the quality and
 safety of handovers has the potential to be very significant.
 
 
 - Clinical Data Analytics Language (CDAL) -
 General purpose access to data from clinical information systems,
 beyond retrieval for point of care work, is needed for many aspects of
 the hospital's work particularly for clinical research, logistics &
 operational planning, and auditing patient safety. Most current
 clinical systems only provide access to data identified in standard
 reports with no flexibility to make ad hoc enquiries or to pursue new
 directions of enquiry. The clinical data analytics language developed
 enables the expression of any question that can be answered from the
 data in the database in a restricted natural language. A prototype of
 the language has been developed for the CareVue information system
 used in the ICU at the Royal Prince Alfred Hospital. It provides for
 the use of local medical dialects, SNOMED CT terminology including all
 forms of collective expressions in SNOMED (e.g. infectious diseases),
 specification of patient groups, a variety of statistical functions,
 and constraints over any medical variable, Time, and Location. CDAL is
 general in that it can be bolted on to any clinical information system
 and is applicable to any clinical specialisation.
 
 
 

DTEND:20071017T163000
DTSTART:20071017T153000
LOCATION:11 Large
SUMMARY:Enhancement Technologies for ICU Information Systems [Jon Patrick (Univ. of Sydney)]
UID:20071017T153000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Treebank parsing can be seen as the search for an optimally refined
 grammar consistent with a coarse training treebank. We describe a
 method in which a minimal grammar is hierarchically refined using EM
 to give accurate, compact grammars. The resulting grammars are
 extremely compact compared to other high-performance parsers, yet the
 parser gives the best published accuracies on several languages, as
 well as the best generative parsing numbers in English. In addition,
 we give an associated coarse-to-fine inference scheme which vastly
 improves inference time with no loss in test set accuracy. 

DTEND:20071019T113000
DTSTART:20071019T103000
LOCATION:11 Large
SUMMARY:Learning and Inference for Hierarchically Split PCFGs [Slav Petrov (Berkeley)]
UID:20071019T103000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Naturalis, the Dutch National Museum of Natural History, harbours one of the largest treasures of the world: the key specimens of millions of animals found throughout the world through centuries of biological expeditions. While the depot where the animals are stored is a technical marvel, Noah's ark of the 21st century, it is hard to search through it. Research in taxonomy, the evolution of life and biodiversity revolves around the specimens in the depot. The main key to accessing the depot are(mostly) handwritten expedition logs and registration books, which are currently being photographed and keyed in to be stored in searchable digital archives. Such digital logs already enable a kind of "Biogoogle" search, but actual research questions are more complicated ("how did this kind of frog develop over the last century in the Amazon rainforests?"), and demand more intelligent handling. This is where the MITCH project comes in.
 
 The goal of MITCH is to turn the field logs and registration books into a populated semantic network, in which concepts such as animal specimens are related to all other concepts that define them: where, when, under which circumstances and by whom were they found, who described them first in the academic literature, who prepared them for storage in the Naturalis depot, which registration number was assigned to them, etc. This means that all textual descriptions of a specimen need to be parsed into exactly these concepts and their relations. All of this needs to be  done at a scale that goes far beyond the human capacity, as tens of thousands of digitized but unanalysed textual records are waiting for semantic analysis. This necessitates the use of state-of-the-art machine learning methods that learn from examples automatically.
 
 The project addresses its goals on three levels. The basic level is the development and application of automatic data cleaning and markup tools. On top of this, semi-structured textual material such as fieldbook logs and scientific papers, are semi-automatically converted to a searchable knowledge base. Search results are visualised by displaying maps and specimen photos. The conversion phase assumes the active intervention of domain experts, such as collection managers, to correct and steer the automatic extraction  procedure. At the top level, information resources are cross-linked using a domain ontology, populating a semantic network that can be hooked up to any other standardised cultural heritage knowledge base or to a search engine. 

DTEND:20071214T163000
DTSTART:20071214T150000
LOCATION:11 Large
SUMMARY:MITCH: Mining for Information in Texts from the Cultural Heritage [Marieke van Erp]
UID:20071214T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: I'll talk about some unsupervised learning experiments -- how I was satisfied with the initial results, how I became very dissatisfied, and how I became (somewhat) satisified again. 

DTEND:20080111T163000
DTSTART:20080111T150000
LOCATION:11 Large
SUMMARY:How to Make EM Do What You Want [Kevin Knight]
UID:20080111T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Automatically word-aligning a parallel bitext in the source and target languages constitutes the first stage of most statistical machine translation pipelines.  Automatic word alignment is error-prone, and produces many incorrect links.  Incorrect links that violate syntactic correspondences interfere with the extraction of string-to-tree transducer rules for syntactic machine translation.  We present an algorithm for identifying and deleting incorrect word alignment links, using features of the extracted rules.  We obtain gains in both alignment quality and translation quality in Chinese-English and Arabic-English translation experiments, relative to a GIZA++ union baseline.

DTEND:20080118T163000
DTSTART:20080118T150000
LOCATION:11 Large
SUMMARY:Using Syntax to Improve Word Alignment Precision for Syntactic Machine Translation [Victoria Fossum]
UID:20080118T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: This tutorial is intended to provide attendees with working knowledge of the Arabic writing system. No previous experience with Arabic is required. At the end of this tutorial you should be able to read and segment individual Arabic characters, read common ligatures, identify possible affixes on stems, and understand the various lexical normalizations used in Arabic text preprocessing. The focus will be on the formal writing system in printed text for Modern Standard Arabic, although handwriting will be briefly discussed.

DTEND:20080325T113000
DTSTART:20080325T103000
LOCATION:11 Large
SUMMARY:Tutorial on Arabic Orthography [Jason Riesa]
UID:20080325T103000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: We present a method to transliterate names in the framework of
 end-to-end statistical machine translation. The system is trained to
 learn when to transliterate.
 
 For Arabic to English MT, we developed and trained a transliterator on a
 bitext of 7 million sentences and Google's English terabyte ngrams and
 achieved better name translation accuracy than 3 out of 4 professional
 translators. The talk also includes a discussion of challenges in name
 translation evaluation.

DTEND:20080404T160000
DTSTART:20080404T150000
LOCATION:11 Large
SUMMARY:Name Translation in Statistical Machine Translation: Learning When to Transliterate [Ulf Hermjakob]
UID:20080404T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Randomized algorithms are those which use randomness to achieve efficient performance with a bounded probability of error; typically, the bound is adjustable and the performance depends on the bound. Randomized data structures, likewise, use randomness to achieve efficient storage with a bounded probability of error. I will give an overview of the use of such data structures, namely, Bloom filters and "Bloomier" filters, for storing very large n-gram language models, and will discuss possibilities for using randomized data structures for other purposes as well. 

DTEND:20080425T160000
DTSTART:20080425T150000
LOCATION:11 Large
SUMMARY:Tutorial: Randomized data structures for large statistical NLP models [David Chiang]
UID:20080425T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: We present a method for improving word alignment for statistical syntax-based machine translation that employs a syntactically informed alignment model closer to the translation model than commonly-used word alignment models. This leads to extraction of more useful linguistic patterns and improved BLEU scores on translation experiments in Chinese and Arabic.

DTEND:20080411T160000
DTSTART:20080411T150000
LOCATION:11 Large
SUMMARY:Syntactic Re-Alignment Models for Machine Translation [Jon May]
UID:20080411T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Paraphrases are textual expressions that convey the same meaning using different words. They capture variability, which is a common phenomenon in language. Given this, paraphrases have been shown to be useful in many natural language applications like Question-Answering, Machine Translation, Summarization and Information Retrieval. In this talk, I'll discuss the phenomenon paraphrasing and focus on methods for automatically acquiring paraphrases from text.

DTEND:20080418T160000
DTSTART:20080418T150000
LOCATION:11 Large
SUMMARY:Learning Paraphrases from Text [Rahul Bhagat]
UID:20080418T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: We present a novel approach to weakly supervised semantic class learning from
 the web, using a single powerful hyponym pattern combined with graph
 structures, which capture two properties associated with pattern-based
 extractions: popularity and productivity. Intuitively, a candidate is popular
 if it was discovered many times by other instances in the hyponym pattern. A
 candidate is productive if it frequently leads to the discovery of other
 instances. Together, these two measures capture not only frequency of
 occurrence, but also cross-checking that the candidate occurs both near the
 class name and near other class members. We developed two algorithms that begin
 with just a class name and one seed instance and then automatically generate a
 ranked list of new class instances. We conducted experiments on four semantic
 classes and consistently achieved high accuracies.

DTEND:20080502T160000
DTSTART:20080502T150000
LOCATION:11 Large
SUMMARY:Semantic Class Learning from the Web with Hyponym Pattern Linkage Graphs [Zornitsa Kozareva]
UID:20080502T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Models that align phrases instead of words offer an
 appealing alternative to the standard relative frequency estimates of
 phrase translation probabilities.  But, while some effective word
 alignment models (Model 1, Model 2 & HMM) can be estimated tractably
 with EM, phrase alignment models cannot.  I'll talk about how to show
 that estimation and inference under these models is intractable.
 Then, I'll present two useful approximation techniques.
 
 First, I'll talk about how to cast phrase alignment search as an
 integer linear programming (ILP) problem and find the optimal
 alignment reliably and quickly with off-the-shelf ILP software.  Some
 applications of this technique include training phrase alignment
 models and interpreting the output of word alignment models.
 
 Second, we'll look at how to estimate translation probabilities under
 a phrase alignment model using a Gibbs sampling procedure.  The
 sampler has some nice asymptotic convergence properties and also seems
 to produce good results in practice. I'll walk through the different
 models we've trained and how they performed.
 
 Time permitting, I'll also talk about some of the ways in which we
 could potentially extend this work to syntactic MT.

DTEND:20080509T160000
DTSTART:20080509T150000
LOCATION:11 Large
SUMMARY:Inference in phrase alignment models [John DeNero (Berkeley)]
UID:20080509T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: I will briefly overview the landscape of event-oriented information
 extraction (IE) systems and explain why it is especially challenging
 to learn IE systems without annotated training data. Then I will
 describe one attempt to do so by decoupling the tasks of finding
 relevant text regions and applying extraction patterns. First, a
 self-trained relevant sentence classifier identifies relevant regions
 in documents. Second, a "semantic affinity" measure identifies
 domain-relevant extraction patterns.  We further distinguish between
 "primary" patterns and "secondary" patterns and apply the patterns
 selectively in the relevant regions.  This approach is weakly
 supervised, requiring only a few seed patterns plus relevant and
 irrelevant (but unannotated) documents for training.  The resulting IE
 system achieves reasonably good performance, despite the fact that the
 relevant region classifier leaves a lot to be desired.

DTEND:20080613T160000
DTSTART:20080613T150000
LOCATION:11 Large
SUMMARY:Effective Information Extraction with Relevant Regions and Semantic Affinity Patterns [Ellen Riloff]
UID:20080613T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: This talk investigates the use of domain knowledge to constrain and improve the unsupervised learning of a classifier, by placing limits or biases on the possible hypotheses for each input. Theoretically, we view the contribution of the knowledge source as a reduction in the uncertainty of the model's decisions, quantified by the resulting conditional entropy of the label distribution given the input corpus. Evaluating on the simple case of an unsupervised HMM tagger, we find surprising levels of improvement from little knowledge, with more stable and efficient training convergence and label assignment, and a high degree of correlation between classification entropy and model performance. We conclude that, while we should always seek better generic models and techniques, for applications in an unsupervised setting, knowledge may still be key.

DTEND:20080606T160000
DTSTART:20080606T150000
LOCATION:11 Large
SUMMARY:Knowledge as a Constraint on Uncertainty for Unsupervised Classification [Tom Murray (USC)]
UID:20080606T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: BLEU the de facto standard for evaluation and development of statistical machine translation systems.  We describe three real-world situations involving comparisons between different versions of the same systems where one can obtain improvements in BLEU scores that are questionable or even absurd. We propose a very conservative modification to BLEU that addresses these issues while improving correlation with human judgements, then explore some deeper modifications that alleviate the problems further.

DTEND:20080530T153000
DTSTART:20080530T150000
LOCATION:11 Large
SUMMARY:BLEU Sway Issues: one way to get statistical significance, two ways to get a better score, and three ways to thwart them [Steve DeNeefe]
UID:20080530T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Topic models, a class of Bayesian probabilistic models for discrete
 data, have recently gained popularity in applications ranging from
 document modeling to computer vision.  Since the introduction of
 Latent Dirichlet Allocation (LDA) in 2003, there have been numerous
 extensions to this archetype.  I will review the theory behind LDA,
 and discuss subsequent models, including (some of): Correlated Topic
 Model, Dynamic Topic Model, Hierarchical Topic Model, Special Words
 Topic Model, Hierarchical Dirichlet Process Model, Pachinko Allocation
 Machine, Topics and Syntax Model, Bi-LDA, Author-Topic Model,
 Supervised Topic Model, Spatial LDA, etc. 

DTEND:20080516T160000
DTSTART:20080516T150000
LOCATION:11 Large
SUMMARY:Theory and Applications of Topic Modeling [David Newman (UCI)]
UID:20080516T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Solutions for many natural language processing problems such as speech recognition, transliteration, and  translation have been described as weighted finite-state transducer cascades. The transducer formalism is very useful for researchers, not only for its ability to expose the deep similarities between seemingly disparate models, but also because expressing models in this formalism allows for rapid implementation of real, data-driven systems. Finite-state toolkits can interpret and process transducer chains using generic algorithms and many real-world systems have been built using these toolkits. Current research in NLP makes use of syntax-rich models that are poorly suited to extant transducer toolkits, which process linear input and output. Tree transducers can handle these models, and a weighted tree transducer toolkit with appropriate generic algorithms will lead to the sort of gains in syntax-based modeling that were achieved with string transducer toolkits. In this thesis proposal practice talk I will briefly trace the history of finite-state transducers and automata as they relate to natural language processing and the evolution of formalisms and the toolkits that support them, leading up to motivation for the design and creation of Tiburon, the toolkit referenced in this talk's title. I will describe previous, current, and future work on Tiburon's algorithms and the effectiveness of both algorithms and  software at cleanly representing syntax-based NLP models from the literature and at constructing and evaluating novel models.

DTEND:20080711T160000
DTSTART:20080711T150000
LOCATION:11 Large
SUMMARY:Thesis Proposal Practice Talk:  A Weighted Tree Transducer Toolkit for Syntactic Natural Language Processing Models [Jon May]
UID:20080711T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: I will be talking about deciphering letter-substitution ciphers *optimally* using only minimal knowledge (bigrams, trigrams, etc.) of the source language, instead of relying on large look-up dictionaries. We also plan to show how our empirical results compare with Shannon's predictions on the equivocation curves and unicity distance measure.

DTEND:20080718T160000
DTSTART:20080718T150000
LOCATION:11 Large
SUMMARY:Deciphering Ciphers Optimally Using Only Minimal Knowledge of the Source Language [Sujith Ravi]
UID:20080718T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Minimum Bayes risk (MBR) decoding improves the output of
 machine translation systems by selecting a translation that matches a
 large proportion of the k-best hypotheses of a system.  We extend this
 idea to apply to packed forests by selecting an output sentence that
 matches a large proportion of all hypotheses in the pruned forest of
 derivations from a syntax-based translation system.

DTEND:20080820T161500
DTSTART:20080820T154500
LOCATION:11 Small
SUMMARY:Intern Final Talk: Minimum Risk Decoding over Forests [John DeNero (Berkeley)]
UID:20080820T154500@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Probabilistic context-free grammars can describe probability distributions over strings, i.e., the sum of probabilities of all generated strings is 1.This condition is often  called consistency. It has applications in fields of natural language processing such as probabilistic parsing (disambiguate by picking the parse with the highest score), or speech recognition (rank hypotheses returned by a speech recognizer). 
  
 The talk is a survey of some of the previous results. We investigate how we can determine if a probabilistic context-free grammar is consistent, and if such a test can always be done. Also, we study a method, namely normalization, which guarantees consistent probabilistic context-free grammars. Moreover, we mention briefly some techniques that train probabilistic context-free grammars and guarantee consistency.

DTEND:20080822T153000
DTSTART:20080822T150000
LOCATION:11 Large
SUMMARY:Intern Final Talk:  On the Consistency of Probabilistic Context-Free Grammars [Catalin Tirnauca (Univ. Rovira i Virgili)]
UID:20080822T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: String-to-tree machine translation decoders are effective but very slow, especially compared to other decoding approaches.  We explore various methods to identify constraints on the search space, with the aim of improving the efficiency of the syntax-based decoder.

DTEND:20080822T161500
DTSTART:20080822T154500
LOCATION:11 Large
SUMMARY:Intern Final Talk: Structural constraints for efficient decoding. [Amittai Axelrod (UW)]
UID:20080822T154500@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: The fundamental task in statistical machine translation (SMT) is to  
 characterize the probability of a target sentence given its source  
 translation; for translating French as English, P(f | e). By applying  
 Bayes Rule, we derive the fundamental theorem of SMT: e maximizing  
 P(e) P(f | e). Advances in SMT come from improving estimations of  
 these two terms, or from more efficient ways of searching for optimal  
 solutions (Brown et al. 1993).
 
 In the case of language modeling, Shannon (1949) and Brown et al.  
 (1992) identified upper and lower bounds for the per-character entropy  
 of English, H(e), for humans and machines, respectively. We ask the  
 same question for SMT, H(e | f), comparing the results for human  
 translators and a simple machine baseline based on IBM Model 1. These  
 numbers are the upper and lower bounds for SMT systems trained on  
 parallel data.

DTEND:20080820T153000
DTSTART:20080820T150000
LOCATION:11 Small
SUMMARY:Intern Final Talk:  The Entropy of English given French [Kyle Gorman (Penn)]
UID:20080820T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Automatic Prediction of Parser Accuracy (practice talk for EMNLP)
 
 Statistical parsers have become increasingly accurate, to the point where they are useful in many natural language applications. However, estimating parsing accuracy on a wide variety of domains and genres is still a challenge in the absence of gold-standard parse trees.
 
 We propose a technique that automatically takes into account certain characteristics of the domains of interest, and accurately predicts parser performance on data from these new domains. As a result, we have a cheap (no annotation involved) and effective recipe for measuring the performance of a statistical parser on any given domain.
 (Joint work with Kevin Knight and Radu Soricut)
 
 
 
 Overcoming Vocabulary Sparsity in MT Using Lattices  (practice talk for AMTA)
 
 Source languages with complex word formation rules present a challenge for statistical machine translation (SMT). In this paper, we take on three facets of this challenge: (1) common stems are fragmented into many different forms in training data, (2) rare and unknown words are frequent in test data, and (3) spelling variation creates additional sparseness problems. We present a novel, lightweight technique for dealing with this fragmentation, based on bilingual data, and we also present a combination of linguistic and statistical techniques for dealing with rare and unknown words. Taking these techniques together, we demonstrate +1.3 and +1.6 BLEU increases on top of strong baselines for Arabic-English machine translation. 
 (Joint work with Ulf Hermjakob and Kevin Knight)
 

DTEND:20081010T161500
DTSTART:20081010T150000
LOCATION:11 Large
SUMMARY:Practice talks for AMTA/EMNLP [Sujith Ravi + Steve DeNeefe]
UID:20081010T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Using Bilingual Chinese-English Word Alignments to Resolve PP-Attachment Ambiguity in English (practice talk for AMTA)
 
 Errors in English parse trees impact the quality of syntax-based MT systems trained using those parses. Frequent sources of error for English parsers include PP-attachment ambiguity, NP-bracketing ambiguity, and coordination ambiguity. Not all ambiguities are preserved across languages. We examine a common type of ambiguity in English that is not preserved in Chinese: given a sequenc "VP NP PP", should the PP be attached to the main verb, or to the object noun phrase? We present a discriminative method for exploiting bilingual Chinese-English word alignments to resolve this ambiguity in English. On a heldout test set of Chinese-English parallel sentences, our method achieves 86.3% accuracy on this PP-attachment disambiguation task, an improvement of 4% over the accuracy of the baseline Collins parser (82.3%). 
 
 Online Large-Margin Training of Syntactic and Structural Translation Features (practice talk for EMNLP)
 
 Minimum-error-rate training (MERT) is a bottleneck for current development in statistical machine translation because it is limited in the number of weights it can reliably optimize. Building on the work of Watanabe et al., we explore the use of the MIRA algorithm of Crammer et al. as an alternative to MERT. We first show that by parallel processing and exploiting more of the parse forest, we can obtain results using MIRA that match or surpass MERT in terms of both translation quality and computational cost. We then test the method on two classes of features that address deficiencies in the Hiero hierarchical phrase based model: first, we simultaneously train a large number of Marton and ResnikÂ’s soft syntactic constraints, and, second, we introduce a novel structural distortion model. In both cases we obtain significant improvements in translation performance. Optimizing them in combination, for a total of 56 feature weights, we improve performance by 2.6 Bleu on a subset of the NIST 2006 Arabic-English evaluation data.
 
 (Joint work with Yuval Marton and Philip Resnik)
 
 

DTEND:20081014T161500
DTSTART:20081014T150000
LOCATION:11 Large
SUMMARY:Practice talks for AMTA/EMNLP [Victoria Fossum + David Chiang]
UID:20081014T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Considering the adversity of the conditions under which linguistic communication takes place in everyday life---ambiguity of the signal, environmental competition for our attention, speaker error, our limited memory, and so forth---it is perhaps remarkable that we are as successful at it as we are.  Perhaps the leading explanation of this success is that (a) the linguistic signal is redundant, (b) diverse information sources are generally available that can help us obtain infer the intended message (or something close enough) when comprehending an utterance, and (c) we use these diverse information sources very quickly and to the fullest extent possible.  This explanation can be thought of as treating language comprehension as a rational, evidential process.  Nevertheless, there are number of prominent phenomena reported in the sentence processing literature that remain clear puzzles for the rational approach.  In this talk I address three such phenomena, whose common underlying thread is an apparent failure to use information available in a sentence appropriately in global or incremental inferences about the correct interpretation of a sentence.  I argue that the apparent puzzle posed by these phenomena for models of rational sentence comprehension may derive from the failure of existing models to appropriately account for the environmental and cognitive constraints---namely, noisy input and limited memory---under which comprehension takes place.  I present two new probabilistic models of language comprehension under noisy input and limited memory, and show that these models lead to solutions to the above puzzles.  More generally, these models suggest how appropriately accounting for environmental and cognitive constraints can lead to a more nuanced and ultimately more satisfactory picture of key aspects of human cognition.

DTEND:20090123T160000
DTSTART:20090123T150000
LOCATION:11 Large
SUMMARY:Noise and memory in rational human language comprehension [Roger Levy (UCSD)]
UID:20090123T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: What is in common, and what is different, between translating from English to Chinese and compiling C++ into machine code?
 
 In this talk I will first introduce a tree-based (aka syntax-directed) paradigm for machine translation, inspired by both human translators and compilers. In this paradigm, a source language sentence is first parsed into a syntactic tree, which is then recursively converted into a target language sentence via tree-to-string transformation rules. Since the translation process is driven by the syntax, this approach resembles the classical "syntax-directed translation" method in compiling theory.
 
 However, natural languages are crucially different from programming languages in that they are fundamentally ambiguous. So we don't (and will probably never) have perfect parsers, and parsing errors adversely affect translation quality. To alleviate this problem, an obvious idea is to use the top-k parses, rather than a single 1-best, but this only helps a little bit due to the limited scope of the k-best list. We instead propose a "forest-based approach", which translates a packed forest encoding *exponentially* many parses in a compact (polynomial) space by sharing common subtrees. Large-scale experiments showed very significant improvements (over the 1-best baseline) in terms of translation quality, which outperforms the best reported systems to date. More interestingly, translating a forest of millions of trees is even faster than translating on top-30 individual trees thanks to dynamic programming.
 
 This talk includes joint work with Kevin Knight and Aravind Joshi (first part), and with Haitao Mi and Qun Liu (second/third parts).
 
 
 Short Bio:
 
 Liang Huang recently completed his PhD study at the University of Pennsylvania, co-supervised by Aravind Joshi and Kevin Knight (USC/ISI). He is mainly interested in the theoretical aspects of computational linguistics, in particular, efficient algorithms in parsing and machine translation, generic dynamic programming, and formal properties of synchronous grammars. His thesis develops a set of "forest-based methods" that have been applied to many problems in NLP including k-best parsing, forest rescoring and reranking, and forest-based translation. His awards include an Outstanding Paper Award at ACL 2008, and a University Teaching Award at Penn in 2005.
 
 http://www.cis.upenn.edu/~lhuang3/ 

DTEND:20081217T160000
DTSTART:20081217T150000
LOCATION:4th Floor CR
SUMMARY:Tree-based and Forest-based Translation [Liang Huang (UPenn => Google Research)]
UID:20081217T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: In many application domains, we face the task of characterizing the distribution of continuous random variables.  For instance, in automatic speech recognition (ASR), these variables are acoustic properties of speech signals.  For such tasks, Gaussian mixture models (GMMs) are widely used as an very effective density estimator. Particularly, in the context of ASR, they are embedded in continuous-density hidden Markov models (CD-HMMs) to yield emission probabilities, i.e., the likelihoods of acoustic observations conditioned on hidden states such as phonemes. Meanwhile, the transition probabilities in CD-HMMs attempt to capture temporal properties of speech signals. Similar modeling choices arise in other applications, for instance, in activity recognition.
 
 Various techniques have been developed to estimate the parameters of CD-HMMs. In particular, discriminative techniques such as conditional maximum likelihood and minimum classification error have attracted significant research attention. When carefully and skillfully implemented, they often lead to lower error rates (in speech recognition) than traditional techniques of maximum likelihood estimation.
 
 In this talk, I will describe a new discriminative technique that is based on the principle of large margin, a key framework in many machine learning algorithms including support vector machines and boosting. The new technique differs from previous discriminative methods for ASR in the goal of margin maximization. In particular, in our large margin training of CD-HMMs, model parameters are optimized to maximize the gap (or the margin)  between correct and incorrect classifications.  I will present an extensive empirical evaluation of our approach on two benchmark problems in speech recognition: phonetic classification and recognition on the TIMIT speech database.  In both tasks, large margin systems obtain significantly better performance than systems trained by maximum likelihood estimation or competing discriminative frameworks.  An in-depth analysis also reveals some 
 interesting features of our approach, which contribute to the superior performance.
 
 Towards the end of the talk, I will discuss briefly the connection of our work to the structured prediction problems in the machine learning community. I will also discuss the future direction of this line of work and other application potentials.
 

DTEND:20080919T160000
DTSTART:20080919T150000
LOCATION:11 Large
SUMMARY:Large margin based parameter estimation for hidden Markov models [Fei Sha (USC)]
UID:20080919T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: EM (the Expectation Maximization Algorithm) is a well known technique for unsupervised learning (where one does not have any hand labeled solutions available, but instead one must learn from the raw text). Unfortunately EM is known to fail to find good solutions in many (most?) applications on which it is tried.  In this talk we present some recent work on using EM to learn how to resolve pronoun-anaphora: determining that "the dog" is the antecedent of "he" and "his" in "When Sally fed the dog he wagged his tail". For this application EM works strikingly well, determining tens of thousands of parameters and resulting in a program that probably produces state of the art results, although because this is preliminary work, and pronoun-anaphora has no standard evaluation metrics, this is just a guess.
 
 
 About the Speaker: 
 
 Eugene Charniak is Professor of  Computer Science. and Cognitive Science at Brown University. He received an A.B. degree in Physics from University of Chicago and a Ph.D. from M.I.T. in Computer Science. He has published four books: Computational Semantics, with Yorick Wilks (1976); Artificial Intelligence Programming (now in a second edition) with Chris Riesbeck, Drew McDermott, and James Meehan (1980, 1987); Introduction to Artificial Intelligence with Drew McDermott (1985); and Statistical Language Learning (1993). He is a Fellow of the American Association of Artificial Intelligence and was previously a Councilor of the organization. His research has always been in the area of language understanding or technologies which relate to it, such as knowledge representation, reasoning under uncertainty, and learning. Over the last few years he has been interested in statistical techniques for language understanding. His research in this area has included work in the subareas of part-of-speech tagging, probabilistic context-free grammar induction, and, more recently, syntactic disambiguation through word statistics, efficient syntactic parsing, and lexical resource acquisition through statistical means. 
 

DTEND:20080926T160000
DTSTART:20080926T150000
LOCATION:11 Large
SUMMARY:EM Works for Pronoun-Anaphora Resolution [Eugene Charniak (Brown University)]
UID:20080926T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: A group of 60 researchers have been asked to comment on what they perceive to be
 
 - the most important contributions in the fields of speech recognition, language modeling, and machine translation;
 
 - past ideas that failed to lead to substantial improvements;
 
 - and contributions that are most likely to have a material impact in the future.
 
 This talk summarizes the perceptions and trends identified in the collection of answers provided by the researchers. 

DTEND:20081107T160000
DTSTART:20081107T150000
LOCATION:11 Large
SUMMARY:The best/worst Speech Recognition, Language Modeling, and Machine Translation ideas [Daniel Marcu]
UID:20081107T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: If you ever needed to extract information, e.g. LHS, RHS words, features, etc., from an XRS rules, this talk is for you. Over the years, a variety of regular expressions have been used to obtain data from XRS rules. However, in light of recent pipeline efforts, the copy-n-paste culture lead to expressions that were sometimes too complex for the task at hand, unnecessarily slowing down processing steps, or too trivial to work correctly on boundary cases. A unified effort by Steve, David, Wei, Michael and Jens culminated in the NLPRules module for Perl. While the talk centers on the Perl module, and some surprising benchmark results, any language supporting libpcre (perl compatible regular expression) will benefit from the insights, and from knowing the right regular expression for the task at hand.
 

DTEND:20081017T160000
DTSTART:20081017T150000
LOCATION:11 Large
SUMMARY:Parsing XRS with(out) regular expressions [Jens Voeckler]
UID:20081017T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: This high-level survey will describe the results of statistical machine translation (SMT) research since 1948. Part of the survey will cover the explosion of work in the past few years that has resulted from intense interest on the part of scientists, funders, and industry. We will also examine the roots of SMT in World War II decipherment activities. Some of the concepts from that era have become core to the field, while others still remain to be picked up.

DTEND:20090130T160000
DTSTART:20090130T150000
LOCATION:11 Large
SUMMARY:Sixty Years of Statistical Machine Translation [Kevin Knight]
UID:20090130T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: People make explicit subjective judgments of speech when doing things like tutoring students in a foreign language, or testing a child's reading skills.  On what do we base these judgments, and how can they be made automatically?  The "quality" of speech does not exist on any one scale alone, and is not limited strictly to pronunciation - it is manifested through a multiplicity of simultaneous and interacting cues of various sizes.  In this talk I'll discuss modeling strategies for categorical pronunciation on several scales, cognitive models for estimating student knowledge demonstrated through speech, and applications in the fields of education and speech synthesis.
 

DTEND:20090213T160000
DTSTART:20090213T150000
LOCATION:11 Large
SUMMARY:Estimating Subjective Judgments of Speech on Multiple Levels [Joseph Tepperman (Signal Analysis and Interpretation Laboratory, USC)]
UID:20090213T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Advances in technologies to capture and process multimedia signals are enabling new opportunities for understanding and modeling human behavior, and designing new human-centered applications. Intelligent environments equipped with a range of audio-visual sensors provide suitable means for automatically monitoring and tracking the behavior, strategies and engagement of the participants in multiperson interactions such as meetings, at various levels of interest. We describe a case study of a "Smartroom" being developed at USC in which high-level features are calculated from active speaker segmentations, automatically annotated by our system, to infer the interaction dynamics between the participants. The results show that it is possible to accurately estimate in real-time not only the flow of the interaction, but also how dominant and engaged each participant was during the discussion.
 
 Additionally, we describe analysis of human expressive behavior that can be afforded by such audio-visual data. We describe an analysis of the interrelation between facial gestures and speech using a multimodal approach. Using a controlled setting, motion capture technology was used to simultaneously acquire speech and detailed facial information. Our results indicate that the verbal and non-verbal channels of human communication are internally and intricately connected. The interplay is observed across the different communication channels such as various aspects of speech, facial expressions, and movements of the hands, head and body, and is greatly affected by the linguistic and emotional content of the message being communicated. As a result of the analysis, applications in automatic emotion recognition and synthesis of expressive communication are presented.
 
 [This research was supported in part by funds from the NSF, NIH, and the Department of the Army]
 
 

DTEND:20090227T160000
DTSTART:20090227T150000
LOCATION:11 Large
SUMMARY:Multimodal Processing of Human Behavior in Intelligent Instrumented Spaces: A Focus on Expressive Human Communication [Carlos Busso (USC)]
UID:20090227T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: More than three decades of research has sought to uncover the
 principles that determine how hearers interpret pronouns in context.
 This work has focused predominantly on identifying so-called
 'preferences' or 'heuristics' that hearers utilize based on linguistic
 properties of antecedent expressions.  This focus is a departure from
 the type of approach outlined in Hobbs (1979), which argues that the
 mechanisms that drive pronoun interpretation are driven predominantly
 by semantics, world knowledge, and inference, with particular
 reference to how these are used to establish the coherence of
 discourses.
 
 In this talk, I report on new experimental evidence in support of a
 coherence-driven analysis, and describe how the analysis can
 accommodate a range of previous findings suggestive of conflicting
 preferences and biases.  Case studies of four commonly-cited
 preferences are described, specifically (i) the parallel grammatical
 role preference (e.g., Smyth 1994), (ii) thematic role preferences
 (e.g., Stevenson et al. 1994), (iii) implicit causality biases (e.g.,
 Caramazza et al. 1977), and (iv) the subject assignment strategy
 (e.g., Crawley et al. 1990).  In each case, the experimental results
 offer an explanation of what the underlying source of the bias is, and
 predicts in what contexts evidence for it will surface.
 
 These results suggest that pronoun interpretation is incrementally
 influenced in part by the probabilistic expectations that hearers have
 about how the discourse will be coherently continued.  They are also
 argued to leave various myths by the roadside, e.g., that pronoun
 interpretation can be profitably thought of as a 'search and match'
 procedure, and that coherence relations need not be controlled for in
 experimental stimuli.
 
 This talk includes joint work with Laura Kertz, Hannah Rohde, and
 Jeffrey Elman.
 

DTEND:20090508T160000
DTSTART:20090508T150000
LOCATION:11 Large
SUMMARY:Coherence and the (Psycho-) Linguistics of Pronoun Interpretation  [Andrew Kehler (UCSD)]
UID:20090508T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Weighted tree automata are equivalent to weighted tree grammars, which can be used, for example, to easily model weighted context-free grammars. In constrast to context-free grammars, tree automata work directly on a tree representation and not on strings. We will introduce weighted tree automata and review the important results on minimization of them. For example, it is known that deterministic devices over commutative semifields (commutative semirings with multiplicative inverses) can be effectively minimized. In the main part of the talk, we present the first efficient algorithm for this minimization. If the operations can be performed in constant time, then our algorithm constructs an equivalent minimal (with respect to the number of states) deterministic automaton in time linear in the maximal rank of the input symbols, the number of (useful) transitions, and the number of states of the input automaton.
 
 

DTEND:20090306T160000
DTSTART:20090306T150000
LOCATION:11 Large
SUMMARY:Minimizing Deterministic Weighted Tree Automata [Andreas Maletti]
UID:20090306T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: In previous work on "Learning by Reading" we successfully extracted entities, states and events from technical natural language descriptions of processes. The research described here is aimed at the automatic discovery of causal and temporal ordering relations among states and events, specifically, among molecular and other events in biomedical articles. We have annotated causal and temporal relations in articles on the cell cycle, and we discuss our annotation guidelines and the issue of inter-annotator agreement. We then describe the natural language parsing and the inference system we use to extract these relations. We have created axioms manually for this system, focusing on temporal, causal and aspectual information and we have used semi-automatic means to augment these axioms. We have evaluated the performance of this system, and the results are promising.

DTEND:20090319T143000
DTSTART:20090319T140000
LOCATION:4th floor CR
SUMMARY:Discovering Causal and Temporal Relations in Biomedical Texts (practice talk for AAAI Spring Symposium) [Rutu Mulkar]
UID:20090319T140000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Paraphrases are textual expressions that convey the same meaning using different surface forms. Capturing the variability of language, they play an important role in many natural language applications including question answering, machine translation, and multi-document summarization. In linguistics, paraphrases are characterized by approximate conceptual equivalence. Since no automated semantic interpretation systems available today can identify conceptual equivalence, paraphrases are difficult to acquire without human effort. The aim of this thesis is to develop methods for automatically acquiring and filtering phrase-level paraphrases using a monolingual corpus.
 
 Noting that the real world uses far more quasi-paraphrases than the logically equivalent ones, we first present a general typology of quasi-paraphrases together with their relative frequencies. To our knowledge the first one ever. We then present a method for automatically learning the contexts in which quasi-paraphrases obtained from a corpus are mutually replaceable. Knowing that quasi-paraphrases are often inexact because they contain semantic implications which can be directional, we present an algorithm called LEDIR to learn the directionality of quasi-paraphrases. Since semantic classes play a crucial role in our work, we also investigate the use of a semi-supervised clustering algorithm for learning semantic classes.
 
 We next investigate the task of learning surface paraphrases, i.e., paraphrases that do not require the use of any syntactic interpretation. Since one would need a very large corpus to find enough surface variations, we use a really large but unprocessed corpus of 150GB (25 billion words) obtained from Google News to do this learning. We show that these paraphrases can be used to learn surface patterns for relation extraction. Finally, we use paraphrases to learn patterns for domain-specific information extraction.
 
 Thus, in this thesis we define quasi-paraphrases, present methods to learn them from a corpus, and show that quasi-paraphrases are useful for information extraction.
 

DTEND:20090417T160000
DTSTART:20090417T150000
LOCATION:11 Large
SUMMARY:Learning Paraphrases from Text (Ph.D. Defense practice talk) [Rahul Bhagat]
UID:20090417T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: 11,001 New Features for Statistical Machine Translation (David Chiang)
 - Winner of Best Paper Award at NAACL/HLT 2009
 
 We use the Margin Infused Relaxed Algorithm of Crammer et al. to add a
 large number of new features to two machine translation systems: the
 Hiero hierarchical phrase based translation system and our
 syntax-based translation system. On a large-scale Chinese-English
 translation task, we obtain statistically significant improvements of
 +1.5 BLEU and +1.1 BLEU, respectively. We analyze the impact of the new features and the performance of the learning algorithm. 

DTEND:20090515T160000
DTSTART:20090515T150000
LOCATION:4th flr CR
SUMMARY:Practice talks for NAACL HLT [David Chiang]
UID:20090515T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Hadoop is an open-source implementation of the Map/Reduce framework introduced by Google Research. It is a simple abstraction for describing parallelizable algorithms that admits very efficient execution: in one case, one of my (poorly implemented) algorithms was improved from a typical runtime of 72 hours to 3 hours. I will give a short introduction to Hadoop that is highly colored by my experiences with it and the likely experiences of other natural language processing researchers at ISI. I will show how to run Hadoop on HPC, how to use Hadoop Streaming (which allows implementation in any language you choose), and how to define Map/Reduce algorithms for a few incarnations of a typical NLP task, relative-frequency estimation of a large probability distribution. Input from others who are more experienced with Hadoop than I am is welcome!

DTEND:20090327T160000
DTSTART:20090327T150000
LOCATION:11 Large
SUMMARY:Tutorial on Hadoop [David Chiang]
UID:20090327T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: The task of coreference resolution identifies those expressions in a text that point to the same discourse entity. Natural language applications such as information extraction, question answering and machine translation can greatly benefit from its output (the different pieces of information in connection with the same entity are linked, pronouns are disambiguated, etc.). The task is extremely complex since a number of knowledge sources come into play, from morphology to discourse structure and world knowledge. In this talk I present the results of my PhD research up to now, including the development of two 400k-word corpora for Spanish and Catalan (AnCora) annotated at various levels (morphology, syntax, semantics, pragmatics), a 100k-word corpus for English, and a series of experiments towards building a learning-based coreference resolution system. More specifically, I'll discuss issues concerning the definition of the annotation scheme, the selection of features for machine learning, the effect of sample selection, and I'll introduce CISTELL, the new learning-approach we propose for coreference resolution. 

DTEND:20090529T160000
DTSTART:20090529T150000
LOCATION:11 Large
SUMMARY:Learning-based Coreference Resolution for Spanish and Catalan [Marta Recasens Potau (Universitat de Barcelona)]
UID:20090529T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Talk-1: Learning Phoneme Mappings for Transliteration without Parallel Data
 
 We present a method for performing machine transliteration without any parallel resources. We frame the transliteration task as a decipherment problem and show that it is possible to learn cross-language phoneme mapping tables using only monolingual resources.  We compare various methods and evaluate their accuracies on a standard name transliteration task.
 
 This is joint work with Kevin Knight.
 
 ----------------------------------------------------
 
 Talk-2: A New Objective Function for Word Alignment
 
 We develop a new objective function for word alignment that measures the size of the bilingual dictionary induced by an alignment. A word alignment that results in a small dictionary is preferred over one that results in a large dictionary.  In order to search for the alignment that minimizes this objective, we cast the problem as one of integer linear programming.  We then extend our objective function to align corpora at the sub-word level, which we demonstrate on a small Turkish-English corpus.
 
 This is joint work with Tugba Bodrumlu and Kevin Knight.
 

DTEND:20090514T160000
DTSTART:20090514T150000
LOCATION:4th flr CR
SUMMARY:Practice talks for NAACL HLT [Sujith Ravi]
UID:20090514T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Recent research on statistical machine translation has lead to the rapid development of syntax-based translation models, which
 exploit syntactic information to direct translation. In this talk, I
 will give an overview of tree-to-string translation models, one of the
 state-of-the-art syntax-based models. In a tree-to-string model, the source side is a phrase structure parse tree and the target side is a
 string. This talk includes the following topics: (1) tree-based tree-to-string model, (2) tree-sequence based tree-to-string model,
 (3) forest-based tree-to-string model, and (4) context-aware
 tree-to-string model. Experimental results show that the forest-based
 tree-to-string system outperforms Hiero significantly on Chinese-to-English translation.
 
 About the speaker:
 
 Yang Liu is an Assistant Researcher at Institute of Computing
 Technology (ICT), Chinese Academy of Sciences. He received his PhD
 degree in Computer Science from ICT in 2007. His major research
 interests include statistical machine translation and Chinese
 information processing. He has been working on syntax-based modeling,
 word alignment, and system combination. His paper on tree-to-string
 translation won the Meritorious Asian NLP Paper Award of COLING/ACL
 2006. He served as Reviewers for TALIP, TSLP, JNLE, ACL, EMNLP, AMTA, and SSST.
 

DTEND:20090715T170000
DTSTART:20090715T160000
LOCATION:11 Large
SUMMARY:An Overview of Tree-to-String Translation Models [Yang Liu (ICT China)]
UID:20090715T160000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Combining Constituent Parsers (Victoria Fossum: 3:00pm -- 3:30pm)
 
 Combining the 1-best output of multiple parsers via parse selection or
 parse hybridization improves f-score over the best individual parser
 (Henderson and Brill, 1999; Sagae and Lavie, 2006). We propose three ways to improve upon existing methods for parser combination.
 
 ---------------------------------------------------------
 
 Disambiguation of Preposition Sense Using Linguistically Motivated
 Features (Dirk Hovy: 3:30pm -- 4:00pm)
 
 Classifying polysemous words into their proper sense classes is
 potentially useful to any NLP application that needs to extract
 information from text or build a semantic representation of the
 textual information. Like instances of other word classes, many
 prepositions are ambiguous, carrying different semantic meanings
 (including notions of instrumental, accompaniment, location, etc.)
 In this paper, we present a supervised classification approach for
 disambiguation of preposition senses. We use the SemEval 2007
 Preposition Sense Disambiguation datasets to evaluate our system and
 compare its results to those of the systems participating in the
 workshop. We derived linguistically motivated features from both sides
 of the preposition. Instead of restricting     these to a fixed window
 size, we utilized the phrase structure. Testing with five different
 classifiers, we can report an increased accuracy (76.4%) that
 outperforms the best system in the SemEval task.

DTEND:20090522T160000
DTSTART:20090522T150000
LOCATION:11th flr CR
SUMMARY:Practice talks for NAACL HLT [Victoria Fossum <br> Dirk Hovy ]
UID:20090522T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Both coarse-to-fine and A* parsing use simple grammars to guide search in
 complex ones. We compare the two approaches in a common, agenda-based
 framework, demonstrating the tradeoffs and relative strengths of each
 method. Overall, coarse-to-fine is much faster for moderate levels of search
 errors, but below a certain threshold A* is superior. In addition,
 we present the first experiments on hierarchical A* parsing, in
 which computation of heuristics is itself guided by
 meta-heuristics. Multi-level hierarchies are helpful in both
 approaches, but are more effective in the coarse-to-fine case because
 of accumulated slack in A* heuristics.

DTEND:20090619T160000
DTSTART:20090619T150000
LOCATION:11 Large
SUMMARY:Hierarchical Search for Parsing (and Machine Translation) [Adam Pauls (UC Berkeley)]
UID:20090619T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Tree Adjoining Grammars have well-known advantages, but are typically
 considered too difficult for practical systems.  We demonstrate that,
 when done right, adjoining improves translation quality without
 becoming computationally intractable.  Using adjoining to model
 optionality allows general translation patterns to be learned without
 the clutter of endless variations of optional material, with extra
 information spliced in as needed.
 
 In this paper, we describe a novel method for learning a type of
 Synchronous Tree Adjoining Grammar and associated probabilities from
 aligned tree/string training data.  We introduce a method of
 converting these grammars to a weakly equivalent tree transducer for
 efficient decoding.  Finally, we show that adjoining results in an
 end-to-end improvement of +0.8 BLEU over a baseline statistical
 syntax-based MT model on a large-scale Arabic/English MT task.

DTEND:20090626T153000
DTSTART:20090626T150000
LOCATION:11 Large
SUMMARY:Synchronous Tree Adjoining Machine Translation (Practice talk for EMNLP) [Steve DeNeefe]
UID:20090626T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: 
 K-Best A* Parsing (Adam Pauls)
 
 A* parsing makes 1-best search efficient by
 suppressing unlikely 1-best items. Existing k-
 best extraction methods can efficiently search
 for top derivations, but only after an exhaus-
 tive 1-best pass. We present a unified algo-
 rithm for k-best A* parsing which preserves
 the efficiency of k-best extraction while giv-
 ing the speed-ups of A* methods. Our algo-
 rithm produces optimal k-best parses under the
 same conditions required for optimality in a
 1-best A* parser. Empirically, optimal k-best
 lists can be extracted significantly faster than
 with other approaches, over a range of gram-
 mar types.
 
 ------------------------------------------
 
 Improved Word Alignment with Statistics and Linguistic Heuristics (Ulf Hermjakob)
 
 We present a method to align words in a bitext that combines elements
 of a traditional statistical approach with linguistic knowledge.
 We demonstrate this approach for Arabic-English, using an alignment
 lexicon produced by a statistical word aligner, as well as linguistic
 resources ranging from an English parser to heuristic alignment rules
 for function words. These linguistic heuristics have been generalized
 from a development corpus of 100 parallel sentences.
 Our aligner, UALIGN, outperforms both the commonly used GIZA++ aligner
 and the state-of-the-art LEAF aligner on F-measure and produces
 superior scores in end-to-end statistical machine translation,
 +1.3 BLEU points over GIZA++, and +0.7 over LEAF.
 

DTEND:20090724T161500
DTSTART:20090724T150000
LOCATION:11 Large
SUMMARY:Practice talks for EMNLP [Adam Pauls (UC Berkeley) <br> Ulf Hermjakob]
UID:20090724T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: This talk will be divided into two parts. In the first part I will
 talk about using Transfer Learning techniques to improve the task of
 Word Sense Disambiguation (WSD).
 Usually in supervised WSD, we suffer due to paucity of labeled data as
 there are some words that occur less frequently in the data and its
 very difficult to get enough labeled data for these words. In such
 cases it is very difficult to build high accuracy supervised learning
 models for these words. So, we propose an approach called TransFeat
 (based on the MDL principle) which ``transfers information", from
 similar words in the form of a feature relevance prior to get improved
 accuracies on these rare words. Besides this, our experiments show
 that we also get decent improvement in accuracy for words that have
 more amount of labeled data available. TransFeat gives accuracies that
 are in the worst case comparable to state-of-the-art on ONTONOTES and
 SENSEVAL-2 datasets.
 
 In the second part of the talk I will talk about incorporating
 non-local constraints in Named Entity Recognition (NER) systems. The
 main idea is that some linguistic constraints (e.g. every occurrence
 of the word ``Einstein" in the data should have the tag PER
 i.e. person ) are very useful and can give improved performance but
 they are non - local and hence are intractable and can not be
 efficiently modeled using state-of-the-art sequence modeling methods
 like CRFs. Though people have used Skip-chain CRFs (with Loopy
 BP)(Sutton and McCallum '04) and Gibbs Sampling (Finkel and Manning
 '05) to enforce these non-local constraints, but they turn out to be
 really inefficient and custom-tailored to one particular kind of
 constraints (say) consistency constraints of the type mentioned
 above. We propose a constrained version of EM in which a general set
 of constraints (not limited to consistency constraints!) can be
 incorporated into the model. In the end I will show some results of
 this approach on CoNLL 03 English and CoNLL 02 Spanish NER shared tasks.

DTEND:20090717T160000
DTSTART:20090717T150000
LOCATION:11 Large
SUMMARY:Transfer Learning for WSD & Non-local constraints for Named Entity Recognition [Paramveer Dhillon (Penn)]
UID:20090717T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Philipp Koehn and I will do a machine translation tutorial at ACL.
 Instead of an introductory tutorial, we'll do short 15-minute segments
 on various hot topics in MT research.  For the ISI NL seminar, I'll
 present 3 or 4 of those topics, determined by audience vote.

DTEND:20090710T160000
DTSTART:20090710T150000
LOCATION:11 Large
SUMMARY:Excerpts from ACL-09 Tutorial on "Topics in Machine Translation" [Kevin Knight]
UID:20090710T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Current statistical machine translation systems usually
 extract rules from bilingual
 corpora annotated with 1-best alignments. They are prone to learn
 noisy rules due to alignment mistakes. We propose a new structure
 called weighted alignment matrix
 to encode all possible alignments for a parallel text compactly. The
 key idea is to assign a probability to each word pair to indicate how
 well they are aligned. We design new algorithms for extracting phrase
 pairs from weighted alignment matrices and estimating their
 probabilities. Our experiments on multiple language pairs show that
 using weighted matrices achieves consistent improvements over using
 n-best lists in significant less extraction time.
 
 About the speaker:
 
 Yang Liu is an Assistant Researcher at Institute of Computing
 Technology (ICT), Chinese Academy of Sciences. He received his PhD
 degree in Computer Science from ICT in 2007. His major research
 interests include statistical machine translation and Chinese
 information processing. He has been working on syntax-based modeling,
 word alignment, and system combination. His paper on tree-to-string
 translation won the Meritorious Asian NLP Paper Award of COLING/ACL
 2006. He served as Reviewers for TALIP, TSLP, JNLE, ACL, EMNLP, AMTA, and SSST.
 

DTEND:20090716T113000
DTSTART:20090716T103000
LOCATION:11 Large
SUMMARY:Weighted Alignment Matrices for Statistical Machine Translation [Yang Liu (ICT China)]
UID:20090716T103000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Cube pruning is a fast inexact method for generating the items of a
 beam decoder.  Here we show that cube pruning is essentially
 equivalent to A* search on a specific search space with specific
 heuristics.  We use this insight to develop faster and exact variants
 of cube pruning.
 

DTEND:20090723T154500
DTSTART:20090723T150000
LOCATION:11 Large
SUMMARY:Cube Pruning as Heuristic Search (Practice talk for EMNLP) [Mark Hopkins (Language Weaver)]
UID:20090723T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: TALK 1: Erica Greene
 
 Title: A Statistical Foray into Poetry
 
 Abstract: Although the analysis and generation of poetry is often considered an
 exclusively human task, we have taken some initial steps to automate
 the process.  We build a series of finite state transducers to analyze
 poetic meter and train them on a handmade corpus of poetry. We then
 use these trained transducers to generate poetry.  Specifically, we
 focus on generating sonnets and limericks.
 
 ------------------------------------------
 
 TALK 2: Paramveer Dhillon
 
 Title: Learning to simplify target language for MT + Unsupervised log-linear
 models for Word Alignment
 
 Abstract: We consider the Machine Translation task for the language pair
 (Chinese and English), where English is the target language. There are
 lots of redundancies in English language, e.g. It has capitalization,
 i.e. the first word of each sentence is capitalized, and it has
 different morphology i.e. it has noun and verb endings; none of which
 are present in Chinese. In a way, due to these redundancies, we are
 learning that a single Chinese word "tamen" translates to "They" and
 "they" and another Chinese word translates to "run", "runs" and
 "running". We present generative models which learn to "cluster" the
 target language vocabulary, by removing the above redundancies, namely
 (Capitalization and Different morphology). We show results on how this
 "clustering" affects the translation quality in end-to-end MT
 experiments.
 
 In the last part of the talk, I would talk about using unsupervised
 log-linear(discriminative) models for improving word alignments. There
 are very few precedents of using discriminative models for word
 alignment in totally unsupervised settings. (Taskar et. al. '05) and
 (Lacoste-Julien et. al. '06) used maximum weight bipartite matching in
 "nearly" unsupervised setting and (Blunsom et. al. '06) used CRFs for
 supervised word alignment. We use log-linear models in totally
 unsupervised settings to do word alignments. Speicifically we use
 Contrastive Estimation (Smith et. al. '05) to shift the probability
 mass to the correct set of alignments from a well-chosen
 "neighborhood" of those alignments. In the end I will show some
 preliminary word alignment results using our approach.

DTEND:20090827T160000
DTSTART:20090827T150000
LOCATION:11 Large
SUMMARY:Intern Final Talks [Erica Greene (Haverford) <br> Paramveer Dhillon (Penn)]
UID:20090827T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Tree-to-String Alignment Models
 
 Machine translation systems typically rely on some form alignment as a
 preprocessing step. Typically, these alignments take the form of
 word-to-word alignments. In this talk, we will introduce several
 models aimed at aligning foreign words to either English words or
 nodes in the English parse tree. Such word-to-node alignments offer
 several potential advantages over traditional word-to-word
 alignments. Firstly, since the extraction process for some syntactic
 systems explicitly considers the English trees, we expect that also
 considering the trees at alignment time will produce alignments that
 will better suit the extraction process. Secondly, aligning foreign
 function words to English tree nodes can admits highly desirable
 syntactic transfer rules which cannot be directly as word-to-word
 alignments. Finally, word-to-node alignments can effectively model
 many-to-one alignments.  We present four models of increasing
 complexity and show preliminary results for each model.
 

DTEND:20090828T160000
DTSTART:20090828T150000
LOCATION:11 Large
SUMMARY:Intern Final Talks [Adam Pauls (UC Berkeley) <br> Michael Auli (Edinburgh)]
UID:20090828T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: 
 Jointly parsing two languages has been shown to improve accuracies on
 either or both sides. However, its search space is much bigger than
 the monolingual case, forcing existing approaches to employ
 complicated modeling and crude approximations. Here we propose a much
 simpler alternative, bilingually-constrained monolingual parsing,
 where a source-language parser learns to exploit reorderings as
 additional observation, but not bothering to build the target-side
 tree as well. We show specifically how to enhance a shift-reduce
 dependency parser to use alignment features to resolve shift-reduce
 conflicts. Experiments on the bilingual portion of Chinese Treebank
 show that, with just 3 bilingual features, we can improve parsing
 accuracies by 0.6% for both English and Chinese, with negligible (~6%)
 efficiency overhead, thus much faster than biparsing.
 
 http://www.cis.upenn.edu/~lhuang3/biparsing.pdf

DTEND:20090821T161500
DTSTART:20090821T150000
LOCATION:4th Floor Conference Room
SUMMARY:Bilingually-Constrained (Monolingual) Shift-Reduce Parsing [Liang Huang]
UID:20090821T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: We present a general framework for automatically extracting social networks and biographical facts from conversational
 speech. Our approach relies on fusing the output produced by multiple information extraction modules, including entity
 recognition and detection, relation detection, and event detection modules. We describe the specific features and algorithmic
 refinements effective for conversational speech. These cumulatively increase the performance of social network
 extraction from 0.06 to 0.30 for the development set, and from 0.06 to 0.28 for the test set, as measured by f-measure on the
 ties within a network. The same framework can be applied to other genres of text -- we have built an automatic biography
 generation system for general domain text using the same approach.
 
 --
 
 Brief Bio:
 Nanda Kambhatla has nearly 17 years of research experience in the areas of
 Natural Language Processing (NLP), text mining, information extraction, dialog systems, and
 machine learning. He holds 6 U.S patents and has authored over 30 publications in books,
 journals, and conferences in these areas. Nanda holds a B.Tech in Computer Science and Engineering
 from the Institute of Technology, Benaras Hindu University, India, and a Ph.D in Computer
 Science and Engineering from the Oregon Graduate Institute of Science & Technology, Oregon, USA.
 
 Currently, Nanda is the manager of the Data Analytics Group at IBM's India Research Lab (IRL), Bangalore. The group is focused on research on machine translation, Natural Language Processing, text analysis and machine learning techniques for developing analytics
 solutions to help IBM's services divisions. Most recently, Nanda was the manager of the Statistical
 Text Analytics Group at IBM's T.J. Watson Research Center, the Watson co-chair of the Natural
 Language Processing PIC, and the task PI for the Language  Exploitation Environment (LEE) subtask
 for the DARPA GALE project. He has been leading the development of information extraction
 tools/products and his team has achieved top tier results in successive Automatic Content Extraction
 (ACE) evaluations conducted by NIST for extracting entities, events and relations from text from
 multiple sources, in multiple languages and genres.
 
 Earlier in his career, Nanda has worked on natural language web-based and spoken dialog systems at IBM. Before joining IBM, he has worked on information retrieval and filtering algorithms as a senior research scientist at WiseWire Corporation, Pittsburgh and on image compression algorithms while working as a postdoctoral fellow under Prof. Simon Haykin at McMaster University, Canada.
 
 Nanda's research interests are focused on NLP and technology solutions for creating, storing, searching, and processing large volumes of unstructured data (text, audio, video, etc.) and specifically on applications of statistical learning algorithms to these tasks.

DTEND:20091009T160000
DTSTART:20091009T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Extracting Social Networks and Biographical Facts from Conversational Speech Transcripts [Nandakishore Kambhatla (IBM India)]
UID:20091009T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: This tutorial will be a short introduction to using the Linux cluster at 
 USC's High-Performance Computing (HPC) facility. Topics will include:
     (1) basics of starting jobs on the cluster using Torque/PBS,
     (2) dealing with common problems like jobs not starting or 
 spontaneously dying,
     (3) maximizing the performance of your jobs (both yours and other 
 people's), e.g., using the correct filesystem and tuning it for better speed, 
     (4) embarrassingly parallel processing and poor-man's workflows.
 
 It will NOT cover 
     Hadoop,
     MPI,
     real workflow management tools like Condor.
 
 

DTEND:20090911T160000
DTSTART:20090911T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Tutorial on HPC [David Chiang ]
UID:20090911T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Natural Language Decipherment: Solving Problems in Natural Language Processing without Labeled Data (Thesis Proposal practice talk)
 
 A wide variety of problems in NLP require parallel data to train supervised models to perform different tasks. For example, in machine translation (where the task is to translate between two languages automatically) parallel data containing source/target language sentence pairs is required to train various models which can then be used to translate new sentences or documents. The dependency on parallel data for many of these NLP tasks limits their applications to specific domains, or language pairs for which a lot of training data is readily available. On the other hand, collecting parallel data for new domains, language pairs, etc. is a costly as well as time-intensive operation. For such tasks, the development of novel unsupervised approaches which require only {\em non-parallel} data for training can enable their application to new domains and potentially broaden the impact and benefits of NLP research to wider areas.
 
 A similar problem has been tackled by cryptographers and archaeologists in a different context---for "decipherment" purposes. During the 1940's and 1950's, mathematicians and scientists worked on code-breaking operations, which spurred the development of many research ideas for modern computer science. For such problems, it is highly unlikely to assume the availability of parallel data relating the ciphertext and plaintext, yet cryptographers and archaeologists have attempted to solve such tasks using various decipherment techniques along with other non-parallel sources of information.
 
 In this thesis proposal practice talk, I will show how we combine the two ideas (decipherment and unsupervised learning for NLP problems) together and present a unified decipherment-based approach for modeling a wide range of problems in NLP. Instead of relying on parallel data, I propose to use alternate sources of linguistic knowledge and large quantities of readily available monolingual data to induce strong bilingual connections in problems such as machine transliteration and translation. The talk will describe how various NLP problems such as unsupervised part-of-speech tagging, word alignment, transliteration, and machine translation can be formulated as decipherment tasks. I will present decipherment algorithms for tackling many of these problems and show that it is possible to achieve good results for many problems of interest in NLP without using any parallel data at all.

DTEND:20090826T160000
DTSTART:20090826T150000
LOCATION:11 Large
SUMMARY:Natural Language Decipherment: Solving Problems in Natural Language Processing without Labeled Data (Thesis Proposal practice talk) [Sujith Ravi]
UID:20090826T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Tree Adjoining Grammars have well-known advantages but are typically considered too difficult for practical systems.  We propose that, when done right, adjoining improves translation quality without becoming computationally intractable.  Using adjoining to model optionality allows general translation patterns to be learned without the clutter of endless variations of optional material.  The appropriate modifiers can later be spliced in as needed to translate details.
 
 In this proposal, we describe challenges encountered by phrase-based and syntax-based machine translation (MT) systems today, and present an in-depth, quantitative comparison of both models. Then, we describe a novel model for statistical MT which addresses these challenges using a Synchronous Tree Adjoining Grammar.  We introduce a method of converting these grammars to a weakly equivalent tree transducer for decoding.   And we present a method for learning the rules and associated probabilities of this grammar from aligned tree/string training data.
 
 Finally, our initial results show that adjoining already delivers an end-to-end improvement of +0.8 BLEU over a baseline statistical syntax-based MT model on a medium-scale Arabic/English MT task.  Furthermore, we demonstrate it is a competitive entry in the Urdu-English track of the 2009 NIST MT evaluation.  We then propose improvements to the model, decoding, and extraction that promise to allow this new, linguistically-motivated MT model to surpass its syntax-based and phrase-based cousins in a wide range of scenarios and language pairs.
 

DTEND:20091023T160000
DTSTART:20091023T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Tree Adjoining Machine Translation (thesis proposal practice talk) [Steve DeNeefe]
UID:20091023T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Automated techniques that can support the human activities of search and
 sense-making in large email collections are of increasing importance for a
 broad range of uses, including historical scholarship, law enforcement and
 intelligence applications, and lawyers involved in "e-discovery" incident
 to civil litigation.  In this talk, I'll briefly describe some of the work
 to date on searching large email collections, and then for most of the
 talk I will focus on the more challenging task of support for
 sense-making.  Specifically, I'll describe joint work with Tamer Elsayed
 to automatically resolve the identity of people who are mentioned
 ambiguously (e.g., just by first name) in a collection of email from a
 failed corporation (Enron).  Our results indicate that for people who are
 well represented in the collection we can use a generative model to guess
 the right identity about 80% of the time, and for others we are right
 about half the time.  I'll conclude the talk with a few remarks on our
 next directions for techniques, evaluation, and additional types of
 collections to which similar ideas might be applied.
 
 About the Speaker:
 
 Douglas Oard is an Associate Professor at the University of Maryland,
 College Park, with joint appointments in the College of Information
 Studies and the Institute for Advanced Computer Studies; he is on
 sabbatical at Berkeley's iSchool for the Fall 2009 semester.  Dr. Oard
 earned his Ph.D. in Electrical Engineering from the University of
 Maryland, and his research interests center around the use of emerging
 technologies to support information seeking by end users.  His recent work
 has focused on interactive techniques for cross-language information
 retrieval and techniques for search and sense-making in conversational
 media.  Additional information is available at
 http://www.glue.umd.edu/~oard/.

DTEND:20091021T160000
DTSTART:20091021T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Who 'Dat? Identity resolution in large email collections [Douglas W. Oard (Maryland)]
UID:20091021T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: In this talk I will present Ensemble Semantics (ES), a new general framework for information extraction developed at Yahoo!, that combines multiple sources of information and extractors. The ES framework is based on the hypothesis that although distributional and pattern-based extraction algorithms are complementary, they do not exhaust the semantic space; other sources of evidence can be leveraged to better combine them.  In this presentation, I will focus on a specific implementation of ES for the task of entity extraction. I will report experimental results showing large gains in performance, by combining state-of-the-art distributional and pattern-based systems with a large set of features from a document webcrawl, one year of query logs, and a snapshot of Wikipedia. I will also propose an analysis of feature correlations and interactions showing the value of the different feature sets. I will conclude discussing some issues that can impact on the overall performance of entity extraction algorithms.

DTEND:20091120T160000
DTSTART:20091120T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Entity Extraction via Ensemble Semantics [Marco Pennacchiotti (Yahoo! Research)]
UID:20091120T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Modeling query concepts through term dependencies has been shown to have a significant positive effect on retrieval performance, especially for tasks such as Web search, where relevance at high ranks is particularly critical. Most previous work, however, treats all concepts as equally important, an assumption that often does not hold, especially for longer, more complex queries. In this talk, I will describe the state-of-the-art practices for modeling query term dependencies for information retrieval using Markov random fields. Within this context I will discuss why many NLP-inspired approaches to the problem, such as query segmentation, have failed to show consistent improvements when applied to information retrieval tasks. Experimental results carried out on a number of TREC and Yahoo! Web search test collections will be presented showing the effectiveness of various types of term (in)dependence models.
 
 Brief bio:
 Donald Metzler is a Research Scientist in the Search and Computational Advertising group at Yahoo! Research. He obtained his Ph.D. from the University of Massachusetts in 2007. He is an active member of the information retrieval and web search communities, having served on the program committees of SIGIR, ECIR, HLT, EMNLP, WSDM, WWW, and ICML. He has published over 35 research papers, has 13 patents pending, and is the co-author of Search Engines: Information Retrieval in Practice. His research interests include information retrieval, web search, computational advertising, and applications of machine learning to large-scale text problems.

DTEND:20091204T160000
DTSTART:20091204T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Learning Query Concept Importance Using a Weighted Dependence Model [Donald Metzler (Yahoo! Research)]
UID:20091204T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: TBA

DTEND:20091211T160000
DTSTART:20091211T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:TBA [Anselmo Pe&#241;as (UNED, Spain)]
UID:20091211T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar/
END:VEVENT
END:VCALENDAR
