(a.k.a. "Empirical Methods in Natural Language Processing".)
(There is a sister course,
CS 544, "Natural Language Processing", offered in the Spring semesters.
You can take these two in either order.)
Time:  Tu/Th, 11:00am12:20pm 
Location:  GFS 223 
Type:  Lectures 
Instructors:  David Chiang (chiang@isi.edu)
Liang Huang (lhuang@isi.edu) 
Teaching Assistant:  Oana Nicolov (oana@isi.edu) 
This graduate course covers the basics of statistical methods for processing human language, intended for:
Students will experiment with existing NLP software toolkits and write their own programs. There will be six programming assignments and a research project; there will be no exams. Grades will be based on:
Students are expected to submit only their own work for homework assignments. They may discuss the assignments with one another but may not collaborate with or copy from one another.
All assignments and the project will be due at the beginning of class (11am) on the due date. Late assignments will be accepted with a 20% penalty at the beginning of the following class; with a 40% penalty up to a week after the due date. No exceptions can be made except for a grave reason.
Date  Topic  Instructor  Slides/Readings  Assignments 

Tue, Aug 25  Introduction: realworld applications of NLP  slides    
Thu, Aug 27  Basic linguistic theory. Words, partsofspeech, ambiguity, morphology, phrase structure, word senses, speech. Text corpora and processing tools.  Notes: p16 J & M: Ch. 1 
HW 0: nothing to turn in.  
Unit One: Sequence models  
Tue, Sep 1  Basic automata theory. Finitestate acceptors and intersection. NFA determinization and DFA minimization. Regular Expressions.  Notes: p813 J & M: Ch. 2 slides (TuTh) 
  
Thu, Sep 3  Finitestate transducers and composition. Applications in morphology, texttosound conversion, word alignment.  Notes: p1419 J & M: Ch. 3 
HW 1 out  
Tue, Sep 8  Basic probability theory. Conditional probability, Bayes' rule, estimating parameter values from data, priors, building generative stochastic models.  Notes: p2027
slides (TuTh) 
  
Thu, Sep 10  The noisychannel framework. Probabilistic finitestate acceptors and transducers.  Notes: p2834  HW 1 due  
Tue, Sep 15  Language modeling and using language models to resolve ambiguities across a wide range of applications.  Notes: p3653 J & M: Ch. 4 Adv. Smoothing: [CG98] 
  
Thu, Sep 17    
Tue, Sep 22  Training and testing data. The sparse data problem. Smoothing with heldout data.  HW 2 out (LM) 

Thu, Sep 24    
Tue, Sep 29  String transformations. A simple framework for stochastically modeling many types of string transformations, such as: tagging word sequences with parts of speech, cleaning up misspelled word sequences, automatically markingup names, organizations, and locations in raw text, etc. Estimating parameter values from annotated data. Various Dynamic Programming Algorithms for sequences. 
Notes: p5465 J & M: Ch. 5 [slides]
Semirings and DP: [H08 (part 1)] 
HW 2 due  
Thu, Oct 1  HW 3 out (transliteration). [data] 

Tue, Oct 6    
Thu, Oct 8  Problems involving incomplete data. Hidden parameters.  Notes: p6690 J & M: Ch. 6 [DLR77][C97][D02][B06] 
HW 3 due  
Tue, Oct 13  The ExpectationMaximization algorithm    
Thu, Oct 15  HW 4 out (EM / FB) 

Tue, Oct 20  The ForwardBackward algorithm    
Thu, Oct 22    
Unit Two: Tree models  
Tue, Oct 27  Limitations of finitestate machines. Natural language syntax. Contextfree grammars.  J & M: Ch. 12. [slides (5 lectures)]  Project Guidelines out.  
Thu, Oct 29  A Toy Grammar for English; Probabilistic CFGs; Parsing: the CKY algorithm.  J & M: Ch. 1314.    
Tue, Nov 3  CKY Parsing Examples; Ambiguities in Chart. Chomsky Normal Form and Binarization. 
J & M: Ch. 14. 
HW 4 due  
Thu, Nov 5  Parsing as Deduction, Intersection, and Generation. Packed Forests and Hypergraphs. Parser Evaluation Metrics.  [SSP 95] [H08 (part 2)] J & M: Ch. 14.7 
HW 5 out (parsing / forest). 

Tue, Nov 10  The InsideOutside Algorithm; PCFG reestimation.  [LY 90] J & M: Ch. 14.3 
Initial project proposal due  
Thu, Nov 12  Applications in Machine Translation:

[slides (2 lectures)] J & M: Ch. 25.10. 
  
Tue, Nov 17  HW 5 due  
Unit Three: (Structured) Discriminative training  
Thu, Nov 19  Discriminative training: (Structured) Perceptron  [slides (3 lectures)]  Final project proposal due. Data preparation and baseline finished.  
Tue, Nov 24  Maximumentropy models. Conditional random fields. Stochastic gradient ascent.    HW 6 out (discriminative tagging) 

Thu, Nov 26  Thanksgiving  NO CLASS  
Tue, Dec 1  Maxmargin methods      
Thu, Dec 3  Research project presentations    HW 6 due  
Wed, Dec 16        Research project due by email 
[LY 90]  The estimation of stochastic contextfree grammars using the insideoutside algorithm. 
K. Lari and S. J. Young.  
Computer Speech and Language, 1990: 4:3556.  
[SSP 95]  Principles and Implementation of Deductive Parsing. 
Stuart Shieber, Yves Schabes, and Fernando Pereira.  
Journal of Logic Programming, 1995: 24 (12): 336.  
[H08]  Tutorial on Semirings and Advanced Dynamic Programming in NLP (slides). 
Liang Huang  
Tutorial given at COLING 2008 and NAACL 2009.  
[KS09]  Tutorial on Writing Systems, Transliteration, and Decipherment (slides). 
Kevin Knight and Richard Sproat  
Tutorial given at NAACL 2009.  
[CG98]  An Empirical Study of Smoothing Techniques for Language Modeling 
Stanley F. Chen and Joshua Goodman  
Technical Report TR 1098, Computer Science Group, Harvard University.  
[DLR77]  Maximum Likelihood from Incomplete Data via the EM Algorithm 
A.P. Dempster, N.M. Laird, and D.B. Rubin  
Journal of the Royal Statistical Society, Vol. 39, No. 1, 1977  
[C97]  The EM Algorithm 
Michael Collins  
[D02]  The Expectation Maximization Algorithm 
Frank Dellaert  
Technical Report, 2002 