(a.k.a. "Empirical Methods in Natural Language Processing".)
(There is a sister course,
CSCI 544, "Natural Language Processing", offered in the Spring semesters.
You can take these two in either order.)
Time:  Tu/Th, 11:00am12:20pm 
Location:  KAP 145 
Type:  Lectures 
Instructors: 
Prof. Liang Huang (lhuang@isi.edu)
Prof. Kevin Knight (knight@isi.edu) 
Teaching Assistant:  Jason Riesa (riesa@isi.edu) 
This graduate course covers the basics of statistical methods for processing human language, intended for:
If you have not taken CSCI 561 and want to enroll with our CSCI 562: email us if you are a PhD student or if you received an A in CSCI 570 (Algorithms).
Students will experiment with existing NLP software toolkits and write their own programs. There will be six programming assignments, four quizzes, and a research project. Grades will be based on:
Students are expected to submit only their own work for homework assignments. They may discuss the assignments with one another but may not collaborate with or copy from one another.
All assignments and the project will be due at the beginning of class (11am) on the due date. Late assignments will be accepted with a 30% penalty up to one week late. No exceptions can be made except for a grave reason.
Date  Topic  Instructor  Slides/Readings  Assignments 
Tue, 8/24  Introduction: realworld applications of NLP. Python tutorial. 
Huang  intro slides python slides 

Thu, 8/26  Python for text processing. Basic linguistic theory. 
Huang  HW0: nothing to turn in. word.in word.out Solutions: perm.py word.py word3.py 

Unit One: Sequence Models (10 lectures)  
Tue, Aug 31  Basic automata theory. Finitestate acceptors and transducers.  Knight  Quiz 0: 20 mins Solutions  
Thu, Sep 2  HW1 out: FSTs  
Tue, Sep 7  Probability theory and estimations; WFSTs; Noisychannel  Knight  
Thu, Sep 9  
Tue, Sep 14  Language modeling and using language models to resolve ambiguities across a wide range of applications. Training and testing data. The sparse data problem; smoothing with heldout data. 
Knight  J&M Ch. 4  HW1 due 
Thu, Sep 16  HW2 out: LMs  
Tue, Sep 21  
Thu, Sep 23  String transformations. A simple framework for stochastically modeling many types of string transformations, such as: tagging word sequences with parts of speech, cleaning up misspelled word sequences, automatically markingup names, organizations, and locations in raw text, etc. Estimating parameter values from annotated data. Viterbi algorithm.  Knight  HW2 due  
Tue, Sep 28  HW3 out: Tagging or Transliteration  
Thu, Sep 30  Quiz 1: 15 min  
Unit Two: Tree Models (7 lectures)  
Tue, Oct 5  ContextFree Grammars (CFGs)  Huang  slides  
Thu, Oct 7  Probabilistic ContextFree Grammars (PCFGs) CKY algorithm 
HW3 due  
Tue, Oct 12  Binarization; Unary Rules (lecture by D. Chiang)  
Thu, Oct 14  Parsing as Deduction and Intersection Forest and Hypergraphs Dynamic Programming on Hypergraphs From Viterbi to CKY From Dijkstra to Knuth 77 
HW4 out: Parsing data 

Tue, Oct 19  
Thu, Oct 21  
Tue, Oct 26  Synchronous Grammars and Machine Translation  Quiz 2: 15 min  
Unit Three: Machine Learning (8 lectures)  
Thu, Oct 28  Unsupervised Learning: The EM Algorithm  Knight  HW4 due (Sun Oct 31) Project Guidelines out 

Tue, Nov 2  ForwardBackward Algorithm  HW5 out: EM  
Thu, Nov 4  WordAlignment  
Tue, Nov 9  InsideOutside Algorithm  Initial Project Proposal due  
Thu, Nov 11  Discriminative Learning: The Perceptron  Huang  slides  
Tue, Nov 16  Conditional Random Fields (CRFs)  HW5 due HW6 out: Discriminative Learning 

Thu, Nov 18  Maxmargin methods  Final proposal due  
Tue, Nov 23  More on Discriminative Training  Quiz 3: 15 min  
Thu, Nov 25  Thanksgiving! No class.  
Tue, Nov 30  Project Midway Presentations  Knight/Huang  
Thu, Dec 2  Project Midway Presentations  Knight/Huang  HW6 due 
[LY 90]  The estimation of stochastic contextfree grammars using the insideoutside algorithm. 
K. Lari and S. J. Young.  
Computer Speech and Language, 1990: 4:3556.  
[SSP 95]  Principles and Implementation of Deductive Parsing. 
Stuart Shieber, Yves Schabes, and Fernando Pereira.  
Journal of Logic Programming, 1995: 24 (12): 336.  
[H08]  Tutorial on Semirings and Advanced Dynamic Programming in NLP (slides). 
Liang Huang  
Tutorial given at COLING 2008 and NAACL 2009.  
[KS09]  Tutorial on Writing Systems, Transliteration, and Decipherment (slides). 
Kevin Knight and Richard Sproat  
Tutorial given at NAACL 2009.  
[CG98]  An Empirical Study of Smoothing Techniques for Language Modeling 
Stanley F. Chen and Joshua Goodman  
Technical Report TR 1098, Computer Science Group, Harvard University.  
[DLR77]  Maximum Likelihood from Incomplete Data via the EM Algorithm 
A.P. Dempster, N.M. Laird, and D.B. Rubin  
Journal of the Royal Statistical Society, Vol. 39, No. 1, 1977  
[C97]  The EM Algorithm 
Michael Collins  
[D02]  The Expectation Maximization Algorithm 
Frank Dellaert  
Technical Report, 2002  
[B06]  The Expectation Maximization Algorithm  A short Tutorial. 
Sean Borman 