(a.k.a. "Empirical Methods in Natural Language Processing".)
(There is a sister course, CSCI 544, "Natural Language Processing", offered in the Spring semesters. You can take these two in either order.)
Prof. Liang Huang (email@example.com)
Prof. Kevin Knight (firstname.lastname@example.org)
|Teaching Assistant:||Jason Riesa (email@example.com)|
This graduate course covers the basics of statistical methods for processing human language, intended for:
If you have not taken CSCI 561 and want to enroll with our CSCI 562: email us if you are a PhD student or if you received an A in CSCI 570 (Algorithms).
Students will experiment with existing NLP software toolkits and write their own programs. There will be six programming assignments, four quizzes, and a research project. Grades will be based on:
Students are expected to submit only their own work for homework assignments. They may discuss the assignments with one another but may not collaborate with or copy from one another.
All assignments and the project will be due at the beginning of class (11am) on the due date. Late assignments will be accepted with a 30% penalty up to one week late. No exceptions can be made except for a grave reason.
|Tue, 8/24||Introduction: real-world applications of NLP.
|Thu, 8/26||Python for text processing.
Basic linguistic theory.
|Huang||HW0: nothing to turn in.
Solutions: perm.py word.py word3.py
|Unit One: Sequence Models (10 lectures)|
|Tue, Aug 31||Basic automata theory. Finite-state acceptors and transducers.||Knight||Quiz 0: 20 mins Solutions|
|Thu, Sep 2||HW1 out: FSTs|
|Tue, Sep 7||Probability theory and estimations; WFSTs; Noisy-channel||Knight|
|Thu, Sep 9|
|Tue, Sep 14||Language modeling and using language models to resolve ambiguities across a wide range of applications.
Training and testing data.
The sparse data problem; smoothing with held-out data.
|Knight||J&M Ch. 4||HW1 due|
|Thu, Sep 16||HW2 out: LMs|
|Tue, Sep 21|
|Thu, Sep 23||String transformations. A simple framework for stochastically modeling many types of string transformations, such as: tagging word sequences with parts of speech, cleaning up misspelled word sequences, automatically marking-up names, organizations, and locations in raw text, etc. Estimating parameter values from annotated data. Viterbi algorithm.||Knight||HW2 due|
|Tue, Sep 28||HW3 out: Tagging or Transliteration|
|Thu, Sep 30||Quiz 1: 15 min|
|Unit Two: Tree Models (7 lectures)|
|Tue, Oct 5||Context-Free Grammars (CFGs)||Huang||slides|
|Thu, Oct 7||Probabilistic Context-Free Grammars (PCFGs)
|Tue, Oct 12||Binarization; Unary Rules (lecture by D. Chiang)|
|Thu, Oct 14||Parsing as Deduction and Intersection
Forest and Hypergraphs
Dynamic Programming on Hypergraphs
From Viterbi to CKY
From Dijkstra to Knuth 77
|HW4 out: Parsing
|Tue, Oct 19|
|Thu, Oct 21|
|Tue, Oct 26||Synchronous Grammars and Machine Translation||Quiz 2: 15 min|
|Unit Three: Machine Learning (8 lectures)|
|Thu, Oct 28||Unsupervised Learning: The EM Algorithm||Knight||HW4 due (Sun Oct 31)
Project Guidelines out
|Tue, Nov 2||Forward-Backward Algorithm||HW5 out: EM|
|Thu, Nov 4||Word-Alignment|
|Tue, Nov 9||Inside-Outside Algorithm||Initial Project Proposal due|
|Thu, Nov 11||Discriminative Learning: The Perceptron||Huang||slides|
|Tue, Nov 16||Conditional Random Fields (CRFs)||HW5 due
HW6 out: Discriminative Learning
|Thu, Nov 18||Max-margin methods||Final proposal due|
|Tue, Nov 23||More on Discriminative Training||Quiz 3: 15 min|
|Thu, Nov 25||Thanksgiving! No class.|
|Tue, Nov 30||Project Midway Presentations||Knight/Huang|
|Thu, Dec 2||Project Midway Presentations||Knight/Huang||HW6 due|
|[LY 90]||The estimation of stochastic context-free grammars using the inside-outside algorithm.|
|K. Lari and S. J. Young.|
|Computer Speech and Language, 1990: 4:35-56.|
|[SSP 95]||Principles and Implementation of Deductive Parsing.|
|Stuart Shieber, Yves Schabes, and Fernando Pereira.|
|Journal of Logic Programming, 1995: 24 (1-2): 3-36.|
|[H08]||Tutorial on Semirings and Advanced Dynamic Programming in NLP (slides).|
|Tutorial given at COLING 2008 and NAACL 2009.|
|[KS09]||Tutorial on Writing Systems, Transliteration, and Decipherment (slides).|
|Kevin Knight and Richard Sproat|
|Tutorial given at NAACL 2009.|
|[CG98]||An Empirical Study of Smoothing Techniques for Language Modeling|
|Stanley F. Chen and Joshua Goodman|
|Technical Report TR 10-98, Computer Science Group, Harvard University.||[DLR77]||Maximum Likelihood from Incomplete Data via the EM Algorithm|
|A.P. Dempster, N.M. Laird, and D.B. Rubin|
|Journal of the Royal Statistical Society, Vol. 39, No. 1, 1977|
|[C97]||The EM Algorithm|
|[D02]||The Expectation Maximization Algorithm|
|Technical Report, 2002|
|[B06]||The Expectation Maximization Algorithm -- A short Tutorial.|
Last year's version is cached here.