CSCI 662 Fall 2020 course page

Jonathan May

Website https://www.isi.edu/~jonmay/cs662_fa20_web/
Lectures Online , Mondays and Wednesdays 10:00–11:50 am
Instructor & office hours Jonathan May, Online, Mondays and Wednesdays 9:00–10:00 am or by appointment
TAs & Office hours Mozhdeh Gheini, TBD, Online
  Meryem M'Hamdi, TBD, Online
Textbook Required: Natural Language Processing - Eisenstein1
  Required: Selected papers from NLP literature, see (evolving) schedule
  Optional: Introduction to Deep Learning - Charniak 2
  Optional: Speech and Language Processing 3rd edition -Jurafsky, Martin 3
Grading 10 %: In-class participation
  10 %: Posted questions before each in-class selected paper presentation
  10 %: In-class selected paper presentation
  30 %: Three Homeworks (10% each)
  40 %: Project, comprising proposal (10%), final conference-quality paper (15%), and 20-minute in-class presentation (15%) (may be done in small groups)
Contact us On Piazza or in class/office hours. Please do not email (unless notified otherwise).

Topics (subject to change per instructor/class whim) (will not be presented in this order):

date material reading presentation Other
8/24 intro, applications Eisenstein 1 (not mandatory)    

8/26

probability basics, ethics Eisenstein Appendix A, Goldwater probability tutorial 4 The Social Impact of Natural Language Processing5   project assignment out (due 9/9)

8/31

corpora, text processing, Linear Classifiers (Naive Bayes, Logistic Regression, Perceptron) Eisenstein 2, Nathan Schneider's unix notes6, Unix for poets7, sculpting text8 Thumbs up? Sentiment Classification using Machine Learning Techniques9  
9/2 Nonlinear classifiers, feed forward neural networks, backpropagation, gradient descent Eisenstein 3, Charniak 1. A Unified Architecture for Natural Language Processing: Deep Neural Networks with Multitask Learning 10 HW1 out (due 9/30)

9/7

LABOR DAY NO CLASS      
9/9 POS tags, HMMs, search Eisenstein 7   project proposal due

9/14

parsing and syntax 1: treebanks, evaluation, cky, grammar induction, pcfgs Eisenstein 9,2, 10. Building a Large Annotated Corpus of English11  

9/16

parsing and syntax 2: dependencies, shift-reduce Eisenstein 11, A Fast and Accurate Dependency Parser using Neural Networks12    

9/21

evaluation, annotation, mechanical turk Eisenstein 4.5. An Empirical Investigation of Statistical Significance in NLP 13  

9/23

semantics: word sense, propbank, amr, distributional lexical Eisenstein 13, 14. Linguistic Regularities in Continuous Space Word Representations14. The word analogy testing caveat15 HW2 out (due 10/21)

9/28

NO CLASS      

9/30

language models: ngram, feed-forward Eisenstein 7   HW1 due

10/5

language models: recurrent      

10/7

Machine Translation history, evaluation, statistical Eisenstein 18.1, 18.2    

10/12

Neural Machine Translation, summarization, generation Eisenstein 18.3, 19.1, 19.2    

10/14

Transformers Attention is all you need 16, Illustrated Transformer 17   HW3 out (due 11/11)

10/19

Large Contextualized Language Models (ElMo, BERT, GPT(2), etc.) Illustrated BERT, ElMo, and co.18    

10/21

Information Extraction: Entity/Relation, CRF Eisenstein 17.1, 17.2 25 years of IE19 HW2 Due

10/26

Information Extraction: Events, Zero-shot Eisenstein 17.3    

10/28

Blade Runner NLP   GLUE  

11/2

Creative Generation, structure-to-text, text-to-text Eisenstein 19.1, 19.2    

11/4

Dialogue Eisenstein 19.3. A Diversity-Promoting Objective Function for Neural Conversation Models20  

11/9

Power and Ethics   Energy and Policy Considerations for Deep Learning in NLP 21  

11/11

How to write a paper Neubig slides on Piazza   HW3 Due

11/16

TBD (QA? Bertology? Fairness and inclusion?)      
11/18 Presentations      

11/23

Presentations      

       

TODO:

  1. project presentations
  2. integrate paper presentations
  3. schedule homeworks more properly

Reading list:

  1. Classification: Thumbs up? Sentiment Classification using Machine Learning Techniques https://www.aclweb.org/anthology/W02-1011/
  2. Syntax/Corpus Annotation: Building a Large Annotated Corpus of English https://www.aclweb.org/anthology/J93-2004.pdf
  3. Transition to Neural: A Unified Architecture for Natural Language Processing: Deep Neural Networks with Multitask Learning https://ronan.collobert.com/pub/matos/2008_nlp_icml.pdf
  4. Embeddings (pro analogy test): https://www.aclweb.org/anthology/N13-1090.pdf
  5. Embeddings (con analogy test): https://www.aclweb.org/anthology/N18-2039.pdf
  6. Working with Mechanical Turk
  7. Evaluation: An Empirical Investigation of Statistical Significance in NLP http://nlp.cs.berkeley.edu/pubs/BergKirkpatrick-Burkett-Klein_2012_Significance_paper.pdf
  8. Transformer
  9. Distributional Hypothesis
  10. Bertology? https://arxiv.org/pdf/2002.12327.pdf
  11. Ethical Considerations: Energy and Policy Considerations for Deep Learning in NLP https://aclweb.org/anthology/papers/P/P19/P19-1355/
  12. What semantics do language models know
  13. Dialogue: A Diversity-Promoting Objective Function for Neural Conversation Models https://www.aclweb.org/anthology/N16-1014/



Footnotes

... Eisenstein1
https://mitpress.mit.edu/books/introduction-natural-language-processing or free version https://github.com/jacobeisenstein/gt-nlp-class/blob/master/notes/eisenstein-nlp-notes.pdf
... Charniak2
https://mitpress.mit.edu/books/introduction-deep-learning (first three chapters at https://cs.brown.edu/courses/csci1460/assets/files/deep-learning.pdf)
... Martin3
https://web.stanford.edu/~jurafsky/slp3/
... tutorial4
http://homepages.inf.ed.ac.uk/sgwater/teaching/general/probability.pdf
... Processing5
https://www.aclweb.org/anthology/P16-2096/
... notes6
https://github.com/nschneid/unix-text-commands
... poets7
https://www.cs.upc.edu/~padro/Unixforpoets.pdf
... text8
http://matt.might.net/articles/sculpting-text/
... Techniques9
https://www.aclweb.org/anthology/W02-1011/
... Learning10
https://ronan.collobert.com/pub/matos/2008_nlp_icml.pdf
... English11
https://www.aclweb.org/anthology/J93-2004.pdf
... Networks12
https://www.aclweb.org/anthology/D14-1082/
... NLP13
http://nlp.cs.berkeley.edu/pubs/BergKirkpatrick-Burkett-Klein_2012_Significance_paper.pdf
... Representations14
https://www.aclweb.org/anthology/N13-1090.pdf
... caveat15
https://www.aclweb.org/anthology/N18-2039.pdf
... need16
https://arxiv.org/abs/1706.03762
... Transformer17
http://jalammar.github.io/illustrated-transformer/
... co.18
http://jalammar.github.io/illustrated-bert/
... IE19
in piazza or https://www.cambridge.org/core/journals/natural-language-engineering/article/twentyfive-years-of-information-extraction/0E5BB0D6AE906BB3C25037E2D74CA8F3/share/5ce1ad8430e190e282cc234c79c320c49906a7e2
... Models20
https://www.aclweb.org/anthology/N16-1014/
... NLP21
https://aclweb.org/anthology/papers/P/P19/P19-1355/


jonmay@isi.edu