*Class schedule is subject to revision throughout the semester.
WDate Topic Reference Assignment
(due next class)
18/29 (T) Intro to CL, setup, string processing
Python 3 Notes Exercise 1: Python refresher quiz
8/31 (Th) Encoding systems, Python structural programming
[Lecture2.pdf, palindrome]
L&C ch.1 Encoding language Exercise 2: Python quiz, Pig Latin
29/5 (T) Encoding systems & Unicode; text processing with NLTK
NLTK ch.1, ch.2, ch.3 Exercise 3: Processing O. Henry
9/7 (Th) Unicode, spell checking fundamentals: edit distance, more NLTK
L&C ch.2 Writers' aids: spell checkers HW 1: Spell checker, text processing
39/12 (T) N-gram context, n-gram frequency, data resources on web, list comprehension
Exercise 4: Austen vs. Enable
9/14 (Th) Conditional probability, n-gram frequency, conditional frequency distribution
[Lecture6.pdf, CFD + ENABLE practice shell txt/PDF]
NLTK ch.2, ch.3 HW 2: Bigram Speak
49/19 (T) N-gram language models, web resources
[Lecture7.pdf, process Norvig's unigram data txt/PDF]
J&M ed.3 ch.3 N-gram language models Exercise 5: Big-data n-gram stats
9/21 (Th) N-gram resources, NLTK's corpus tools, corpus linguistics
HW 3: BU/JA EFL writing (week-long)
59/26 (T) Type, token, TTR; Zipf's law, freq distribution, n-grams
9/28 (Th) HW3 review, classifier intro
[Lecture10.pdf, name gender classifier txt/PDF]
NLTK ch.6: 6.1.3 Learning to classify text Exercise 6: Sentiment analysis of movie reviews
610/3 (T) Naive Bayes classifier
L&C ch.5 Classifying documents HW 4: Who said it? (week-long)
10/5 (Th) Bayes theorem, evaluation metrics
NLTK 6.5 Naive Bayes classifiers, 6.3 Evaluation -
710/10 (T) Homework 4 review
10/12 (Th) Midterm exam
810/17 (T) Regular expressions
NLTK 3.4 Regular expressions
J&M ed.3 ch.2 Regular expressions
Exercise 7: Regex vs. Steve Jobs
10/19 (Th) RE in Python
L&C ch.4 FSA
HW 5: Python's re library
910/24 (T) RE wrap, FSA, Morphology, FST
J&M ed.2 (older edition!) ch.3 Words and Transducers
Hulden (2011) Morphological analysis with FSTs
Exercise 8: Foma
10/26 (Th) FST and Foma
HW 6: Morphological analysis with Foma
1010/31 (T) FST review, POS tags
L&C ch.3 Language tutoring systems 3.4 Tokenization, POS tagging
NLTK ch.5 Categorizing and tagging words
Exercise 9: POS in Brown Corpus
11/2 (Th) POS tagging
[Lecture19.pdf, Brown & Treebank demo txt/PDF]
NLTK ch.5.5 N-Gram Tagging
J&M ed.3 ch.8 POS tagging
HW 7: N-gram tagger
1111/7 (T) N-gram tagger review, HMM tagger, Trees, CFG
[Lecture20.pdf, notes on Trees]
L&C ch.3 Beyond words
J&M ed.3 ch.12 Constituency Grammars
NLTK 7.4.2 Trees!
Exercise 10: Trees
11/9 (Th) Parsing, CFG, Treebanks
NLTK ch.8 Analyzing sentence structure HW 8: CFG and parsing
1211/14 (T) Probabilistic CFG, dependency grammar; Computational semantics: WordNet
[lecture22.html, Lecture22.pdf]
J&M ed.3 ch.18 Dependency parsing
J&M ed.3 ch.23 Word sense and WordNet
NLTK 2.5 WordNet
Exercise 11: WordNet, lexical resources
11/16 (Th) Computational semantics: formal, semantic roles
[Lecture23.pdf, PropBank demo]
NLTK ch.10 Analyzing the meaning of sentences
J&M ed.3 ch.24 Semantic role labeling
HW 9: PropBank, word vectors (due 11/30)
Thanksgiving break (whole week)
1311/28 (T) Vector semantics, word embeddings
[Lecture24.pdf, Word vectors demo]
J&M ed.3 ch.6 Vector semantics and embeddings -
11/30 (Th) Deep learning language models (by Tianyi Zheng), Machine translation
[Tianyi presentation, Lecture25.pdf]
L&C ch.7 Machine translation
Eisenstein (2019) Ch.18 MT, draft copy
J&M ed.3 Appendix B The noisy channel model, PPT slides
HW 10: Reflections on NLP, AI (week-long)
1412/5 (T) MT wrap, formal language theory
Eisenstein (2019) Ch.9 Formal language theory, draft copy -
12/7 (Th) Formal language theory
Partee et al. (1993) Ch.16 -
1512/13 (W) 4-5:50pm Final exam
