Na-Rae Han's home page
Took the course in the past? Click here for 2022, for 2021, 2020, 2019, and 2018. (E-mail Na-Rae for password.)

LING 1330/2330 Introduction to Computational Linguistics

Fall 2023, University of Pittsburgh

Meetings: Tue & Thu 11am - 12:15pm   Classroom: 407 CL

Description

This course aims to introduce students who have been exposed to linguistics to real-world applications of computational linguistics and natural language processing. Students will first learn the fundamentals of how computers are used to represent and process textual and spoken information. They will then be introduced to the challenges of real-world language engineering problems and learn how they are handled with the latest language technologies. The topics include: spell-checking, machine translation, part-of-speech tagging, parsing, document classification, and corpus building and exploration. Students will be given hands-on training on the basics of text processing using Python and will have a chance to work with NLTK, a popular natural language processing application suite. This course is designed specifically for students in the humanities; computer science majors (who are not linguists) are encouraged to take CS 1671 or CS 1571 instead.

Prerequisites

Intro-level linguistics and basic Python knowledge are required: LING 1000 "Introduction to Linguistics" and CS 0012 "Introduction to Computing for the Humanities" (or an equivalent class, grade B or above). Having Python programming as a prerequisite, instead of learning in-class, frees up valuable class time to explore more computational linguistic topics and to focus more on linguistic motivations. Linguistics majors and grad students very much remain as the target audience of this course.

Students are required to bring their own laptop to class. It should be running Windows (10 or 11), Mac OS-X, or Linux (any distribution). Mobile or cloud-based machines such as Android/Apple tablets or Chromebooks are not suited. Additional setup and software requirements are found on this Checklist page.

Instructors

WhoPitt emailOffice hours (We are also available to meet by appointment.)Location
Na-Rae Hannaraehan@pittMon 11:30am-1pm
Wed 3-4:30pm
In-person: G17 CL (Language Media Center)
Virtual: MS Teams chat & video calls
Tianyi Zhengtiz65@pittMon 1-4pm
Wed 1-3pm
In-person: G17 CL (LMC)
Virtual: MS Teams chat & video calls

Textbooks

[1] Language and Computers. Markus Dickinson et al. Wiley-Blackwell. 2012. (Digital copy available through Pitt Library)
[2] Speech and Language Processing 2nd Edition, 3rd Edition. Jurafsky & Martin.
[3] Natural Language Processing with Python. (updated edition based on Python 3 and NLTK 3) Steven Bird et al. O'Reilly Media.
[4] Python tutorial: Python 3 Notes

Course Organization

Each meeting will have lecture and lab components. Topics presented in the textbooks will be covered in a lecture-and-discussion format. In lab, students will get hands-on training using Python and Natural Language Toolkit (NLTK). "Learning by doing" is the core design principle of this class!

In addition to in-person class meetings, we will be using multiple digital platforms: this course home page (everything that's public: syllabus, policies, course schedule, class materials such as lecture slides, homework and exercise assignments), Canvas (private things: course announcements, assignment submission, grades), and MS Teams (chat-based communication, one-on-one video calls). They are all linked on the top right.

Assignments, Requirements, Grading and Policies

As a rule, there will always be a form of assignment between classes. There are two types: homework assignments and programming exercises, which are administered via Canvas and due before the beginning of the next class. Assignment schedule is posted on the Class Schedule page. Details on all requirements, grading and other policies can be found on the Policies page.