LING 1330/2330 Introduction to Computational Linguistics
Fall 2023, University of PittsburghMeetings: Tue & Thu 11am - 12:15pm Classroom: 407 CL
DescriptionThis course aims to introduce students who have been exposed to linguistics to real-world applications of computational linguistics and natural language processing. Students will first learn the fundamentals of how computers are used to represent and process textual and spoken information. They will then be introduced to the challenges of real-world language engineering problems and learn how they are handled with the latest language technologies. The topics include: spell-checking, machine translation, part-of-speech tagging, parsing, document classification, and corpus building and exploration. Students will be given hands-on training on the basics of text processing using Python and will have a chance to work with NLTK, a popular natural language processing application suite. This course is designed specifically for students in the humanities; computer science majors (who are not linguists) are encouraged to take CS 1671 or CS 1571 instead.
PrerequisitesIntro-level linguistics and basic Python knowledge are required: LING 1000 "Introduction to Linguistics" and CS 0012 "Introduction to Computing for the Humanities" (or an equivalent class, grade B or above). Having Python programming as a prerequisite, instead of learning in-class, frees up valuable class time to explore more computational linguistic topics and to focus more on linguistic motivations. Linguistics majors and grad students very much remain as the target audience of this course.
Students are required to bring their own laptop to class. It should be running Windows (10 or 11), Mac OS-X, or Linux (any distribution). Mobile or cloud-based machines such as Android/Apple tablets or Chromebooks are not suited. Additional setup and software requirements are found on this Checklist page.
Textbooks Language and Computers. Markus Dickinson et al. Wiley-Blackwell. 2012. (Digital copy available through Pitt Library)
 Speech and Language Processing 2nd Edition, 3rd Edition. Jurafsky & Martin.
 Natural Language Processing with Python. (updated edition based on Python 3 and NLTK 3) Steven Bird et al. O'Reilly Media.
 Python tutorial: Python 3 Notes
Course OrganizationEach meeting will have lecture and lab components. Topics presented in the textbooks will be covered in a lecture-and-discussion format. In lab, students will get hands-on training using Python and Natural Language Toolkit (NLTK). "Learning by doing" is the core design principle of this class!
In addition to in-person class meetings, we will be using multiple digital platforms: this course home page (everything that's public: syllabus, policies, course schedule, class materials such as lecture slides, homework and exercise assignments), Canvas (private things: course announcements, assignment submission, grades), and MS Teams (chat-based communication, one-on-one video calls). They are all linked on the top right.
Assignments, Requirements, Grading and Policies
As a rule, there will always be a form of assignment between classes. There are two types: homework assignments and programming exercises, which are administered via Canvas and due before the beginning of the next class. Assignment schedule is posted on the Class Schedule page. Details on all requirements, grading and other policies can be found on the Policies page.