Course Description and Learning Objectives

This course is a gentle introduction to Data Science. The following will be discussed:

  1. Introduction to Data Science
  2. Introduction to R and RStudio
  3. Data Visualization
  4. Data Wrangling
  5. Supervised learning
  6. Model evaluation
  7. Unsupervised learning
  8. Professional Reporting and reproducible analysis
  9. Basic programming in R
  10. Professional ethics

Course Logistics

Prerequisites

Statistics knowledge at the level of STAT 1000 or above. No prior knowledge of programming required.

Required Textbook

Baumer et al., Modern Data Science with R. CRC Press.

Course Management System: Canvas

  • Lecture slides
  • Reading material
  • Homework assignments, labs, and quizzes
  • Data sets

Grading Components

1. Homework Assignments and Lab Reports (50%)

  • Homework and/or lab activities will be assigned weekly.
  • Homework questions require coding in R.

2. Quizzes (20%)

  • Four quizzes throughout the semester.
  • Check class schedule for dates and topics

3. Projects (30%)

  • Project I: 10%
  • Project II: 20%

4. Attendance Bonus (2%)

  • Attendance is encouraged. Students who attend at least 30 (out of 35) classes will receive a 2% bonus.
  • Attendance will be recorded through TopHat starting from Monday, September 13, 2021.

Course Grades:

Grade Percentage
A+ [97%,100%]
A [93%,97%)
A- [90%,93%)
B+ [87%,90%)
B [83%,87%)
B- [80%,83%)
C+ [77%,80%)
C [73%,77%)
C- [70%,73%)
D+ [67%,70%)
D [63%,67%)
D- [60%,63%)
F [0,60%)

University Policies:

Academic Integrity

Students in this course will be expected to comply with the University of Pittsburgh’s Policy on Academic Integrity. Any student suspected of violating this obligation for any reason during the semester will be required to participate in the procedural process, initiated at the instructor level, as outlined in the University Guidelines on Academic Integrity. This may include, but is not limited to, the confiscation of the examination of any individual suspected of violating University Policy. Furthermore, no student may bring any unauthorized materials to an exam, including dictionaries and programmable calculators.

To learn more about Academic Integrity, visit the Academic Integrity Guide for an overview of the topic. For hands-on practice, complete the Understanding and Avoiding Plagiarism tutorial.

Statement on Classroom Recording

To ensure the free and open discussion of ideas, students may not record classroom lectures, discussion and/or activities without the advance written permission of the instructor, and any such recording properly approved in advance can be used solely for the student’s own private use.

Diversity and Inclusion

The University of Pittsburgh does not tolerate any form of discrimination, harassment, or retaliation based on disability, race, color, religion, national origin, ancestry, genetic information, marital status, familial status, sex, age, sexual orientation, veteran status or gender identity or other factors as stated in the University’s Title IX policy. The University is committed to taking prompt action to end a hostile environment that interferes with the University’s mission. For more information about policies, procedures, and practices, see: http://diversity.pitt.edu/affirmative-action/policies-procedures-and-practices.

I ask that everyone in the class strive to help ensure that other members of this class can learn in a supportive and respectful environment. If there are instances of the aforementioned issues, please contact the Title IX Coordinator, by calling 412-648-7860, or e-mailing . Reports can also be filed online: https://www.diversity.pitt.edu/make-report/report-form. You may also choose to report this to a faculty/staff member; they are required to communicate this to the University’s Office of Diversity and Inclusion. If you wish to maintain complete confidentiality, you may also contact the University Counseling Center (412-648-7930).

Disability Services

If you have a disability for which you are or may be requesting an accommodation, you are encouraged to contact both your instructor and Disability Resources and Services (DRS), 140 William Pitt Union, (412) 648- 7890, drsrecep@pitt.edu, (412) 228-5347 for P3 ASL users, as early as possible in the term. DRS will verify your disability and determine reasonable accommodations for this course.

Accessibility

The Canvas LMS platform was built using the most modern HTML and CSS technologies, and is committed to W3C’s Web Accessibility Initiative and Section 508 guidelines. Specific details regarding individual feature compliance are documented and updated regularly.

Health and Safety Statement

Please visit https://www.coronavirus.pitt.edu/ and check your Pitt email for updates before each class.

Tentative Class Schedule

Class Lecture Day Date Topics Reading Due Day
1 1 F Aug 27 Introduction MDSR Ch.1
2 2 M Aug 30 R Basics (1) MDSR Appendix B
3 3 W Sep 1 R Basics (2) MDSR Appendix B
4 F Sep 3 Lab 1: R Basics
M Sep 6 Labor Day: No Class
5 4 W Sep 8 Data Visualization (1) MDSR Ch.2
6 F Sep 10 Lab 2: R Markdown HW 1
7 5 M Sep 13 Data Visualization (2) MDSR Ch.3
8 6 W Sep 15 Data Visualization (3) MDSR Ch.3
9 F Sep 17 Lab 3: Data Visualization HW 2
10 7 M Sep 20 Data Visualization (4) MDSR Ch.3
11 8 W Sep 22 Data Wrangling: One Table (1) MDSR 4.1,4.2
12 F Sep 24 Quiz 1: Data Visualization HW 3
13 9 M Sep 27 Data Wrangling: One Table (2) MDSR 4.1,4.2
14 10 W Sep 29 Data Wrangling: Two Tables (1) MDSR 4.3
15 F Oct 1 Lab 4: Visualization and Wrangling
16 11 M Oct 4 Data Wrangling: Two Tables (2) MDSR 4.3
17 12 W Oct 6 Summary of Data Wrangling MDSR 5.1, 5.2
18 F Oct 8 Lab 5: Data Wrangling to explore baseball database HW 4
19 13 M Oct 11 Statistical Foundation MDSR 7.1, 7.2
20 14 W Oct 13 Sampling Distribution and Bootstrap MDSR 7.3
F Oct 15 Fall Break: No Class HW 5
21 15 M Oct 18 Statistical Modeling: Regression (1) MDSR 7.5
22 16 W Oct 20 Statistical Modeling: Regression (2) MDSR 7.6
23 F Oct 22 Lab 6: Tidying data with tidyr
24 17 M Oct 25 Supervised Learning (1) MDSR 8.1
25 18 W Oct 27 Supervised Learning (2) MDSR 8.1
26 F Oct 29 Quiz 2: Data Wrangling HW 6
27 19 M Nov 1 Supervised Learning (3) MDSR 8.2
28 20 W Nov 3 Supervised Learning (4) MDSR 8.2
29 F Nov 5 Lab 7: Visualizing regression results
30 21 M Nov 8 Supervised Learning (5) MDSR 8.4
31 22 W Nov 10 Supervised Learning (6)
32 F Nov 12 Quiz 3: Regression HW 7
33 23 M Nov 15 Unsupervised Learning (1) MDSR 9.1
34 24 W Nov 17 Unsupervised Learning (2) MDSR 9.2
35 F Nov 19 Lab 8: Many Models (1)
Nov 22-26 Thanksgiving Break: No Class
36 25 M Nov 29 Machine Learning Summary
37 26 W Dec 1 Professional Ethics MDSR Ch.6
38 F Dec 3 Quiz 4: Machine Learning HW 8
39 27 M Dec 6 R Programming (1)
40 28 W Dec 8 R Programming (2)
41 F Dec 10 Lab 9: Many Models (2)