IS 2065
Fall 2001 (02-1)
All available information about this course may be found at: http://www.pitt.edu/~hirtle/is2065.html
The course will provide an important foundation for further study in diverse areas, such as information retrieval, cognitive science, and marketing. The techniques discussed are also the foundation of many modern data mining techniques. In addition, the course will count as one of the two required statistics course for the Information Science Track of the DIST PhD program.
Materials. The primary text for the term
is
A complete set of powerpoint slides to accompany
the text can be found at: http://www.cs.sfu.ca/~han/DM_Book.html.
Additional readings will be assigned each week to complement the text.
Students should read both the text and the readings each week. However,
given your own level of knowledge and interest, you will be free to focus
on either the text or the readings. The material, which is not the
focus of your study, may be skimmed. Both text and readings will
be discussed each week during the lectures. Generally, the text will
be covered in the first half of the lecture. The readings in the
second half.
Assignments will require accounts on icarus.sis, unixs.cis, and vms.cis. See me if you have trouble getting an account on any of these machines. We will be using several packages and programs, including S-plus and SPSS during the semester. Additional links related to the class can be found at the following sites:
Evaluation. Evaluation will occur through
a combination of three short papers and a term project. The papers can
be one of two types: A review or an analysis. A review will describe
a current problem in the data mining field and dicuss various proposed
solutions, including any solutions that you might suggest. An analytical
paper will consist of a short, written analysis of a data set using one
or more techniques. The papers will be will be limited to 5 pages of text,
plus supporting graphs and tables. For each paper, there will be a set
of guidelines/topics that will be distributed two weeks before the paper
is due. Each paper will count 50 points. Late papers will lose 2 points
each day it is late. No paper will be accepted more than 7 days after the
due date. All papers must be completed independently. There must
be at least one paper of each type at some point in the term.
In addition to the short papers, there will be a separate term project, which will be similar to the papers in style, but will include a general discussion and must cover at least two of the topic areas. The written part of the project will be limited to 8 pages of text. The project will also be presented orally during the last class meeting. The entire project will be worth 100 points, including 10 points for the oral presentation. Any extenuating circumstances that would result in missing the final deadline must be discussed in advanced with the instructor. The oral presentation cannot be made up.
Special circumstances. If you have a disability for which you are or may be requesting an accomodation, you are encouraged to contact both your instructor and the Office of Disability Resources and Sevices, 216 William Pitt Union, (412-648-7890/TTY:412-383-7355) as early as possible in the term. DRS will verify your disability and determine reasonable accomodations for this course. In addition, you should be aware that my office is up a short flight of stairs. If this problematic, I am happy to arrange a meeting in an accessible location at any time.
Course Readings
Introduction;
8/30
Other References | CSNA | KD Nuggets | Citeseer