INFO680/MKTG680 Introduction to Data Mining for Managers Summer 2011 Instructor: Dr. Greg Smith Email: smithg2@xavier.edu Office: 209 Smith Hall Phone: 745-3245 Fax: 745-3455 Office Hours: Monday 3:00 pm – 6:00 pm Thursday 3:00 pm – 6:00 pm Other times by appointment Course Site: blackboard.xavier.edu Classroom: 15 Hailstones Hall Class time: Monday & Wednesday 6:00 pm – 9:15 pm In case of emergency class cancellation an email will be sent to advise of the situation and provide further information. In addition, a posting will be made on Blackboard. Williams College of Business Mission: “We educate students of business, enabling them to improve organizations and society, consistent with the Jesuit tradition.” Class Text, Hardware, & Software: «Required» Olson, D. and Shi, Y., Introduction to Business Data Mining, McGraw-Hill, 2007. ISBN10: 0072959711 ISBN13: 9780072959710 Applied Analytics Using SAS Enterprise Miner 6.x, SAS Institute, Cary, NC. Individual Readings to be presented in-class and on our course Blackboard site. Smith – Summer 2011 Course Syllabus – Data Mining 1 Data Files: This class will employ SAS Enterprise Miner Software. The software environment is free and available on the SAS Cloud on PC’s only. Course Description: This introductory course will familiarize students with popular data mining methods for extracting knowledge from data. Principles of data mining will be presented and discussed, but students will also acquire hands-on experience using state-of-the-art software to develop data mining solutions to real-world business problems. The course will de delivered from both a technological view and a marketing/management view. Topics and related methods discussed in the class include: data mining processes and knowledge discovery, database support to data mining, associations, classifications and prediction, clustering, recommendation systems and developing issues in data mining. My Vision: In the last decade we have seen an explosion in the quantity of data available to businesses. Transactional data from point-of-sale scanners are now routinely available; data from direct marketing is growing exponentially; and e-commerce and web-browsing data is everywhere. Obviously, there is going to be a strong interest in extracting value or knowledge from this data. My vision of this course is to present and discuss data mining technologies and their application to data sets in an effort to support tactical and strategic business decisions. However, the overriding focus will be learning when and how to use the technologies. Course Goals: Upon completion of this course, you should be able to: • Understand popular data mining techniques, how to apply them, and when they are applicable • Utilize a state-of-the-art commercial data mining package • Apply popular data mining techniques to solve real-world problems Course Policies: • I will take attendance at every class period. This is simply for my information and will only come into play if attendance is poor. In this class, if you miss, it will be extremely hard for you to catch-up because of the “learning-by-doing” emphasis. • Assignments are to be submitted on the due date. Late assignments will not be accepted unless prior arrangements have been made with the instructor. A score of 0 will be recorded for any assignment received beyond the due date. • Grade tracking and averaging is the responsibility of the student. Blackboard will be kept up-to-date for your convenience. Smith – Summer 2011 Course Syllabus – Data Mining 2 Academic Honesty: “All work submitted for academic evaluation must be the student’s own. Certainly, the activities of other scholars will influence all students. However, the direct and unattributed use of another’s efforts is prohibited as is the use of any work untruthfully submitted as one’s own. The penalty for violation of this policy will be a zero for that assignment if it is a first offense. Subsequent violation will result in an F for the course.” Exams: There will be three exams covering material from the textbook, readings, assignments, and Enterprise Miner usage. Homework: There will be several homework assignments throughout the summer covering data mining applications. In-class work: We will be performing a number of in-class assignments (to be applied outside of class) using SAS Enterprise Miner therefore it is important that you attend class regularly. Class readings: Published articles will be presented for reading, review, and in-class discussion. These articles will cover current trends and practices in “real-world” data mining. Grade Components: Grade Distribution: Exam 1 30% Exam 2 30% A 95-100% C+ 77-79% Exam 3 30% A-90-94% C 73-76% Homework 10% B+ 87-89% C-70-72% B 83-86% D 60-69% B-80-82% F Below 60% Smith – Summer 2011 Course Syllabus – Data Mining 3 Class Schedule (This is simply a guide and WILL be changed periodically. Check Blackboard for changes) Class Date Class Topics Reading 5/16/11 Course Introduction Chapter 1 – Initial Description of Data Mining in Business Chapter 2 – Data Mining Process and Knowledge Discovery pp 1-14 pp 19-31 5/18/11 Chapter 3 – Database Support for Data Mining Chapter 4 – Overview of Data Mining Techniques pp 34-48 pp 53-70 5/23/11 Chapter 5 – Cluster Analysis Chapter 11 – Market Basket Analysis pp 73-96 pp 211-219 5/25/11 Chapter 8 – Decision Tree Algorithms Chapter 6 – Regression Algorithms in Data Mining pp 135-160 pp 99-119 5/30/11 Memorial Day (No-Class) 6/1/11 Exam 1 – Chapters 1,2,3,4,5,11 Chapter 7 – Neural Networks in Data Mining pp 122-131 6/6/11 Chapter 10 – Business Data Mining Applications Chapter 13 - Ethical Aspects of Data Mining pp 187-208 pp 250-257 6/8/11 Enterprise Miner 1 & 2 6/13/11 Exam 2 – Chapters 8, 6, 7, 10, 13 Enterprise Miner 3 6/15/11 Enterprise Miner 4 & 5 6/20/11 Enterprise Miner 6 & 7 6/22/11 Enterprise Miner 8 & 9 Final Exam (Enterprise Miner) Smith – Summer 2011 Course Syllabus – Data Mining