Instructor: | Dr. Vijay Raghavan |
Office: | OLVR 305 |
Office Hours: | Appointment only |
Phone: | (337) 280-8451 |
Email: | raghavan@louisiana.edu |
Grader: | Titli Sarkar |
Office: | LINC Lab (Oliver 216) |
Office Hours: | Appointment Only |
Phone: | (337) 522-8307 |
Email: | titli010203@gmail.com C00222141@louisiana.edu |
Page Content
- Roster
- Prerequisites
- Course Outline
- Policies on Cheating
- References
- Grading Policy
- Class Notes
- Assignments
- Final Project
- Useful Links
- Sample Exam Papers
Roster
Click here to check the class rosterPlease check and let me (TA) know if your name is not in the roster !
Prerequisites
CMPS 460G or consent of the instructor.
Some background knowledge on WWW protocols for database access from web browsers is assumed.
Outline
Model representation, evaluation, and search methods in data mining; knowledge discovery; classification and clustering, trend and deviation analysis, dependency derivation; integrated discovery systems, augmented database systems, and applications.
Policies on Cheating
Cheating: It should be strictly noted that any sort of cheating will NOT be tolerated. All work you submitted must be entirely your own. If any student is found cheating in an assignment (either programming or non-programming), he/she will be given a 0 for that assignment. This includes both the person showing their work and the person involved in copying. If any student is found cheating in a test, he/she will be given either a grade of 'C' or 'F' or in some cases will also be brought to the attention of Dean (Again includes both the person showing their work and the person involved in copying).
References
- Han, Kamber & Pei, "Data Mining Concepts and Techniques", Morgan Kaufmann, Third Edition.[pdf] [ppt]
- Tan, Steinbach & Kumar, "Introduction to Data Mining", Addison-Wesley, First Edition.
- Adriaans & Zantinge, "Data Mining", Addison-Wesley, 1996.
- Erik Thomsen, "OLAP Solutions: Building Multidimensional Information Systems", WILEY, Second Edition.[pdf]
Grading Policy
- Term Project: 30-40%*
- Homework Assignments and Quizzes: 25-30%
- Term Test: 10-15%
- Final Exam: 20-30%
*Typically, a term project involves the design and implementation of search and indexing algorithms or interface requirements or other infromation retrieval system components.
Class Notes
Index | Lecture | Link | Link |
1 | Introduction (Part 1) | [pdf] | |
2 | Introduction (Part 2) | [pdf] | |
3 | Data Objects and attribute Types | Notes on transparencies | |
4 | Measuring Data Similarity/ Dissimilarity | [pdf] | |
5 | Statistical Descriptions and Data Preprocessing | [pdf] | |
6 | Data Warehousing | [pdf] | |
7 | Data Warehouse (Part 2) | [pdf] | |
8 | Association Rule Mining | [pdf] | |
9 | Association Rule Mining (Part 2) | [pdf] | |
10 | Classification: Basic Concepts | [pdf] | |
11 | Clustering Analysis: Basic Concepts and Methods | [pdf] |
Assignments
Note:
- All non-programming assignments should be written legibly (Please check Policies on Cheating).
- Before submission a photo-copy of the assignment should be made (for reference).
- Only the original should be submitted.
- Retain the photocopy. DO NOT submit it.
- Please staple the question paper on top of the answer sheet.
- Answer sheets that are not stapled properly will not be graded.
- All assignments should be done individually unless otherwise stated.
- Academic dishonesty will be prosecuted in accordance with the rules and regulations specified by the university.
- All answer sheets should be numbered.
- While answering questions please begin answering individual questions on separate pages.
- Please provide an index, stating each question number and the corresponding page number where its answer can be found.
Useful Links
Chapter 1 scanned version from Advances in KDD book[Scanned_PDF]. - Data Mining task primitives [Scaned_PDF]
- Chapter 1, Han et al. text
- Chapter 2 from Vipin Kumars Book[Scaned_PDF]
- Chapter 2.4, Han et al. text
- Chapter 4 from Adriaans book[Scaned_PDF]
- Chapter 3 , Han et al. text
- Section 2.6.1 from Han et al. text, 2nd Edition[Scaned_PDF]
- Test of Stasticial Independence for Categorical Attributes[Scaned_PDF]
- Some useful data mining tools[DM Tools]
Introduction (Part 1)
Introduction (Part 2)
Data Objects and attribute Types
Measuring Data Similarity/ Dissimilarity
Statistical Descriptions and Data Preprocessing
Useful Data Mining tools
Last updated: January 22, 2020