Data Mining

CSCE 566
Tuesday and Thrusday - 9:30 am to 10:45 am
Offered remotely
Spring 2021


Instructor: Dr. Vijay Raghavan
Office: OLVR 305
Office Hours: Appointment only
Phone: (337) 280-8451
Email: raghavan@louisiana.edu
Grader: Titli Sarkar
Office: LINC Lab (Oliver 216)
Office Hours: Appointment Only
Phone: (337) 522-8307
Email: titli010203@gmail.com
C00222141@louisiana.edu

Page Content


Roster

Click here to check the class roster

Please check and let me (TA) know if your name is not in the roster !


Prerequisites

CMPS 460G or consent of the instructor.

Some background knowledge on WWW protocols for database access from web browsers is assumed.


Outline

Model representation, evaluation, and search methods in data mining; knowledge discovery; classification and clustering, trend and deviation analysis, dependency derivation; integrated discovery systems, augmented database systems, and applications.


Policies on Cheating

Cheating: It should be strictly noted that any sort of cheating will NOT be tolerated. All work you submitted must be entirely your own. If any student is found cheating in an assignment (either programming or non-programming), he/she will be given a 0 for that assignment. This includes both the person showing their work and the person involved in copying. If any student is found cheating in a test, he/she will be given either a grade of 'C' or 'F' or in some cases will also be brought to the attention of Dean (Again includes both the person showing their work and the person involved in copying).


References

 


Grading Policy

*Typically, a term project involves the design and implementation of search and indexing algorithms or interface requirements or other infromation retrieval system components.


Class Notes

Index Lecture Link Link
1 Introduction (Part 1)   [pdf]
2 Introduction (Part 2)   [pdf]
3 Data Objects and attribute Types   Notes on transparencies
4 Measuring Data Similarity/ Dissimilarity   [pdf]
5 Statistical Descriptions and Data Preprocessing   [pdf]
6 Data Warehousing   [pdf]
7 Data Warehouse (Part 2)   [pdf]
8 Association Rule Mining   [pdf]
9 Association Rule Mining (Part 2)   [pdf]
10 Classification: Basic Concepts   [pdf]
11 Clustering Analysis: Basic Concepts and Methods   [pdf]

Assignments

  1. Assignment 1: Due 02/13/2020
  2. Assignment 2: Due 03/29/2020

Note:

  1. All non-programming assignments should be written legibly (Please check Policies on Cheating).
  2. Before submission a photo-copy of the assignment should be made (for reference).
  3. Only the original should be submitted.
  4. Retain the photocopy. DO NOT submit it.
  5. Please staple the question paper on top of the answer sheet.
  6. Answer sheets that are not stapled properly will not be graded.
  7. All assignments should be done individually unless otherwise stated.
  8. Academic dishonesty will be prosecuted in accordance with the rules and regulations specified by the university.
  9. All answer sheets should be numbered.
  10. While answering questions please begin answering individual questions on separate pages.
  11. Please provide an index, stating each question number and the corresponding page number where its answer can be found.

Final Project

Class project proposal and the final report should have the following details

Useful Links

    Introduction (Part 1)
    1. Chapter 1 scanned version from Advances in KDD book[Scanned_PDF].
    2. Data Mining task primitives [Scaned_PDF]
    Introduction (Part 2)
    1. Chapter 1, Han et al. text
    Data Objects and attribute Types
    1. Chapter 2 from Vipin Kumars Book[Scaned_PDF]
    Measuring Data Similarity/ Dissimilarity
    1. Chapter 2.4, Han et al. text
    Statistical Descriptions and Data Preprocessing
    1. Chapter 4 from Adriaans book[Scaned_PDF]
    2. Chapter 3 , Han et al. text
    3. Section 2.6.1 from Han et al. text, 2nd Edition[Scaned_PDF]
    4. Test of Stasticial Independence for Categorical Attributes[Scaned_PDF]
    Useful Data Mining tools
    1. Some useful data mining tools[DM Tools]

Sample Exam Papers

These are the previous midterm and final question papers for your reference


Last updated: January 22, 2020