Big Data Analytics
CMPS 499-003 / CSCE 598-003
Tuesday and Thursday - 3:30 pm to 4:45 pm
OLVR 116
Spring 2018
Instructor: | Dr. Vijay Raghavan |
Office: | OLVR 305 |
Office Hours: | Tue: 1:30 pm - 3:00 pm Wed: 4:00 pm - 5:30 pm |
Phone: | (337) 482-6603 |
Email: | raghavan@louisiana.edu |
Grader: | TBA |
Office: | |
Office Hours: | |
Phone: | |
Email: |
Page Content
Roster
Click here to check the class rosterPlease check and let me (TA) know if your name is not in the roster !
Outline
Essentials of Big Data analytics. Topics include: challenges and opportunities posed by Big Data in a variety of domains, predictive analytics or other advanced methods to extract value from data, innovative statistical techniques to glean insights from data, frameworks for parallelizing data pre-processing and data analytics, such as, Hadoop and Spark, and distributed algorithms to accelerate knowledge discovery.
Policies on Cheating
Cheating: It should be strictly noted that any sort of cheating will NOT be tolerated. All work you submitted must be entirely your own. If any student is found cheating in an assignment (either programming or non-programming), he/she will be given a 0 for that assignment. This includes both the person showing their work and the person involved in copying. If any student is found cheating in a test, he/she will be given either a grade of 'C' or 'F' or in some cases will also be brought to the attention of Dean (Again includes both the person showing their work and the person involved in copying).
References
- Principles of Big Data: Preparing, Sharing, and Analyzing Complex Information; Morgan Kaufmann Publishers Inc. San Francisco, CA, USA ©2013, 9780124047242
- Murali K. Pusala, Mohsen Amini Salehi, Jayasimha R. Katukuri, Ying Xie and Vijay V. Raghavan, "Massive Data Analysis: Tasks, Tools, Applications and Challenges".[Full_Text]
- Uthayasankar Sivarajah, Muhammad Mustafa Kamal, Zahir Irani, Vishanth Weerakkody, "Critical analysis of Big Data challenges and analytical methods".[Full_Text]
- Jure Leskovec, Anand Rajaraman, Jeffrey D. Ullman, "Mining of Massive Datasets ". [Full_Text]
- Data Mining by Pieter Adriaans and Dolf Zantigue, Addison-Wesley, 1996. [Full_Text]
- Introduction to Data Mining by Pang-Ning Tan, Michael Steinbach and Vipin Kumar, Addison-Wesley, 2006. [Full_Text]
Grading Policy
- Term Project: 30-40%*
- 2 - 4 Homework Assignments: 25-30%
- One Term Test + 3 Quizzes: 10-15%
- Final Exam: 20-30%
*Typically, a term project involves the design and implementation of analytic applications with Big Data tools / platforms or user interface or other system components for Big Data Search (Reduction) systems.
Class Notes
Index | Lecture | Link | Link | |
1 | Introduction | [pdf] | [ppt] | |
2 | Introduction II | [pdf] | ||
3 | Visual Analytics Sandbox | [pdf] | [ppt] | |
4 | Hadoop Basics | [pdf] | [ppt] | |
5 | Introduction to Spark | [pdf] | [ppt] | |
6 | Data Preprocessing in Spark | [pdf] | [ppt] | |
7 | Big Data Project Failures | [pdf] | [ppt] | |
8 | Matrix-Vector Multiplication by MapReduce | [pdf] | [ppt] | |
9 | Types of Data | [pdf] | ||
10 | Clustering Tips and Tricks | [pdf] | ||
11 | Sentiment Analysis with PySpark | [pdf] | [ppt] |
Assignments
Note:
- All non-programming assignments should be written legibly (Please check Policies on Cheating).
- Before submission a photo-copy of the assignment should be made (for reference).
- Only the original should be submitted.
- Retain the photocopy. DO NOT submit it.
- Please staple the question paper on top of the answer sheet.
- Answer sheets that are not stapled properly will not be graded.
- All assignments should be done individually unless otherwise stated.
- Academic dishonesty will be prosecuted in accordance with the rules and regulations specified by the university.
- All answer sheets should be numbered.
- While answering questions please begin answering individual questions on separate pages.
- Please provide an index, stating each question number and the corresponding page number where its answer can be found.
Last updated: January 27, 2018