0% found this document useful (0 votes)
5 views

Big Data Introduction

The document provides an overview of the MAT 6015: Big Data Analytics and Visualization course, including details about the course coordinator, Dr. Jisha Francis, and the course structure, which comprises lectures and lab sessions. It outlines the course objectives, expected outcomes, and modules covering various topics in big data analytics, as well as evaluation criteria. The course is scheduled for December - April 2025 and emphasizes practical applications and contemporary issues in the field.

Uploaded by

aritra sarkar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Big Data Introduction

The document provides an overview of the MAT 6015: Big Data Analytics and Visualization course, including details about the course coordinator, Dr. Jisha Francis, and the course structure, which comprises lectures and lab sessions. It outlines the course objectives, expected outcomes, and modules covering various topics in big data analytics, as well as evaluation criteria. The course is scheduled for December - April 2025 and emphasizes practical applications and contemporary issues in the field.

Uploaded by

aritra sarkar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

MAT 6015: Big Data Analytics and Visualization

Course Introduction and Overview

Dr. Jisha Francis

Department of Mathematics
School of Advanced Sciences
Vellore Institute of Technology
Vellore Campus, Vellore - 632 014
India

Dr. Jisha Francis Course Overview 1 / 18


Contact of Course Coordinator

Dr. Jisha Francis


PRP 215 E
Assistant Professor,
Department of Mathematics,
School of Advanced Sciences
Vellore Institute of Technology, Vellore.
Email : [email protected]

Dr. Jisha Francis Course Overview 1 / 18


Course Details with L-T-P-J-C

Session : December - April 2025


Semester : VI
Subject Code : MTH6015
Subject Name : Big Data Analytics and Visualization
Lectures (L) : 2 hours/week
Laboratory (P) : 2 hours/week
Credits (C) : 3

Dr. Jisha Francis Course Overview 2 / 18


Time Table

Day Slot (Time) Activity


Wednesday C 1 (8:00 AM - 8:50 AM) Lecture
Friday C 1 (09:00 AM - 09:50 AM) Lecture
Tuesday L37 + L38 (2:00 PM - 03:40 PM) Lab
Office Hours PRP 215E (Through prior appointment) Doubts clarification

Dr. Jisha Francis Course Overview 3 / 18


Classroom Details

Mode : Physical (Regular)


Theory Class : PRP629
Lab Class : SJT 119

Attendance Policy:
Attendance will be taken within the first 10 minutes of class commencement.
After 10 minutes, no attendance will be recorded.
No exceptions will be made, regardless of the reason for late arrival.

Dr. Jisha Francis Course Overview 4 / 18


Course Objectives

To understand the functioning of industries and business strategies.


To introduce the power of big data analytics and data visualization techniques in
contributing to business value creation.
To solve a variety of complex data-centered business problems using computer software
tools.

Dr. Jisha Francis Course Overview 5 / 18


Expected Course Outcomes

Display conceptual understanding of big data analytics and visualization techniques.


Demonstrate a systematic understanding of database management concepts and their
connections with big data analytics.
Develop skills in big data network analytics, text mining, and social media data mining.
Demonstrate critical awareness of how big data analytics is used for business value
creation.
Apply big data techniques using statistical software.

Dr. Jisha Francis Course Overview 6 / 18


Course Modules Overview

1 Introduction to Big Data Analytics (3 hours)


2 Advanced Analytics (4 hours)
3 Big Data Analysis Models and Algorithms (5 hours)
4 Research Trends and Applications (2 hours)
5 Data Analytics Methods Using Statistical Packages (4 hours)
6 Testing of Hypotheses (6 hours)
7 Factor Analysis (4 hours)
8 Contemporary Issues (2 hours)

Dr. Jisha Francis Course Overview 7 / 18


Module 1: Introduction to Big Data Analytics (3 hours)

Big Data Overview: Definition and Characteristics


State of the Practice in Analytics: Current Trends
The Data Scientist: Role and Skills
Big Data Analytics in Industry Verticals: Applications in healthcare, finance, retail, etc.
Data Analytics Lifecycle: Steps from data collection to decision making

Dr. Jisha Francis Course Overview 8 / 18


Module 2: Advanced Analytics (4 hours)

K-means Clustering: Introduction and algorithm, applications.


Association Rules: Concepts of support, confidence, and lift.
Linear Regression: Basic model and assumptions, use cases.
Logistic Regression: Binary classification, interpretation of coefficients.
Nave Bayes: Probabilistic classifier, assumptions, and applications.
Decision Trees: Algorithm and use cases in classification.
Time Series Analysis: Methods for analyzing temporal data.
Text Analysis: Introduction to Natural Language Processing (NLP) techniques.

Dr. Jisha Francis Course Overview 9 / 18


Module 3: Big Data Analysis Models and Algorithms (5 hours)

Analytics for Unstructured Data: Importance of unstructured data and approaches.


MapReduce and Hadoop: Introduction to parallel computing, Hadoop framework.
The Hadoop Ecosystem: Key tools in Hadoop ecosystem: Hive, Pig, Spark.
In-database Analytics: SQL Essentials for big data, using MADlib for advanced
analytics in-database.

Dr. Jisha Francis Course Overview 10 / 18


Module 4: Research Trends and Applications (2 hours)

Operationalizing an Analytics Project: Moving from model development to


deployment.
Creating Final Deliverables: Presenting findings to stakeholders.
Data Visualization Techniques: Tools for effective visualization (Tableau, PowerBI).
Big Data Analytics Challenge: Practical application of the data analytics lifecycle.

Dr. Jisha Francis Course Overview 11 / 18


Module 5: Data Analytics Methods Using Statistical Packages (4 hours)

Analyzing and Exploring Data: Descriptive statistics, summary statistics.


Importing and Exporting Files: Working with different file formats (CSV, Excel, etc.).
Visual Binning: Grouping continuous variables.
Graphical Plots: Box plot, scatter plot, histogram, bar and pie charts.
Fitting Curves: Parabola, cubic, and exponential fitting.
Correlation and Regression: Simple, multiple, rank correlation.
Residual Analysis: Model adequacy, outliers, and influential observations.

Dr. Jisha Francis Course Overview 12 / 18


Module 6: Testing of Hypotheses (6 hours)

Two-sample and Paired Samples t-test: Applications and assumptions.


F-test for Variance: Comparing variances of two populations.
Chi-Square Test: Independence of attributes.
One-way and Two-way ANOVA: Analysis of variance with multiple groups.
Non-Parametric Tests: Kolmogorov-Smirnov, Kruskal-Wallis, Friedman tests.
MANOVA: Multivariate analysis of variance.
Repeated Measures ANOVA: Analysis with multiple observations on the same subjects.

Dr. Jisha Francis Course Overview 13 / 18


Module 7: Factor Analysis (4 hours)

Principle Component Analysis (PCA): Identification and interpretation of principal


components.
Varimax Rotation: Improving interpretability of factors.
Discriminant Analysis: Classification techniques and variable selection.
Logistic Regression: Applying logistic regression for classification.
Factorial Designs: Split plot designs and their applications.

Dr. Jisha Francis Course Overview 14 / 18


Module 8: Contemporary Issues (2 hours)

Industry Expert Lecture: Insights from professionals working in Big Data.


Challenges in Big Data Analytics: Data privacy, security, and ethical issues.
Emerging Trends: Future directions in Big Data and AI technologies.

Dr. Jisha Francis Course Overview 15 / 18


Textbooks and References
Textbooks:
Lemahieu, W., Vanden Broucke, S., Baesens, B. (2018). Principles of Database
Management: The Practical Guide to Storing, Managing and Analyzing Big and Small
Data. Cambridge University Press.
Sanders, R.N. (2014). Big Data Driven Supply Chain Management: A Framework for
Implementing Analytics and Turning Information into Intelligence. Pearson FT Press.
Reference Books:
Luke, D.A. (2015). A User’s Guide to Network Analysis in R. Springer.
Kolaczyk, E.D., Csardi, G. (2014). Statistical Analysis of Network Data with R. Springer.
Ohlhorst, F.J. (2013). Big Data Analytics, Turning Big Data into Big Money. Wiley.
Minelli, M., Chambers, M., Dhiraj, A. (2013). Big Data, Big Analytics: Emerging
Business Intelligence and Analytic Trends. Wiley.
Sathi,
Dr. A.
Jisha(2012).
Francis Big Data Analytics:Course
Disruptive
Overview Technologies for Changing the Game.16 / 18
Evaluation Breakdown: Total Marks (100M)

Total Marks = Internal Marks (60) + External Marks (40)


Internal Marks (60):
Assignment (Online Submission): 10M
Continuous Assessment Test (CAT-I): 15M
(27-01-2025 to 02-02-2025)
Quiz-I (Multiple Choice): 10M
Continuous Assessment Test (CAT-II): 15M
(16-03-2025 to 22-03-2025)
Quiz-II (Multiple Choice): 10M
External Marks (40):
Final Assessment Test (FAT): 40M
(Scheduled by Controller of Examinations)

Dr. Jisha Francis Course Overview 17 / 18


Thank You...

Dr. Jisha Francis Course Overview 18 / 18

You might also like