0% found this document useful (0 votes)
50 views

CHO AI 105 - Data Analytics-As Shared

Uploaded by

dishu.diwanshi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views

CHO AI 105 - Data Analytics-As Shared

Uploaded by

dishu.diwanshi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Course Plan

A. Course Handout (For Students & Faculty)

Institute/School/College Name Chitkara University Institute of Engineering & Technology


Department/Centre Name Department of Computer Science & Engineering
Programme Name Bachelor of Engineering- Computer Science & Engineering (Artificial
Intelligence)
Course Name Data Analytics Session 2022-23
Course Code AI105 Semester/Batch 3rd /2022
Lecture/Tutorial (Per Week) 2-0-4 Course Credits 4
Course Coordinator Name Dr. Sushil Kumar Narang

1. Scope & Objective of the Course:


With great amounts of data comes a great need for data analysts. Organizations generate and collect
an exponentially growing amount of data: wringing actionable answers and insights out of the chaos is
a valuable and in-demand skill set to have. Organizations across industries need these answers and
insights to improve the decisions they make. B2B and B2C commerce, health care, manufacturing, and
marketing all use data analytics to improve processes and enhance profits.
This course prepares the students for a new career in the high-growth field of data analytics and about
the process for planning data analysis solutions and the various data analytic processes that are
involved. This course takes you through five key factors that indicate the need for specific AWS services
in collecting, processing, analyzing, and presenting your data. This includes learning basic architectures,
value propositions, and potential use cases
The core objectives of this course are :

● To gain an immersive understanding of the practices and processes used by a data analyst in
their day-to-day job operations
● To teach how to clean and organize data for analysis, and complete analysis and calculations
using spreadsheets, SQL and Python programming
● To inculcate key analytical skills (data cleaning, analysis, & visualization) and tools (Python
programming, Tableau, Power BI)
● To know how to visualize and present data findings in dashboards, presentations and
commonly used visualization platforms.

2. Course Learning Outcomes:

At the end of the course, students will be able to:


CLO01: Develop the ability to build proficiency with statistical analysis of data
CLO02: Apply data science concepts and methods to solve problems in real-world contexts and will
communicate these solutions effectively
CLO03: Carry out standard data visualization and formal inference procedures
CLO04: Perform data cleaning, and transform variables to facilitate analysis by integrating data from
disparate sources
CLO05: Build and enhance business intelligence capabilities by adapting the appropriate technology
and software solutions

CLO-PO Mapping grid |Program outcomes (POs) are available as a part of Academic Program Guide

Course PO1 PO2 PO3 PO4 PO5 PO6 PO7 PO8 PO9 PO10 PO11 PO12
Learning
Outcomes

CLO1 L L L L L L L M

CLO2 H M H H M L L M

Applied Soft Computing /AI102 Page 1 of 8


Course Plan

CLO3 H M H H M L L M

CLO4 H M H H M L L M

CLO5 M M M H M L L M

3. Recommended Books (Reference Books/Textbooks):


B01: Jojo Moolayil, “Smarter Decisions: The Intersection of IoT and Data Science”, PACKT, 2016.
B02: Cathy O’Neil and Rachel Schutt , “Doing Data Science”, O'Reilly, 2015.
B03: David Dietrich, Barry Heller, Beibei Yang, “Data Science and Big data Analytics”, EMC 2013
B04: Raj, Pethuru, “Handbook of Research on Cloud Infrastructures for Big Data Analytics”, IGI
Global

4. Other readings & relevant websites:

S.N. Link of Journals, Magazines, Websites, and Research Papers


1. https://www.tutorialspoint.com/statistics/index.html
2. https://iridl.ldeo.columbia.edu/dochelp/StatTutorial/index.html
3. https://www.khanacademy.org/math/statistics-probability
4. https://www.analyticsvidhya.com/blog/2021/02/an-intuitive-guide-to-visualization-
in-python/
5. https://www.educba.com/data-science/data-science-tutorials/tableau-tutorial/
6. https://matplotlib.org/stable/tutorials/index.html
7. https://seaborn.pydata.org/tutorial.html

5. Recommended Tools and Platforms

Python, Jupyter Notebook, Visual Studio Code, Anaconda, Tableau, Microsoft Power BI

6. Course Plan:

Lecture Number Topics Details

1 Understanding data Introduction – Types of Data: Numeric –


Categorical – Graphical – High Dimensional
Data
2 Classification of digital Data  Structured, Semi-Structured and Un-
Structured
 Sources of Data
3 Case Studies on Different types of Data Time Series – Transactional Data – Biological
Data – Spatial Data – Social Network Data
4 Data Evolution Understand issues relating to acquisition,
cleaning and loading of data, Data Deluge,
Data lake
5-7 Python Numpy  Arrays, Indexing, Slicing
 Different Array Operations
 Linear Algebra Operations
8-12 Python Pandas  Dataframes, Handling of data
 Series
 Data wrangling, Alignment and Indexing,
Handling Missing Data, Data Cleaning
 Merging and Joining Dataframes,
Grouping
 Concayenation and Aggregation
 Masking, Performing Mathematical
Operations on Data

Applied Soft Computing /AI102 Page 2 of 8


Course Plan

13-16 Access and combine data from CSV,  Using Pandas to access different sources
JSON, logs, APIs, and databases of Data

17-19 Using SQL with Databases  DDL, DML, Select and Joins

20-23 Advanced Operations using Pandas  Statistical Functions


 Descriptive Statistics
 Working with Text Data
 Time Delta
 Basic Data Visualization using pandas

24-28 Matplotlib, Seaborn, Cufflinks Different Plot types


Scatter, bar, histogram, box, pie, violin
Subplots, axis and figures
Text, labels and annotations
Colormaps
Plotting with Seaborn
Plotting Categorical and Continuous Data
Visualizing Regression models
Plotting interactive plots using Cufflinks
29-33 Using Power BI for Visualization Connecting to different data sources
Data types
Working with Meta Data
Calculations
 Purpose of Data Analysis Expresssions
(DAX)
 DAX operators & Functions
 Power Query
 Different Charts & Reports
 Exploring Data geographically
 Building dashboard to see insights
34-35 Time-Series Analysis  Understanding, Trend, Seasonality and
residuals, Moving Averages, Expnential
Weighted Moving Averages,
Autocorrelation, Autoregression models,
ARIMA, SARIMAX
ST-1 (Syllabus covered from Lecture 1 to 30)
36-38 Statistical Learning  Important statistical concepts used in
data science
 Difference between population and
sample, Types of variables, Measures of
central tendency
 Measures of variability, Coefficient of
variance, Skewness and Kurtosis
 Exploratory data analysis: Missing value
analysis
 The correction matrix, Outlier detection
analysis
 Inferential Statistics: Normal distribution
39-44 Test hypotheses, Parametric and Non-  Central limit theorem, Confidence
parametric tests interval
 T-test, Type I and II errors, Student’s T
distribution
 Non-Parametric Tests: Sign Test
 Wilcoxon’s Signed Rank Test
 Mann-Whitney test

Applied Soft Computing /AI102 Page 3 of 8


Course Plan

 Kolmorogov-Smirnov test
45-47 Understanding Regression Linear and Non-linear Regression

48-50 ANOVA  One-way and Two-way Analysis of


Variance
 R square, Correlation and causation
 dependent and independent variables-
Case Studies
51-53 Identification of regression problems Case Studies

54-57 Identifications of Classification Case Studies


problems
58-60 Identification of Clustering Problems Case Studies

ST-2 (Syllabus covered from Lecture)

END TERM – FULL SYLLABUS

7. Delivery/Instructional Resources

Lecture Topics PPT Industry Web References Audio-Video


No. (Link of Expert
ppts on Session (If
the yes: link of
central ppts on the
server) central
server)
1 Understanding data https://www.sqlsha
ck.com/introductio
n-to-data-science-
data-
understanding-and-
preparation/

https://ibm-cloud-
architecture.github.
io/refarch-data-ai-
analytics/preparati
on/data-
understanding/
2 Classification of digital Data https://ibm-cloud- https://www.yo
architecture.github. utube.com/watc
io/refarch-data-ai- h?v=mm2A5tKVI
analytics/preparati pg
on/data-
understanding/
3 Case Studies on Different types https://data-
of Data flair.training/blogs/
big-data-case-
studies/

Applied Soft Computing /AI102 Page 4 of 8


Course Plan

4 Data Evolution https://www.kdnug https://www.yo


gets.com/2014/06/ utube.com/watc
data-lakes-vs-data- h?v=E49BFhThC
warehouses.html 3U

5-7 Python Numpy https://numpy.org/ https://www.yo


doc/stable/user/ind utube.com/watc
ex.html#user h?v=j31ah5Qa4
QI
8-12 Python Pandas https://pandas.pyd https://www.yo
ata.org/docs/user_ utube.com/watc
guide/index.html h?v=UB3DE5Bgf
x4

13-16 Access and combine data from https://pandas.pyd https://www.yo


CSV, JSON, logs, APIs, and ata.org/docs/user_ utube.com/watc
databases guide/io.html h?v=GFBxxxjAza
U
17-19 Using SQL with Databases https://www.sqltut https://www.yo
orial.org/ utube.com/watc
h?v=zbMHLJ0dY
4w
20-23 Advanced Operations using https://www.kdnug https://www.yo
Pandas gets.com/2019/10/ utube.com/watc
5-advanced- h?v=DUgd48QY
features- mfI
pandas.html https://www.yo
utube.com/watc
h?v=RlIiVeig3hc
24-28 Matplotlib, Seaborn, Cufflinks https://matplotlib. https://www.yo
org/stable/plot_typ utube.com/watc
es/index.html h?v=3Xc3CA655
Y4
https://seaborn.py https://www.yo
data.org/tutorial.ht utube.com/watc
ml h?v=ooqXQ37X
HMM
https://www.analyt https://www.yo
icsvidhya.com/blog utube.com/watc
/2021/06/advanced h?v=7n5GzKuvP
-python-data- sw
visualization-
libraries-plotly/
29-33 Using Power BI for Visualization https://www.tutori https://www.yo
alspoint.com/powe utube.com/watc
r_bi/index.htm h?v=NalazxBo-
90
https://www.yo
utube.com/watc
h?v=3u7MQz1E
yPY

Applied Soft Computing /AI102 Page 5 of 8


Course Plan

34-38 Statistical Learning https://realpython. https://www.yo


com/python- utube.com/watc
statistics/ h?v=mQ-
3KwrBIN0

39-44 Test hypotheses, Parametric https://machinelea https://www.yo


and Non-parametric tests rningmastery.com/ utube.com/watc
nonparametric- h?v=IcLSKko2tsg
statistical-
significance-tests-
in-python/
45-47 Understanding Regression https://statisticsbyj https://www.yo
im.com/regression/ utube.com/watc
regression-tutorial- h?v=nk2CQITm_
analysis-examples/ eo
48-50 ANOVA https://www.renes https://www.yo
hbedre.com/blog/a utube.com/watc
nova.html h?v=ITf4vHhyGp
c

https://www.yo
utube.com/watc
h?v=QOl0_Odvb
dE
51-53 Identification of regression https://users.stat.u https://www.yo
problems fl.edu/~winner/cas utube.com/watc
es.html h?v=HgfHefwK7
VQ&t=268s
54-57 Identifications of Classification https://www.techg https://www.yo
problems uruspeaks.com/cas utube.com/watc
e-study- h?v=T5zJHhTO1
classification/ FA

58-60 Identification of Clustering http://ucanalytics.c https://www.yo


Problems om/blogs/customer utube.com/watc
-segmentation- h?v=lc7MLQpjqZ
cluster-analysis- 8
telecom-case-
study-example/

8. Action plan for different types of learners

Slow Learners Average Learners Fast Learners


 Multiple Remedial Extra  Doubt-sessions  More Practice assignments
Classes  Pre-coded algorithms to on real life problems
 Encouragement for illustrate concepts and  Engaging students to hold
improvement using Peer notions hands of slow learners by
Tutoring  E-notes and E-exercises to creating a Peer Tutoring
read in addition to Group
pedagogic material  Participation in
Hackathons, competitions.

Applied Soft Computing /AI102 Page 6 of 8


Course Plan

9. Evaluation Scheme & Components:

Evaluation Type of Component No. of Weightage of Mode of


Component Assessments Component Assessment
Subjective Test/Sessional
Component 1 02* 40% Offline/Online
Tests (STs)

Component 2 End Term Examinations 01 60% Offline/Online

Total 100%
*Out of 02 STs, the ERP system automatically picks the average of 02 STs Marks for evaluation of the STs as final marks.

10. Details of Evaluation Components:

Evaluation Description Syllabus Timeline of Examination Weightage


Component Covered (%) (%)

ST 01 Up to 36% 4th April, 2022


Component 1 40%
ST 02 37% - 100% 23rd May, 2022

End Term At the End of the 60%


Component 2 100%
Examination* Semester
Total 100%
*As per Academic Guidelines minimum 75% attendance is required to become eligible for appearing in the End Semester
Examination.

Evaluation Components

Type of Assessment Timeline of Total Question Paper Format


Conduct Marks
1 Mark 2 Mark 5 Mark Question 10 Mark
MCQ MCQ Algorithm/Case
Study
4th April,
Sessional Test 1 40 5 5 1 2
2022
23rd May,
Sessional Test 2 40 5 5 1 2
2022

End Term Examination 60 10 5 4 2

B. Syllabus of the Course

Subject: Applied Soft Computing Subject Code: AI102

S.N
Topic (s) No. of Lectures Weightage %
.
1 Understanding data: Introduction – Types of Data: Numeric –
Categorical – Graphical – High Dimensional Data – Classification
4 10%
of digital Data: Structured, Semi-Structured and Un-Structured -
Example, Applications, Sources of Data: Time Series –

Applied Soft Computing /AI102 Page 7 of 8


Course Plan

Transactional Data – Biological Data – Spatial Data – Social


Network Data – Data Evolution, Understand issues relating to
acquisition, cleaning and loading of data, Data Deluge, Data lake
2 Python Numpy, Python Pandas: Dataframes, Handling of data,
Data wrangling, Alignment and Indexing, Handling Missing Data,
Data Cleaning, Merging and Joining Dataframes, Grouping,
14 25%
Masking, Performing Mathematical Operations on Data, Access
and combine data from CSV, JSON, logs, APIs, and databases

3 Data Visualization using pandas, Matplotlib, Seaborn,


Using Tableau for Visualization, Connecting to different data
sources, Data types, Working with Meta Data, Calculations, Purpose
10 20%
of Data Analysis Expresssions (DAX), DAX operators & Functions,
Power Query, Different Charts & Reports, Exploring Data
geographically, Building dashboard to see insights
4 Important statistical concepts used in data science, Difference
between population and sample, Types of variables, Measures
of central tendency, Measures of variability, Coefficient of
variance, Skewness and Kurtosis 5 8%

Exploratory data analysis: Missing value analysis, The correction


matrix, Outlier detection analysis
5 Inferential Statistics: Normal distribution, Test hypotheses,
Parametric and Non-parametric tests, Central limit theorem,
Confidence interval, T-test, Type I and II errors, Student’s T 6 10%
distribution, Non-Parametric Tests: Sign Test, Wilcoxon’s Signed
Rank Test, Mann-Whitney test, Kolmorogov-Smirnov test
6 Regression, ANOVA(One-way and Two-way Analysis of Variance)
,R-square, Correlation and causation, Introduction to
classification problems, Identification of a regression problem, 16 27%
dependent and independent variables, Clustering Problems
Different Case Studies

This document is approved by:

Designation Name Signature


Course Coordinator Dr. Sushil Kumar Narang
Program In-charge Dr. Kamal Deep Garg
Cluster Dean Dr. Sushil Kumar Narang
Date (DD/MM/YYYY)

Applied Soft Computing /AI102 Page 8 of 8

You might also like