0% found this document useful (0 votes)

7 views4 pages

Alternate_Simulated_Practice_Exam_ETC1010

The document is an alternate simulated practice exam for an introductory data analysis course, covering exploratory data analysis (EDA) concepts, data visualization, clustering, and regression. It includes questions on tidy data, reproducibility, graph types, clustering methods, and regression assumptions. Each section contains multiple-choice, true/false, and open-ended questions designed to assess understanding of key data analysis principles.

Uploaded by

Tani C

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views4 pages

Alternate_Simulated_Practice_Exam_ETC1010

Uploaded by

Tani C

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

Alternate Simulated Practice Exam - ETC1010/5510: Introduction to Data

Analysis

PART A: EDA Concepts

Q1. [3 marks]

Explain the term 'tidy data'. Why is it important in data analysis?

Answer:

Tidy data refers to a standard format where each variable forms a column, each observation forms a row, and

each type of observational unit forms a table. It ensures consistency and makes data easier to manipulate

and visualize in R.

Q2. [3 marks]

You have a data frame where dates are in column names (e.g., Jan_2023, Feb_2023). Is this tidy? Explain.

Answer:

No, it is not tidy. Dates should be in one column and values in another, not spread across column names.

Each variable (date) should be in its own column.

Q3. [1 mark]

True or False: Reproducibility includes sharing your code and data.

Answer:

True. Sharing code and data allows others to verify and replicate the analysis.

Q4. [5 marks]

Describe three ways to make your R analysis more reproducible.

Answer:

1. Use R Markdown to combine code, output, and text.

2. Set a seed for random operations with `set.seed()`.

3. Comment your code and use version control (e.g., Git).

PART B: Data Visualisation

Q5. [3 marks]

What type of graph is most suitable to compare the distribution of test scores across three different schools?

Answer:

A boxplot is suitable as it shows medians, IQR, and potential outliers across multiple categories.
Q6. [3 marks]

Interpret the following: A line graph shows a steep increase in sales in December compared to November.

Answer:

It suggests a sharp rise in sales, possibly due to seasonal factors like holidays or promotions.

Q7. [6 marks]

What are two key visual elements that enhance graph readability? Provide examples.

Answer:

1. Labels and titles - clearly indicate what each axis and plot represents.

2. Appropriate use of color - e.g., different colors for product lines in a sales chart.

Q8. [3 marks]

Why should pie charts be avoided for comparing many categories?

Answer:

Pie charts become cluttered and hard to interpret with many slices. Bar charts offer better visual comparison.

PART C: Clustering

Q9. [1 mark]

True or False: k-means clustering is deterministic.

Answer:

False. It is not deterministic unless the random seed is fixed.

Q10. [1 mark - MCQ]

Which statement is TRUE?

a) Clusters must be spherical

b) Clustering is always accurate

c) Distance metric affects clustering

d) Number of clusters is not important

Answer:

c) Distance metric affects clustering

Q11. [3 marks]

What is a dendrogram? How can it help in choosing the number of clusters?

Answer:

A dendrogram is a tree-like diagram showing hierarchical relationships. The height at which branches merge

helps determine an appropriate number of clusters.

Q12. [4 marks]

Explain the difference between complete and single linkage in hierarchical clustering.

Answer:

Complete linkage considers the maximum distance between observations in different clusters, leading to

compact clusters. Single linkage uses the minimum distance, which can lead to chaining effects.

Q13. [6 marks]

What are two limitations of hierarchical clustering?

Answer:

1. Computationally expensive for large datasets.

2. Sensitive to outliers and noise.

Q14. [3 marks]

True or False: In clustering, the order of data points affects the final output.

Answer:

True for k-means without a fixed seed, but not for hierarchical clustering.

PART D: Regression

Q15. [1 mark - True/False]

Adding a variable to a regression model always increases R-squared.

Answer:

True. But it may not improve model performance (Adjusted R is better for comparison).

Q16. [1 mark]

Which function extracts fitted values from a model?

Answer:

fitted()

Q17. [2 marks]

What does a negative coefficient for 'price' imply in a sales prediction model?

Answer:

It implies that as price increases, sales are expected to decrease.

Q18. [5 marks]

What are two assumptions of linear regression and how can you check them?

Answer:
1. Linearity - check with scatterplot of fitted vs. residuals.

2. Homoscedasticity - check if residuals have constant spread.

Q19. [4 marks]

What does a funnel shape in a residual vs. fitted plot indicate?

Answer:

It indicates heteroscedasticity - non-constant variance of errors.

Q20. [2 marks]

Write an R function to multiply two numbers.

Answer:

```r

multiply <- function(x, y) {

return(x * y)

```

Q21. [3 marks]

In regression, why is it important to standardize predictors?

Answer:

To compare coefficients directly and ensure variables on different scales dont bias the model.

COURSERA DATASCIENCE FINAL EXAM - Docx-1
100% (1)
COURSERA DATASCIENCE FINAL EXAM - Docx-1
3 pages
Data Visualization Question Bank eDBDA Sept 21
No ratings yet
Data Visualization Question Bank eDBDA Sept 21
5 pages
Module-1 MCQ of Data Analytics and Visualization
No ratings yet
Module-1 MCQ of Data Analytics and Visualization
6 pages
8614 QUIZ
No ratings yet
8614 QUIZ
14 pages
Atlas Copco Roto - Z Sds
No ratings yet
Atlas Copco Roto - Z Sds
14 pages
Soal Latihan IT Specialist Data Analytics
No ratings yet
Soal Latihan IT Specialist Data Analytics
12 pages
Ss 2 Second Term Economics Note.docx
No ratings yet
Ss 2 Second Term Economics Note.docx
54 pages
84 Quantitative Finals
No ratings yet
84 Quantitative Finals
235 pages
UIIC_AO_Dataanalytics_Syllabuscoveredthroughmcqs
No ratings yet
UIIC_AO_Dataanalytics_Syllabuscoveredthroughmcqs
333 pages
MODULE 2 Coursera
No ratings yet
MODULE 2 Coursera
9 pages
Exam SRM Sample Questions
No ratings yet
Exam SRM Sample Questions
71 pages
Barrons Data Analysis 1
No ratings yet
Barrons Data Analysis 1
39 pages
Exam SRM Sample Questions
No ratings yet
Exam SRM Sample Questions
69 pages
The Owner of A Company Has Recently Decided To Raise The Salary of One Employee, Who Was Already Making The Highest Salary, by 20%. Which of The Following Is (Are) Expected To Be Affected by
No ratings yet
The Owner of A Company Has Recently Decided To Raise The Salary of One Employee, Who Was Already Making The Highest Salary, by 20%. Which of The Following Is (Are) Expected To Be Affected by
8 pages
V20UDM301 - Statistics in Business - Copy
No ratings yet
V20UDM301 - Statistics in Business - Copy
41 pages
Descriptive Analytics
No ratings yet
Descriptive Analytics
31 pages
Iba Notes With Links (1)
No ratings yet
Iba Notes With Links (1)
57 pages
Datascience Interview
100% (1)
Datascience Interview
31 pages
Bussiness analytics_FINAL
No ratings yet
Bussiness analytics_FINAL
34 pages
121a1086 - Bda - Assignment - No.2
No ratings yet
121a1086 - Bda - Assignment - No.2
31 pages
PRACTICE QUIZ (1)
No ratings yet
PRACTICE QUIZ (1)
20 pages
AI & DS IAT-2 QB SOLN
No ratings yet
AI & DS IAT-2 QB SOLN
27 pages
Sheet 3
No ratings yet
Sheet 3
12 pages
Data Science and Big Data Analysis Mcqs
No ratings yet
Data Science and Big Data Analysis Mcqs
53 pages
Test Bank for Introduction to Business Statistics, 7th Edition - Download Now For An Unlimited Reading Experience
100% (8)
Test Bank for Introduction to Business Statistics, 7th Edition - Download Now For An Unlimited Reading Experience
54 pages
Chapter 3
No ratings yet
Chapter 3
12 pages
Data Science Interview Questions
No ratings yet
Data Science Interview Questions
31 pages
Foundations of Data Science - R19AD253
No ratings yet
Foundations of Data Science - R19AD253
22 pages
DADM Kahoot
No ratings yet
DADM Kahoot
20 pages
2023 Mock Test
No ratings yet
2023 Mock Test
3 pages
DADM Original cheat data
No ratings yet
DADM Original cheat data
25 pages
PRACTICE QUIZ
No ratings yet
PRACTICE QUIZ
10 pages
Data Analysis
No ratings yet
Data Analysis
7 pages
R Programming
No ratings yet
R Programming
11 pages
Question Big data-1
No ratings yet
Question Big data-1
11 pages
ASMR notes
No ratings yet
ASMR notes
6 pages
mcq.docx
No ratings yet
mcq.docx
11 pages
11 Economics Eng PP 2023 24 1
No ratings yet
11 Economics Eng PP 2023 24 1
6 pages
DS&BDA Techneo Unit 1&2 MCQs
No ratings yet
DS&BDA Techneo Unit 1&2 MCQs
16 pages
Revision Exercise SDSC5001 Midterm
No ratings yet
Revision Exercise SDSC5001 Midterm
4 pages
QCM Preparation
No ratings yet
QCM Preparation
9 pages
HSB1003_Sample Exam 2023
No ratings yet
HSB1003_Sample Exam 2023
9 pages
IDS MID2 QUIZ
No ratings yet
IDS MID2 QUIZ
6 pages
Exploratory Data Analysis for Machine Learning
No ratings yet
Exploratory Data Analysis for Machine Learning
6 pages
Applied Business Statistics Unit 3 Mcqs
No ratings yet
Applied Business Statistics Unit 3 Mcqs
13 pages
mod 1,2
No ratings yet
mod 1,2
15 pages
DS_IAT_2_Question_Bank[1] (1)
No ratings yet
DS_IAT_2_Question_Bank[1] (1)
7 pages
Assignment- Unit-5 Answers_af6dd8e6c794019a5e522eb5f2ddb474
No ratings yet
Assignment- Unit-5 Answers_af6dd8e6c794019a5e522eb5f2ddb474
6 pages
Sixth_Simulated_Practice_Exam_ETC1010
No ratings yet
Sixth_Simulated_Practice_Exam_ETC1010
3 pages
EDA 2K22 DEC Exploratory data analysis OE00075
No ratings yet
EDA 2K22 DEC Exploratory data analysis OE00075
4 pages
DS Bits Mid-2 Exam
No ratings yet
DS Bits Mid-2 Exam
4 pages
DSC2608_Assessment_05 S1-2025
No ratings yet
DSC2608_Assessment_05 S1-2025
4 pages
ibm_ps.1_trayambak.
No ratings yet
ibm_ps.1_trayambak.
3 pages
DS Bits Mid-2 Student
No ratings yet
DS Bits Mid-2 Student
3 pages
FDS_2_Marks_QA
No ratings yet
FDS_2_Marks_QA
2 pages
BDA Important Questions
No ratings yet
BDA Important Questions
3 pages
Industrial Training
No ratings yet
Industrial Training
61 pages
SERB Major Research Project Proposal Form
No ratings yet
SERB Major Research Project Proposal Form
26 pages
Corporate Finance I
No ratings yet
Corporate Finance I
3 pages
ETU - Skills Based Talent Strategy Ebook
100% (1)
ETU - Skills Based Talent Strategy Ebook
17 pages
Muscle Relaxant
No ratings yet
Muscle Relaxant
29 pages
Application of Derivatives Solutions (Exercise 3)
No ratings yet
Application of Derivatives Solutions (Exercise 3)
27 pages
Assessment q2 SCIENCE
No ratings yet
Assessment q2 SCIENCE
11 pages
Edexcel IGCSE Biology Experimental Method Notes
No ratings yet
Edexcel IGCSE Biology Experimental Method Notes
17 pages
Refrigerated Centrifuge Manufacturers India
No ratings yet
Refrigerated Centrifuge Manufacturers India
4 pages
Guidelines On Clinical Management of Endometrial Hyperplasia
No ratings yet
Guidelines On Clinical Management of Endometrial Hyperplasia
14 pages
Residual Alkalinity Nomograph by John Palmer PDF
No ratings yet
Residual Alkalinity Nomograph by John Palmer PDF
1 page
Full Stack Java Development Internship
No ratings yet
Full Stack Java Development Internship
11 pages
Docs Google Com Document D e 2PACX 1vQkrH6tfmh5yVIfICop LASur9tVfF4t1euZg1tNEMs4Vi6WcguNDKtP15gbxQFSg6xd8awDqIQBNgr Pub
No ratings yet
Docs Google Com Document D e 2PACX 1vQkrH6tfmh5yVIfICop LASur9tVfF4t1euZg1tNEMs4Vi6WcguNDKtP15gbxQFSg6xd8awDqIQBNgr Pub
3 pages
Personal Details:: Caste/Community Name
No ratings yet
Personal Details:: Caste/Community Name
13 pages
Vita Carnahan - 2
No ratings yet
Vita Carnahan - 2
6 pages
VDOeditor v1.0.3.6 EEPROM dumps of VDO dashboards
No ratings yet
VDOeditor v1.0.3.6 EEPROM dumps of VDO dashboards
6 pages
Environmental & Resource Economics
No ratings yet
Environmental & Resource Economics
16 pages
SFI - Selective Pricing Strategies
No ratings yet
SFI - Selective Pricing Strategies
2 pages
Resume Nur Asyiqin
No ratings yet
Resume Nur Asyiqin
2 pages
Cement 1 #
No ratings yet
Cement 1 #
20 pages
TQCQ Petro Valves
No ratings yet
TQCQ Petro Valves
3 pages
ENERGIZER NH12-700 (HR03) : Product Datasheet
No ratings yet
ENERGIZER NH12-700 (HR03) : Product Datasheet
1 page
Tut 1
No ratings yet
Tut 1
1 page
Topic: Functions and Importance of Mass Media. Functions of Mass Media
No ratings yet
Topic: Functions and Importance of Mass Media. Functions of Mass Media
4 pages
RT-1 Mat Cs(Ch1,2) Ans
No ratings yet
RT-1 Mat Cs(Ch1,2) Ans
4 pages
Activity No. 11 General Characteristics of Carbohydrates: Name: Group No.: 4 Rating
No ratings yet
Activity No. 11 General Characteristics of Carbohydrates: Name: Group No.: 4 Rating
7 pages
Textile Finishing Chemicals
No ratings yet
Textile Finishing Chemicals
2 pages
Eng Math I Course Outline
No ratings yet
Eng Math I Course Outline
2 pages
Hands-On AI Trading with Python, QuantConnect, and AWS
From Everand
Hands-On AI Trading with Python, QuantConnect, and AWS
Jiri Pik
3/5 (1)
Illuminating Data: A hands on guide to data visualization in R
From Everand
Illuminating Data: A hands on guide to data visualization in R
Eman Ahmad
No ratings yet
Six Sigma Green Belt, Round 2: Making Your Next Project Better than the Last One
From Everand
Six Sigma Green Belt, Round 2: Making Your Next Project Better than the Last One
Tracy L. Owens
No ratings yet
Business Forecasting: The Emerging Role of Artificial Intelligence and Machine Learning
From Everand
Business Forecasting: The Emerging Role of Artificial Intelligence and Machine Learning
Michael Gilliland
No ratings yet
100 Puzzles to Learn Data Warehousing
From Everand
100 Puzzles to Learn Data Warehousing
Cristian Scutaru
No ratings yet
Profit Driven Business Analytics: A Practitioner's Guide to Transforming Big Data into Added Value
From Everand
Profit Driven Business Analytics: A Practitioner's Guide to Transforming Big Data into Added Value
Wouter Verbeke
No ratings yet