0% found this document useful (0 votes)

3 views

Data Science Question Bank Updated - Google Docs

The document is a comprehensive question bank for a Data Science course, covering topics such as the definition and significance of Data Science, its interdisciplinary fields, the roles of Data Scientists, and the Data Science process. It includes sections on statistics, machine learning concepts, and data visualization, along with practical numerical problems and ethical considerations in data science. Each unit contains detailed questions aimed at assessing knowledge and understanding of the subject matter.

Uploaded by

Piyush Patil

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views

Data Science Question Bank Updated - Google Docs

Uploaded by

Piyush Patil

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

CE0630-Data Science Question Bank

Unit 1 Introduction to Data Science

Introduction to Data Science

1. Define Data Science and explain its significance. How does it

differ from traditional data analysis and Business Intelligence?
2. What are the key interdisciplinary fields that contribute to Data
Science? Explain their roles.
3. What are the main responsibilities of a Data Scientist? How do
they differ from Data Analysts and Data Engineers?
4. What key skills are required to become a Data Scientist? Discuss
both technical and non-technical skills.
5. How does Data Science contribute to business decision-making?
Give real-world examples.
6. List out important use cases of data science.
7. Give Differences between Data Engineer, Data Analyst and Data
Scientist.
8. Discuss some major applications of Data Science in healthcare,
finance, e-commerce, and social media analytics.
9. Explain the relationship between Data Science and Big Data.
What are the key characteristics (6Vs) of Big Data?
10. What is the role of Machine Learning in Data Science?
Differentiate between Supervised, Unsupervised, and
Reinforcement Learning.
11. How is Data Science used in fraud detection, recommendation
systems, and predictive maintenance? Explain with examples.
12. Discuss the role of Data Science in social media analytics,
customer segmentation, and personalized marketing.
13. Discuss the application areas of Data Science.
14. Discuss the use cases of Data Science.

Data Science Process Overview

1. List out and explain the various phases of the data science process.
2. Why is goal definition important in a Data Science project? How
does domain expertise help in this process? Explain first step of
data science process-Setting the Research Goal & Project Charter
3. What is the data retrieval phase?What are the common sources of
data in Data Science?
4. Explain data preparation phase with data cleaning , transformation
and combining data.
5. Explain various types of data entry errors and how to fix them in
the data preparation phase.
6. What is data cleaning, and why is it crucial? Discuss handling
missing data, outliers, and feature scaling.
7. What is Exploratory Data Analysis (EDA)? How do visualization
techniques help in understanding data?OR Explain data exploration
phase in data science process.
8. What is data modeling in Data Science? What is a hold out sample?
How does cross-validation improve model performance?
Unit II Introduction to statistics

1. Basics of Statistics:Descriptive Statistics

1. What is statistics? How is it used in Data Science?

2. Differentiate between Descriptive and Inferential Statistics with
examples.
3. Explain the concepts of Population and Sample in statistics. Why is
sampling important?
4. What are the different types of variables in statistics? Give examples.
5. Define Measures of Central Tendency. Why is it important in data
science? Explain Mean, Median, and Mode with examples.
6. What are Measures of Variability? Discuss Range, Variance, and
Standard Deviation.
7. Define Coefficient of Variance (CV) and explain its significance.
8. What is Skewness? How does it indicate the shape of a distribution?
9. What is Kurtosis? How does it describe the characteristics of a
probability distribution?

3. Inferential Statistics

10. What is the Normal Distribution, and why is it important in statistics?

11. Explain Hypothesis Testing and its importance in statistics.
12. What is the Central Limit Theorem (CLT)? Why is it important in
inferential statistics?
13. What is a Confidence Interval? How is it calculated?
14. What is a t-test? Explain its applications with examples.
15.Differentiate between Type I and Type II errors in hypothesis testing.
Numerical Problems

15. The number of points scored by two teams in a hockey match is

given below. With the help of Coefficient of Variation, determine
which team is more consistent.

16. Coefficients of Variation and Standard Deviation of two series X and Y

are 55.43% and 48.86%, and 25.5 and 24.43, respectively. Find the
means of series X and Y.
17. The standard deviation and mean of the data are 8.5 and 14.5
respectively. Find the coefficient of variation
18. If the mean and coefficient of deviation of data are 13 and 38
respectively, then locate the value of expected variation?
19. The mean and standard variation of marks received by 40 students of a
class in three subjects Mathematics, English and economics are given
below.Which of the three subjects indicates the most elevated deviation
and which indicates the most subordinate variation in marks?

Standard
Subject Mean
deviation

Maths 65 10

English 60 12

Economics 57 14
20. In a small business firm, two typists are employed- typist A and Typist B.
Typist A types out, on an average, 30 pages per day with a standard
deviation of 6. Typist B, on an average, types out 45 pages with a
standard deviation of 10. Which typist shows greater consistency in his
output.
21. The male population’s weight data follows a normal distribution. It
has a mean of 70 kg and a standard deviation of 15 kg. What would
the mean and standard deviation of a sample of 50 guys be if a
researcher looked at their records?
22. A distribution has a mean of 69 and a standard deviation of 420.
Find the mean and standard deviation if a sample of 80 is drawn
from the distribution.
23. A nutritionist claims that the average sugar content in a brand of cereal is
less than 10 grams per serving. A random sample of 30 cereal boxes
shows an average sugar content of 9.5 grams with a standard deviation
of 1.2 grams. At a 5% significance level (α = 0.05), test whether the
nutritionist's claim is supported.
24. A manufacturer claims that the average lifespan of its LED bulbs is at
least 25,000 hours. A consumer protection agency tests 40 randomly
selected bulbs and finds an average lifespan of 24,500 hours with a
standard deviation of 1,200 hours. At a 5% significance level (α =
0.05), test whether the agency’s data contradicts the manufacturer’s
claim.
25. A soft drink company claims that the average sugar content in its cola is
39 grams per can. A health organization collects a random sample of
50 cans and finds the average sugar content is 40 grams, with a
standard deviation of 2 grams. At a 1% significance level (α = 0.01),
test if the actual sugar content is different from 39 grams.
26. A company manufacturing automobiles finds that tyre-life is normally
distributed with a mean of 40,000 km and standard deviation of 3000
km. It is believed that a change in the production process will result in a
better product and the company has developed a new tyre. A sample of
100 new tyres has been selected. The company has found that the mean
life of these new tyres is 40,900 Km. Can it be concluded that the new
tyre is significantly better than the old one, using the significance level of
0.01.
27. Hint; we are interested in testing whether or not there has been an
increase in the mean life of tyres or test whether the mean life of new
tyre has increased beyond 40,000 km.
28. Following are the runs scored by two batsmen in 5 cricket matches, Who
is more consistent in scoring runs?

Batsman A: 38 47 34 18 33

Batsman B: 37 35 41 27 35

29. Find the skewness for the given Data ( 2,4,6,6) :

Skewness = 3(Mean – Median)/S.D.

30. For the given observations {23, 24, 56, 55, 28, 38, 48}, calculate:
● Skewness
● Kurtosis
● Determine the type of kurtosis

31. Given the weights of five persons: 120, 140, 150, 160, and 180 find the
following:
● Mean
● Median
● Mode
● Standard deviation
● Variance
● Interquartile range

32. A random sample of n = 500 observations from a binomial population

produced x = 240 successes.

● Find a point estimate for p and place a 95% confidence interval.

● Find a 90% confidence interval for p.

33. Given the observations {6, 8, 10, 12, 14, 16, 18, 20, 22, 24}, calculate the
following:

● Mean
● Median
● Standard deviation
● Variance
● Skewness
● Kurtosis
● Lower quartile
● Upper quartile
● Middle quartile
● Interquartile range
● Range

35. Calculate Population Skewness, Population Kurtosis from the following

grouped data and explain the type of kurtosis and skewness of the data.

34. Calculate Sample mean, sample variance, sample skewness and sample
kurtosis from the following grouped data:

Class Interval Frequency

2-4 3

4-6 4

6-8 2
8-10 1

Unit III: Machine Learning - Introduction and

Concepts
1. What is Machine Learning?Explain the Modeling Process in Machine
Learning. What are its key steps?
2. Explain the following key terminologies in Machine Learning: Features,
Target, Training Data, Testing Data, Overfitting, and Underfitting.
3. What are the four main phases in the machine learning modeling
process?
4. What is the role of feature engineering in Machine Learning?
5. What is the role of model training in machine learning? What is its
significance?
6. What is model selection and validation, and why is it important in
machine learning? How is model scoring used to assess the effectiveness
of a machine learning model?
7. What are some key methods for validating a Machine Learning model?
8. How does a trained model make predictions on new observations?
9. List out and explain the role of various Python tools used in machine
learning for data science.
10. What is Supervised Learning? How does it work? Explain the differences
between Regression and Classification in Supervised Learning.
11. What is a Naïve Bayes classifier? Explain it in the context of the case
study on handwritten digit recognition.
12. What is Unsupervised Learning? How does it differ from Supervised
Learning?
13. Explain linear regression with suitable examples for supervised learning.
14. How can a confusion matrix help evaluate the performance of a
classification model?
15. What role does principal component analysis (PCA) play in unsupervised
learning? OR

How does PCA help in reducing input variables while maintaining

important information?

16. What are Clustering Algorithms? Explain their applications in Data

Science. Explain K-means clustering algorithm with suitable examples.
17. What are the key evaluation metrics for classification and
regression models? Provide examples.

Examples:

KNN Classification

6. Given the following dataset with two features (X1, X2) and class labels:

X1 X2 Class

1 1 A

2 2 A

3 3 B

6 6 B

Using KNN with K=3, classify a new data point (4,4).

7. You have a dataset of fruits classified based on their weight and size.

Weight (g) Size (cm) Fruit Type

150 8 Apple

180 10 Apple

200 12 Orange

220 14 Orange

Classify a fruit with weight 190g and size 11cm using KNN (K=3).

KNN Regression
8. A dataset provides the exam scores of students based on study hours:

Hours Studied Exam Score

1 45

2 50

3 55

4 60

5 65

Predict the score for a student who studies 3.5 hours using KNN
regression with K=3.

9. Given house price data:

Area (sq. ft.) Price (₹ in Lakhs)

1000 50

1500 70

2000 90

2500 110

Predict the price for a 1750 sq. ft. house using KNN regression with
K=2.

10. Given the dataset of house prices:

Price (₹ in
Area (sq. ft.)
Lakhs)
1000 50
1500 75
2000 100
Find the linear regression equation (y = mx + c) and predict the price of
a 1250 sq. ft. house.

11. A company’s advertisement spending (₹ Lakhs) and sales revenue (₹

Crores) is given below:

Ad Spend (₹ Lakhs) Sales Revenue (₹ Crores)

1 10

2 20

3 30

4 40

Fit a linear regression model and estimate the sales revenue if the ad spend is
2.5 Lakhs.

12. Given student study hours and exam scores:

Hours Studied Score

2 50

4 60

6 70
8 80

Calculate the linear regression equation and predict the score for a student who
studies 5 hours.

13. A company records employee experience and salary:

Experience (Years) Salary (₹ Lakhs)

1 3

3 6

5 9

Find the regression equation and predict the salary for an employee with 4
years of experience.

Unit IV: Data Visualization

Data visualization options – Filters – Python libraries for visualization –

Matplotlib- seaborn
Data Science Ethics – Doing good data science – Owners of the data -
Valuing different aspects of privacy - Getting informed consent - The
Five Cs – Diversity – Inclusion – Future Trends.

Data Visualization
Filters:
1. What is the purpose of filters in data visualization, and how do they
enhance the viewer's experience?
2. Can you explain the difference between static and interactive filters
in data visualizations?
Python Libraries for Visualization:
1. Compare and contrast Matplotlib and Seaborn. What are the
primary use cases for each library?
2. How would you choose the right visualization library for a given
data science project? Provide examples where you would prefer one
over the other.
Matplotlib:
1. What are the basic components of a Matplotlib plot?
2. How can you create a multi-line plot using Matplotlib?
3. Describe box plot graph in detail.
Seaborn:
1. Describe how Seaborn integrates with other Python libraries. What
advantages does this offer?
2. Provide an example of a complex data visualization that can be
more easily generated with Seaborn than with Matplotlib.
Data Science Ethics
Doing Good Data Science:
1. What are the ethical considerations a data scientist must keep in
mind when designing a new algorithm?
2. How can data scientists ensure their work contributes positively to
society?
Owners of the Data:
1. Discuss the implications of data ownership in the context of
personal versus corporate data.
2. How does the concept of data ownership affect data access for
scientific research?
Valuing Different Aspects of Privacy:
1. What challenges arise when balancing individual privacy with the
benefits of big data analytics?
2. Provide examples of privacy-preserving techniques in data science.
The Five Cs of Data Science:
1. Define the "Five Cs" in the context of ethical data science practices.
2. How can adhering to the Five Cs improve the outcome of a data
science project?
Diversity and Inclusion:
1. Why is diversity important in data science teams and data
collection?
2. Discuss an example where lack of diversity in data collection led to
biased outcomes.

Note: Question Bank is for reference purposes. Mid-semester and

End Semester Exam Question papers will be drawn from the syllabus
mentioned in the Course file.

James R. Evans - Statistics, Data Analysis and Decision Modeling International 5th Ed.-Pearson (2013)
86% (14)
James R. Evans - Statistics, Data Analysis and Decision Modeling International 5th Ed.-Pearson (2013)
543 pages
Ae 311 Midterm Exam Part
No ratings yet
Ae 311 Midterm Exam Part
13 pages
Data Quality Management For LCA
No ratings yet
Data Quality Management For LCA
8 pages
3B-Using CLSI Guidelines To
100% (1)
3B-Using CLSI Guidelines To
53 pages
ds_imp_qs
No ratings yet
ds_imp_qs
4 pages
QT all question
No ratings yet
QT all question
2 pages
das ffff
No ratings yet
das ffff
16 pages
Module 3 Numericals
No ratings yet
Module 3 Numericals
3 pages
Ds 5 Marks Final
No ratings yet
Ds 5 Marks Final
11 pages
Day 3 Statistics Interview QnA
No ratings yet
Day 3 Statistics Interview QnA
5 pages
Descriptive Analysis
No ratings yet
Descriptive Analysis
2 pages
BPCC 104 EM 23-24 @assignment - Solved - IGNOU
No ratings yet
BPCC 104 EM 23-24 @assignment - Solved - IGNOU
11 pages
Priority Questions
No ratings yet
Priority Questions
12 pages
Question Paper Code:: Reg. No.
No ratings yet
Question Paper Code:: Reg. No.
37 pages
AP Stat Spring Pacing
No ratings yet
AP Stat Spring Pacing
4 pages
Unit 1 Computational Statistics
No ratings yet
Unit 1 Computational Statistics
58 pages
Undefined
No ratings yet
Undefined
3 pages
Intro To Probability and Statistics
No ratings yet
Intro To Probability and Statistics
147 pages
APznzaZmf FjNZzQU2KZGNWcTIMyEPNieeXpEIC4txhLpx IW9aIcijwEdcvmrObIy4gDpcU78AYLsB6msaeqj47x3Fc6z9vdKhe5EnyMTtReSpFg 23R3DG W66DWWysqOW PfB BJrKuEN CsrKXdSrdM OKOdbGKa2ND0ltkJXrievcwimUpSlHEYiQCPleUm8zmyjmaz7 PPZRnRfUuizv
No ratings yet
APznzaZmf FjNZzQU2KZGNWcTIMyEPNieeXpEIC4txhLpx IW9aIcijwEdcvmrObIy4gDpcU78AYLsB6msaeqj47x3Fc6z9vdKhe5EnyMTtReSpFg 23R3DG W66DWWysqOW PfB BJrKuEN CsrKXdSrdM OKOdbGKa2ND0ltkJXrievcwimUpSlHEYiQCPleUm8zmyjmaz7 PPZRnRfUuizv
24 pages
1 Introduction (Student Version)
No ratings yet
1 Introduction (Student Version)
17 pages
Machine Learning (1) : Inteligência Artificial E Cibersegurança (Inacs)
No ratings yet
Machine Learning (1) : Inteligência Artificial E Cibersegurança (Inacs)
33 pages
SML - Question Bank-20.2.25
No ratings yet
SML - Question Bank-20.2.25
35 pages
Lecture Note Sse2193
33% (3)
Lecture Note Sse2193
251 pages
Business Statistics: For University of Delhi
No ratings yet
Business Statistics: For University of Delhi
11 pages
IFS Assignment LLT
No ratings yet
IFS Assignment LLT
3 pages
Mock Exam - Summer 2024 (Business Stat 1)
No ratings yet
Mock Exam - Summer 2024 (Business Stat 1)
10 pages
E-Note_33325_Content_Document_20250319114322AM
No ratings yet
E-Note_33325_Content_Document_20250319114322AM
69 pages
Question Bank For Mba Students
No ratings yet
Question Bank For Mba Students
6 pages
BS-II Assignment 1 Batch 2022 - 25 Sem 3 Date 29 Aug 2023
No ratings yet
BS-II Assignment 1 Batch 2022 - 25 Sem 3 Date 29 Aug 2023
3 pages
Get Intro Stats 5th Edition by Richard D. de Veaux (Ebook PDF) PDF Ebook With Full Chapters Now
100% (2)
Get Intro Stats 5th Edition by Richard D. de Veaux (Ebook PDF) PDF Ebook With Full Chapters Now
51 pages
MB650005 DATA ANALYSIS FOR MANAGEMENT
No ratings yet
MB650005 DATA ANALYSIS FOR MANAGEMENT
14 pages
Q.B Statistics
No ratings yet
Q.B Statistics
7 pages
question-bank
No ratings yet
question-bank
7 pages
Lecture 2 - Statistical Inference - EDA and DS Process - 02032023 111156am 1 - 1 27022024 012412pm
No ratings yet
Lecture 2 - Statistical Inference - EDA and DS Process - 02032023 111156am 1 - 1 27022024 012412pm
44 pages
Book 2.0 - Python
100% (1)
Book 2.0 - Python
143 pages
2) Final Question Bank_DA-QB (1)
No ratings yet
2) Final Question Bank_DA-QB (1)
8 pages
Assignment
No ratings yet
Assignment
11 pages
Statistics Project Guide
No ratings yet
Statistics Project Guide
7 pages
ADS QB Num+Theory Soln
No ratings yet
ADS QB Num+Theory Soln
37 pages
Question Bank
No ratings yet
Question Bank
7 pages
DS Question Bank
No ratings yet
DS Question Bank
13 pages
(eBook PDF) Statistics, Data Analysis, and Decision Modeling 5th Edition download
100% (2)
(eBook PDF) Statistics, Data Analysis, and Decision Modeling 5th Edition download
56 pages
Assignment
0% (2)
Assignment
4 pages
STA501 Study Guide 2024-02-27 01 - 00 - 08
No ratings yet
STA501 Study Guide 2024-02-27 01 - 00 - 08
270 pages
Fact 2
No ratings yet
Fact 2
6 pages
chapter2-statistical analysis
No ratings yet
chapter2-statistical analysis
86 pages
QBM 101 Business Statistics: Department of Business Studies Faculty of Business, Economics & Accounting HE LP University
No ratings yet
QBM 101 Business Statistics: Department of Business Studies Faculty of Business, Economics & Accounting HE LP University
62 pages
data science dse
No ratings yet
data science dse
24 pages
AS CP
No ratings yet
AS CP
3 pages
Activity
No ratings yet
Activity
11 pages
Au B.com Business Statistics
No ratings yet
Au B.com Business Statistics
221 pages
F.Y.B.sc. Statistics-Statistical Techniques
100% (1)
F.Y.B.sc. Statistics-Statistical Techniques
18 pages
Role of Statistics in Data Science
No ratings yet
Role of Statistics in Data Science
8 pages
(eBook PDF) Modern Business Statistics, with Microsoft Office Excel 4th Edition pdf download
100% (2)
(eBook PDF) Modern Business Statistics, with Microsoft Office Excel 4th Edition pdf download
50 pages
(eBook PDF) Business Statistics: For Contemporary Decision Making, 8th Edition - The ebook in PDF and DOCX formats is ready for download now
100% (1)
(eBook PDF) Business Statistics: For Contemporary Decision Making, 8th Edition - The ebook in PDF and DOCX formats is ready for download now
45 pages
Book IntroStatistics PDF
No ratings yet
Book IntroStatistics PDF
263 pages
ZC-417 Quantitative Methods Exam Notes
No ratings yet
ZC-417 Quantitative Methods Exam Notes
144 pages
ESA- QP_UE19-20CS203_SDS_Scheme and Solution
No ratings yet
ESA- QP_UE19-20CS203_SDS_Scheme and Solution
12 pages
Assignment
No ratings yet
Assignment
10 pages
Practice Questions
No ratings yet
Practice Questions
5 pages
Comprehensive Guide to Statistics
From Everand
Comprehensive Guide to Statistics
Mohit Chatterjee
No ratings yet
Statistical Analysis and Visualization
From Everand
Statistical Analysis and Visualization
Mohit Chatterjee
No ratings yet
Student Import Data Example
No ratings yet
Student Import Data Example
5 pages
Student Import Data Example (1) (1)
No ratings yet
Student Import Data Example (1) (1)
5 pages
PiyushPatilCoverLetter.pdf
No ratings yet
PiyushPatilCoverLetter.pdf
1 page
Unit_1.pptx
No ratings yet
Unit_1.pptx
57 pages
DPA Assignment 3
No ratings yet
DPA Assignment 3
1 page
Presentation Outline-Robotics PHASE 2 (1) (1) (1)
No ratings yet
Presentation Outline-Robotics PHASE 2 (1) (1) (1)
13 pages
LENGKAP - 6. Kurniawan Et. Al. 2020 Ecology and Colour Variation of Oreophryne Monticola
No ratings yet
LENGKAP - 6. Kurniawan Et. Al. 2020 Ecology and Colour Variation of Oreophryne Monticola
11 pages
The Impact of PM Housing Scheme
No ratings yet
The Impact of PM Housing Scheme
85 pages
Output Genstat
No ratings yet
Output Genstat
19 pages
Israr Educational Statistics (8614) 1st Assignment
No ratings yet
Israr Educational Statistics (8614) 1st Assignment
51 pages
Guia 1 - Medidas Variabilidad
No ratings yet
Guia 1 - Medidas Variabilidad
7 pages
CBS News 2016 Battleground Tracker, Methods: Florida, North Carolina, Wisconsin, Colorado, June 2016
No ratings yet
CBS News 2016 Battleground Tracker, Methods: Florida, North Carolina, Wisconsin, Colorado, June 2016
3 pages
2.1 Random Variables 2.1.1 Definition: PX PX X
100% (1)
2.1 Random Variables 2.1.1 Definition: PX PX X
13 pages
Final Educ 107 Unit 4 Analysis and Interpretation of Assessment Results
No ratings yet
Final Educ 107 Unit 4 Analysis and Interpretation of Assessment Results
33 pages
A General Formula For Hydrologic Frecuency Analysis - Chow - 1951
No ratings yet
A General Formula For Hydrologic Frecuency Analysis - Chow - 1951
7 pages
Statistics Midterm Review
No ratings yet
Statistics Midterm Review
21 pages
Stanbiototal Calcium Liquicolor Procedure No. 0150: Expected Values
No ratings yet
Stanbiototal Calcium Liquicolor Procedure No. 0150: Expected Values
2 pages
Statistics 1-17
No ratings yet
Statistics 1-17
18 pages
Ashcroft 1994
No ratings yet
Ashcroft 1994
25 pages
Food Chemistry: Sahameh Shafiee, Saeid Minaei, Nasrollah Moghaddam-Charkari, Mohsen Barzegar
No ratings yet
Food Chemistry: Sahameh Shafiee, Saeid Minaei, Nasrollah Moghaddam-Charkari, Mohsen Barzegar
8 pages
Makintosh Probe Test
No ratings yet
Makintosh Probe Test
8 pages
In Vitro Evaluation of Salt Tolerant Traits in Indigenous Rice Genotypes and Advanced Mutant Lines Using Different Concentration of NaCl
No ratings yet
In Vitro Evaluation of Salt Tolerant Traits in Indigenous Rice Genotypes and Advanced Mutant Lines Using Different Concentration of NaCl
5 pages
Sampling Guide For Air Contaminants in The Workplace
No ratings yet
Sampling Guide For Air Contaminants in The Workplace
152 pages
Faqs Sta301 by Naveedabbas17
100% (1)
Faqs Sta301 by Naveedabbas17
77 pages
JONES Et Al-1996-International Journal of Climatology
No ratings yet
JONES Et Al-1996-International Journal of Climatology
17 pages
Exam 1 Am
No ratings yet
Exam 1 Am
51 pages
HP1 2015
No ratings yet
HP1 2015
103 pages
2a-Accompanying by Alex Natera
No ratings yet
2a-Accompanying by Alex Natera
9 pages
BF330 FPD 7 2020 2
No ratings yet
BF330 FPD 7 2020 2
58 pages
Manual QC Mamomat Siemen
No ratings yet
Manual QC Mamomat Siemen
56 pages
Name: Hazem Emam Ali Section: 2 I.D: 13P1082 Building Engineering Materials Assignment
No ratings yet
Name: Hazem Emam Ali Section: 2 I.D: 13P1082 Building Engineering Materials Assignment
5 pages
Chapter-3ni Kamote Chua
No ratings yet
Chapter-3ni Kamote Chua
29 pages
Applicability of Pont's Index in Orthodontics
No ratings yet
Applicability of Pont's Index in Orthodontics
5 pages
Development and Validation of Novel Hydrotropic Solubilization Method For Spectrophotometric Determination of Halofantrine in Pure and Solid Dosage Form Nwodo NJ, Nnadi CO and Nnadi KI
No ratings yet
Development and Validation of Novel Hydrotropic Solubilization Method For Spectrophotometric Determination of Halofantrine in Pure and Solid Dosage Form Nwodo NJ, Nnadi CO and Nnadi KI
6 pages

Data Science Question Bank Updated - Google Docs

Uploaded by

Data Science Question Bank Updated - Google Docs

Uploaded by

CE0630-Data Science Question Bank

Unit 1 Introduction to Data Science

Introduction to Data Science

1. Define Data Science and explain its significance. How does it

Data Science Process Overview

1. Basics of Statistics:Descriptive Statistics

1. What is statistics? How is it used in Data Science?

10. What is the Normal Distribution, and why is it important in statistics?

15. The number of points scored by two teams in a hockey match is

16. Coefficients of Variation and Standard Deviation of two series X and Y

29. Find the skewness for the given Data ( 2,4,6,6) :

Skewness = 3(Mean – Median)/S.D.

32. A random sample of n = 500 observations from a binomial population

● Find a point estimate for p and place a 95% confidence interval.

35. Calculate Population Skewness, Population Kurtosis from the following

Class Interval Frequency

Unit III: Machine Learning - Introduction and

How does PCA help in reducing input variables while maintaining

16. What are Clustering Algorithms? Explain their applications in Data

Using KNN with K=3, classify a new data point (4,4).

Weight (g) Size (cm) Fruit Type

Hours Studied Exam Score

9. Given house price data:

Area (sq. ft.) Price (₹ in Lakhs)

10. Given the dataset of house prices:

11. A company’s advertisement spending (₹ Lakhs) and sales revenue (₹

Ad Spend (₹ Lakhs) Sales Revenue (₹ Crores)

12. Given student study hours and exam scores:

Hours Studied Score

13. A company records employee experience and salary:

Experience (Years) Salary (₹ Lakhs)

Unit IV: Data Visualization

Data visualization options – Filters – Python libraries for visualization –

Note: Question Bank is for reference purposes. Mid-semester and

You might also like