Data Analyst Multiple Choice Questions
Data Analyst Multiple Choice Questions
1. What is the measure of central tendency that is most affected by extreme values?
o A. Mean *
o B. Median
o C. Mode
o D. Range
2. What does a p-value less than 0.05 indicate in hypothesis testing?
o A. The null hypothesis should be accepted
o B. The test is invalid
o C. There is strong evidence against the null hypothesis *
o D. The alternative hypothesis is false
3. Which of the following distributions is used when the sample size is small (n < 30)?
o A. Normal distribution
o B. Poisson distribution
o C. Student’s t-distribution *
o D. Chi-square distribution
4. In regression analysis, what does a high R-squared value indicate?
o A. The model explains most of the variability in the dependent variable *
o B. The model is not statistically significant
o C. The model has multicollinearity issues
o D. The independent variables are not correlated
5. What is the purpose of standard deviation in statistics?
o A. To measure the central tendency
o B. To measure the dispersion of data points from the mean *
o C. To calculate probability
o D. To determine correlation strength
6. What type of data is used in a chi-square test?
o A. Continuous data
o B. Categorical data *
o C. Interval data
o D. Ordinal data
7. In hypothesis testing, what is the null hypothesis?
o A. A hypothesis that assumes no effect or difference *
o B. A hypothesis that assumes a significant effect
o C. The main research hypothesis
o D. An untestable assumption
8. What does a confidence interval of 95% mean?
o A. 95% of the data falls within the mean
o B. There is a 95% chance the true parameter is within the interval *
o C. The interval is always accurate
o D. It means the hypothesis is 95% correct
9. What type of correlation exists when both variables increase together?
o A. Negative correlation
o B. Zero correlation
o C. Positive correlation *
o D. No correlation
10. What is the main goal of descriptive statistics?
12. In a normal distribution, what percentage of data falls within one standard deviation of
the mean?
• A. 50%
• B. 68% *
• C. 90%
• D. 99%
15. What type of test is used to compare the means of two independent groups?
• A. ANOVA
• B. Chi-square test
• C. Independent t-test *
• D. Correlation analysis
16. What is the purpose of the central limit theorem?
19. Which statistical test is used to analyze the relationship between three or more group
means?
• A. T-test
• B. ANOVA *
• C. Chi-square test
• D. Regression analysis
• A. To collect data
• B. To summarize data
• C. To make predictions or generalizations about a population *
• D. To clean raw data
• A. SELECT *
• B. UPDATE
• C. INSERT
• D. DELETE
• A. AVERAGE
• B. COUNT
• C. SUM *
• D. CONCATENATE
42. Which Excel feature allows users to filter data based on conditions?
• A. Pivot Table
• B. Data Validation
• C. Conditional Formatting
• D. AutoFilter *
43. What is the shortcut key to open the Format Cells dialog box in Excel?
• A. Ctrl + F
• B. Ctrl + 1 *
• C. Ctrl + Shift + F
• D. Alt + Enter
• A. MIN
• B. MAX *
• C. LARGE
• D. AVERAGE
46. What feature allows you to quickly apply a format based on specific conditions?
• A. Data Validation
• B. Conditional Formatting *
• C. Pivot Table
• D. Goal Seek
47. Which Excel tool allows you to summarize data and analyze patterns?
• A. Pivot Table *
• B. AutoFilter
• C. CONCATENATE
• D. Flash Fill
• A. .xls
• B. .xlsx *
• C. .csv
• D. .xlsm
51. Which function is used to count only numeric values in a range?
• A. COUNTA
• B. COUNT *
• C. COUNTIF
• D. COUNTBLANK
• A. To filter data
• B. To perform logical tests *
• C. To find duplicate values
• D. To create charts
56. Which function can be used to count cells that meet a specific condition?
• A. COUNT
• B. COUNTA
• C. COUNTIF *
• D. COUNTBLANK
58. What feature in Excel allows you to fill a series based on patterns?
• A. Flash Fill *
• B. AutoFormat
• C. Data Validation
• D. Goal Seek
• A. Power Query
• B. Power Pivot
• C. Power View *
• D. Power Automate
• A. Classification
• B. Clustering
• C. Data Replication *
• D. Association Rule Mining
A. Data visualization
C. Machine learning
D. Web scraping
92. Which of the following libraries is used for data visualization in Python?
• A. Matplotlib *
• B. Scikit-learn
• C. NumPy
• D. Pandas
• A. Power BI
• B. Tableau
• C. Apache NiFi
• D. Talend *
103. What is the main advantage of using a star schema in a data warehouse?
• A. NoSQL databases
• B. Relational databases *
• C. Time-series databases
• D. Key-value stores
• A. AWS Redshift *
• B. Google Drive
• C. Microsoft OneDrive
• D. GitHub
D. To clean data
A. To visualize data
D. To clean data
A. Structured data
B. Schema-on-write
C. Schema-on-read
D. Transactional processing
A. To visualize data
C. To visualize data
A. Apache Airflow
B. Apache NiFi
C. AWS Glue *
D. Talend
D. To clean data
125. Which of the following is a NoSQL database commonly used in data engineering?
A. MySQL B. PostgreSQL
C. MongoDB *
D. SQL Server
A. To visualize data
D. To clean data
A. To store data
C. To visualize data
B. To visualize data
D. To clean data
B. Data replication
C. Data cleaning
D. Data analysis
130. In the ETL process, what does the "Transform" step primarily involve?
A. Collecting data without telling individuals B. Telling individuals their data is being
collected, but not why C. Obtaining permission from individuals after fully explaining
how their data will be used * D. Collecting data only from public sources