Crash Course Data Science
Crash Course Data Science
(BEGINNER LEVEL)
DATA COLLECTION
1) Data collection is the process of gathering relevant
information from various sources to analyze and derive insights.
4) The range is NOT a measure of central tendency that represents the middle
value in a dataset.
5) The interquartile range (IQR) is a measure of spread that represents the range
between the first quartile (Q1) and the third quartile (Q3).
9) Standard deviation measures the average distance of values from the mean.
15) Correlation measures the strength and direction of the linear relationship
between two numerical variables.
EXPLORATORY DATA ANALYSIS
1) Exploratory data analysis involves summarizing and visualizing data to
gain insights and understand patterns.
8) Exploratory data analysis can reveal potential data quality issues, such as
inconsistent or erroneous values, and identify data anomalies that require
further investigation.
2) Bar chart, line chart and pie chart are some of the common types of
visualisation charts
8) The points on the scatter plot show the relationship between two
variables
9) In a bar chart, y-axis shows the dependent variable while x-axis shows
the independent variable
10) Python is the most commonly used programming language that creates
interactive data visualisations
DATA CLEANING
1) Imputation technique is used to fill in missing values
9) Mean imputation Replacing missing values with the mean of the variable
10) Forward filling Filling missing values with the value before them