0% found this document useful (0 votes)

20 views

ZZZZ

The document discusses statistical concepts including population, sample, parameter, statistic, variable, data, descriptive statistics, statistical inference, tables, graphs, probability, proportion, data visualization techniques, correlation, correlation matrix, simple regression analysis and finding the line of best fit using different methods.

Uploaded by

Kheane Edineh Bonggo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views

ZZZZ

Uploaded by

Kheane Edineh Bonggo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

4.

1 REVIEW OF COMMON STATISTICAL TERMS

Population – is the entire group of individuals you want to study, and a sample is a subset of that group.

Representative Sample – a subset of the population that has the same characteristics as the population.

Parameter – is a quantitative characteristic of the population that you are interested in estimating or
testing (such as a population mean or proportion).

Statistic – is a quantitative characteristic of a sample that often helps estimate or test the population
parameter (such as a sample mean or proportion).

Variable – is a characteristic of interest for each person or object in a population.

• Categorical Variable – a variable that takes on values that are names or labels
• Numerical – a variable that takes on values that are indicated by numbers

Data – a set of observations (a set of possible outcomes): most data can be put into two groups:

• Qualitative – an attribute whose value is indicated by a label.

• Quantitative – an attribute whose value is indicated by a number.
• Discrete data – the result of counting (e.g., the number of Covid-19 deaths in Cebu)
• Continuous data – the result of measuring (e.g., weight of new-born babies)

Descriptive Statistics – are single results you get when you analyze a set of data.

• Mean – shows the arithmetic mean of the sample data.

• Standard Error – shows the standard error of the data set (a measure of the difference between
the predicted value and the actual value)
• Median – shows the middle value in the data set (the value that separates the largest half of the
values from the smallest half of the values)
• Mode – shows the most common value in the data set
• Standard Deviation – shows the sample standard deviation measure for the data set.
• Sample Variance – shows the sample variance for the data set (the squared standard deviation)
• Kurtosis – shows the kurtosis of the distribution
• Skewness – shows the skewness of the data set’s distribution
• Range – shows the difference between the largest and smallest values in the data set
• Minimum – shows the smallest value in the data set
• Maximum – shows the largest value in the data set
• Sum – adds all the values in the data set together to calculate the sum
• Count – counts the number of values in a data set
• Largest (X) – shows the largest X value in the data set
• Smallest (X) – shows the smallest X value in the data set
• Confidence Level (X) Percentage – shows the confidence level at a given percentage for the data
set values

Statistical Inference – refers to using your data (and its descriptive statistics) to make conclusions about
the population.
Table – contains quantitative data organized into rows and columns with categorical labels.

Graph – (or chart) is primarily used to show relationships among data and portrays values encoded as
visual objects (e.g., lines, bars, or points). Numerical values are displayed along the axes providing scales.

Probability – a number between zero and one (inclusive) that gives the likelihood that a specific event will
occur.

Proportion – (or percentage) the number of successes divided by the total number in the sample.

4.2 DATA VISUALIZATION

Data Visualization – is the graphic representation of data

• It involves producing images that communicate relationships among the represented data to
viewers of the images. It makes complex data more accessible, understandable, and usable.
• Data visualization is viewed as a branch of descriptive statistics by some, but also as a tool by
others.

Bar Graphs – shows numbers that are independent of each other.

Pie Charts – show you how a whole is divided into different parts

Line Charts – show you how numbers have changes over time, very useful when data are connected,
expected to reveal trends

Excel Box and Whiskey Plot

Box and Whiskey Plot in Excel – is an exploratory chart used to show statistical highlights and distribution
of the data set. This chart is used to show a five number summary of the data. These five-number summary
are “Minimum Value, First Quartile Value, Median Value, Third Quartile Value, and Maximum Value”

• Minimum Value – the minimum or smallest value from the dataset

• First Quartile Value – it is the value between the minimum value and median value
• Median Value – median is the value of the dataset
• Third Quartile Value – the value between the median value and maximum value
• Maximum Value – highest value of the dataset

4.3 CORRELATION

Correlation – denoted by r, measures the amount of linear association between two variables. The value
of r is always between -1 and 1 inclusive. The R-squared value, denoted by R2, is called the Coefficient of
determination. It measures the proportion of variation in the dependent variable that can be attributed
to the independent variable. The value of R2 is always between 0 and 1 inclusive.

Interpreting Pearson r

Value of r Strength

-1.0 to -0.5; 1.0 to 0.5 strong relationship

-0.5 to -0.1; 0.1 to 0.5 weak relationship

-0.1 to 0.1 none or very weak

Correlation r = 0.0; R-squared = 0.0. No association

There is no association between the variables.

Correlation r = -0.3. Small negative association.

How to Read a Correlation Matrix

1. -1 indicates a perfectly negative linear correlation between two variables.

2. 0 indicates no linear correlation between two variables.
3. 1 indicates a perfectly positive linear correlation between two variables.

When to use a Correlation Matrix

1. A correlation matrix conveniently summarizes a dataset

A correlation matrix is a simple way to summarize the correlations between all variables in a
dataset.
2. A correlation matrix serves as a diagnostic for regression.
One key assumption of multiple linear regression is that no independent variable in the model is
highly correlated with another variable in the model. When two independent variables are highly
correlated, this results in a problem known as multicollinearity and it can make it hard to interpret
the results of the regression.

One of the easiest ways to detect a potential multicollinearity problem is to look at a correlation
matrix and visually check whether any of the variables are highly correlated with each other.

4.4 Simple Regression

Regression Analysis – finds the equation of the line that best describes the relationship between two
variables to help make accurate or reliable predictions.

Another complex equation can be used to arrive at the estimates of the y-intercept and slope. A quicker
way or most effective way to calculate is by using Data Analysis.

Iba Unit - Ii
No ratings yet
Iba Unit - Ii
31 pages
stastics for data science1 (quiz1 notes)
No ratings yet
stastics for data science1 (quiz1 notes)
2 pages
Statics Imp Answer
No ratings yet
Statics Imp Answer
14 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
13 pages
MATM Midterm Reviewer
No ratings yet
MATM Midterm Reviewer
10 pages
ST Formula Sheet Midterm
No ratings yet
ST Formula Sheet Midterm
4 pages
SCA - Module 4
No ratings yet
SCA - Module 4
49 pages
Statistics
No ratings yet
Statistics
13 pages
Data Science (Unit 02) Notes
No ratings yet
Data Science (Unit 02) Notes
7 pages
Research method lecture notes
No ratings yet
Research method lecture notes
32 pages
Unit II Descriptive-Statistics-And-Correlation
No ratings yet
Unit II Descriptive-Statistics-And-Correlation
19 pages
02 Exploratory Data Analytics
No ratings yet
02 Exploratory Data Analytics
41 pages
MMW Chapter 4
No ratings yet
MMW Chapter 4
11 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
5 pages
Business Statistics: Qualitative or Categorical Data
No ratings yet
Business Statistics: Qualitative or Categorical Data
14 pages
Data Visualization
No ratings yet
Data Visualization
37 pages
Ccsasa
No ratings yet
Ccsasa
3 pages
Midterms Gec Math Adooooor
No ratings yet
Midterms Gec Math Adooooor
6 pages
Document
No ratings yet
Document
23 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
3 pages
Data Exploration_Preparation
No ratings yet
Data Exploration_Preparation
16 pages
Topic 8 Data Processing and Analysis PDF
No ratings yet
Topic 8 Data Processing and Analysis PDF
157 pages
E-Book On Essentials of Business Analytics: Group 7
No ratings yet
E-Book On Essentials of Business Analytics: Group 7
6 pages
Algebra 1 Unit 6 Describing Data Notes
No ratings yet
Algebra 1 Unit 6 Describing Data Notes
13 pages
Grey Minimalist Business Project Presentation
No ratings yet
Grey Minimalist Business Project Presentation
5 pages
1.1 CS3352-FDS -UNIT 1
No ratings yet
1.1 CS3352-FDS -UNIT 1
42 pages
Business Analytics
No ratings yet
Business Analytics
44 pages
MAT211 Assignment - 1: Part - 1
No ratings yet
MAT211 Assignment - 1: Part - 1
10 pages
Statistics For Data Science
100% (1)
Statistics For Data Science
27 pages
3 Matm111
No ratings yet
3 Matm111
3 pages
Business Statstics Complete
No ratings yet
Business Statstics Complete
13 pages
BA File
No ratings yet
BA File
68 pages
Bustat Reviewer
No ratings yet
Bustat Reviewer
6 pages
Unit_I_II_III_IV
No ratings yet
Unit_I_II_III_IV
23 pages
Difference Between (Median, Mean, Mode, Range, Midrange) (Descriptive Statistics)
No ratings yet
Difference Between (Median, Mean, Mode, Range, Midrange) (Descriptive Statistics)
11 pages
Class Test 1 Revision Notes
No ratings yet
Class Test 1 Revision Notes
10 pages
ADS EXP 1
No ratings yet
ADS EXP 1
13 pages
Quantitative Data Analysis
No ratings yet
Quantitative Data Analysis
22 pages
Week 5A - Statistics Handout
No ratings yet
Week 5A - Statistics Handout
9 pages
Session 1 On Descriptive Statistics
No ratings yet
Session 1 On Descriptive Statistics
24 pages
365 Data Science - Statistics: Glossary Section Lesson Word
No ratings yet
365 Data Science - Statistics: Glossary Section Lesson Word
5 pages
Data Analysis Guide
No ratings yet
Data Analysis Guide
4 pages
Statistics For Data Science PDF - Statistics-for-Data-Science PDF
No ratings yet
Statistics For Data Science PDF - Statistics-for-Data-Science PDF
14 pages
Statistics
No ratings yet
Statistics
152 pages
APznzaZmf FjNZzQU2KZGNWcTIMyEPNieeXpEIC4txhLpx IW9aIcijwEdcvmrObIy4gDpcU78AYLsB6msaeqj47x3Fc6z9vdKhe5EnyMTtReSpFg 23R3DG W66DWWysqOW PfB BJrKuEN CsrKXdSrdM OKOdbGKa2ND0ltkJXrievcwimUpSlHEYiQCPleUm8zmyjmaz7 PPZRnRfUuizv
No ratings yet
APznzaZmf FjNZzQU2KZGNWcTIMyEPNieeXpEIC4txhLpx IW9aIcijwEdcvmrObIy4gDpcU78AYLsB6msaeqj47x3Fc6z9vdKhe5EnyMTtReSpFg 23R3DG W66DWWysqOW PfB BJrKuEN CsrKXdSrdM OKOdbGKa2ND0ltkJXrievcwimUpSlHEYiQCPleUm8zmyjmaz7 PPZRnRfUuizv
24 pages
Descriptive Analytics - Univariate and Bivariate
No ratings yet
Descriptive Analytics - Univariate and Bivariate
41 pages
Statistics For Data Science
No ratings yet
Statistics For Data Science
30 pages
EDA- Reviewer Midterm
No ratings yet
EDA- Reviewer Midterm
9 pages
BRM Chapter 6
No ratings yet
BRM Chapter 6
8 pages
Variable: An Item of Data Examples
No ratings yet
Variable: An Item of Data Examples
60 pages
What Are The Various Measures of Central Tendency
No ratings yet
What Are The Various Measures of Central Tendency
4 pages
Ymzv Further Mathematics Bound Reference
No ratings yet
Ymzv Further Mathematics Bound Reference
30 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
4 pages
09 - Data Analysis - Descriptive Statistics
No ratings yet
09 - Data Analysis - Descriptive Statistics
23 pages
Fundamentals of Data Science and Analytics On Descriptive Analysis
No ratings yet
Fundamentals of Data Science and Analytics On Descriptive Analysis
53 pages
01 Introduction
No ratings yet
01 Introduction
50 pages
5 6233086535453902095
No ratings yet
5 6233086535453902095
22 pages
ECONOMICS SEM 4 Notes Sakshi
No ratings yet
ECONOMICS SEM 4 Notes Sakshi
10 pages
Bocalig Act5 MMW
No ratings yet
Bocalig Act5 MMW
6 pages