0% found this document useful (0 votes)

13 views

Chapter 4

1) Statistics can be divided into descriptive statistics, which describes data, and inferential statistics, which analyzes inferences to make conclusions about populations. Descriptive statistics involves collecting, tabulating, presenting, and summarizing information, while inferential statistics involves analyzing samples to make inferences about populations. 2) A population consists of all elements being studied, while a sample is a portion of the population selected for study. Variables can be quantitative or qualitative, and quantitative variables can be discrete or continuous depending on whether their values are countable or can assume any value. 3) Common sampling methods include random sampling, systematic sampling, stratified sampling, and cluster sampling. Data collection methods include interviews,

Uploaded by

fcpvhbdpb7

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views

Chapter 4

Uploaded by

fcpvhbdpb7

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 23

TOPIC 4

DATA ORGANISATION AND DESCRIPTION

4.1 INTRODUCTION

STATISTICS

 a scientific method of collecting, organizing, summarizing, analyzing, interpreting

and data presenting.
 can be divide into two branches:
 Descriptive statistic
 Inferential Statistics

DESCRIPTIVE STATISTICS INFERENTIAL STATISTICS

Technique of Technique of
collecting analyzing
tabulating inferences
presenting
summarizing
information
to make conclusion
about population

to describe data

Population

Sample

DEFINITIONS EXPLANATIONS
Populations A population consists of all elements—individuals, items, or
objects—whose characteristics are being studied.
Sample A portion of the population selected for study is referred to as a
sample
Census (bancian) A survey that includes every member of the population.
Parameter A parameter of population is some quantity that relates to the
population, such as its mean or median.
EXAMPLE 1

Explain whether each of the following constitutes a population or a sample.

a) Pounds of bass caught by all participants in a bass fishing derby
b) Credit card debts of 100 families selected from a city
c) Number of home runs hit by all Major League baseball players in the
2009 season
d) Number of parole violations by all 2147 parolees in a city
e) Amount spent on prescription drugs by 200 senior citizens in a large
city

DATA

DEFINITIONS EXPLANATIONS
Can be count
Discrete Data Examples:
number of houses, cars

Can be obtains through by measuring and the accuracy depends on the

measuring instruments.
Continuous Data
Examples:
length, age, height, weight, time

Raw data that have not been organized numerically

Example:
These are the number of rooms in each of 20 houses in particular town
5 6 6 4 5 4 6 8 2 4
7 8 3 5 4 2 4 8 8 3
From above data also can be formed with the help of tally chart shown
below:
Number of of room in each Tally Number of
Ungrouped Data house house
2 2
3 2
4 5
5 3
6 3
7 1
8 4
DEFINITIONS EXPLANATIONS
We can summarize data above by grouping them into classes.
Example :
Number of rooms Frequency
Grouped Data 2-3 4
4-5 8
6-7 4
8-9 4

VARIABLES

Type of Variable

Quantitative Qualitative or categorical

Continuous Discrete

DEFINITIONS EXPLANATIONS
Variable A variable is a characteristic under study that assumes different values for
different elements. In contrast to a variable, the value of a constant is fixed.
Quantitative A variable that can be measured numerically is called a quantitative
variable.
Discrete A variable whose values are countable is called a discrete variable. In other
words, a discrete variable can assume only certain values with no
intermediate values.
Continuous A variable that can assume any numerical value over a certain interval or
intervals is called a continuous variable.
Qualitative or A variable that cannot assume a numerical value but can be classified into
categorical two or more nonnumeric categories is called a qualitative or categorical
variable. The data collected on such a variable are called qualitative data

EXAMPLE 2:

Indicate which of the following variables are quantitative and which are qualitative. Hence,
classify the quantitative variables as discrete or continuous.
a) Number of typographical errors in newspapers
b) Monthly TV cable bills
c) Spring break locations favored by college students
d) Number of cars owned by families
e) Lottery revenues of states
LEVEL OF MEASUREMENT

DEFINITIONS EXPLANATIONS
The nominal level of measurement classifies data into mutually exclusive
(nonoverlapping) categories in which no order or ranking can be imposed on
Nominal the data.
Example:
- gender, religion, political party, marital status.

The ordinal level of measurement classifies data into categories that can be
ranked; however, precise differences between the ranks do not exist.
Example:
Ordinal - from student evaluations, guest speakers might be ranked as superior,
average, or poor.
- Floats in a homecoming parade might be ranked as first place, second
place, etc.

The interval level of measurement ranks data, and precise differences

between units of measure do exist; however, there is no meaningful zero.
Example:
Interval - There is a meaningful difference of 1 point between an IQ of 109 and an
IQ of 110.
- Temperature is another example of interval measurement, since there is a
meaningful difference of 1F between each unit, such as 72 and 73F.

The ratio level of measurement possesses all the characteristics of interval

measurement, and there exists a true zero. In addition, true ratios exist when
Ratio the same variable is measured on two different members of the population.
Example:
- height, weight, area, and number of phone calls received

EXAMPLE 3:

What level of measurement would be used to measure each variable?

a) The ages of authors who wrote the hardback versions of the top 25
fiction books sold during a specific week
b) The colors of baseball hats sold in a store for a specific year
c) The highest temperature for each day of a specific month
d) The ratings of bands that played in the homecoming parade at a
college
METHODS OF SAMPLING

SAMPLING EXPLANATIONS
METHOD
Random Sampling Subject are selected by random numbers.
Systematic Subject are selected by using every kth number after the first subject is
Sampling randomly selected from 1 through k.
Stratified Sampling Subject are selected by dividing up the population into groups (strata),
and subjects are randomly selected within groups.
Cluster Sampling Subject are selected by using an intact group that is representative of
the population.

DATA COLLECTION

• The next step after the sample is identified and selected by using the appropriate sampling
technique is to determine the best way to reach the respondents in order to obtain the
required data.
• There are several methods of collecting data and each has its own advantages and
disadvantages.
• A researcher must choose the methods that provide the most information at minimum cost.
• The common methods of data collection are as follows:
a) Face-to-face interview (personal interview)
b) Telephone interview
c) Direct questionnaire (questionnaires are distributed and collected personally)
d) Mail or postal questionnaire (questionnaires are sent and received back through the
post)
e) Direct observation (respondents are observed and data recorded)
f) Other methods (e-mail, video recording)

EXAMPLE 4

State which sampling method was used.

a) Out of 10 hospitals in a municipality, a researcher selects one and collects

records for a 24-hour period on the types of emergencies that were treated
there.
b) A researcher divides a group of students according to gender, major field,
and low, average, and high grade point average. Then she randomly selects
six students from each group to answer questions in a survey.
c) The subscribers to a magazine are numbered. Then a sample of these
people is selected using random numbers.
d) Every 10th bottle of Energized Soda is selected, and the amount of liquid in
the bottle is measured. The purpose is to see if the machines that fill the
bottles are working properly.
4.2 DATA ORGANISATION

QUALITATIVE DATA

Type of data representation

Example:

Twenty-five army inductees were given a blood test to determine their blood type. The data set
is

A B B AB O
O O B AB B
B B O A O
A O O O AB
AB A O B A

We can represent the data using following type of data representation:

1) Categorical Frequency Distributions

Data collected by forming categories of values and indicating the number of data fall into
each category.

Type of Blood Tally Frequency Percent

A IIII 5 20
B IIII II 7 28
O IIII IIII 9 36
AB IIII 4 16
Total 25 100
From frequency distribution:

i. more people have type O blood than any other type

ii. less people have type AB

2) PIE CHART

Consist of a circle that divided into sectors to show the number of objects or percentage in
each group or category. The angle in the sector is proportional to the number or percentage of
elements in the category.

From pie chart:

20
i. Number of type A blood is  25  5 people
100
i. 36% have most people have type O blood than
any other type
3) BAR CHART

A bar chart uses the length of vertical columns or horizontal bars to represent quantities
or percentages.

Type of Blood
10 9
8 7
frequency

6 5
4
4
2
0
A B O AB
Type of blood

From bar chart:

i. 9 peoples have type O blood than any other type

ii. 4 peoples have type AB

EXAMPLE 5

The Brunswick Research

Organization surveyed 50 randomly selected individuals and asked them the primary way they
received the daily news. Their choices were via newspaper (N), television (T), radio (R), or
Internet (I). Construct a categorical frequency distribution for the data and interpret the results.

N N T T T I R R I T
I N R R I N N I T N
I R T T T T N R R I
R R I N T R T I I T
T I N T T I R N R T

Type of way Tally Frequency Percent

N
T
R
I
Total
EXAMPLE 6

The pie chart shows the population of an area. If the number of employees is 1500, find the
number of

a) Widowed

b) Single

EXAMPLE 7

The graphs show that first-year college students spend the most on electronic equipment.

a) Calculate the percentage of students

in spend on clothing.

b) What is the difference between the

highest and the lowest spent of
electronic equipment.
QUANTITATIVE DATA

Type of data representation

The following data are the scores in Mathematics Account test for 29 student of Class 2A.

50 50 50 50 53 53 53 54 61 62

64 64 64 68 68 70 79 79 79 79

79 79 80 80 83 83 83 95 95

We can represent the data using following type of data representation:

1) Stem and leaf plot

This plot separates data entries into leading digits and trailing digits. The guidelines for
constructing stem-and-leaf plots are as follows.
i. Split each score or value into two sets of digits. The first (or leading) set of digits
is the stem, and the second (or trailing) set of digits is the leaf.
ii. List all the possible stem digits from the lowest to the highest.
iii. For each score in the mass of data, write down the leaf numbers on the line
labelled by the appropriate stem number.

Leading digit (Stem) Trailing digit (Leaf)

5 0 0 0 0 3 3 3 4
6 1 2 4 4 4 8 8
7 0 9 9 9 9 9 9
8 0 0 3 3 3
9 5 5
2 I 7 means 27 marks

From stem and leaf plot:

i. mode = 79
ii. median = 68
iii. min = 50
iv. maximum = 95

2) Frequency distribution table

 A frequency table summarizes the data collected by forming intervals of values and
indicating the number of data that falls into each interval.
 This frequency table with class intervals is known as the frequency distribution of
grouped data.
 The grouping of data is often desirable because it reduces the complexity of the
data and helps to smoothen out irregularities in the distribution.
 There are several guidelines that can be followed in constructing a grouped
frequency distribution.
i. Firstly, the class interval should be mutually exclusive. This means that the
class intervals should not overlap and must be clearly defined.
ii. Secondly, it is a good practice to ensure that class intervals are of equal
width except for open-ended classes. If there are no observations in a
particular interval, it should still be included to avoid a misleading
impression of the data.
iii. Thirdly, there should neither be too few classes nor too many classes. The
rule of thumb is, the number of classes should not be less than 5 and should
not be more than 15.

From previous example, we can make a frequency distribution table below:

Class limit Frequency

50-59 8
60-69 7
70-79 7
80-89 5
90-99 2

3) Histogram and frequency polygon

 Histograms look like bar charts but are actually different.

 The area of a rectangle in a histogram is proportional to the frequency in a particular
class.
 All rectangles are drawn side by side and any empty space in between two rectangles
means that the class has zero frequency.
 To sketch histogram:
i. Mark the class boundaries on the horizontal axis.
ii. Mark the frequency on the vertical axis.
iii. For every class, draw the rectangle with the same height as the class.

From previous example, the resulting histogram is given:

Class limit Midpoint Frequency
40-49 44.5 0 Histogram
50-59 54.5 8
10
60-69 64.5 7
8

Frequency
70-79 74.5 7
6
80-89 84.5 5
4
90-99 94.5 2 2
100-109 104.5 0 0
49 59 69 79 89 99 109
score

Frequency polygon

A frequency polygon is obtained by connecting the midpoint (or class mark) of each class at the
top of the bar in the histogram.

Frequency Polygon
9
8
7
6
5
4
3
2
1
0
40-49 50-59 60-69 70-79 80-89 90-99 100-109

4) Ogive

The two types of cumulative frequency curves or ogive are

i. “more than” cumulative frequency curve, where the cumulative frequency is the sum of
the frequencies for classes above that class.
ii. “less than” cumulative frequency curve, where the cumulative frequency is the sum of
the frequencies for classes below that class.
Ogive more than 100.00%
Ogive less than
93.10%
100% 100% 100.00%
75.86%
Cummulative Frequency
80% 80%
51.72% 72.41%

Frequency
60% 60%
27.59% 48.28%
40% 40%
20% 0.00% 20% 24.14%
0% 6.90%
0% 0.00%
49 59 69 79 89 99 59 69 79 89 99 109
Upper limits Lower limit

EXAMPLE 8
A listing of calories per 1 ounce of selected salad dressings (not fat-free) is given below.
Construct a stem and leaf plot for the data.

100 130 130 130 110 110 120 130 140 100
140 170 160 130 160 120 150 100 145 145
145 115 120 100 120 160 140 120 180 100
160 120 140 150 190 150 180 160

Stem Leaf

EXAMPLE 9

Using the histogram shown here, answer these questions.

a) How many values are in the class 27.5–30.5?

b) How many values fall between 24.5 and 36.5?

c) How many values are below 33.5?

d) How many values are above 30.5?

EXAMPLE 10

Shown is an ogive depicting the cumulative frequency of the average mathematics SAT scores
by state.

a) How many students have an average

score is 549.5?

b) How many students have an average

score more than or equal to 522.5 but
less than 603.5?
4.3 MEASURE OF CENTRAL TENDENCY

Type Formula

Mean of a set data x1, x 2 , x3 , ... x n is written x and defined as

x1  x2  x3  ...  xn
x
Mean ( X ) n
n

x i
 i 1
n

Mode The mode of a set of data is the value that occurs most

The median of a data set is the middle value when the original
data values are arranged in descending or ascending
numerical order.
Median/Quartile/Percentile

Interquartile Range,

IQR  Q3  Q1
Q1  P25  X 1
n
4

Semi Interquartile Range,

med  Q2  P50  X 1
n
2
1
SIQR  Q3  Q1  Q3  P75  X 3
2 n
4

 If the location is not integer, take the next location.

Min Lowest value

Max Highest value

Range Maximum - Minimum

SHAPE OF DATA DISTRIBUTION

Symmetry and Skewness for the Data Distribution

 Position of the mean, median, and mode on the histogram or frequency curve can be
determine the general shape of the data distributions

 3 important shapes are :

 Symmetrical
 Skewed to the right or positive skewed
 Skewed to the left or negative skewed

positively skewed

mode < median < mean

negatively skewed

mode > median > mean

Symmetrical

Mean = median = mode

4.4 MEASURE OF DISPERSION AND SKEWNESS

Variance Standard Deviations

 n 2 1  n 2   n 2 1  n 2 
 xi    xi    xi    xi  
 i1 n  i1    i1 n  i1  
s 
2
s
n 1 n 1
COEFFICIENT OF VARIATION

The coefficient of variation is the standard deviation divided by the mean of the same
data set, and expressed as a percentage.

Formula:

standard deviation
Coefficient of Variation   100%
mean

A larger coefficient of variation means that the data is more dispersed and less
consistent.

The Pearson’s Coefficient of Skewness

3mean  median mean  mode

Sk  or Sk 
standard deviation standard deviation

4.5 EXPLORATORY DATA ANALYSIS

Boxplot

 Another graphical representation of data.

 Construct based on the lowest value, lower quartile, Q1  , median, Q2 , upper quartile,
Q3  and the highest value.
 Can be represented horizontally or vertically.

Lower boundary/fence, Upper boundary/fence,

Q1  1.5Q3  Q1  Q3  1.5Q3  Q1 
Q1  1.5IQR Q3  1.5IQR

1.5 Q3  Q1  1.5 Q3  Q1 

The skewness by Pearson’s Coefficient

Skewness Skewed to the LEFT Symmetrical Skewed to the RIGHT

Pearson’s Coefficient Sk  0.1 Sk  0 Sk  0.1

Interpretation on the shape of the distribution

Skewness Skewed to the LEFT Symmetrical Skewed to the RIGHT

Graphs

Measure of
Mean  Median  Mode Mean  Median  Mode Mode  Median  Mean
Location

Box-Plot
Q2  Q1  Q3  Q2 Q2  Q1  Q3  Q2 Q3  Q2  Q2  Q1

Central
Median Mean Median
Tendency
EXAMPLE 11
The stem and leaf diagram shows the number of flies caught in an insect trap for 27 days.

Stem Leaf
0 1 1 2
1 2 3 5 5 6
2 2 2 3 5 8 8
3 4 4 4 4 5 7 7 9
4 2 6 7 7 8

Key: 1 2 means 12

(a) Find

(i) mean, mode and median.

(ii) Q1, Q3 and semi-interquartile range.

(iii) 81t h percentile.

(iv) variance and standard deviation.

(b) Illustrate the above data by constructing a box and whisker plot. Hence, describe the
skewness of the distribution.
EXAMPLE 12
The table shows the distribution of grades of students for a certain subject in an examination.

Grade 1 2 3 4 5 6 7 8 9
Number of Students 7 13 9 7 7 2 1 1 1

(a) Find

(i) mean, mode and median.

(ii) first quartile, third quartile and P12.

(iii) standard deviation.

(b) Construct the box and whisker plot. Hence, state the shape of distribution.
EXAMPLE 13
The following is the systolic blood pressure, in mm Hg, of 10 patients in a hospital.

146 135 151 155 158 146 149 124 162 173

(a) Find the mean and mode. Describe the shape of the distribution.

(b) Find the standard deviation of the systolic blood pressure of the 10 patients. Hence, find the
Pearson’s coefficient of skewness. Comment on the distribution.
(c) Find the number of patients whose systolic blood pressures exceed one standard deviation
above or below the mean.

EXAMPLE 14

Calculate the coefficient of variation

(a) for a set of data having mean 14.0 and standard deviation 2.3.

(b) for a set of data having mean 7, and variance 0.6.

STA2023 Summary Notes: Chapter 1 - 10
No ratings yet
STA2023 Summary Notes: Chapter 1 - 10
58 pages
Methods of Correlation
No ratings yet
Methods of Correlation
28 pages
1.1 Introduction To Statistics and Data Gatherings
No ratings yet
1.1 Introduction To Statistics and Data Gatherings
102 pages
Part1 141104090445 Conversion Gate01
No ratings yet
Part1 141104090445 Conversion Gate01
27 pages
1 Statistics Introduction
No ratings yet
1 Statistics Introduction
36 pages
Statistics A Review
No ratings yet
Statistics A Review
47 pages
Classification and Presentation of Data
No ratings yet
Classification and Presentation of Data
35 pages
Stats For PGDM
No ratings yet
Stats For PGDM
52 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
39 pages
Intro of Statistics - Ogive
No ratings yet
Intro of Statistics - Ogive
35 pages
Reading On Data Collection
No ratings yet
Reading On Data Collection
57 pages
Scientific Data
No ratings yet
Scientific Data
22 pages
Lect.1
No ratings yet
Lect.1
47 pages
Defining Statistics and Basic Terms
No ratings yet
Defining Statistics and Basic Terms
31 pages
Statistics
No ratings yet
Statistics
36 pages
Biostat
No ratings yet
Biostat
20 pages
Biostatitistics
No ratings yet
Biostatitistics
58 pages
Collection of Data Part 2 Edited MLIS
No ratings yet
Collection of Data Part 2 Edited MLIS
45 pages
Statistics-and-Basic-Terms
No ratings yet
Statistics-and-Basic-Terms
28 pages
Stansa23z - 2023 - Basic Statistics
No ratings yet
Stansa23z - 2023 - Basic Statistics
10 pages
Math11n PPT 3.1
No ratings yet
Math11n PPT 3.1
40 pages
Lecture 01 Introduction to Statistics Ppt 06022025 095924am
No ratings yet
Lecture 01 Introduction to Statistics Ppt 06022025 095924am
40 pages
Review of Basic Statistics: "There Are Three Kinds of Lies: Lies, Damned Lies, and Statistics." (B.D. Israeli)
No ratings yet
Review of Basic Statistics: "There Are Three Kinds of Lies: Lies, Damned Lies, and Statistics." (B.D. Israeli)
50 pages
EDA - First Quiz Reviewer
No ratings yet
EDA - First Quiz Reviewer
5 pages
EDA - Midterms - Reviewer
No ratings yet
EDA - Midterms - Reviewer
7 pages
Ns Statistics 2022
No ratings yet
Ns Statistics 2022
70 pages
Introduction Book 1
No ratings yet
Introduction Book 1
41 pages
Data Types: and Its Representation Session - 2 & 3
No ratings yet
Data Types: and Its Representation Session - 2 & 3
33 pages
Course Introduction Inferential Statistics Prof. Sandy A. Lerio
No ratings yet
Course Introduction Inferential Statistics Prof. Sandy A. Lerio
46 pages
Math-1100-Module-4
No ratings yet
Math-1100-Module-4
34 pages
Topic 1 Descriptive Statistics SV
No ratings yet
Topic 1 Descriptive Statistics SV
113 pages
Statistics - Basic Concepts
No ratings yet
Statistics - Basic Concepts
29 pages
chapter 1_250119_072242
No ratings yet
chapter 1_250119_072242
11 pages
Educ 502 1 1
No ratings yet
Educ 502 1 1
70 pages
Intro To Statistics Lecture
No ratings yet
Intro To Statistics Lecture
41 pages
Notes
No ratings yet
Notes
71 pages
GEC104 Lesson9-NEW
No ratings yet
GEC104 Lesson9-NEW
72 pages
Chapter 1 Statistics: Case Study 1.1
No ratings yet
Chapter 1 Statistics: Case Study 1.1
5 pages
Written Report Gathering and Organizing Data
No ratings yet
Written Report Gathering and Organizing Data
13 pages
SMA 140 Lectures Notes 2024 Sep
No ratings yet
SMA 140 Lectures Notes 2024 Sep
87 pages
Statistics for Business and Economics
No ratings yet
Statistics for Business and Economics
6 pages
WEEK 3 and 4- Formulation and Presentation of Data (1)
No ratings yet
WEEK 3 and 4- Formulation and Presentation of Data (1)
36 pages
Probability and Statistics Lesson 1 2
No ratings yet
Probability and Statistics Lesson 1 2
47 pages
Sta 131 Complete Note
No ratings yet
Sta 131 Complete Note
33 pages
Introduction To Stati Stics: There Are Three Kinds of Lies: Lies, Damned Lies, A ND Statistics." (B.Disraeli)
No ratings yet
Introduction To Stati Stics: There Are Three Kinds of Lies: Lies, Damned Lies, A ND Statistics." (B.Disraeli)
39 pages
SASA Reviewer
No ratings yet
SASA Reviewer
4 pages
Lecture 1 - Introduction To Statistics
No ratings yet
Lecture 1 - Introduction To Statistics
41 pages
Statistics For Research: Data and Variables
No ratings yet
Statistics For Research: Data and Variables
7 pages
Q4 Week 1 and 2 Math 7
No ratings yet
Q4 Week 1 and 2 Math 7
7 pages
Adv Stats Lessons
No ratings yet
Adv Stats Lessons
36 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
10 pages
Midterm Exam in Reading and Writing
No ratings yet
Midterm Exam in Reading and Writing
3 pages
Chapter 1 An Overview of Statistics
No ratings yet
Chapter 1 An Overview of Statistics
4 pages
Descriptive Lec
No ratings yet
Descriptive Lec
8 pages
Basic Ideas of Data Management
No ratings yet
Basic Ideas of Data Management
32 pages
Introduction To Statistics: "There Are Three Kinds of Lies: Lies, Damned Lies, and Statistics." (B.Disraeli)
No ratings yet
Introduction To Statistics: "There Are Three Kinds of Lies: Lies, Damned Lies, and Statistics." (B.Disraeli)
32 pages
Lecture 2
No ratings yet
Lecture 2
50 pages
MATHEMATICS AS A TOOL and CODES
No ratings yet
MATHEMATICS AS A TOOL and CODES
86 pages
What Is Statistics ? and Describing Data: Frequency Distributio N
No ratings yet
What Is Statistics ? and Describing Data: Frequency Distributio N
17 pages
Intro123243ewqs1
No ratings yet
Intro123243ewqs1
37 pages
Elementary Statistics
From Everand
Elementary Statistics
jay prakash Maheshwari
5/5 (1)
MA311 Tutorial 3
No ratings yet
MA311 Tutorial 3
2 pages
Chapter 13 Multivariate Analysis Techniques
No ratings yet
Chapter 13 Multivariate Analysis Techniques
58 pages
Univariate ANOVA and ANCOVA
100% (1)
Univariate ANOVA and ANCOVA
33 pages
# 7 Report: L1 (Lasso) Regularization
No ratings yet
# 7 Report: L1 (Lasso) Regularization
6 pages
Complete Download SPSS Statistics: A Practical Guide 5e 5th Edition Kellie Bennett - eBook PDF PDF All Chapters
100% (4)
Complete Download SPSS Statistics: A Practical Guide 5e 5th Edition Kellie Bennett - eBook PDF PDF All Chapters
69 pages
Linearity
No ratings yet
Linearity
4 pages
Multilevel Analysis An Introduction To Basic And Advanced Multilevel Modeling 2nd Edition Tom A. B. Snijders download
100% (1)
Multilevel Analysis An Introduction To Basic And Advanced Multilevel Modeling 2nd Edition Tom A. B. Snijders download
48 pages
Quiz
No ratings yet
Quiz
15 pages
20200929105758YPDCHAN001ST3188 Topic 9 2021
No ratings yet
20200929105758YPDCHAN001ST3188 Topic 9 2021
73 pages
Sampling:: Design and Procedure
No ratings yet
Sampling:: Design and Procedure
47 pages
The Professor Proposes Dymond
No ratings yet
The Professor Proposes Dymond
27 pages
593 Siti Muflikhatur Rosyada G2C009014
No ratings yet
593 Siti Muflikhatur Rosyada G2C009014
32 pages
STA 2023 CRN 81075 Fall 2021 Syllabus
No ratings yet
STA 2023 CRN 81075 Fall 2021 Syllabus
13 pages
404 Research Methodology
No ratings yet
404 Research Methodology
1,236 pages
SEM Vs M Regression
No ratings yet
SEM Vs M Regression
4 pages
Itae0006 Exam
No ratings yet
Itae0006 Exam
4 pages
Statistics and Parametric Tests
No ratings yet
Statistics and Parametric Tests
74 pages
Chapter 08
No ratings yet
Chapter 08
3 pages
A Comparison of PLS and ML Bootstrapping Techniques in SEM: A Monte Carlo Study
No ratings yet
A Comparison of PLS and ML Bootstrapping Techniques in SEM: A Monte Carlo Study
9 pages
Intro Regression Modeling
No ratings yet
Intro Regression Modeling
11 pages
An Introduction To Modern Missing Data Analyses
No ratings yet
An Introduction To Modern Missing Data Analyses
33 pages
Find The Sample Size Given 95
No ratings yet
Find The Sample Size Given 95
6 pages
Dorm Living On Student Performa
No ratings yet
Dorm Living On Student Performa
15 pages
04 Percntiles Deciles and Quartiles
No ratings yet
04 Percntiles Deciles and Quartiles
8 pages
Buggy Car Assigment 5
No ratings yet
Buggy Car Assigment 5
6 pages
Annual Examination, 2021: B.C.A. III (New Course)
No ratings yet
Annual Examination, 2021: B.C.A. III (New Course)
2 pages
Life Style Measurement
No ratings yet
Life Style Measurement
7 pages
QT ll
No ratings yet
QT ll
7 pages
Quantitative Methods - Hypothesis Testing
No ratings yet
Quantitative Methods - Hypothesis Testing
9 pages

Chapter 4

Uploaded by

Chapter 4

Uploaded by

TOPIC 4

DATA ORGANISATION AND DESCRIPTION

 a scientific method of collecting, organizing, summarizing, analyzing, interpreting

DESCRIPTIVE STATISTICS INFERENTIAL STATISTICS

Explain whether each of the following constitutes a population or a sample.

Can be obtains through by measuring and the accuracy depends on the

Raw data that have not been organized numerically

Quantitative Qualitative or categorical

The interval level of measurement ranks data, and precise differences

The ratio level of measurement possesses all the characteristics of interval

What level of measurement would be used to measure each variable?

State which sampling method was used.

a) Out of 10 hospitals in a municipality, a researcher selects one and collects

Type of data representation

We can represent the data using following type of data representation:

1) Categorical Frequency Distributions

Type of Blood Tally Frequency Percent

i. more people have type O blood than any other type

From pie chart:

From bar chart:

i. 9 peoples have type O blood than any other type

The Brunswick Research

Type of way Tally Frequency Percent

a) Calculate the percentage of students

b) What is the difference between the

Type of data representation

We can represent the data using following type of data representation:

1) Stem and leaf plot

Leading digit (Stem) Trailing digit (Leaf)

From stem and leaf plot:

2) Frequency distribution table

From previous example, we can make a frequency distribution table below:

Class limit Frequency

3) Histogram and frequency polygon

 Histograms look like bar charts but are actually different.

From previous example, the resulting histogram is given:

The two types of cumulative frequency curves or ogive are

Using the histogram shown here, answer these questions.

b) How many values fall between 24.5 and 36.5?

c) How many values are below 33.5?

d) How many values are above 30.5?

a) How many students have an average

b) How many students have an average

Mean of a set data x1, x 2 , x3 , ... x n is written x and defined as

Semi Interquartile Range,

 If the location is not integer, take the next location.

Min Lowest value

Max Highest value

Range Maximum - Minimum

Symmetry and Skewness for the Data Distribution

 3 important shapes are :

mode < median < mean

mode > median > mean

Mean = median = mode

4.4 MEASURE OF DISPERSION AND SKEWNESS

Variance Standard Deviations

The Pearson’s Coefficient of Skewness

3mean  median mean  mode

4.5 EXPLORATORY DATA ANALYSIS

 Another graphical representation of data.

Lower boundary/fence, Upper boundary/fence,

1.5 Q3  Q1  1.5 Q3  Q1 

Skewness Skewed to the LEFT Symmetrical Skewed to the RIGHT

Pearson’s Coefficient Sk  0.1 Sk  0 Sk  0.1

Interpretation on the shape of the distribution

Skewness Skewed to the LEFT Symmetrical Skewed to the RIGHT

(i) mean, mode and median.

(ii) Q1, Q3 and semi-interquartile range.

(iii) 81t h percentile.

(i) mean, mode and median.

(ii) first quartile, third quartile and P12.

Calculate the coefficient of variation

(b) for a set of data having mean 7, and variance 0.6.

You might also like