0% found this document useful (0 votes)
80 views

Centrall - Tendency - IV

Here are the steps to solve this homework problem: 1) Calculate the measures of central tendency and variability for the midterm exam scores: Mean = 92 Median = 88 Mode = 88 Range = 126 - 67 = 59 Standard Deviation = 15.2 2) To compare the treatment and no treatment groups, calculate the mean, standard deviation, and conduct a t-test to determine if the difference in means is statistically significant. This would help determine if the treatment was effective in improving language proficiency. Let me know if you need any clarification or have additional questions!

Uploaded by

Radhika Mohata
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
80 views

Centrall - Tendency - IV

Here are the steps to solve this homework problem: 1) Calculate the measures of central tendency and variability for the midterm exam scores: Mean = 92 Median = 88 Mode = 88 Range = 126 - 67 = 59 Standard Deviation = 15.2 2) To compare the treatment and no treatment groups, calculate the mean, standard deviation, and conduct a t-test to determine if the difference in means is statistically significant. This would help determine if the treatment was effective in improving language proficiency. Let me know if you need any clarification or have additional questions!

Uploaded by

Radhika Mohata
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 53

Measure of central tendency

„ Central tendency
„ A statistical measure that identifies a single
score as representative for an entire
distribution. The goal of central tendency is
to find the single score that is most typical
or most representative of the entire group.
Measure of central tendency
Measure of central tendency
„ The mean
„ Population mean vs. sample mean
ΣX ΣX
μ= x=
N n
„ N=4: 3,7,4,6
ΣX 20
x= = =5
n 4
Measure of central tendency
„ The weighted mean
„ Group A: x = 6 n=12
„ Group B: x = 7 n=8
72 + 56
„ Weighted mean = ΣX 1 + ΣX 2 = 12 + 8
= 6.4
n1 + n2

„ Seriously sensitive to extreme scores.


Measure of central tendency
„ Median
„ The score that divides a distribution exactly
in half. Exactly 50 percent of the
individuals in a distribution have scores at
or below the median.
„ odd: 3, 5, 8, 10, 11 Æ median=8
„ even: 3, 3, 4, 5, 7, 8 Æ
median=(4+5)/2=4.5
Measure of central tendency
„ Median
„ The median is often used as a measure of
central tendency when the number of
scores is relatively small, when the data
have been obtained by rank-order
measurement, or when a mean score is
not appropriate.
Measure of central tendency
„ Mode
„ Most frequently obtained score in the data
„ Problems:
„ No mode
Measure of central tendency
„ Choosing a measure of central tendency
„ the level of measurement of the variable concerned
(nominal, ordinal, interval or ratio);
„ the shape of the frequency distribution;
„ what is to be done with the figure obtained.
„ The mean is really suitable only for ratio and interval
data. For ordinal variables, where the data can be ranked
but one cannot validly talk of `equal differences' between
values, the median, which is based on ranking, may be
used. Where it is not even possible to rank the data, as in
the case of a nominal variable, the mode may be the
only measure available.
Measure of central tendency
„ Central tendency and the shape of the
distribution
Summary
1. The purpose of central tendency is to determine the single value that best
represents the entire distribution of scores. The three standard measures of
central tendency are the mode, the median, and the mean.
2. The mean is the arithmetic average. It is computed by summing all the
scores and then dividing by the number of scores. Conceptually, the mean
is obtained by dividing the total (IX) equally among the number of
individuals (N or n). Although the calculation is the same for a popula-tion
or a sample mean, a population mean is identified by the symbol and a
sample mean is identified by X.
3. Changing any score in the distribution will cause the mean to be changed.
When a constant value is added to (or sub-tracted from) every score in a
distribution, the same con-stant value is added to (or subtracted from) the
mean. If every score is multiplied by a constant, the mean will be multiplied
by the same constant. In nearly all circum-stances, the mean is the best
representative value and is the preferred measure of central tendency.
Summary
1. The median is the value that divides a distribution exactly in half. The
median is the preferred measure of central tendency when a
distribution has a few extreme scores that displace the value of the
mean. The median also is used when there are undetermined
(infinite) scores that make it impossible to compute a mean.
2. The mode is the most frequently occurring score in a dis-tribution. It
is easily located by finding the peak in a frequency distribution graph.
For data measured on a nominal scale, the mode is the appropriate
measure of central ten-dency. It is possible for a distribution to have
more than one mode.
3. For symmetrical distributions, the mean will equal the me-dian. If
there is only one mode, then it will have the same value, too.
4. For skewed distributions, the mode will be located toward the side
where the scores pile up, and the mean will be pulled toward the
extreme scores in the tail. The median will be located between these
two values.
Homework

Imagine that you received the following data on the vocabulary test mentioned earlier:
20 22 23 23 23
23 23 23 24 25
28 29 30 30 30
30 30 30 31 32
32 33 33 34 35
35 36 36 37 37

1. Chart the data and draw the frequency polygon.

2. Compute the mean, mode, and median of the data and decide which of the three you
believe to be best for the central tendency of the data.
Measure of variability
„ Variability provides a quantitative
measure of the degree to which scores
in a distribution are spread out or
clustered together.
Measure of variability
„ Range
„ range=Xhighest – Xlowest
„ Quartile:
„ A statistical term describing a division of observations into four defined intervals
based upon the values of the data and how they compare to the entire set of
observations.
Each quartile contains 25% of the total observations. Generally, the data is
ordered from smallest to largest with those observations falling below 25% of all
the data analyzed allocated within the 1st quartile, observations falling between
25.1% and 50% and allocated in the 2nd quartile, then the observations falling
between 51% and 75% allocated in the 3rd quartile, and finally the remaining
observations allocated in the 4th quartile.
„ Interquartile: The interquartile range is a measure of spread or dispersion. It
is the difference between the 75th percentile (often called Q3) and the 25th
percentile (Q1). The formula for interquartile range is therefore: Q3-Q1.
„ Semi-interquartile: The semi-interquartile range is a measure of spread or
dispersion. It is computed as one half the difference between the 75th
percentile [often called (Q3)] and the 25th percentile (Q1). The formula for
semi-interquartile range is therefore: (Q3-Q1)/2.
„ TOEFL: (560-470)/2=45
Measure of variability
Measure of variability
„ Variance
„ Deviation: deviation of one score from the
mean
„ Variance: taking the distribution of all
scores into account.
Sum of
square (SS)

n=24
Measure of variability
„ Standard deviation
squar ed
scor e mean devi at i on* devi at i on
8 9. 67 - 1. 67 2. 79
25 9. 67 +15. 33 235. 01
7 9. 67 - 2. 67 7. 13
5 9. 67 - 4. 67 21. 81
8 9. 67 - 1. 67 2. 79
3 9. 67 - 6. 67 44. 49
10 9. 67 + . 33 . 11
12 9. 67 + 2. 33 5. 43
9 9. 67 - . 67 . 45
sum of squar ed dev= 320. 01

St andar d Devi at i on = Squar e r oot ( sum of squar ed devi at i ons / ( N- 1)


= Squar e r oot ( 320. 01/ ( 9- 1) )
= Squar e r oot ( 40)
= 6. 32
Measure of variability
„ The larger the standard deviation figure, the wider
the range of distribution away from the measure of
central tendency
Measure of variability
„ Adding a constant to each score does
not change the standard deviation.
„ Multiplying each score by a constant
causes the standard deviation to be
multiplied by the same constant.
Measure of variability
Group A Group B
11 20
8 10
10 1
9 8
8 0
12 30
10 13
11 6
Measure of variability

Reporting the standard deviation (APA):

Type of instrument
Listening Watching
Mean SD Mean SD
Males 15.72 4.43 6.94 2.26
Females 3.47 1.12 2.61 0.98
Measure of variability
„ Standard deviation and normal distribution
Homework
1. Calculate the mean, median, mode, range and standard
deviation for the following sample:

Midterm Exam
X X
100 85
88 82
83 96
105 107
78 102
98 113
126 94
85 119
67 91
88 100
88 72
77 88
114 85
Homework
2. Suppose that the following scores were obtained on administering a
language proficiency test to ten aphasics who had undergone a course
of treatment, and ten otherwise similar aphasics who had not
undergone the treatment:
Experimental group Control group
15 31
28 34
62 47
17 41
31 28
58 54
45 36
11 38
76 45
43 32
Calculate the mean score and standard deviation for each group, and
comment on the results.
Locating scores and finding
scales in a distribution
Percentiles, quartiles, deciles
Mind work

Imagine that you conducted an in-service course for ESL teachers. To receive university credit for the
course, the teachers must take examinations--in this case, a midterm and a final. The midterm was a
multiple-choice test of 50 items and the final exam presented teachers with 10 problem situations to
solve. Sue, like most teachers, was a whiz at taking multiple-choice exams, but bombed out on the
problem-solving final exam. She received a 48 on the midterm and a 1 on the final. Becky didn't do so
well on the midterm. She kept thinking of exceptions to answers on the multiple-choice exam. Her score
was 39. However, she really did shine on the final, scoring a 10. Since you expect students to do well on
both exams, you reason that Becky has done a creditable job on each and Sue has not. Becky gets the
higher grade. Yet, if you add the points together, Sue has 49 and Becky has 49. The question is whether
the points are really equal.
Should Sue also do this bit of arithmetic, she might come to your office to complain of the injustice of it
all. How will you show her that the value of each point on the two tests is different?
Locating scores and finding
scales in a distribution
„ Standard score (z-scores) X −x
z=
s
Locating scores and finding
scales in a distribution
Mind work

Suppose that we have measured the times taken by a very large number of
people to utter a particular sentence, and have shown these times to be
normally distributed with a mean of 3.45 sec and a standard deviation of
0.84 sec. Armed with this information, we can answer various questions.
1. What proportion of the (potentially infinite) population of utterance
times would be expected to fall below 3 sec?
2. What proportion would lie between 3 and 4 sec?
3. What is the time below which only 1 per cent of the times would be
expected to fall?
Mind work

„ 1. z-score for 3 sec. z = 3 − 3.45 = −0.54


0.84
„ 2. check the normal distribution table
„ 3. z-score for 4 sec. z = 4 − 3.45 = −0.66
0.84
„ 4. 100-29.46-25.46=45.1 per cent
„ 5. z-score for 1 per cent: 2.33
„ 6. − 2.33 = x − 3.45 x=(-2.33x0.84)+3.45=1.49
0.84
sec
Normal Distribution Table
Locating scores and finding
scales in a distribution
„ T-score
„ T score = 10(z) + 50
„ Z=(T-score-500)/100
X = z×s + x
Mind work

某外语学院在其研究生教学中规定,只要有一门课程的考试
成绩低于75分,即取消其撰写论文的资格。显然,这是不科
学的。因为这实质上也是把不同质的考试硬拉在一起进行比
较。同是75分,在不同考试中的意义是不一样的。在一个非
常容易的考试中,它可能是比较低的分数,而在一个难度较
大的考试中,它却可能是比较高的考分。如果凡是低于该分
数的都不让写论文,这是不科学的,也是不公平的。科学的
做法是把各科的考试分数换算成标准分,然后规定多少标准
分以下的没有资格写论文。同上例一样,有了标准分之后,
也可以把各科的成绩合成一个总分,或求平均分,排出名次,
再制定一个标准,以确定总分或平均分为多少的人才有资格
撰写论文
Locating scores and finding
scales in a distribution
„ Distributions with nominal data
„ Implicational scaling (Guttman scaling)
„ Coefficient of scalability
Homework
I. The following scores are obtained by 50 subjects on a language aptitude test:
42 62 44 32 47 42 52 76 36 43
55 27 46 55 47 28 53 44 15 61
18 59 58 57 49 55 88 49 50 62
61 82 66 80 64 50 40 53 28 63
63 25 58 71 82 52 73 67 58 77

1. Draw a histogram to show the distribution of the scores.


2. Calculate the mean and standard deviation of the scores.
3. Suppose Lihua scored 55 in this test, what’s her position in the
whole class?
II. Suppose there will be 418,900 test takers for the NMET in 2006 in
Guangdong, the key universities in China plan to enroll altogether 32,000
students in Guangdong. What score is the lowest threshold for a student to
be enrolled by the key universities? (Remember the mean is 500, standard
deviation is 100).
Sample statistics and population
parameter: estimation
„ Standard error
„ Sampling distribution of the mean
„ Standard error of mean
„ Standard error = s
N
„ In order to halve the standard error, we should have to
take a sample which was four times as big.
„ Central limit theorem:
„ For any population with mean of μand standard deviation
of σ, the distribution of sample means for sample size n
will approach a normal distribution with a mean of μand a
standard deviation of σ / n as n approaches infinity.
„ samples above 30
Sample statistics and population
parameter: estimation
„ Interpreting standard error: confidence
limits
Sample statistics and population
parameter: estimation

„ Normal distribution: sample is large


„ t-distribution: sample is small
„ Degree of freedom: N-1
„ When sample is large, t = z
Sample statistics and population
parameter: estimation
„ Interpreting standard error: confidence
limits
„ Mean=58.2
„ s=23.6
„ N=50
23.6
Standard error= N 50 = 3.3
s
=
„

.2−x
„ z=58 .2−x
58 51.7≤ x≤64.7
3.3 −1.96
≤ ≤1.9
3.3
Sample statistics and population
parameter: estimation
„ Confidence limits for proportions
„ Standard error = p(1 − p )
N
„ Confidence limits=proportion in sample
±(critical value x standard error)
Sample statistics and population
parameter: estimation
Suppose that we have taken a random sample of 500 finite verbs from a text,
and found that 150 of them have present tense form. How can we set
confidence limits for the proportion of present tense finite verbs in the whole
text, the population from which the sample is taken?

95% confidence limits = proportion in sample


± (1.96 X standard error) =0.30±(1.96x0.02)
= 0.30 ± 0.04 = 0.26 to 0.34.
We can thus be 95 per cent confident that the proportion of present tense
finite verbs in the population lies between 26 and 34 per cent.
Sample statistics and population
parameter: estimation
„ Estimating required sample sizes
„ Standard error = p(1 − p )
N

In a paragraph there are 46 word tokens, of which 11 are two-


letter words. The proportion of such words is thus 11/46 or 0.24.
How big a sample of words should we need in order to be 95 per
cent confident that we had measured the proportion to within an
accuracy of 1 per cent?
0.01=1.96 x standard error
Standard error = 0.01 x 1.96
0.24 × 0.76
N= 2
= 7007
(0.01 / 1.96)
Homework
I. The following are the times (in seconds) taken for a group of 30 subjects
to carry out the detransformation of a sentence into its simplest form:
0.55 0.56 0.52 0.59 0.51 0.50
0.42 0.41 0.37 0.22 0.24 0.41
0.49 0.59 0.75 0.65 0.63 0.61
0.72 0.77 0.76 0.39 0.26 0.68
0.30 0.32 0.44 0.61 0.54 0.47

Calculate (i) the mean, (ii) the standard deviation, (iii) the standard error
of the mean, (iv) the 99 per cent confidence limits for the mean.

II. A random sample of 300 finite verbs is taken from a text, and it is found
that 63 of these are auxiliaries. Calculate the 95 per cent confidence
limits for the proportion of finite verbs which are auxiliaries in the text
as a whole.
III. Using the data in question II, calculate the size of the sample of finite
verbs which would. be required in order to estimate the proportion of
auxiliaries to within an accuracy of 1 per cent, with 95 per cent
confidence.
Probability and Hypothesis
Testing
„ Null hypothesis (H0)
„ The null hypothesis states that in the general
population there is no change, no difference, or no
relationship. In the context of an experiment, H0
predicts that the independent variable (treatment)
will have no effect on the dependent variable for
the population. H0: μA- μB=0 or μA= μB
„ Alternative hypothesis (H1)
„ The alternative hypothesis (H1) states that there is
a change, a difference, or a relationship for the
general population. H1: μA≠ μB
Probability and Hypothesis
Testing
„ Null hypothesis (H0)
„ When we reject the null hypothesis, we want the probability to be
very low that we are wrong. If, on the other hand, we must accept
the null hypothesis, we still want the probability to be very low that
we are wrong in doing so.
„ Type I error and Type II error
„ A type I error is made when the researcher rejected the null
hypothesis when it should not have been rejected.
„ A type II error is made when the null hypothesis is accepted when
it should have been rejected.
„ In research, we test our hypothesis by finding the probability of
our results. Probability is the proportion of times that any
particular outcome would happen if the research were repeated
an infinite number of times.
Probability and Hypothesis
Testing
„ Two-tailed and one-tailed hypothesis
„ When we specify no direction for the null hypothesis (i.e.,
whether our score will be higher or lower than more typical
scores), we must consider both tails of the distribution. This
is called two-tailed hypothesis.
„ If we have good reason to believe that we will find a
difference (e.g., previous studies or research findings
suggest this is so), then we will use a one-tailed hypothesis.
One-tailed tests specify the direction of the predicted
difference. We use previous findings to tell us which
direction to select.
.05 .01
1-tailed 1.64 2.33
2-tailed 1.96 2.57
Probability and Hypothesis
Testing
„ Steps in hypothesis testing
1. State the null hypothesis.
2. Decide whether to test it as a one- or two-tailed hypothesis. If there is
no research evidence on the issue, select a two-tailed hypothesis. This
will allow you to reject the null hypothesis in favor of an alternative
hypothesis. If there is research evidence on the issue, select a
one-tailed hypothesis. This will allow you to reject the null
hypothesis in favor of a directional hypothesis.
3. Set the probability level (α level). Justify your choice.
4. Select the appropriate statistical test(s) for the data.
5. Collect the data and apply the statistical test(s).
6. Report the test results and interpret them correctly.
Probability and Hypothesis
Testing
„ Parametric vs. nonparametric
„ Parametric procedures
„ Make strong assumptions about the distribution of the
data
„ Assume the data are NOT frequencies or ordinal scales
but interval data
„ Data are normally distributed
„ Nonparametric procedures
„ Do not make strong assumptions about the shape of the
distribution of the data
„ Work with frequencies and rank-ordered scales
„ Used when the sample size is small
Homework

You might also like