Statistics Jee Advance
Statistics Jee Advance
Statistics 1
is 16 and the standard deviation is 2.5 years. Which of the
and σ22 are the variances of A and B respectively. If σ2 following statements are correct?
denote the mean and variance of the combined pool of A (a) Data about education is more variable.
and B i.e., the data set: x, 3,5,11,15, y, 9,11 and the relation (b) Data about salaries is more consistent.
(c) Data about Education is more consistent.
5σ2 + 3σ22
σ2 = 1 holds true then which of the following (d) Both the data sets are equally consistent.
8
statement(s) is/are correct 14. Consider two data set A : 2,5,8,11,14 , and B : 2,8,14 . Let
(a) The number of possible ordered pairs (x, y), where x and µ1 and µ 2 are the means of the A and B respectively and
y are natural numbers, is finite.
σ12 and σ22 are the variances of A and B respectively. If
(b) The possible natural values of x will form an arithmetic
progression with common difference 5. µ and σ2 denote the mean and variance of the combined
(c) The possible natural values of y will form an arithmetic pool of A and B i.e., the data set: 2,5,8,11,14, 2,8,14 then
progression with common difference 3. which of the following statement(s) is/are correct?
(d) Smallest possible natural values of x and y are
respectively 1 and 2 (a) µ1 =µ 2 (b) σ12 =σ22
11. Let M1 denotes the median of data set A: 5, 19, 42, 11, 50, 30,
21, 0, 52, 36, 27. and M2 denotes the median of the distribution 5σ2 + 3σ22 σ2 + σ22
(c) σ2 = 1 (d) σ2 = 1
8 2
x 1 2 3 4 5 6 7 8 9 15. All the students of a class performed poorly in Mathematics.
The teacher decided to give grace marks of 10 to each of the
f 8 10 11 16 20 25 15 9 6
students. Which of the following statistical measures will
Which of the following statements are correct? change after the grace marks were given?
(a) M1 > M2 (a) Mean (b) Median
(b) M1. M2 = 135 (c) Mode (d) Variance
(c) Both M1 and M2 are less than 20. 16. In a set of 2 n observations, half of them are equal to ‘a’ and the
(d) Both M1 and M2 are divisors of 540. remaining half are equal to ‘ −a ‘. If the standard deviation
12. The ages (in years) and incomes (in Rupees) of the 10 of all the observations in 2 ; then the value of a can be
employees at XYZ Pvt. Ltd. are given in following table (a) 2 (b) 2
Age Income (c) –2 (d) 2 2
25 23500 17. The mean of the numbers a, b, 8, 5, 10 is 6 and the variance
30 25000 is 6.80. Then which one of the following is correct?
40 30000 (a) a = 3, b = 4 (b) a = 4, b = 3
53 47500 (c) a = 3, b = 3 (d) a = 4, b = 4
29 32000 18. A scientist is weighing each of 30 fishes. Their mean weight
worked out is 30 gm and a standarion deviation of 2 gm.
45 37500
Later, it was found that the measuring scale was misaligned
40 32000 and always under reported every fish weight by 2 gm. The
55 50500 correct values are:
35 40000 (a) x = 32 (b) x = 28
47 43750 (c) σ = 2 (d) σ = 4
Assuming that all employees remain with the company for 19. Let X and M.D. be the mean and the mean deviation
next 5 years and that each income is multiplied by 1.5 over
about X of n observations xi , i = 1, 2, ..., n. If each of the
that period, which of the following statements is/are correct?
observations is increased by 5, then the new mean and the
(a) Current standard deviation of ages is 10.21 yrs. mean deviation about the new mean, respectively, are:
(b) Standard Deviation of salaries 5 years later will be Rs.
13836.15 (a) X (b) X + 5
(c) Standard deviation of ages after 5 years will be 10.21 (c) M.D (d) M.D. + 5
years. 20. If the standard deviation of the numbers 2, 3, a and 11 is 3.5,
(d) Standard Deviation of salaries now is Rs. 13836.15 then which of the following is true?
13. The mean yearly salary of all the employees at ABC Pvt. (a) 3a2 – 34a + 91 = 0
Ltd is Rs. 42,500 and the standard deviation is Rs. 4,000. (b) Product of all value of a is 28
The mean number of years of education for the employees (c) a can be prime number
2 Math
(d) 3a2 – 32a + 84 = 0 (a) 10 (b) 20
21. Standard deviation of a data set is 7, when each observation (c) 30 (d) 40
is decreased by 7 then standard deviation of new data is 32. The value of [σ] is
22. A sample of 20 observations has mean of 50 and variance (a) 13 (b) 14
of 1, while a sample of 40 observations has mean of 50 and (c) 15 (d) 16
standard deviation 2. The 2 samples are combined to give
complete set of 60 observations with variance σ2, then 3 σ2
is equal to Passage–II
to (Given x1 < x2 < x3 , … < xn and X is mean of all these for i = 1, 2,3 .
observations)
n1 = 5, µ1 = 8 , and σ12 =18
24. If the standard deviation of first 10 positive integral multiples
of 3 is equal to k then find [k], where [.] is G.I.F. n2 = 3, µ 2 = 8 , and σ22 =24
25. The mean and standard deviations of 100 observation were
n3 = 3, µ3 = 16 , and σ32 =24
calculated as 40 and 5.1 respectively by a student who took
by mistake 50 instead of 40 for one observation. If the sum of Now answer the following questions. ([.] denotes the greatest
correct mean and standard deviation is k then find [k], where integer function)
[.] is G.I.F. Choose the correct answer:
26. If coefficient of variation of a distribution is 60% and its
2
standard deviation is 18, then its arithmetic mean is 33. σ12 is equal to
27. The variance of x1, x2, x3, …, x100 is 13.75 if the variance of (a) 11 (b) 20
5x1, 5x2, 5x3, …, 5x100 is equal to k then find [k], where [.] is (c) 33 (d) 44
G.I.F.
28. The mean and variance of seven observations are 8 and 16 2
34. σ13 is equal to
respectively, if five of these are 2, 4, 10, 12,14 and other two
(a) 11 (b) 22
a −a (c) 35 (d) 44
are a1, a2 such that a1 < a2 then a2 2 1 is equal to
29. If the mean and standard deviations of 10 observation x1, Passage–III
x2, x3……..x10 are 2 and 3 respectively, then the mean of (x1
+1)2, (x2 + 1)2,………..(x10 + 1)2 is equal to Let x , M and be respectively the mean, mode and variance
of n observation x1 , x2 , …, xn and di =− xi − a,i =1, 2, …, n ,
30. L=
e t X {11,12,13, …, 40, 41} =
a n d Y {61, 62, 63, … ,
where a is any number.
90,91} be the two sets of observations. If x and y are
35. Variance of d1 , d 2 , …, d n is
their respective means and σ2 is the variance of all the
Statistics 3
Height 60-62 63-65 66-68 69-71 72-74 C Which not dependetn on change R Mode
(inch) of origin?
Frequency 5 18 42 27 8 D The value of range of data is S S.D.
Assuming the mid-point of a class interval as class mark always greater than or euqal to
match the entries in column-I to the entries in the column-II (a) A → r; B → p; C → s; D → r, q
([.] denotes the greatest integer function) (b) A → q, r; B → p, q, r, s; C → s; D → s
Column-I Column-II (c) A → s; B → p, q; C → p, r; D → p, s, q
A If mean of the data is m then p. 2 (d) A → q; B → p; C → s; D → r
[m] is equal to
Paragraph Based
B If mean deviation of the data q. 4
is MD then [MD] is equal to Paragraph for question numbers 1 to 3
C If standard deviation of data r. 3
is σ then [σ] is equal to 10 10
D If coefficient of variation is s. 67 Given values x1, x2…,x10 such that ∑ xi = 120, ∑ xi2 = 1530 then
CV then [CV] is equal to 40. A.M. of x1, x2,…x10 is i =1 i =1
t. 65 (a) 12 (b) 10
(a) A → s; B → p; C → p; D → q (c) 11 (d) 13
(b) A → s; B → p; C → r; D → q 41. Variance of x1, x2,...x10 is
(c) A → p; B → q; C → r; D → s (a) 2 (b) 3
(d) A → p; B → q; C → r; D → t (c) 9 (d) 16
38. Let Ri , Mi , σi and Ci denote the range, mean deviation, 42. Coefficient of variation is
standard deviation and coefficient of variation for dataset Si (a) 36% (b) 41%
for i = 1, 2. Consider following datasets.
(c) 25% (d) None of these
S1: 12, 6, 7, 3, 15, 10, 18, 5 and S2: 9, 3, 8, 8, 9, 8, 9, 18 ([.] Paragraph for question numbers 4 to 6
denotes the greatest integer function) Match the entries in
Column-I to entries in Column-II To analyze data using mean, median and mode, we need to use
the most appropriate measure of central tendency. The mean is
Column-I Column-II useful for predicting future results when there are no extreme
A p. 6 values in the data set. The median may be more useful than the
R1 + R2 mean when there are extreme values in the data set as it is not
Value of is equal to
2 affected by the extreme values. The median is most commonly
quoted figure used to measure property prices as mean property
B Value of [M1 + M2] is equal to q. 7
price is affected by a few expensive properties that arenot
C Value of [σ1] + [σ2] is equal to r. 8 representative of the general property market. The mode is useful
D Value of [C1 – C2] is equal to s. 15 when the most common item or characteristic of a data set is
required. The mode has applications in printing. It is important
t. 12
to print more of the most popular books.
(a) A → s; B → p; C → p; D → r
43. For the data shown, the value of appropriate measure of
(b) A → s; B → p; C → r; D → q
central tendency is
(c) A → p; B → q; C → r; D → s
Staff 1 2 4 5 6 2
(d) A → t; B → p; C → q; D → s
Salary 15000 10000 7000 12000 90000 95000
39. Follosing question contains stements given in two column,
which have to be matched. The satemetns in Column-I are (In rupees)
labelled as A, B, C and D while the statements in Column-II (a) 95000 (b) 18350
are labelled as p, q, r and s. Any given statemtn in Column-I (c) 90000 (d) 12000
can have correct matching with one or more statement(s) in
Column-II. 44. For a normally distributed sample as shown, then most
appropriate representative of data is
Column-I Column-II
A Better measure of central ten- P Mean
dency from data 1, 7, 8, 9, 98 is
B Which is not independent of Q Median
change of scale?
4 Math
130 + x + 126 + x + 68 + x + 50 + x + 1 + x
x=
5
3. (b)
=x
∑
=
xi
0
(a) Mena (b) Mode 100
(c) Median (d) Any one of mean or median
∑ xi − x
5 ⇒ ∑ xi x j =
= (500) 2
45. Based upon collection of data of numbers of days it snows, 100 1≤i < j ≤100
rains or it is sunny in a month for three month December,
January and February of last year, the weather forecast points ∑ (500) 2 − 2∑ xi x j
xi2
that snow is likely to be in January. Which measure is used ⇒ = =2500 − 1600
100 100
for this forecast?
2
(a) Mean (b) Mode
S.D.
=
∑ ( xi −=
x)
900 30
=
(c) Median (d) Range 100
4 .(c)
Answer Key 7 k + 9k
Median
= = 8k
2
1. (d) 2. (a) 3. (b) 4. (c) 5. (a) 6. (b) 7. (c) 8. (c) 9. (b,c) 10. (b,c) Mean deviation about median = 11
11. (a,b,d) 12. (a,b,c) 13. (a,b) 14. (a,c) 15. (a,b,c) 16. (a,c)
∑ xi − 8k = 11
17. (a,b) 18. (a,c) 19. (b,c) 20. (b,d) 21. [7] 22. [9] 23. [1] 24. [8] 6
25. [44] 26. [30] 27. [343] 28.[64] 29. [18] 30. [603] 31. (d) 7k + 5k + k + k + 3k + 5k = 66
⇒ 22k = 66
32. (b) 33. (b) 34. (c) 35. (a) 36. (c) 37. (a) 38. (a) 39. (b) ⇒k=3
40. (a) 41. (c) 42. (c) 43. (d) 44. (a) 45. (b) 5. (a)
A B A+B
EXPLANATION
=x1 40
= x 2 55 =x 50
1. (d)
σ1 =α σ2 =30 − α σ2 =350
Since, root mean square ≥ arithmetic mean
=n1 100
= n2 n 100 + n
n 2 n 100 × 40 + 55n
xi
=i 1 =
∴
∑ x
i 1 i
≥
∑ =
400 80
≥ ⇒ n ≥ 16
x=
100 + n
n n n n 5000 + 50 n = 4000 + 55 n
Hence, possible value of n = 18 . 1000 = 5 n
2. (a)
n = 200
( 27 + x ) + ( 31 + x ) + (89 + x ) ∑ xi2 − 402
σ12
=
+ (107 + x ) + (156 + x ) 100
Given, 82 =
5
⇒ 82 × 5= 410 + 5 x ⇒ 410 − 410= 5 x
σ22
=
∑ x 2j − 552
⇒x=0 100
∴ Required mean is
350 =σ2 =
∑ xi2 + ∑ x 2j − ( x )2
300
Statistics 5
350
(1600 + α2 ) ×100 + (30 − α)2 + 3025 × 200 − (50)2 8. (c)
300
2850 × 3 = α2 + 2(30 – α)2 + 1600 + 6050 2 Óxi2
( ∑ xi ) − 2Óxi x j =
Óxi2 = 300; 30
=
8550 = α2 + 2(30 – α)2 + 7650 10
2
α2 + 2(30 – α)2 = 900 ∑ xi2 ∑ xi
=
σ − ;σ
= 30 − 25
= 5
α2 – 40 α + 300 = 0 10 10
σ12 + σ=
2 2 2 9. (b, c)
2 10 + 20= 500
63.5 to 68.5 is a one standard deviation interval about
6. (b) the mean hence percentage of females having heights in
9 = x1 < x2 < … < x7 this range is 68. 61.0 to 71.0 is a two standard deviation
9, 9 + d, 9 + 2d, …., 9 + 6d interval about the mean hence percentage of females
having heights in this range is 95. 58.5 to 73.5 is a
0, d, 2d, …, 6d
three standard deviation interval about the mean hence
percentage of females having heights in this range is 99.7.
21d
xnew
= = 3d 10. (b, c)
7
16
=
7
(
1 2 2
)
0 + 1 +……+ 62 d 2 − 9d 2 N σ2 + N 2 σ22
Formula σ2 = 1 1 hold true for two data sets
N1 + N 2
1 6 × 713 2 2 having N1 and N 2 number of elements and variances σ12
= d − 9d
7 26
16 = 4d and σ22 if there means are equal.
d2 = 4 Hence x and y have to satisfy 5=
y 3x + 2
d=2
11. (a, b, d)
x + x6 = 6 + 9 + 10 + 9 Let us arrange the value of ascending order $0,5,11,19,21
7. (c) ,27,30,36,42,50,52$
4000
CV=
42500
× 100= 9.4% And the coefficient of variation
Now, standard deviation σ =
∑ ( x − A)2
2n
2.5
for years of education is CV = × 100 = 15.6% Years of
16 (a − 0) 2 + (a − 0) 2 +…+ (0 − a ) 2 +……
2=
education has a higher relative variation. 2n
a 2 ⋅ 2n
1 = = a
14. (a, c) Mean of first set = (2 + 5 + 8 + 11 + 14) =8 =µ1 2n
5
Hence, a = 2
1
Mean of second set = (2 + 8 + 14) =8 =µ 2 17. (a, b)
3
Variance of first set
Mean a, b,8,5,10 is 6
2 2 2
1 (2 − 8) + (5 − 8) + (8 − 8) a + b + 8 + 5 + 10
=σ12 = =18 ⇒ =6
5 +(11 − 8) 2 + (14 − 8) 2 5
(a − 6) 2 + (b − 6) 2 + (8 − 6) 2
1 2
=σ22 = (2 − 8) 2 + (8 − 8) 2 + (14 − 8) 2 =24
3
+(5 − 6) 2 + (10 − 6) 2
The mean of the combined sets ⇒ = 6.80
5
⇒ a 2 − 12a + 36 + (1 − a ) 2 + 21 =
34
2 + 5 + 8 + 11 + 14 + 2 + 8 + 14
= = 8= µ
5+3 ⇒ 2a 2 − 14a + 24 =0 ⇒ a 2 − 7 a + 12 =0
a 3 or 4 ⇒=
⇒= b 4 or 3
The varies of the combined set is
∴ The possible value of a and b are a = 3 and b = 4
(2 − 8) 2 + (5 − 8) 2 + (8 − 8) 2 + (11 − 8) 2 + or, a = 4 and b = 3 .
Statistics 7
n
xi
∴ Correct ∑ x=i 3990, and ∑ xi2= 161701
Mean = ∑
i =1 n ∴ Correct mean =
3990
39.9
=
100
20. (b,d) and correct S.D.
161701
2 + 3 + a + 11 a = − (39.9) 2 =25 =
5
x= = +4 100
4 4
∴ Sum of S.D. and mean = 44.9.
x2
=σ ∑ ni − ( x )2 27. [343]
2
Variance is multiplied by 52.
4 + 9 + a 2 + 121 a ∴ Variance = 13.75 × 25 = 343.75
⇒ 3.5
= − + 4
4 4 28. [64]
⇒ =
(
2 2
49 4 134 + a − a + 256 + 32a ) ( ) The number a1 = 6 and a2 = 8
4 16
⇒ 3a 2 − 32a + 84 =
0 ∴ a2 2
a − a1
82 =
= 64
Integer / Numerical Type
29. [18]
21. [7] x1 + x2 + x3 + ……..+ x10 = 20
Deviation does not change by changing each of the
10
datapoint by same quantity. 2
∑ ( xi − x )
22. [9] σ= 3⇒ i =1
= 9
Since mean of both samples is the same 10
∑ ( xi2 − 4 xi + 4 ) =
90
n1σ12 + n2 σ22
σ2 =
n1 + n2 ∑ xi2 − 4∑ xi + 4∑1 =
90
20 × 1 + 40 × 22
=σ2 = 3 10
23. [1]
60 ∑ xi2 − 4 × 20 + 4 ×10 =90
i =1
Mean deviation is minimum about median. 10
24. [8] ∑ xi2 = 90 + 80 − 40
i =1
=
32 + 62 +…+ 302
− (16.5) 2 =
∑ xi2 + 2∑ x1 + 10
10 10
= 74.25
∴ standard deviation = 8.62 130 + 40 + 40 180
= = = 18
10 10
25. [44]
30. [603]
∑ xi ⇒
Mean =40 =
100
∑ xi =4000
41
∑i 11 + 41
i =11
dx
= = = 26 (31 elements)
31 2
and S .D. = 5 ⇒ σ2 = (5.1) 2
91
∴
∑ xi2 − 402 =
26.01 ∑ j
100 j = 61 61 + 91
=y = = 76 (31 elements)
31 2
∴ ∑ xi2 =
16260
8 Math
– (sum of squares of incorrect values) + (sum of squares
31× 26 + 31× 76 of correct values)
Combined mean, µ = =
31 + 31 = 365000 −(34)2 + (43)2 = 365693
So, corrected
26 + 76
= = 51 2
2 1 1
1 31 2
31
2
=σ
n
∑ xi2 − ∑ xi2
n
2
σ= × ∑ ( xi − µ ) + ∑ ( yi − µ ) =
705
62 i 1 =i 1
= 365693 8009
2
31 31
2 2
∴ ∑ ( xi −=
µ) ∑ ( yi − µ ) 33. (b)
=i 1 =i 1
1 n1µ1 + n2µ 2
∴X =
N
∑ xi ⇒ ∑ xi = nX = 200 × 40 = 8000 Where µ =
n1 + n2
holds true in general.
corrected
= ∑ xi incorrect ∑ xi − 2
That gives us σ13 = 33.5
(sum of incorrect values) + (sum of correct values)
= 8000 – 34 + 43 = 8009
35. (a)
corrected ∑ xi 36. (c)
∴ Corrected mean =
n
x1 + x2 + x3 +…+ xn
8009 x=
= = 40.045 n
200
Now, σ= 15 ⇒ 152 1 n 2
σ2
= ∑ ( xi − x )
n i =1
2
Mean of d1, d2, d3, …, dn
=
1
200
( ) 1
∑ xi2 − 200 ∑ xi
d1 + d 2 + d3 +…+ d n
2 =
⇒
= 225
1
200
(∑ )
xi2
8000
−
200
n
( − x1 − a ) + ( − x2 − a ) + ( − x3 − a ) +…+ ( − xn − a )
=
⇒
= 225
1
200
( ∑ ) − 1600
xi2 n
Hence=
MD
∑ f | x=
−X | 226.5
= 2.26 inch 9 + 3 + 8 + 8 + 9 + 8 + 9 + 18 72
N 100 X
= = = 9
8 8
Standard deviation can be calculated as shown below
Height Class |x – X|= Fre- MD =
∑ |X − X |
(inch) Mark |x – (x – X)2 quency N
f(x – X)2
(X) 67.45| (f) | 9 − 9 | + | 8 − 9 | + | 9 − 9 | + | 18 − 9 || 9 − 9 | + | 3 − 9 | + | 8 − 9 | + | 8 − 9 | +
=
8
60 − 62 61 –6.45 41.6025 5 208.0125
0 + 6 +1+1+ 0 +1+ 0 + 9
63 − 65 64 –3.45 11.9025 18 214.2450 = = 2.25 = M2
8
66 − 68 67 –0.45 0.2025 42 8.5050 Calculations for standard deviations
69 − 71 70 2.55 6.5025 27 175.5675
10 Math
As data has outliners at 90,000 and 95,000 we should use
median in place of mode
(12 − 9.5) 2 + (6 − 9.5) 2 + (7 − 9.5) 2
+ (3 − 9.5) 2 + (15 − 9.5) 2 10 th Value+11th value
&Median= =12000
+ (10 − 9.5) 2 + (18 − 9.5) 2 2
+ (5 − 9.5) 2 For a normally distributed data, we many use either mean
=
8 or median. However, mean is preferred as it include all the
values in the data set for its calculation and any change in
any of the scores will affect the value of the mean which
= 23.75 4.87.ForS2 :
= is not the case with median or mode.
σ2 =
∑ ( X − X )2
N
(9 − 9) 2 + (3 − 9) 2 + (8 − 9) 2 + (8 − 9) 2 +
(9 − 9) 2 + (8 − 9) 2 + (9 − 9) 2 + (18 − 9) 2
=
8
= =15 3.87
Now coefficients of variation
σ
CV =× 100, x ≠ 0
x
4.87
C1 = × 100 = 51.26 and
9.5
3.87
C2 = × 100 = 43
9
39. (b)
(a) Due to low value 1, mean is not preferred
(b)
Mean, Median, Mode and S.D. are dependent on
change of scale.
(c) S.D. is independent of change of origin.
(d) Range is always greater than or equal to S.D.
40. (a)
41. (c)
42. (c)
Sol. (40-42)
Σxi2
=σ − ( x )2
N
1530
=σ 144 3
−=
10
σ
cv = × 100 =25%
x
43. (d)
44. (a)
45. (b)
Sol. (43-45)
Statistics 11