MMPC 05 Ebook
MMPC 05 Ebook
1
IInluces very important
questions only
Based on syllabus
marks
Easy language
Easy to understand
correct solutions
as a writer.self gyan
2
MMPC 05 Quantitative Analysis for Managerial Applications
FIRST PRIORITY MOST IMPORTANT QUESTIONS
Q1- What is probability sampling and non probability sampling? Describe the
four design of probability sampling? (v v v v v imp)
Ans – TYPES OF SAMPLING
There are two basic types of sampling depending on who or what is allowed to govern the selection
of the sample. We shall call them by the names of probability sampling and non-probability sampling
Probability Sampling
In probability sampling the decision whether a particular element is included in the sample or not, is
governed by chance alone. All probability sampling designs ensure that each element in the
population has some nonzero probability of getting included in the sample. This would mean
defining a procedure for picking up the sample, based on chance, and avoiding changes in the
sample except by way of a pre-defined process again. The picking up of the sample is therefore
totally insulated against the judgment, convenience or whims of any person involved with the study.
That is why probability sampling procedures tend to become rigorous and at times quite
timeconsuming to ensure that each element has a nonzero probability of getting included in the
sample. On-the other hand, when probability sampling designs are used, it is possible to quantify the
magnitude of the likely error in inference made and this is of great help in many situations in
building up confidence in the inference.
Non-probability Sampling
Any sampling process which does not ensure some nonzero probability for each element in the
population to be included in the sample would belong to the category of non-probability sampling.
In this case, samples may be picked up based on the judgment or convenience of the enumerator.
Usually, the complete sample is not decided at the beginning of the study but it evolves as the study
progresses. However, the very same factors which govern the selection of a sample e.g. judgment or
convenience, can also introduce biases in the study. Moreover, there is no way that the magnitude
of errors can be quantified when non-probability sampling designs are used. Many times samples
are selected by interviewers or enumerators "at random" meaning that the actual sample selection
is left to the discretion of the enumerators. Such a sampling design would also belong to the non-
probability sampling category and not the category of probability or random sampling.
3
4
5
Q2- What is skewness? Distinguish between karl pearson and bowly co-
efficient of skewness? which one of these would you preferred and why? (v v
v v v imp) or what is RELATIVE SKEWNESS?
Ans – SKEWNESS
The measures of central tendency and variation do not reveal all the characteristics of a given set of
data. For example, two distributions may have the same mean and standard deviation but may
differ
6
widely in the shape of their distribution. Either the distribution of data is symmetrical or it is not. If
the distribution of data is not symmetrical, it is called asymmetrical or skewed. Thus skewness refers
to the lack of symmetry in distribution.
A simple method of detecting the direction of skewness is to consider the tails of the distribution
(Figure I). The rules are:
Data are symmetrical when there are no extreme values in a particular direction so that low and
high values balance each other. In this case, mean = median = mode. (see Fig I(a) ).
If the longer tail is towards the lower value or left hand side, the skewness is negative. Negative
skewness arises when the mean is decreased by some extremely low values, thus making mean <
median < mode. (see Fig I(b) )
If the longer tail of the distribution is towards the higher values or right hand side, the skewness is
positive. Positive skewness occurs when mean is increased by some unusually high values, thereby
making mean > median > mode. (see Fig I(c) )
7
RELATIVE SKEWNESS
In order to make comparisons between the skewness in two or more distributions, the coefficient of
skewness (given by Karl Pearson) can be defined as:
8
Q3- Discuss different approach of probability theory? All these approach
share some basic axiomsans state these? (v v v v v imp)
9
Probability, in common parlance, connotes the chance of occurrence of an event or happening. In
order that we are able to measure it, a more formal definition is required. This is achieved through
the study of certain basic concepts in probability theory, like experiment, sample space and event. In
this section we explore these concepts.
Experiment- The term experiment is used in probability theory in a much broader sense than in
physics or chemistry. Any action, whether it is the tossing of a coin, or measurement of a product's
dimension to ascertain quality, or the launching of a new product in the market, constitute an
experiment in the probability theory terminology
Sample Space- The set of all possible outcomes of an experiment is defined as the sample space.
Each outcome is thus visualised as a sample point in the sample space. Thus, the set (head, tail)
defines the sample space of a coin tossing experiment. Similarly, (success, failure) defines the
sample space for the launching experiment. You may note here, that given any experiment, the
sample space is fully determined by listing down all the possible outcomes of the experiment
Event- An event, in probability theory, constitutes one or more possible outcomes of an experiment.
Thus, an event can be defined as a subset of the sample space. Unlike the common usage of the
term, where an event refers to a particular happening or incident, here, we use an event to refer to
a single outcome or a combination of outcomes. Suppose, as a result of a market study experiment
of a product, we find that the demand for the product for the next month is uncertain, and may take
values from 100, 101, 102... 150. We can obtain different event
10
11
12
Q4- What is chi-square distribution? How would you use it in testing the
goodness of fit and testing independence of categorised data? (v v v v v imp)
or explain the SAMPLING DISTRIBUTION OF THE VARIANCE
By now it is implicitly clear that we use the sample mean to estimate the population mean, when
that parameter is unknown. Similarly; we use a sample statistic called the sample variance to
estimate the population variance. The sample variance is usually denoted by s2 and it again captures
sc me kind of an average of the square of deviations of the sample values from the sample mean. Let
us put it in an equation form
13
Consequently, we can calculate the sample variance based only on the sample values without
knowing the value of any population parameter. The division by (n - 1) is due to a technical reason to
make the expected value of s2 equal Q2 , which it is supposed to estimate.
If the random variable x has the standard normal distribution, what would be the distribution of x 2
? Intuitively speaking, it would be quite different from a normal distribution because now x 2 , being
a squared term, can assume only non-negative values. The probability density of x 2 will be the
highest near 0, because most of the x value are close to 0 in a standard normal distribution. This
distribution is called the chi-square distribution with 1 degree of freedom and is shown in Figure II
below.
14
15
16
Q5- What do you means by statistical hypothesis? Explain characteristics of
good hypothesis? Elaborate the concept of the significance level and p level
of a test? (v v v v v imp)
We shall now discuss some concepts which will come in handy when we attempt to set up a
procedure for testing of hypothesis.
What we have represented symbolically above can be interpreted to mean that the null hypothesis
is that the population mean is not greater than 20, whereas the alternative hypothesis is that the
population mean is greater than 20. It is clear that both Ho and HI cannot be true and also that one
of them will always be true. At the end of our testing procedure, if we come to the conclusion that
H„ should be, rejected, this also amounts to saying that HI should be accepted and vice versa. It ,s
not difficult to identify the pair of hypotheses relevant in any decision situation. Can any one of the
two be called the' null hypothesis? The answer is a big no-because the roles of Ho and Ht are not
symmetrical. One can conceptualise the whole procedure of testing of hypothesis as trying to
answer one basic question: Is the sample evidence strong enough to enable us to reject Ho? This
means that Ho will be rejected only when there is strong sample evidence against it. However, if the
sample evidence is not strong enough, we shall conclude that we cannot reject Ho and so we accept
Ho by default. Thus, Ho is accepted even without any evidence in support of it whereas it can be
rejected only when there is an overwhelming evidence against it. Perhaps the problem faced by the
purchase manager in 15.1 above will help us in understanding the role of the null hypothesis better.
The new supplier has claimed that his castings have higher hardness than the competitor's. The
17
mean hardness of casting supplied by the existing suppliers is 20 and so the purchase manager can
test the claim of the new supplier by setting up the following hypotheses:
In such a case, the purchase manager will reject the null hypothesis only when the sample evidence
is overwhelmingly against it-e.g. if the sample mean from the sample of castings supplied by the new
supplier is 30 yr 40, this evidence might be taken to be overwhelmingly strong so that Ho can be
rejected and so purchase effected from the new supplier. On the other hand if the sample mean is
20.1 or 20.2, this evidence may be found to be too mild to reject I-la so that Ho is accepted even
when the sample evidence is against it.
In other words, the decision maker is somewhat biased towards the null hypothesis and he does not
mind accepting the null hypothesis. However, he would reject the null hypothesis only when the
sample evidence against the null hypothesis is too large to be ignored. We shall explore the reasons
for this bias below. 43 Testing of Hypotheses The null hypothesis is called by this name because in
many situations, acceptance of this hypothesis would lead to null action. For example, if our
purchase manager accepts the null hypothesis, he would continue to buy castings from the existing
suppliers and so status quo ante would be maintained. On the other hand, rejecting the null
hypothesis would lead to a change in status quo ante and purchase is to be made from the new
supplier.
If we wrongly reject Ho , when in reality Ho is True-the error is called a type I error. Similarly, when
we wrongly accept Ho when Ho is False--the error is called a type II error. Let us go back to the
decision problem faced by the purchase manager, referred to in the Null Hypothesis above. If the
purchase manager rejects Ho and places orders with the new supplier when the mean hardness of
the castings supplied by the new supplier is in reality no better than the mean hardness of castings
supplied by the existing suppliers, he would be making a type I error. I n this situation, a type II error
would mean not to buy castings from the new supplier when his castings are really better. Both
these errors are bad and should be reduced to the minimum. However, they can be completely
eliminated only when the full population is examined-in which case there would be no practical
utility of the testing procedure. On the other hand, for a given sample size, these two errors
neutralise each other as we shall see Aker in this unit. This implies that if the testing procedure i5l
designed as to reduce the probability of occurrence of type I error, simultaneously the probability of
type II error would go up and vice versa. What can at best be achievedr is a reasonable balance
18
between these two errors. In all testing of hypothesis procedures, it is implicitly assumed that type I
error is much more severe than type II error and so needs to be controlled. If we go back to the
purchase manager's problem, we shall notice that type I error would result in a real financial loss to
the company since the company would have switched from the existing suppliers to the new
supplier who is in reality no better. The new castings are no better and perhaps worse than the
earlier odes thus affecting the quality of the final product (machine tools) produced. On top of it, the
new supplier might be given a higher rate for his castings as these have been claimed to have higher
hardness. And finally, there is a cost associated with any change.
Compared to this, type II error in this situation would result to an opportunity loss since the
company would forego the opportunity of using better castings. The greater the difference in costs
between type I and type II errors, the stronger would be the evidence needed to be able to reject
Ho-i.e. the probability of type I error would be kept down to lower limits. It is to be noted that type I
error occurs only when Ho is wrognly rejected.
In all tests of hypothesis, type I error is assumed to be more serious than type II error and so the
probability of type I error needs to be explicitly controlled. This is done through specifying a
significance level at which a test is conducted. The significance level, therefore, sets a limit to the
probability of type I error and test procedures are designed so as to get the lowest probability of
type II error subject to the significance level. The probability of type I error is usually represented by
the symbol a (read as alpha) and the probability of type II error represented by (3 (read as beta).
Suppose we have set up our hypotheses as follows:
We would perhaps use the sample mean x to draw inferences about the population mean /I. Also,
since we are biased towards Ho we would be compelled to reject Ho only when the sample evidence
is strongly against it. For example, we might decide to reject Ho only when > 52 or x
Now suppose the Ho is in reality true--i.e. the true value of µ is 50. In that case, if the population
distribution is normal or if the sample size is sufficiently large (n > 30), the distribution Of z will be
normal as shown in Figure I above. Remember that our criterion for rejecting Ho states that if I< 48
or x> 52, we shall reject Ho. Referring to Figure I, we find that the shaded area (under both tails 'of
19
the distribution of )- t represents the probability of rejecting Ho when Ho is true which is the same as
the probability ,of type I error. All tests of hypotheses hinge upon this concept of the significance
level and it is possible that a null hypothesis can l - rejected at a= .05 whereas the same evidence is
not strong enough to reject the null hypothesis at a = .01. In other words, the inference drawn can
be sensitive to the significance level used. Testing of 'hypothesis suffers, from the limitation that the
financial or the economic costs of consequences are not considered explicitly. In practice, the
significance level is supposed to be arrived at after considering the cost consequences. It is very
difficult to specify the ideal value of a in a specific situation; we can only give a guideline that the
higher the difference in costs between type I error and type II error, the greater is the importance of
type I error as compared to type II error. Consequently, the risk or probability of type I error should
be lower-i.e. the value of should be lower. In practice, most tests are conducted at a = .01, a = .05 or
a = .1 by convention as well as by convenience probability of type I error should be lower-i.e. the
value of should be lower. In practice, most tests are conducted at a = .01, a = .05 or a = .1 by
convention as well as by convenience
20
21
SECOND PRIORITY MOST IMPORTANT QUESTIONS
22
Q6- What is mode and and model class state the equation of obtaining mode
from grouped data? What are the various element in the equation for mode?
(v v v v v imp)
Ans – MODE
The mode is the typical or commonly observed value in a set of data. It is defined as the value which
occurs most often or with the greatest frequency. The dictionary meaning of the term mode is most
usual'. For example, in the series of numbers 3, 4, 5, 5, 6, 7, 8, 8, 8, 9, the mode is 8 because it occurs
the maximum number of times. The calculations are different for the grouped data, where the
modal class is defined as the class with the maximum frequency. The following formula is used for
calculating the mode.
he chief advantage of the mode is that it is, by definition, the most representative value of the
distribution. For example, when we talk of modal size of shoe or garment, we have this average in
mind. Like median, the value of mode is not affected by extreme values and its value can be
determined in open-end distributions. The main disadvantage of the mode is its indeterminate
value, i.e., we cannot calculate its value precisely in a grouped data, but merely estimate it. When a
given set of data have two or more than two values as maximum frequency, it is a case of bimodal or
multimodal distribution and the value of mode cannot be determined. The mode has no useful
23
mathematical properties. Hence, in actual practice the mode is more important as a conceptual idea
than as a working average.
A stratum can therefore he conceived of as a sub-population which is more homogeneous than the
complete population-the members of a stratum, are similar to each other and are different from the
members of another stratum in the characteristics that we are measuring.
Proportional stratified sampling: After defining the strata, a simple random sample is picked up
from each of the strata. If we want to have a total sample of size 100, this number is allocated to the
different strata-either in proportion to the size of the stratum in the population or otherwise. If the
different strata have similar variances of the characteristic being measured, then the statistical
efficiency will be the highest if the sample sizes for different strata are in the same proportion as the
size of the respective stratum in the population. Such a design is called proportional stratified
sampling and is shown in Table 4 below. If we want to pick up a proportional stratified sample of size
n from a population of size N, which has been stratified to p different strata with sizes N1,
N2,………….. Np respectively, then the sample sizes for different strata, viz n1, n2, …….np will be given
by
24
25
Q8- What do you understand by time series analysis describe the component
of time series? What relation generally assumed between three
components? (v v v v v imp) Time Series Analysis or Stochastic Models
Stochastic models, with Time Series analysis and Box-Jenkins models is also one of them to usesd.
The demand or variable of interest when plotted as a function of time yields what is commonly
called a `time-series'. This plot of demand at equal time intervals may show random patterns of
behaviour and our objective in Models on Historical Data was to identify the basic underlying
pattern that should be used to explain the data. After hypothesising a model (linear, parabolic or
other) regression was used to estimate the model parameters, using the criterion of minimising the
sum of squares of errors.
26
The observed value of the time series could then be expressed as a product (or some other function)
of the above factors. 14 Forecasting Methods Another treatment that may be given to a time series
is to use the framework developed by Box and Jenkins (1976) in which a stochastic model of the
autoregressive (AR) variety, moving average (MA) variety, mixed autoregressivemoving average
variety (ARMA) or an integrated autoregressive-moving average variety (ARIMA) model may be
chosen. Stochastic models are inherently complicated and require greater efforts to construct.
However, the quality of forecasting generally improves. Computer codes are available to implement
the procedures [see for instance Box and Jenkins (1976)].
Box and Jenkins (1976) have proposed a sophisticated methodology for stochastic model building
and forecasting using time series. The purpose of this section is merely to acquaint you with some of
the terms, models and methodology developed by Box and Jenkins. A time series may be classified
as stationary (in equilibrium about a constant mean value) or non-stationary (when the process has
no natural or stable mean). In stochastic model building the non-stationary processes often
converted to a stationary one by differencing. The two major classes of models used pop ularly in
time series analysis are Auto-regressive and Moving Average models.
27
28
MMPC 05 NUMBERICAL QUSTIONS VERY IMPORTANT
Ans – how to calculate video upload on self gyan channel please search ms 08 self gyan unit wise
video uploaded. MMPC05 AND MS 08 THIS UNIT IS SAME
Q2- frequency distribution given calculate quartile deviation and its co-
efficient? (v v v v v imp)
Ans – how to calculate video upload on self gyan channel please search ms 08 self gyan unit wise
video uploaded. MMPC05 AND MS 08 THIS UNIT IS SAME
Ans – how to calculate video upload on self gyan channel please search ms 08 self gyan unit wise
video uploaded. MMPC05 AND MS 08 THIS UNIT IS SAME
29
Ans – please see unit 14 Testing for the Significance of the Correlation Coefficient.
30
Q2- Exponential smoothing for demand forecasting? (v v v v v imp)
Ans – Exponential smoothing is an averaging technique where the weightage given to the past
data declines (at an exponential rate) as the data recedes into the past. Thus all the values are taken
into consideration, unlike in moving averages, where all data points prior to the period of the
Moving Average are ignored. If Ft is the one-period ahead forecast made at time t and is the demand
for period t, then
31
32
Q4- Poisson Process and Poisson Distribution? (v v v v v imp)
Ans – Conditions specific to the Poisson process are easily seen by establishing them in the context
of the Bernoulli process. Let us consider a Bernoulli process with n trials and
33
Q5- QUANTILES? (v v v v v imp)
Ans – Quantiles are the related positional measures of central tendency. These are useful and
frequently employed measures of non-central location. The most familiar quantiles are the quartiles,
34
deciles, and percentiles. Quartiles: Quartiles are those values which divide the total data into four
equal parts. Since three points divide the distribution into four equal parts, we shall have three
quartiles. Let us call them Q1, Q2, and Q3. The first quartile, Q1, is the value such that 25% of the
observations are smaller and 75% of the observations are larger. The second quartile, Q2, is the
median, i.e., 50% of the observations are smaller and 50% are larger. The third quartile, Q3, is the
value such that 75% of the observations are smaller and 25% of the observations are larger.
35
Q6- Non-Probability Sampling Methods? (v v v v v imp)
Ans – Probability sampling has some theoretical advantages over non-probability sampling. The
bias introduced due to sampling could be completely eliminated and it is possible to set a confidence
interval for the population parameter that is being studied. In spite of these advantages of
probability sampling, non-probability sampling is used quite frequently in many sampling surveys.
This is so because all are based on practical considerations.
Probability sampling requires a list of all the sampling units and this frame is not available in many
situations nor is it practically feasible to develop a frame of say all the households in a city or zone or
ward of a city. Sometimes the objective of the study may not be to draw a statistical inference about
the population but to get familiar with extreme cases or other such objectives. In a dealer survey,
36
our objective may be to get familiar with the problems faced by our dealers so that we can take
some corrective actions, wherever possible. Probability sampling is rigorous and this rigour e.g. in
selecting samples, adds to the cost of the study. And finally, even when we are doing probability
sampling, there are chances of deviations from the laid out process especially where some samples
are selected by the interviewers at site-say after reaching a village. Also, some of the sample
members may not agree to be interviewed or not available to be interviewed and our sample may
turn out to be a non-probability sample in the strictest sense of the term
Ans – In fitting a straight line (or any other function) to a set of data points we would expect some
points to fall above or below the line resulting in both positive and negative error terms (see Figure
II). It is true that we would like the overall error to be as small as possible. The most common
37
criterion in the determination of model parameters is to minimise the sum of squares of errors, or
residuals as they are often called. This is known as the least squares criterion, and is the one most
commonly used in regression analysis.
Ans – The quartile deviation, also known as semi-interquartile range, is computed by taking the
average of the difference between the third quartile and the first quartile. In symbols, this can be
written as:
38
SECOND PRIORITY MOST IMPORTANT SHORT NOTES
QUESTIONS
39
Ans – In this section we shall discuss one of the most important results of applied statistics which
is also known by the name of the central limit theorem
We need to use the central limit theorem when the population distribution is either unknown or
known to be non-normal. If the population distribution is known to be normal, then will also be
distributed normally, as we have seen above irrespective of the sample size.
40
Q12- The Power Curve of a Test? (v v v v v imp)
Ans - Let us go back to the purchase manager's problem referred to earlier where we set up our
hypotheses as follows
41
42
Q13- Moving Average models? Mixed Auto-regressive-moving average
models? (v v v v v imp) DISCUSSES IN stochastic model building and
forecasting
Ans - Moving Average models- Another kind of model of great importance is the moving average
model where Zt is made linearly dependent on a finite number q of previous a's (error terms)
43
The main contribution of Box and Jenkins is the development of procedures for identifying the
ARMA model that best fits a set of data and for testing the adequacy of that model. The various
stages identified by Box and Jenkins in their interactive approach to model building are. For details
on how such models are developed refer to Box and Jenkins.
Ans - A frequently used relative measure of variation is the coefficient of variation, denoted by C.V.
This measure is simply the ratio of the standard deviation to mean expressed as the percentage.
44
Q15- Stratified Sampling? (v v v v v imp) DISCUSSED IN PROBABILITY
SAMPLING METHODS
The concept: Suppose we are interested in estimating the demand of non-aerated beverages in a
residential colony. We know that the consumption of these beverages has some relationship with
the family income and that the families residing in this colony can be classified into three categories-
viz., high income, middle income and low income families. If we are doing a sampling study we
would like to make sure that our sample does have some members from each of the three
categories-perhaps in the same proportion as the total number of families belonging to that
category-in which case we would have used proportional stratified sampling. On the other hand, if
we know that the variation in the consumption of these beverages from one family to another is
relatively large for the low income category whereas there is not much variation in the high income
category, we would perhaps pick up a smaller than proportional sample from the high income
category and a larger than proportional sample from-the low income category. This is what is done
in disproportional stratified sampling. 14 Sampling and sampling Distributions The basis for using
stratified sampling is the existence of strata such that each stratum is more homogeneous within
and markedly different from another stratum. The higher the homogeneity within each stratum, the
higher the gain in statistical efficiency due to stratification
What are strata?: The strata are so defined that they constitute a partition of the population-i.e.,
they are mutually exclusive and collectively exhaustive. Every element of the population belongs to
one stratum and not more than one stratum, by definition
45
46
Q17- Pascal Distribution? (v v v v v imp)
Ans – Suppose we are interested in finding the p.m.f. of the number of trials (n) required to get 5
successes, given the probability p, of success in any trial.
Ans –
Ans – Quite often data is available in the form of some ranking for different variables. It is common
to resort to rankings on a preferential basis in areas such as food testing, competitive events (e.g.
games, fashion shows, or beauty contests) and attitudinal surveys. The primary purpose of
computing a correlation coefficient in such situations is to determine the extent to which the two
47
sets of rankings are in agreement. The coefficient that is determined from these ranks is known as
Spearman's rank correlation coefficient, r.
48
Q20- Systematic Sampling? (v v v v v imp)
Ans – Systematic sampling proceeds by picking up one element after a fixed interval depending on
the sampling ratio. For example, if we want to have a sample of size 10 from a population of size
100, our sampling ratio would be n/N = 10/100 = 1/10. We would, therefore, have to decide where
to start from among the first 10 names in our frame. If this number happens to be 7 for example,
then the sample would contain members having serial numbers 7,17,27, ........97 in the frame. It is to
be noted-that the random process establishes only the first member of the sample-the rest are
preordained automatic because of the known sampling ratio.
Systematic sampling in the previous example would choose one out of ten possible samples each
starting with either number 1, or number 2, or ....number 10. This is usually decided by allowing
chance to play its role-e.g. by using a table of random numbers.
Systematic sampling is relatively much easier to implement compared to simple random sampling.
However, there is one possibility that should be guarded against while using systematic sampling-the
possibility of a strong bias in the results if there is any periodicity in the frame that parallels the
sampling ratio. One can give some ridiculously simple example to highlight the point. If you were
making studies on the demand for various banking transactions in a bank branch by studying the
49
demand on some days randomly selected by systematic sampling-be sure that your sampling ratio is
not 1/7 or 1/14 etc. Otherwise you would always be studying the demand on the same day of the
week and your inferences could be biased depending on whether the day selected is a Monday or a
Friday and so on. Similarly, when the frame contains addresses of flats in buildings all alike and
having say 12 flats in one building, systematic sampling with a sampling ratio of 1/6, 1/60 or any
other such fraction would bias your sample with flats of only one type-e.g. a ground floor corner flat
i.e., all types of flats would not be members of your sample; and this might lead to biases in the
inference made.
I F the frame is arranged in an order, ascending or descending, of some attribute then the location of
the first sample element may affect the result of the study. For example, if our frame contains a list
of students arranged in a descending order of their percentage in the previous examination and we
are picking a systematic sample with a sampling ratio of 1/50. If the first number picked is 1 or 2,
then the sample chosen will be academically much better off compared to another systematic
sample with the first number chosen as 49 or 50. In such situations, one should devise ways of
nullifying the effect of bias due to starting number by insisting on multiple starts after a small cycle
or other such means. On the other hand, if the frame is so arranged that similar elements are
grouped together, then systematic sampling produces almost a proportional stratified sample and
would be, therefore, more statistically efficient than simple random sampling. Systematic sampling is
perhaps the most commonly used method among the probability sampling designs and for many
purposes e.g. for estimating the precision of the results, systematic samples are treated as simple
random samples.
50