0% found this document useful (0 votes)
14 views34 pages

R_-_III_UNIT[1]

The document covers fundamental concepts in statistics, including definitions and comparisons of descriptive and inferential statistics, types of data, measures of central tendency, and various statistical measures such as variance and standard deviation. It also explains probability concepts, events, and methods for calculating probabilities, including classical, relative frequency, and subjective probabilities. Additionally, it discusses graphical representations of data and the characteristics of distributions, including skewness and kurtosis.

Uploaded by

Lishanth
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views34 pages

R_-_III_UNIT[1]

The document covers fundamental concepts in statistics, including definitions and comparisons of descriptive and inferential statistics, types of data, measures of central tendency, and various statistical measures such as variance and standard deviation. It also explains probability concepts, events, and methods for calculating probabilities, including classical, relative frequency, and subjective probabilities. Additionally, it discusses graphical representations of data and the characteristics of distributions, including skewness and kurtosis.

Uploaded by

Lishanth
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 34

Statistical Computing and R Programming– III UNIT

2 marks:
1. What is a Statistics? Mention it types.
Statistics is a branch of mathematics that involves the collection, analysis, interpretation,
presentation, and organization of data. It can be categorized into two main types :
• Descriptive Statistics
• Inferential Statistics
2. Compare Descriptive Statistics and Inferential statistics.
Descriptive statistics summarize and present data in a meaningful way. It includes measures of
central tendency (mean, median, mode) and measures of variability (range, standard deviation).
Inferential statistics involve making predictions or inferences about a population based on a sample
of data. It includes hypothesis testing, estimating parameters, and making predictions using
regression analysis.
3. What are the four Types of Data & Measurement Scales?
• Nominal Data: Categorical data without any inherent order or ranking.
• Ordinal Data: Categorical data with a meaningful order or ranking.
• Interval Data: Numeric data with a consistent interval between values but no true zero point.
• Ratio Data: Numeric data with a consistent interval between values and a true zero point.
4. What are nominal data and ordinal data? Give an example.
• Nominal Data: Nominal data are categories without any inherent order.
Example: Colors (Red, Blue, Green)
• Ordinal Data: Ordinal data are categories with a meaningful order.
Example: Education levels (High School, Bachelor's, Master's)
5. What are interval data and ratio data? Give an example.
• Interval Data: Numeric data with a consistent interval between values but no true zero point.
Example: Temperature in Celsius.
• Ratio Data: Numeric data with a consistent interval between values and a true zero point.
Example: Height in centimeters.
6. Define Measure of Central Tendency. List its types.
Measures of central tendency yield information about the center, or middle part, of a group of
numbers. Types include:
• Mean
• Median
• Mode
7. Define Mode. Determine the mode for the following numbers.
2 4 8 4 6 2 7 8 4 3 8 9 4 3 5
The mode is the value that appears most frequently in a dataset.
For the numbers 2, 4, 8, 4, 6, 2, 7, 8, the mode is 4.
N
8. Define Median. Write the steps to calculate Median.
Median is the middle value in an ordered array of numbers. Steps to calculate the median:
(i) Arrange the data in ascending order.
(ii) If the number of observations (n) is odd, the median is the middle value.
(iii) If n is even, the median is the average of the two middle values.

9. Define Median. Determine the median for the numbers.


2 4 8 4 6 2 7 8 4 3 8 9 4 3 5
Median is the middle value in an ordered array of numbers. Median Calculation:
(i) Arrange the data: 2, 2, 3, 4, 4, 4, 4, 5, 6, 7, 8, 8, 8, 9.
(ii) Since there are 14 observations (even), the median is the average of the 7th and 8th values.
(iii) Median = (4 + 4) / 2 = 4.
10. Determine the mode and median for the following numbers.
213 345 609 073 167 243 444 524 199 682
Mode: There is no mode as no value repeats.
Median Calculation:
(i) Arrange the data: 167, 199, 213, 243, 345, 444, 524, 609, 682, 073.
(ii) Since there are 10 observations (even), the median is the average of the 5th and 6th values.
(iii) Median = (345 + 444) / 2 = 394.5.
11. Compute the mean for the following numbers.
17.3 44.5 31.6 40.0 52.8 38.8 30.1 78.5
Mean=17.3+44.5+31.6+40.0+52.8+38.8+30.1+78.58/8
Mean=41.825
12.Define Percentiles. Write the steps to calculate location of Percentiles.
Percentiles are measures of central tendency that divide a group of data into 100 parts.
(i) Order the data from smallest to largest.
𝑃
(ii) Calculate the percentile location (i) by: 𝑖 = 100 (𝑁)
where, P = percentile and N = number is the number of observations
(iii) Determine the location by either (a) or (b).
a. If i is a whole number, the Pth percentile is the average of the value at the ith location and
the value at the (i+1)th location.
b. If i is not a whole number, the Pth percentile value is located at the whole number part of i+1.
13.What is Quartiles? Determine Q3 for 14, 12, 19, 23, 5, 13, 28, 17.
Quartiles are measures of central tendency that divide a group of data into four subgroups or parts.
Q3 is the third quartile representing the P75.
Order the data: 5 12 13 14 17 19 23 28
𝑃 75
Calculate i : 𝑖= (𝑁) = (8) = 6
100 100
Because i is a whole number, P75 is the average of the 6th and the 7th numbers.
The value of Q3 is P75 = 123.5.
N
14. Determine the 30th percentile of the following eight numbers: 14, 12, 19, 23, 5, 13, 28, 17.
Order the data: 5 12 13 14 17 19 23 28
𝑃 30
Calculate i : 𝑖 = 100 (𝑁) = 100 (8) = 2.4

Because i is not a whole number, the value of i + 1 is 2.4 + 1 = 3.4. The whole-number part of 3.4 is
3. The 30th percentile is located at the third value.
P30 = 13
15. Define Range. Write the range of following numbers.
16 28 29 13 17 20 11 34 32 27 25 30 19 18 33
The range is the difference between the maximum and minimum values in a dataset.
Range calculation : Range = Max – Min = 34 – 11 = 23
16.Define Interquartile Range. Write its formulae.
The interquartile range is the range of values between the first and third quartile. Essentially, it is
the range of the middle 50% of the data.
IQR = Q3 – Q1
17.Define Mean Absolute Deviation. Write its formulae.
The mean absolute deviation (MAD) is the average of the absolute values of the deviations around
the mean for a set of numbers.
∑ | 𝑥−𝜇|
𝑀𝐴𝐷 = 𝑁

18. Define Variance. Write its formulae.


The variance is the average of the squared deviations about the arithmetic mean for a set of
numbers. The population variance is denoted by σ2.

19. Define Standard Deviation. Write its formulae.


The standard deviation is the square root of the variance. The population standard deviation is
denoted by σ .

20. Write the formulae for sample Variance and sample Standard Deviation.

Sample Variance

Sample Standard Deviation


21. Define Z score. Write its formulae.
A z score represents the number of standard deviations a value (x) is above or below the mean of a
set of numbers when the data are normally distributed.

N
22. State Empirical Rule. List the condition.
Empirical Rule also known as the 68-95-99.7 rule, states that for a normal distribution:
• About 68% of the data falls within one standard deviation of the mean.
• About 95% falls within two standard deviations.
• About 99.7% falls within three standard deviations.
Condition: The Empirical Rule applies to approximately symmetric, bell-shaped distributions.

23. Define Coefficient of Variation. Write its formulae.


The coefficient of variation is the ratio of the standard deviation to the mean expressed in
percentage.

24. Define Measures of Shape. Mention its types.


Measures of shape are tools that can be used to describe the shape of a distribution of data. Types
include:
• Skewness: Measures asymmetry
• Kurtosis: Measures the tail's thickness or thinness

25.What is Skewness? Draw its types.


Skewness measures the asymmetry of a distribution. Types:

Positive Skewness Negative Skewness Zero Skewness


(Right-Skewed) (Left-skewed)
26.Write the formulae to calculate Coefficient of Skewness using Karl Pearson.

27. Write the formulae to calculate Coefficient of Skewness using Bowel’s.

N
28. Write the relationship between mean, median and mode in various skewness.
Negatively Skewed (Left): Mean < Median < Mode
Positively Skewed (Right): Mean > Median > Mode
Symmetrical: Mean = Median = Mode

29. What is Kurtosis? Mention its types.


Kurtosis describes the amount of peakedness of a distribution. Types:
leptokurtic : distributions that are high and thin
platykurtic : distributions that are flat and spread out
mesokurtic : normal distribution

30. What are the components of Box-and-Whisker Plots?


• Minimum: The smallest data point within the lower fence.
• Maximum: The largest data point within the upper fence.
• Q1 (First Quartile): Median of the lower half of the data.
• Q3 (Third Quartile): Median of the upper half of the data.
• Median (Q2): The middle value of the dataset.
• Outliers: Data points beyond the fences (1.5 times the interquartile range).

31.What are Histogram, Pie charts, frequency polygons and Bar charts?
Histogram: A histogram is a series of contiguous bars or rectangles that represent the frequency of
data in given class intervals.
Pie Chart: A pie chart is a circular statistical graphic divided into slices to illustrate numerical
proportions. The size of each slice represents the proportion of data it represents.
Frequency Polygon: A frequency polygon is a line graph that represents the frequencies of different
values in a dataset.
Bar Chart: A bar chart is a graphical representation of data where individual bars represent different
categories. The length of each bar corresponds to the quantity it represents.

32. What are Stem and Leaf plot?


A stem-and-leaf plot is constructed by separating the digits for each number of the data into two
groups, a stem and a leaf. The leftmost digits are the stem and consist of the higher valued digits.
The rightmost digits are the leaves and contain the lower values.

33.What is a Probability? Mention it types.


Probability is a measure of the likelihood that a given event will occur. It is expressed as a number
between 0 and 1. Types of Probability:
• Marginal Probability
• Union
• Joint Probability
• Conditional Probability

N
34. What is an Experiment and Event? Give Example.
Experiment: An experiment is a process that produces an outcome with uncertainty. For example,
rolling a die is an experiment.
Event: An event is an outcome or a set of outcomes of an experiment. For example, getting a 3
when rolling a die is an event.
35. What is the classical method of assigning of a Probability? Give Example.
Probability is determined by the number of favorable outcomes divided by the total number of
possible outcomes, assuming all outcomes are equally likely.
Example: The probability of rolling a 3 on a fair six-sided die is 1/6.
36. What is the relative frequency of occurrence method assigning of a Probability? Give
Example.
Probability is estimated based on the observed frequency of an event in a large number of trials.
Example: If a coin is flipped 100 times and it lands on heads 60 times, the relative frequency of
getting heads is 60/100=0.6
37. What are the subjective probabilities? Give Example.
Subjective probabilities are based on the feelings, knowledge, and experience of the person
determining the probability.
Eg: A weather forecaster predicting a 70% chance of rain based on their knowledge of weather
patterns.
38. Define Elementary Events. Give an example.
Events that cannot be decomposed or broken down into other events are called elementary events.
In rolling a six-sided die, the elementary events are getting a 1, 2, 3, 4, 5, or 6.
39. Define Sample Space. Give an example.
A sample space is a complete list of all elementary events for an experiment.
Example: In flipping a coin, the sample space is {Heads, Tails}.
40. Give an example for Unions and Intersections.
Let A be the event of rolling an even number (A = {2, 4, 6}) and B be the event of rolling a number
greater than 4 (B = {5, 6}).
The union of A and B (A ∪ B) is {2, 4, 5, 6}, and the intersection of A and B (A ∩ B) is {6}.
41. Define Mutually Exclusive Events and Independent Events. Give an example.
Mutually Exclusive Events: Mutually exclusive events cannot occur at the same time. If one event
happens, the other cannot. Example: Rolling a die and getting a 1 or 2 are mutually exclusive
events.
Independent Events: Independent events are events where the occurrence of one event does not
affect the occurrence of the other. Example: Flipping a coin twice, where the outcomes are
independent.

N
42. Define Collectively Exhaustive Events. Give an example.
A set of events is collectively exhaustive if at least one of them must occur. A list of collectively
exhaustive events contains all possible elementary events for an experiment.
Example: When rolling a six-sided die, the events of getting a 1, 2, 3, 4, 5, or 6 are collectively
exhaustive.
43. Define Complementary Events. Give an example.
The complement of an event A (denoted as A') is the set of all outcomes not in A.
Example: If event A is getting a head when flipping a coin, then A' is getting a tail.
44. If a population consists of the positive even numbers through 30 and if A = {2, 6, 12, 24}, what
is A’ ?
A′ (the complement of A) would be the set of positive even numbers through 30 that are not in A.
Therefore, A′={4,8,10,14,16,18,20,22,26,28,30}.
45. What are the three types of Counting the Possibilities.
mn counting rule : For an operation that can be done m ways and a second operation that can be
done n ways, the two operations then can occur, in order, in mn ways.
Sampling from a Population with Replacement: sampling n items from a population of size N
with replacement
Combinations : sampling n items from a population of size N without replacement
46. Write the general Addition and Special Addition Laws.
General Addition Law :
Special Addition Law (for mutually exclusive events) :
47. Write the General Law of Multiplication and Special Law of Multiplication.
General Multiplication Law:
Special Multiplication Law (for independent events):
48. Write the Conditional Probability.
Conditional probability is denoted P(E1 | E2). This expression is read: the probability that E1 will
occur given that E2 is known to have occurred. The information that is known or given is written to
the right of the vertical line in the probability statement.
An example of conditional probability is the probability that a person owns a Chevrolet given that
she owns a Ford.
49. What are discrete random variables? Give an example.
These are random variables that can take on a countable number of distinct values.
Example: The number of heads obtained when flipping a coin multiple times.
50. What are Continuous random variables? Give an example.
These are random variables that can take on values at every point over a given interval.
Example: The height of a person can be considered a continuous random variable.

N
51. Write the formulae for Mean, Variance, and Standard Deviation of Discrete Distributions.
Mean :
where E(x) = long-run average x = an outcome P(x) = probability of that outcome
Variance :
where x = an outcome P(x) = probability of a given outcome µ = mean
Standard Deviation :

52. List the assumptions of Binomial Distribution.


• The experiment involves n identical trials.
• Each trial results in a success or failure.
• The probability of success (p) is constant for each trial.
• The trials are independent.
53. Write the formulae of binomial distribution.

54. A company places a seven-digit serial number on each part that is made. Each digit of the
serial number can be any number from 0 through 9. Digits can be repeated in the serial
number. How many different serial numbers are possible?
For each digit in the serial number, there are 10 possibilities (0 through 9). Since there are 7 digits
in the serial number, you multiply the number of possibilities for each digit:
10 × 10 × 10 × 10 × 10 × 10 × 10 = 107
So, there are 107 or 10,000,000 possible different serial numbers.
55. A small company has 20 employees. Six of these employees will be selected randomly to be
interviewed as part of an employee satisfaction program. How many different groups of six
can be selected?
Given, n = 20 r=6
The number of ways to choose 6 employees out of 20 is given by the combination formula:
n 𝑛!
Cr = 𝑟! (𝑛−𝑟)!
20 20!
C6 = 6! (20−6)! = 38,760

56. What are Poisson distribution? Give an example.


The Poison distribution is a discrete distribution that focuses only on the number of discrete
occurrences over some interval or continuum.
Eg : The number of phone calls received at a call center in one hour.
N
57. List the characteristics of Poisson distribution.
• It is a discrete distribution.
• It describes rare events.
• Each occurrence is independent of the other occurrences.
• It describes discrete occurrences over a continuum or interval.
• The occurrences in each interval can range from zero to infinity.
58. Write the formulae of Poisson distribution.

59. What are uniform distributions? Write the probability density function of uniform
distribution.
The uniform distribution, sometimes referred to as the rectangular distribution, is a relatively simple
continuous distribution in which the same height, or f(x), is obtained over a range of values.
Probability density function of uniform distribution,

60. Write the formulae of mean and standard deviation of a uniform distribution.

61. Write the formulae of Probabilities in a Uniform Distribution.

62. List the characteristics of normal distribution.


• It is a continuous distribution.
• It is a symmetrical distribution about its mean.
• It is asymptotic to the horizontal axis.
• It is a family of curves.
• Area under the curve is 1.
63. Write the probability density function Normal Distribution.
1 2
𝑓(𝑥) = 𝜎√2𝜋 𝑒 −1/2[(𝑥−𝜇)/𝜎)]

N
64. What are t Distribution? Write the formula for the t statistic.
Gosset developed the t distribution, which is used instead of the z distribution for doing inferential
statistics on the population mean when the population standard deviation is unknown and the
population is normally distributed. The formula for the t statistic is

65. Write the Confidence Intervals formulae in t statistic.

66. Write the Z formulae for sample mean.

Long Answer Questions


1. Explain four Types of Data & Measurement Scales with example.
(i) Nominal Data: Categories with no inherent order or ranking.
Example: Colors (Red, Blue, Green).
(ii) Ordinal Data: Categories with a meaningful order, but the intervals between them are not
uniform.
Example: Education Levels (High School, Bachelor's, Master's).
(iii) Interval Data: Ordered categories with uniform intervals between them, but no true zero
point.
Example: Temperature in Celsius (0°C does not mean the absence of temperature).
(iv) Ratio Data: Ordered categories with uniform intervals, and a true zero point.
Example: Height in centimeters (0 cm indicates no height).

2. Explain Kurtosis types with diagram.


(i) Leptokurtic : Tails are heavier, indicating more data in the tails.
Peaks are higher and tails are fatter compared to
a normal distribution.

(ii) Mesokurtic : Similar to a normal distribution,


neither heavy-tailed nor light-tailed.

(iii) Platykurtic : Tails are lighter, indicating less data in the tails.
Peaks are lower, and tails are thinner compared
to a normal distribution.
N
3. Explain Measure of Skewness with its types.
Skewness measures the asymmetry of a distribution. There are three types:
(i) Negative Skewness (Left-skewed) : The distribution's tail is extended to the left. Tail on the left
side is longer.

(ii) Positive Skewness (Right-skewed): The distribution's tail is extended to the right. Tail on the
right side is longer.

(iii) Zero Skewness: The distribution is perfectly symmetrical. Left and right sides are mirror
images.

4. The number of U.S. cars in service by top car rental companies in a recent year according
to Auto Rental News follows. Compute the mode, the median, and the mean.

Answer:

N
5. Compute the 35th percentile, the 55th percentile, Q1, Q2, and Q3 for the following data.
16 28 29 13 17 20 11 34 32 27 25 30 19 18 33

6. The following shows the top 16 global marketing categories for advertising spending for a
recent year according to Advertising Age. Spending is given in millions of U.S. dollars.
Determine the first, the second, and the third quartiles for these data.

Answer:

N
7. A data set contains the following seven values.
6 2 4 9 1 3 5
a. Find the range.
b. Find the mean absolute deviation.
c. Find the population variance.
d. Find the population standard deviation.
e. Find the interquartile range.
f. Find the z score for each value.
g. Calculate Coefficient of Variation.
Ans:

𝜎 2.491
g. Coefficient of Variation, CV = 𝜇 (100) = 4.2857 (100) = 58.12

N
8. A data set contains the following eight values.
4 3 0 5 2 9 4 5
a. Find the range.
b. Find the mean absolute deviation.
c. Find the sample variance.
d. Find the sample standard deviation.
e. Find the interquartile range.
f. Calculate Coefficient of Variation
Ans: xxxx

𝜎 2.619
f. Coefficient of Variation, CV = 𝜇 (100) = (100) = 65.475
4

N
9. Shown here is a sample of six of the largest accounting firms in the United States and the
number of partners associated with each firm as reported by the Public Accounting Report.
Calculate sample variance and sample standard deviation.

Firm Number of Partners


Deloitte & Touche 2654
Ernst & Young 2108
PricewaterhouseCoopers 2069
KPMG 1664
RSM McGladrey 720
Grant Thornton 309

Ans:

10. Use your calculator or computer to find the population variance and population standard
deviation for the following data.
123 090 546 378
392 280 179 601
572 953 749 075
303 468 531 646

N
11. On a certain day the average closing price of a group of stocks on the New York Stock
Exchange is $35 (to the nearest dollar). If the median value is $33 and the mode is $21, is the
distribution of these stock prices skewed? If so, how?

12. A local hotel offers ballroom dancing on Friday nights. A researcher observes the
customers and estimates their ages. Discuss the skewness of the distribution of ages if the
mean age is 51, the median age is 54, and the modal age is 59.
mean = 51
median = 54
mode = 59
The distribution is skewed to the left. More people are older but the most extreme ages are younger
ages.

13. Suppose the following data are the ages of Internet users obtained from a sample. Use
these data to compute a Pearsonian coefficient of skewness. What is the meaning of the
coefficient?
41 15 31 25 24
23 21 22 22 18
30 20 19 19 16
23 27 38 34 24
19 20 29 17 23
Ans:

As the value of Sk is positive, the distribution is positively skewed.

N
14. Construct a box-and-whisker plot on the following data. Do the data contain any outliers?
Is the distribution of data skewed?
540 690 503 558 490 609
379 601 559 495 562 580
510 623 477 574 588 497
527 570 495 590 602 541
Ans:

15. Shown here is a list of the top five industrial and farm equipment companies in the United
States, along with their annual sales ($ millions).
Construct a pie chart and a bar graph to represent these data, and label the slices with the
appropriate percentages.
Firm Revenue ($ million)
Caterpillar 30,251
Deere 19,986
Illinois Tool Works 11,731
Eaton 9,817
American Standard 9,509
Ans:
Bar Graph

N
Pie Chart
Firm Revenue (x) Proportion (p=x/T) Degrees (p*360)
Caterpillar 30,251 0.372 134
Deere 19,986 0.246 89
Illinois Tool Works 11,731 0.144 52
Eaton 9,817 0.121 43
American Standard 9,509 0.117 42
T = 81294

16. The following list shows the top six pharmaceutical companies in the United States and
their sales figures ($ millions) for a recent year. Use this information to construct a pie chart
and a bar graph to represent these six companies and their sales.
Pharmaceutical Company Sales
Pfizer 52,921
Johnson & Johnson 47,348
Merck 22,886
Bristol-Myers Squibb 21,886
Abbott Laborataries 20,473
Wyeth 17,358
Ans:

N
Proportion Degrees (p*360)
Pharmaceutical Company Sales (x)
(p=x/T)
Pfizer 52,921 0.289 104
Johnson & Johnson 47,348 0.259 93
Merck 22,886 0.125 45
Bristol-Myers Squibb 21,886 0.120 43
Abbott Laborataries 20,473 0.112 40
Wyeth 17,358 0.095 34
T = 182872

17. The following data represent the costs (in dollars) of a sample of 30 postal mailings by a
company.

Using dollars as a stem and cents as a leaf, construct a stem-and-leaf plot of the data.
Ans:

N
18. Construct a histogram and a frequency polygon for the following data.

Ans:

Histogram Frequency Polygon

19. Construct a histogram and a frequency polygon for the following data.

Ans:

Histogram Frequency Polygon

N
20. Construct a stem-and-leaf plot using two digits for the stem.

Ans:

21. Explain general Methods of assigning probabilities with example.


(i) Classical Method of Assigning Probabilities : When probabilities are assigned based on laws
and rules, the method is referred to as the classical method of assigning probabilities.

For example, if a company has 200 workers and 70 are female, the probability of randomly
selecting a female from this company is 70/ 200 > = .35
(ii) Relative Frequency of Occurrence : It is based on cumulated historical data. With this method,
the probability of an event occurring is equal to the number of times the event has occurred in the
past divided by the total number of opportunities for the event to have occurred.

For example, a company wants to determine the probability that its inspectors are going to reject the
next batch of raw materials from a supplier.
(iii) Subjective Probability : It is based on the feelings or insights of the person determining the
probability. Subjective probability comes from the person’s intuition or reasoning.
For example, A weather forecaster predicting a 70% chance of rain based on their knowledge of
weather patterns.

N
22. A supplier shipped a lot of six parts to a company. The lot contained three defective parts.
Suppose the customer decided to randomly select two parts and test them for defects. How
large a sample space is the customer potentially working with?
List the sample space. Using the sample space list, determine the probability that the customer
will select a sample with exactly one defect.

23. Given X = {1, 3, 5, 7, 8, 9}, Y = {2, 4, 7, 9}, and Z = {1, 2, 3, 4, 7}, solve the following.
a. X∪Z b. X∩Y c. X∩Z
d. X∪Y∪Z e. X∩Y∩Z f. (X∪Y) ∩ Z
g. (Y∩Z) ∪ (X∩Y) h. X or Y i. Y and X
a) X∪Z= {1, 2, 3, 4, 5, 7, 8, 9}
b) X∩Y = {7,9}
c) X∩Z= {1, 3, 7}
d) X∪Y∪Z= {1, 2, 3, 4, 5, 7, 8, 9}
e) X∩Y∩Z= {7}
f) (X∪Y) ∩Z= {1, 2, 3, 4, 5, 7, 8, 9} ∩ {1, 2, 3, 4, 7} = {1, 2, 3, 4, 7}
g) (Y∩Z) ∪ (X∩Y) = {2, 4, 7} ∪ {7,9} = {2, 4, 7, 9}
h) X or Y= X ∪ Y = {1, 2, 3, 4, 5, 7, 8, 9}
i) Y and Z = Y∩Z= {2, 4, 7}

24. A company’s customer service 800 telephone system is set up so that the caller has six
options. Each of these six options leads to a menu with four options. For each of these four
options, three more options are available. For each of these three options, another three
options are presented. If a person calls the 800 number for assistance, how many total options
are possible?
Ans: 6x4x3x3 = 216
N
25. A bin contains six parts. Two of the parts are defective and four are acceptable. If three of
the six parts are selected from the bin, how large is the sample space? Which counting rule
did you use, and why? For this sample space, what is the probability that exactly one of the
three sampled parts is defective?

26. Explain Marginal, Union, Joint and Conditional Probabilities with example.
(i) Marginal probability is denoted P(E), where E is some event. A marginal probability is usually
computed by dividing some subtotal by the whole.
For example, the probability of a person wearing glasses is a marginal probability. This probability
is computed by dividing the number of people wearing glasses by the total number of people.

(ii) Union probability is the union of two events and is denoted P(E1ꓴE2), where E1 and E2 are
two events. P(E1ꓴE2) is the probability that E1 will occur or that E2 will occur or that both E1 and
E2 will occur.
An example of union probability is the probability that a person owns a Ford or a Chevrolet. To
qualify for the union, the person only has to have at least one of these cars.
(iii) Joint probability is the intersection of two events E1 and E2 and is denoted P(E1∩E2).
Sometimes P(E1∩E2) is read as the probability of E1 and E2. To qualify for the intersection, both
events must occur.
An example of joint probability is the probability of a person owning both a Ford and a Chevrolet.
Owning one type of car is not sufficient.
(iv) Conditional probability is denoted P(E1 | E2). This expression is read: the probability that E1
will occur given that E2 is known to have occurred. Conditional probabilities involve knowledge of
some prior information. The information that is known or given is written to the right of the vertical
line in the probability statement.
An example of conditional probability is the probability that a person owns a Chevrolet given that
she owns a Ford. This conditional probability is only a measure of the proportion of Ford owners
who have a Chevrolet—not the proportion of total car owners who own a Chevrolet.
N
27. The client company data from the Decision Dilemma reveal that 155 employees worked
one of four types of positions. Shown here again is the raw values matrix (also called a
contingency table) with the frequency counts for each category and for subtotals and totals
containing a breakdown of these employees by type of position and by sex. If an employee of
the company is selected randomly, what is the probability that the employee is female or a
professional worker?

Ans: Let F denote the event of female and P denote the event of professional worker.

The question is P (FꓴP) = ?

By the general law of addition, P(FꓴP) = P(F) + P(P) - P(F∩P)


Of the 155 employees, 55 are women. Therefore, P(F) = 55/155 = .355.
The 155 employees include 44 professionals. Therefore, P(P) = 44/155 = .284.
Because 13 employees are both female and professional, P(F∩P) = 13/155 = .084.
The union probability is solved as

P(FꓴP) = .355 + .284 - .084 = .555.

P(FꓴP) = 86/155 = .555


A second way to produce the answer from the raw value matrix is to add all the cells one time that
are in either the Female column or the Professional row
3 + 13 + 17 + 22 + 31 = 86
and then divide by the total number of employees, N = 155, which gives

P(FꓴP) = 86/155 = .555

28. Given P(A) = .10, P(B) = .12, P(C) = 21, P(A∩C) = .05, and P(B∩C) = .03, solve the
following.
a. P(A∪C)
b. P(B∪C)
c. If A and B are mutually exclusive, P(A∪B)

a) P(AꓴC) = P(A) + P(C) - P(A∩C) = .10 + .21 - .05 = .26

b) P(BꓴC) = P(B) + P(C) - P(B∩C) = .12 + .21 - .03 = .30

c) If A, B mutually exclusive, P(AꓴB) = P(A) + P(B) =.10 + .12 = .22

N
29. According to the U.S. Bureau of Labor Statistics, 75% of the women 25 through 49 years
of age participate in the labor force. Suppose 78% of the women in that age group are
married. Suppose also that 61% of all women 25 through 49 years of age are married and are
participating in the labor force.
a. What is the probability that a randomly selected woman in that age group is married or is
participating in the labor force?
b. What is the probability that a randomly selected woman in that age group is married or is
participating in the labor force but not both?
c. What is the probability that a randomly selected woman in that age group is neither
married nor participating in the labor force?
Given, P(L) = .75 P(M) = .78 P(M∩L) = .61

a) P(MꓴL) = P(M) + P(L) - P(M∩L)= .78 + .75 - .61 = .92

b) P(MꓴL) but not both = P(MꓴL) - P(M∩L) = .92 - .61 = .31

c) P(NM ∩ NL) = 1 - P(MꓴL) = 1 - .92 = .08


30.A company has 140 employees, of which 30 are supervisors. 80 of the employees are
married, and 20% of the married employees are supervisors. If a company employee is
randomly selected, what is the probability that the employee is married and is a supervisor?

31. Use the values in the contingency table to solve the equations given.

a. P(A∩E) b. P(D∩B) c. P(D∩E) d. P(A∩B)


Ans:
a. P(A∩E) = 16/57 = .2807
b. P(D∩B) = 3/57 = .0526
c. P(D∩E) = .0000
d. P(A∩B) = .0000

N
32. A study by Peter D. Hart Research Associates for the Nasdaq Stock Market revealed that
43% of all American adults are stockholders. In addition, the study determined that 75% of
all American adult stockholders have some college education. Suppose 37% of all American
adults have some college education. An American adult is randomly selected.
a. What is the probability that the adult does not own stock?
b. What is the probability that the adult owns stock and has some college education?
c. What is the probability that the adult owns stock or has some college education?
d. What is the probability that the adult has neither some college education nor owns stock?
e. What is the probability that the adult does not own stock or has no college education?
f. What is the probability that the adult has some college education and owns no stock?
Let S = stockholder
Let C = college
P(S) = .43 P(C) = .37 P(C | S) = .75
a) P(NS) = 1 - .43 = .57
b) P(S∩C) = P(S) P(C | S) = (.43)(.75) = .3225

c) P(SꓴC) = P(S) + P(C) - P(S∩C) = .43 + .37 - .3225 = .4775

d) P(NS ∩ NC) = 1 - P(SꓴC) = 1 - .4775 = .5225

e) P(NS ꓴ NC) = P(NS) + P(NC) - P(NS ∩ NC) = .57 + .63 - .5225 = .6775
f) P(C∩NS) = P(C) - P(C∩S) = .37 - .3225 = .0475

33. Determine the mean, the variance, and the standard deviation of the following discrete
distribution.
x P(x)
1 .238
2 .290
3 .177
4 .158
5 .137
Ans:

N
34. Determine the mean, the variance and the standard deviation of the following discrete
distribution.
Number of Crises Probability
0 .37
1 .31
2 .18
3 .09
4 .04
5 .01

Mean: 1.15
Variance: 1.41
Standard Deviation: 1.19
35. A Gallup survey found that 65% of all financial consumers were very satisfied with their
primary financial institution. Suppose that 25 financial consumers are sampled and if the
Gallup survey result still holds true today, what is the probability that exactly 19 are very
satisfied with their primary financial institution?(Using Binomial Distribution formulae)

36. According to the U.S. Census Bureau, approximately 6% of all workers in Jackson,
Mississippi, are unemployed. In conducting a random telephone survey in Jackson, what is
the probability of getting two or fewer unemployed workers in a sample of 20? (Using
Binomial Distribution formulae)

N
37. Suppose bank customers arrive randomly on weekday afternoons at an average of 3.2
customers every 4 minutes. What is the probability of exactly 5 customers arriving in a 4-
minute interval on a weekday afternoon? The lambda for this problem is 3.2 customers per 4
minutes. The value of x is 5 customers per 4 minutes. (Using Poisson Formula)
We have,

The probability of 5 customers randomly arriving during a 4-minute interval when the long-run
average has been 3.2 customers per 4-minute interval is

If a bank averages 3.2 customers every 4 minutes, the probability of 5 customers arriving during
any one 4-minute interval is.1141.

38. Bank customers arrive randomly on weekday afternoons at an average of 3.2 customers
every 4 minutes. What is the probability of having more than 7 customers in a 4-minute
interval on a weekday afternoon? (Using Poisson Formula)

N
39. A bank has an average random arrival rate of 3.2 customers every 4 minutes. What is the
probability of getting exactly 10 customers during an 8-minute interval? (Using Poisson
Formula)
Given,

The right way to approach this problem is to adjust the interval for lambda so that it and x have the
same interval. The interval for x is 8 minutes, so lambda should be adjusted to an 8-minute interval.
Logically, if the bank averages 3.2 customers every 4 minutes, it should average twice as many, or
6.4 customers, every 8 minutes. If x were for a 2-minute interval, the value of lambda would be
halved from 3.2 to 1.6 customers per 2-minute inter- val. Always adjust the lambda value. After
lambda has been adjusted for an 8-minute interval, the solution is

40. Find the following values by using the Poisson formula.


a. P(x = 5 | λ = 2.3)
b. P(x = 2 | λ = 3.9)
c. P(x ≤ 3 λ = 4.1)
d. P(x = 0 | λ = 2.7)
e. P(x = 1 | λ = 5.4)
f. P(4 < x < 8 | λ = 4.4)

N
41. Suppose the amount of time it takes to assemble a plastic module ranges from 27 to 39
seconds and that assembly times are uniformly distributed. Describe the distribution. What is
the probability that a given assembly will take between 30 and 35 seconds? Fewer than 30
seconds?

42. According to the National Association of Insurance Commissioners, the average annual
cost for automobile insurance in the United States in a recent year was $691.
Suppose automobile insurance costs are uniformly distributed in the United States with a
range of from $200 to $1,182. What is the standard deviation of this uniform distribution?
What is the height of the distribution? What is the probability that a person’s annual cost for
automobile insurance in the United States is between $410 and $825?

N
43. What is the probability of obtaining a score greater than 700 on a GMAT test that has a
mean of 494 and a standard deviation of 100? Assume GMAT scores are normally distributed.
P(x > 700 | μ = 494 and σ = 100) = ?
This problem calls for determining the area of the upper tail of the distribution. The z score for this
problem is

The area under the curve for z = 2.06 is .4803. This value is the probability of randomly drawing a
GMAT with a score between the mean and 700.
Finding the probability of getting a score greater than 700, which is the tail of the distribution,
requires subtracting the probability value of .4803 from 5000, because each half of the distribution
contains 5000 of the area.

N
44. GMAT test that has a mean of 494 and a standard deviation of 100? Assume GMAT scores
are normally distributed. What is the probability of randomly obtaining a score between 300
and 600 on the GMAT exam?
P(300 < x < 600 | μ = 494 and σ = 100) = ?

45. GMAT test that has a mean of 494 and a standard deviation of 100? Assume GMAT scores
are normally distributed. what is the probability of randomly drawing a score that is 550 or
less?
P(x ≤ 550 | μ = 494 and σ = 100) = ?

N
46. GMAT test that has a mean of 494 and a standard deviation of 100? Assume GMAT scores
are normally distributed. What is the probability of getting a score between 350 and 450 on
the same GMAT exam?
P(350 < x < 450 | μ = 494 and σ = 100) = ?

N
47. Suppose the following data are selected randomly from a population of normally
distributed values. Construct a confidence interval to estimate the population mean and 90%
confidence level. (Using the t Statistic). The sample mean is 13.56 and the sample standard
deviation is 7.8.
6 21 17 20 7 0 8 16 29
3 8 12 11 9 21 25 15 16

48. Assuming x is normally distributed; use the following information to compute a 99%
confidence interval to estimate the population mean and 99% confidence level. (Using the t
Statistic). The sample mean is 2.14 and the sample standard deviation is 1.29.
3 1 3 2 5 1 2 1 4 2 1 3 1 1

You might also like