AE 248: AI and Data Science: Prabhu Ramachandran 2024-01-01
AE 248: AI and Data Science: Prabhu Ramachandran 2024-01-01
Prabhu Ramachandran
2024-01-01
• Self study
• Reference text book:
Introduction to Probability and Statistics for Engineers and Scientists Sheldon M. Ross,
Academic Press.
• Quick review
• Numerical data
– Stem-and-leaf, boxplots, box-whisker plot
– Frequency distributions: histograms
– Cumulative frequency plots: ogives
– Relative frequencies
• Categorical data
– Bar charts, pie-charts
Histograms
1
Descriptive measures
Spread of data
Question
A) 2.087
B) 3.238
C) 4.580
D) 4.947
Question
A) -1
B) 0
C) 1
D) 2
2
Question
A) -1
B) -4
C) 1
D) 2
Question
A) 5
B) 2
C) 5 and 2
D) 4
Question
60 61 62 63 64 65
1 2 3 4 2 1
A) 62.54
B) 62.5
C) 63
D) 62
3
Question
For the data set shown in the histogram, which is larger; mean or median?
A) Mean
B) Median
C) Can’t tell without knowing the numbers
Chebychev's inequality
𝑁 (𝑆𝑘 ) 𝑛−1 1
>1− 2
>1− 2
𝑛 𝑛𝑘 𝑘
• Handy
4
• Universal
Normal distributions
Correlation coefficient
∑𝑖 (𝑥𝑖 − 𝑥)(𝑦
̄ 𝑖 − 𝑦)̄
𝑟=
(𝑛 − 1)𝑠𝑥 𝑠𝑦
∑𝑖 (𝑥𝑖 − 𝑥)(𝑦
̄ 𝑖 − 𝑦)̄
𝑟=
√∑(𝑥𝑖 − 𝑥)̄ 2 ∑(𝑥𝑖 − 𝑥)̄ 2
• −1 ≤ 𝑟 ≤ 1
• ”Small” r implies lower correlation
Gotchas
Sampling
• Importance of samples
• Importance of a random sample.
Basic Probability
5
Notion of Probability
Axioms
• 0 ≤ 𝑃 (𝐸) ≤ 1
• 𝑃 (𝑆) = 1
• 𝑃 (𝑈 𝐸𝑖 ) = ∑ 𝑃 (𝐸𝑖 ) when 𝐸𝑖 are mutually exclusive
• 𝑃 (𝐸 ′ ) = 1 − 𝑃 (𝐸)
• 𝑃 (𝐸𝑈 𝐹 ) = 𝑃 (𝐸) + 𝑃 (𝐹 ) − 𝑃 (𝐸𝐹 )
• Note: odds of an event: 𝑃 (𝐴)/(1 − 𝑃 (𝐴))
• Counting carefully!
Question
Prof. Rao has 10 books that he is going to put on his bookshelf. Of these, 4 are mathematics
books, 3 are chemistry books, 2 are history books, and 2 are language books. Jones wants to
arrange his books so that all the books dealing with the same subject are together on the shelf.
How many different arrangements are possible?
A) 4!3!2!2!
B) 4!3!2!1!
C) 4!4!3!2!2!
D) 4!4!3!2!1!
6
Question
For Mood Indigo, a group of size 5 is to be selected as the chief organizers from a collection
of 6 girls and 9 boys. If the selection is random, what is the probability that the group has 3
girls and 2 boys?
A) 420/1001
B) 240/1001
C) 360/1001
D) 300/1001
Conditional Probability
• Very important
• 𝑃 (𝐸|𝐹 )𝑃 (𝐹 ) = 𝑃 (𝐸𝐹 )
• 𝑃 (𝐸|𝐹 )𝑃 (𝐹 ) = 𝑃 (𝐹 |𝐸)𝑃 (𝐸)
• 𝑃 (𝐸) = 𝑃 (𝐸𝐹 ) + 𝑃 (𝐸𝐹 ′ )
• 𝑃 (𝐸) = 𝑃 (𝐸|𝐹 )𝑃 (𝐹 ) + 𝑃 (𝐸|𝐹 ′ )(1 − 𝑃 (𝐹 ))
= 𝑃 (𝐸|𝐹 )𝑃 (𝐹 ) + 𝑃 (𝐸|𝐹 ′ )𝑃 (𝐹 ′ )
• Independence: 𝑃 (𝐸𝐹 ) = 𝑃 (𝐸)𝑃 (𝐹 )
Question
A bin contains 5 defective (that immediately fail when put in use), 10 partially defective (that
fail after a couple of hours of use), and 25 acceptable transistors. A transistor is chosen at
random from the bin and put into use. If it does not immediately fail, what is the probability
it is acceptable?
A) 7/8
B) 5/7
C) 6/7
D) 5/8
𝑃 (𝐴)𝑃 (𝐵|𝐴)
𝑃 (𝐴|𝐵) =
(𝑃 (𝐴)𝑃 (𝐵|𝐴) + 𝑃 (𝐴′ )𝑃 (𝐵|𝐴′ ))
7
Question
An insurance company believes that people can be divided into two classes — those that are
accident prone and those that are not. Their statistics show that an accident-prone person
will have an accident at some time within a fixed 1-year period with probability .4, whereas
this probability decreases to .2 for a non-accident-prone per- son. If we assume that 30 percent
of the population is accident prone, what is the probability that a new policy holder will have
an accident within a year of purchasing a policy?
A) 0.3
B) 0.12
C) 0.26
D) 0.14
Question
Suppose that a new policy holder has an accident within a year of purchasing his policy. What
is the probability that he is accident prone?
A) 0.3
B) 0.4615
C) 0.26
D) 0.4
Question
A laboratory blood test is 99 percent effective in detecting a certain disease when it is, in
fact, present. However, the test also yields a “false positive” result for 1 percent of the healthy
persons tested. (That is, if a healthy person is tested, then, with probability .01, the test result
will imply he or she has the disease.) If .5 percent of the population actually has the disease,
what is the probability a person has the disease given that his test result is positive?
A) 0.3322
B) 0.3
C) 0.5
D) 0.99
Discussion
8
General case