0% found this document useful (0 votes)
6 views

Session 4 Week 2

The document covers various statistical concepts relevant to courtroom settings, including events and probability, conditional probability, Bayes' theorem, sensitivity and specificity, and common fallacies in reasoning. It emphasizes the importance of understanding these concepts for accurate interpretation of statistical data in legal contexts. Additionally, it discusses the application of ROC curves and relative risk in evaluating diagnostic tests and making informed decisions.

Uploaded by

robin.bishnoi.m
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Session 4 Week 2

The document covers various statistical concepts relevant to courtroom settings, including events and probability, conditional probability, Bayes' theorem, sensitivity and specificity, and common fallacies in reasoning. It emphasizes the importance of understanding these concepts for accurate interpretation of statistical data in legal contexts. Additionally, it discusses the application of ROC curves and relative risk in evaluating diagnostic tests and making informed decisions.

Uploaded by

robin.bishnoi.m
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

Session 4 Week 2

Nivedita Nadkarni
Statistics in the Courtroom
November 15th 2024, NLSIU
Today’s Statistics Topics
• Events and Probability
• Conditional Probability
• Bayes’ Theorem
• Sensitivity and Specificity
• Two Common Fallacies
• ROC Curve
• Relative Risk and Odds ratio
• Problem solving
Events and Probability
• An event is the basic element to which probability can be applied.
• It is the result of an observation or experiment, or the description
of some potential outcome.
• For example, an event can be that two 50-year-old criminals are
given 25-year long sentences.
• Let A represent the event that Criminal 1 lives to serve his 20-year
sentence
• Let B be the event that Criminal 2 lives to serve his 20-year
sentence.
Events and Probability
• A number of different operations can be performed on events.
• The intersection of the two events A and B, denoted by A∩B, is
defined as the event “both A and B”.
• Therefore, in the previous example, it would be the event that both
criminals live to serve their respective 20-year sentences.
• The union of A and B, denoted by A ∪ B, is the event “either A or B,
or both A and B”.
• In the above example, it would mean either C1 or C2 or both live to
serve their 20-year sentence.
Events and Probability
• The complement of an event A, denoted by A or Ac or 𝐴 is the
event “not A”.
• Consequently, the event that Criminal 1 dies prior to completing
his 20-year sentence is Ac.
• A Venn diagram is a very useful way of depicting the relationship
between events.
• Refer to page 126, Figure 6.1
• We can now move onto the concept of probability.
Events and Probability
• The frequentist definition of probability is as follows:
• If an experiment is repeated n times under essentially identical
conditions, and if the event A occurs m times, then as n grows
large, the ratio m/n approaches a fixed limit that is the probability
of A:
𝑚
• P(A) = .
𝑛
• P(A)is therefore its relative frequency of occurrence; or the
proportion of times the event occurs in a large number of trials
repeated under virtually identical conditions.
Events and Probability
• The numerical value of a probability lies between 0 and 1.
• If a particular event happens with certainty, it occurs in each of
the n trials and has probability n/n = 1.
• In the previous example of the 50-year-old criminal living to serve
his sentence;
• P(A U Ac) = P(A or Ac or both)
• = P(Criminal lives to serve his sentence or he does not)
• =1,
• Since it is certain he will either live or die.
Events and Probability
• In figure 6.1(c), A and Ac fill up the entire box.
• Furthermore, it is impossible for the two events to occur
simultaneously.
• Therefore, the probability of their intersection is 0.
• An event that can never occur is called a null event and it
represented by the symbol φ.
• Most events have probabilities between 0 and 1.
Events and Probability
• Using the frequentist definition of the probability of an event A, we
can calculate the probability of the complementary event Ac in a
straightforward manner.
• So, if the experiment is repeated under essentially identical
conditions n times, and the event A occurs m times, the event Ac
must occur n-m times,
𝑛−𝑚
• Hence, for large n, P(Ac )= = 1-m/n=1-P(A).
𝑛
• Two events A and B that cannot occur simultaneously are said to
be mutually exclusive or disjoint.
Events and Probability
• What are examples of mutually exclusive events?
• If A is the event that the age of a juvenile offender is less than 13
years, and B is the event that it is between 13 and 15 years, for
example, the events A and B are mutually exclusive.
• Any juvenile cannot be in both categories at the same time!
• A∩B = φ and P(A∩B) = 0.
• See figure 6.2 for a Venn diagram representing the mutually
exclusive events.
• In such a situation, the additive rules of probability states that
• P(A ∪ B) = P(A) + P(B).
Events and Probability
• The additive rule can be extended to the case of three or more
mutually exclusive events
• If A1, A2,…, An are n events such that A1 ∩ A2 = φ , A1 ∩ A3 = φ and
so on for all possible pairs, then
• P(A1 ∪ A2 ∪... ∪ An) = P(A1) +(A2)+…+P(An)
• If the events A and B are not mutually exclusive, then the additive
rule no longer applies.
• If A: age of juvenile is under 13 years and B: age of juvenile is
under 15 years.
Events and Probability
• The two events can occur simultaneously –consider a juvenile
aged 12 years, there is some area where the two events will
overlap.
• Therefore, the probability that either of the events occur is the
sum of their individual probabilities minus the probability of their
intersection.
• P(A ∪ B) = P(A) + P(B) –P(A∩B).
Events and Probability
• Let A represent the event that a particular individual is exposed to
high levels of violence and B is the event that he / she is exposed
to high levels of stress.
1. What is the event AUB?
2. What is the event A∩B?
3. What is the complement of A?
4. Are the events A and B mutually exclusive?
Conditional Probability
• There is often an interest in determining the probability that an event
B will occur given that we already know the outcome of another
event A.
• Does the prior occurrence of A cause the probability of B to change?
• For example, given that a person has already lived till the age of 60,
we might want to know what the probability is that the person
survives till the age of 65.
• Here we are dealing with conditional probability.
• P(B|A) denotes the probability of B given that A has already occurred.
Conditional Probability
• The multiplicative rule of probability states that the probability that two
events A and B will both occur is equal to the probability of A times the
probability of B given that A has already occurred.
• P(A∩B)=P(A) P(B|A). (1)
• Since it is arbitrary which event we call A and which we call B,
• P(A∩B)=P(B) P(A|B). (2)
• Dividing both sides of equation (1) by P(A), we get the formula for
conditional probability to be:
P(A∩B) P(A∩B)
• P(B|A)= , given P(A)≠0 and similarly, P(A|B)= ,
P(A) P(B)
• given P(B)≠0
Conditional Probability
• Let us review this concept with the example on page 130.
• When we are concerned with two events such that the outcome of one
event has no effect on the occurrence or non-occurrence of the other,
the events are said to be independent.
• In such a case,
• P(A|B) = P(A) and P(B|A) = P(B).
• In this special case, the multiplicative rule of probability may be written
as:
• P(A∩B) = P(A)P(B)
• It is important to note that the terms independent and mutually
exclusive do not mean the same thing.
Bayes’ Theorem
• Let us revisit the hearing impairment data.
• The 163,157 persons in the study were divided into three mutually
exclusive categories.

Employment Status Population Impairments

Currently employed 98,917 552

Currently unemployed 7462 27

Not in the labour force 56,778 368

Total 163,157 947


Bayes’ Theorem
• Let E1 be the event that an individual included in the survey is
currently employed,
• E2 the event that he / she is currently unemployed,
• E3 the event that the individual is not in the labour force.
• Under the assumption that these numbers are big enough to
satisfy the frequentist definition of probability, based on the data:
• P(E1) = 98,917/163,157 = 0.6063
• P(E2) = 7462/163,157 = 0.27
• P(E3) = 56,778/163,157 = 0.3480
Bayes’ Theorem
• Let S be the event that either one of these three situations holds.
• S = E1 U E2 U E3.
• These three are mutually exclusive, using the additive rule of
probability: P(S) = P(E1 U E2 U E3) =P(E1)+P(E2)+P(E3)
• =?
• When the probabilities of mutually exclusive events adds to 1,
they are said to be exhaustive.
• Let H be the event that an individual has hearing impairment due
to injury.
Bayes’ Theorem
• Overall,
• P(H) = 947/163,157 = 0.0058
• If we look at the employment status subgroup separately;
• P(H|E1) = P(an individual has a hearing impairment | he / she is
currently employed) = 552 / 98,917 = 0.0056,
• P(H|E2) = P(an individual has a hearing impairment | he / she is
currently unemployed) = 27 / 7462 = 0.0036,
• P(H|E3) = P(an individual has a hearing impairment | he / she is
currently employed) = 368 / 56,778 = 0.0065.
Bayes’ Theorem
• We can walk through the logic of arriving at the probability of P(H)
via events defined and conditional probability rules along with
their properties.
• H, is the union of three mutually exclusive events: E1∩H, E2∩H
and E3∩H.
• Hence, P(H) = P(E1∩H) + P(E2∩H) + P(E3∩H)
• This is sometimes called the rule of total probability.
• P(H) =P(E1)P(H|E1) + P(E2)P(H|E2) + P(E3)P(H|E3) (3)
• Based on the data, the values as summarized in the table on page
133, P(H) comes out to be 0.0059.
Bayes’ Theorem
• The more complicated expression (3) may be useful when we are
unable to calculate P(H) directly.
• From the multiplicative rule of probability, we have
• P(E1∩H) = P(H)P(E1|H) ;
P(E1∩H)
• Therefore, P(E1|H) = ,
P(H)
• Using the multiplicative rule again to the numerator;
P(E1)P(H|E1)
• P(E1|H) =
P(H)
Bayes’ Theorem
• Using expression (3) in this we get:

P(E1)P(H|E1)
• P(E1|H) =
P(E1)P(H|E1) + P(E2)P(H|E2) + P(E3)P(H|E3)

• This is known as Bayes’ Theorem.


• Now, since we have all the probabilities, by substituting these, we
get; P(E1|H) = 0.583. Try it out!
• Cross-checking this with the original data also gives us
• P(E1|H) = 552/947 = 0.583.
Sensitivity and Specificity
• Diagnostic testing is one area where Bayes’ theorem is applied.
• Suppose we are interested in two mutually exclusive and exhaustive states of
conviction:
• D1: Event that an individual is innocent
• D2: Event that an individual is guilty
• T+ : Positive DNA test
• P(test negative | guilty) False negative
• The probability of a positive test result given that the individual is actually
guilty is called the sensitivity of the test.
• P(test positive| innocent) False positive
• The specificity of a test is the probability that its result is negative given that
the individual is innocent.
Sensitivity and Specificity
• Let us go through the example on page 139
• D1: Individual suffering from TB
• P(D1) can also be interpreted as the proportion of individuals who
have TB at a given point in time, or a prevalence of the disease.
• D2: Individual not suffering from TB
• Let T+ denote a positive x-ray result
• Using Bayes’ theorem, we wish to find the probability that an
individual who tests positive for the disease actually has it.
• P(D1 | T+) is the probability we wish to find.
Sensitivity and Specificity
• So, we get P(D1 | T+) = 0.00239.
• This implies that for every 100,000 x-rays, only 239 signal true TB cases.
• Note, that prior to taking an x-ray, a randomly selected individual from
the population has a 0.000093 chance of having TB.
• This is called the prior probability.
• After an x-ray is taken and the result is positive, the same individual has
a 0.00239 chance of being afflicted with TB. This is the posterior
probability.
• Although 99,761/100,000 persons with a positive x-ray do not actually
have the disease, we have greatly increased the chance of a proper
diagnosis.
Two common fallacies
• The prosecutor’s fallacy is an error of logic that can arise in
reasoning about DNA profile evidence and may be regarded as a
generalization of the error in elementary logic of confusing “A
implies B” with “B implies A”.
• This consists in confusing P(A|B) by P(B|A).
• The probability of DNA evidence given that the suspect is innocent
is often taken to be very small.
• It does not however, immediately follow that the probability that
the suspect is innocent, given the DNA evidence is also very
small.
Two common fallacies
• Another error of logic that can arise in connection with DNA
evidence usually favours the defendant and is consequently
dubbed the defendant’s fallacy.
• Suppose a crime occurs in a nation of a 100 million people and a
profile frequency is reported as 1 in a 1 million.
• The fallacy consists of arguing that since the expected number of
people in the nation with a matching profile is 100, the probability
that the defendant is guilty is only 1/100.
ROC Curves
• Diagnosis is an imperfect process.
• Though desirable to have a high sensitivity and specificity, such a
procedure is not possible.
• Table 6.1 on page 141 shows the sensitivity and specificity of the serum
creatinine level for predicting transplant rejection.
• The relationship between sensitivity and specificity may be illustrated
using a graph known as a Receiver Operator Characteristic (ROC)
curve.
• It is a line graph that plots the probability of a true positive result
(sensitivity) against the probability of a false positive result for a range
of different cut-off points.
• See figure 6.4, page 142.
The Relative Risk and Odds Ratio
• The concept of relative risk is often useful when we want to compare
the probabilities of disease in two different groups or situations.
• The relative risk (RR) is the chance that a member of a group receiving
some exposure will develop disease relative to the chance that a
member of an unexposed group will develop the same disease.
𝑃(𝑑𝑖𝑠𝑒𝑎𝑠𝑒|𝑒𝑥𝑝𝑜𝑠𝑒𝑑)
• RR =
𝑃(𝑑𝑖𝑠𝑒𝑎𝑠𝑒|𝑢𝑛𝑒𝑥𝑝𝑜𝑠𝑒𝑑)
• You can think about it like this: what is the chance that an individual in
a locality exposed to a high crime rate will develop depression or PTSD
relative to an individual not exposed to a high crime rate riddled
locality?
The Relative Risk and Odds Ratio
• In general, a relative risk of 1.0 indicates that the probabilities of
disease are identical in the exposed and unexposed groups.
• If we take the earlier example and are presented with a RR = 1.33,
how to interpret this?
• A relative risk of 1.33 implies that individuals exposed to high
levels of crime are 33% more likely to develop depression or PTSD.
• To generalize, a RR > 1 implies an increased risk of disease for the
exposed vs unexposed group and,
• A RR < 1 indicates a decreased risk of disease for the exposed
relative to the unexposed.
The Relative Risk and Odds Ratio
• Another commonly used measure of the relative probabilities of
disease is the odds ratio, or relative odds.
• If an event takes place with probability p, the odds in favour of the
event are p / (1-p) to 1.
• Using the TB example, if for every 100,000 individuals there are 9.3
cases of TB, the odds of a randomly selected person’s having the
disease are:
(9.3/100,000)
• = 0.0000931 𝑡𝑜 1.
(99,990.7/100,000)
• Conversely, if we know that if the odds in favour of an event are a
to b, the probability that the event will occur is a / (a+b).
The Relative Risk and Odds Ratio
• The odds ratio is defined as the odds of disease among exposed
individuals divided by the odds of disease among the unexposed.

𝑃 𝑑𝑖𝑠𝑒𝑎𝑠𝑒 𝑒𝑥𝑝𝑜𝑠𝑒𝑑 /[1−𝑃 𝑑𝑖𝑠𝑒𝑎𝑠𝑒 𝑒𝑥𝑝𝑜𝑠𝑒𝑑 ]


• OR =
𝑃 𝑑𝑖𝑠𝑒𝑎𝑠𝑒 𝑢𝑛𝑒𝑥𝑝𝑜𝑠𝑒𝑑 /[1−𝑃 𝑑𝑖𝑠𝑒𝑎𝑠𝑒 𝑢𝑛𝑒𝑥𝑝𝑜𝑠𝑒𝑑 ]

• It can also be defined as the odds of exposure among diseased


divided by the odds of exposure among non-diseased.
• Mathematically, these two are equivalent.
The Relative Risk and Odds Ratio
• The RR and the OR are two different measures that attempt to
explain the same phenomenon.
• Although the RR might be more intuitive, the OR has better
statistical properties as a result of which it is preferred.
• In any event, for rare diseases, the odds ratio is a close
approximation of the relative risk.
• Refer to page 147 for the proof.
• Let us walk through the subsequent example on the same page to
understand it from a numerical perspective.
Numericals
• Review exercises 6.7: Problem 8 on page 155
Next Session
• Theoretical probability distributions
• Normal distribution
• Binomial distribution
• Sampling distribution of the mean
• Central limit theorem
• Confidence intervals
Prescribed reading
• From Statistical Science in the Courtroom: Statistical consulting
in the legal environment, pages 245-261
• From Statistics in the law : Discrimination in employment (93-98),
Driving while Black (137-146, 154-158), Racial Steering (168-173)
• Please read as much as possible.
• Some statistical topics in the readings have been covered in class,
while some will be covered subsequently.

You might also like