0% found this document useful (0 votes)

11 views54 pages

NLP_Week_02

The document provides an overview of text normalization and tokenization in natural language processing (NLP), detailing processes such as word segmentation, morphology, and the importance of language models. It discusses techniques like stemming, lemmatization, and Byte-Pair Encoding for effective tokenization, along with challenges in tokenization across different languages. Additionally, it covers fundamental concepts of probability relevant to language models, including joint and conditional probabilities, and introduces Bayes' theorem.

Uploaded by

Faizad Ullah

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views54 pages

NLP_Week_02

Uploaded by

Faizad Ullah

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 54

CSCS 366 – Intro.

to NLP
Faizad Ullah

1
Text Normalization

2
Tokenization
• Before almost any natural language processing of a
text, the text has to be normalized, a task called text
normalization.

1. Tokenizing (segmenting) words

2. Normalizing word formats
3. Segmenting sentences
Tokenization
• Tokenization: To extract linguistic unit of interests from running text

• Linguistic Unit?

• character, word, sentence, paragraph, …

• The most common is word

• The 1989 edition of the Oxford English Dictionary had 615,000

entries.
Words
• Type: An element of the vocabulary or the number of
distinct words in a corpus

• If the set of words in the vocabulary is word instance V,

the number of types is the vocabulary size |V|.

• Token/Instance: An instance of that type in running text

• Word instances are the total number N of running words.

Words
• How many words/tokens and types are in the following
sentence?

• He stepped out into the hall, was delighted to

encounter a water brother.

• This sentence has 13 words if we don’t count punctuation marks

as words, 15 if we count punctuation.

• Whether we treat period (“.”), comma (“,”), and so on as words

depends on the task.
Words
• I do uh main- mainly business data processing.

• This utterance has two kinds of disfluencies.

• The broken-off word main- is fragment called a fragment.
• Words like uh and um are called fillers or filled pauses.

• We consider these to be words?

• It depends on the application.
Morphology
• A morpheme is the smallest meaning-bearing unit of a
language; for example the word unwashable has the
morphemes un-, wash, and -able.

• Some languages, like Japanese, don’t have spaces

between words, so word tokenization becomes more
difficult.
Word Normalization
• Word normalization is the task of putting words or tokens in a
standard format.

• The world has 7097 languages at the time of this writing, according
to the online Ethnologue catalog (Simons and Fennig, 2018).

• It is important to test algorithms on more than one language, and

particularly on languages with different properties; by contrast there
is an unfortunate current tendency for NLP algorithms to be
developed or tested just on English (Bender, 2019).

• code switching
How many words?

N = number of tokens/Instances
V = vocabulary = set of types
|V| is the size of the vocabulary
Issues in Tokenization

• Finland’s capital  Finland Finlands Finland’s ?

• what’re, I’m, isn’t  What are, I am, is not
• Hewlett-Packard  Hewlett Packard ?
• state-of-the-art  state of the art ?
• Lowercase  lower-case lowercase lower case ?
• San Francisco  one token or two?
• m.p.h., Ph.D.  ??
Word Tokenization in Chinese

• Also called Word Segmentation

• Chinese words are composed of characters

• Characters are generally 1 syllable and 1 morpheme.

• Standard baseline segmentation algorithm:

• Maximum Matching (also called Greedy)

Maximum Matching Word
Segmentation
• Given a wordlist of Chinese, and a string.

• Start a pointer at the beginning of the string

• Find the longest word in dictionary that matches the

string starting at pointer

• Move the pointer over the word in string

• Go to 2
Maximum Matching Word
Segmentation
• Thecatinthehat the cat in the hat

• Thetabledownthere the table down there

theta bled own there

• Doesn’t generally work in English!

Example
• He sat on the chair, but he likes sitting on the floor.

•N=?
•V=?

• Normalization: lowercasing, stemming, lemmatization

• stopwords removing, punctuation removing,
vectorization
Stemming and Lemmatization
• Stemming: Reduces words to their base or root form by
chopping off affixes (e.g., "running" → "run").

• Lemmatization: Converts words to their dictionary form

(e.g., "better" → "good" or "running" → "run") using
context and linguistic rules.
Stemming and Lemmatization
• He sat on the chair but he likes sitting on the floor

• he sit <SW> <SW> chair but he like sit <SW> <SW> floor

•N = ?
•V = ?

• <DATE>, <UNK>
Byte-Pair Encoding: A Bottom-up

Tokenization Algorithm

18
Byte-Pair Encoding
• BPE is most commonly used by large language models for
word tokenization.

• Instead of defining tokens as words (whether delimited by

spaces or more complex algorithms), or as characters (as in
Chinese), we can use our data to automatically tell us what
the tokens should be.

• NLP algorithms often learn some facts about language from

one corpus (a training corpus) and then use these facts to
make decisions about a separate test corpus and its language.
Byte-Pair Encoding
• Thus, if our training corpus contains, say the words low,
new, newer, but not lower, then if the word lower
appears in our test corpus, our system will not know
what to do with it.

• To deal with this unknown word problem, modern

tokenizers automatically induce sets of tokens that
include tokens smaller than words, called subwords.
Byte-Pair Encoding Algorithm
• The BPE algorithm starts with a vocabulary containing only individual
characters.
• It scans the training corpus to find the two symbols that are most
frequently adjacent (e.g., ‘A’ and ‘B’).
• A new merged symbol (e.g., ‘AB’) is added to the vocabulary, and every
occurrence of adjacent ‘A’ and ‘B’ is replaced with ‘AB’ in the corpus.
• This process of counting and merging continues, forming longer character
strings until k merges have been completed, resulting in k novel tokens.
• K is a parameter of the algorithm, determining the number of new tokens.
• The final vocabulary consists of the original set of characters plus the k
new symbols.
Byte-Pair Encoding Algorithm
• The algorithm is usually run inside words (not merging
across word boundaries).
• The input corpus is first white-space-separated to give
a set of strings, each corresponding to the characters of
a word, plus a special end-of-word symbol , and its
counts.
Byte-Pair Encoding Algorithm
Byte-Pair Encoding Algorithm

Corpus of 18 word tokens with counts for each

word (the word low appears 5 times, the word
newer 6 times, and so on), which would have a
starting vocabulary of 11 letters.
Byte-Pair Encoding Algorithm
Byte-Pair Encoding Algorithm
Byte-Pair Encoding Algorithm
Byte-Pair Encoding Algorithm
Sentence Segmentation

29
Sentence Segmentation
• !, ? are relatively unambiguous
• Period “.” is quite ambiguous
• Sentence boundary
• Abbreviations like Inc. or Dr.
• Numbers like .02% or 4.3
• Build a binary classifier
• Looks at a “.”
• Decides End-of-Sentence/Not-End-of-Sentence
• Classifiers: hand-written rules, regular expressions, or
machine learning
Determining if a word is end-of-
sentence
Language Models

32
Language Models
• A language model is a machine learning model LM that
predicts upcoming words.

• More formally, a language model assigns a probability

to each possible next word, or equivalently gives a
probability distribution over possible next works.

• Language models can also assign a probability to an

entire sentence.
Language Models
Language Models
• Thus, an LM could tell us that the following sequence
has a much higher probability of appearing in a text:

• all of a sudden I notice three guys standing on the sidewalk

• than does this same set of words in a different order:

• on guys all I of notice sidewalk three a sudden standing the

Basic Probability

36
Probability

• Fair Coin Toss:

• Probability of heads: ½ 𝑷( 𝑯)→ 𝟎. 𝟓

• Probability of tails: ½  𝑷( 𝑻)→ 𝟎. 𝟓

• Fair Coin Toss universe has only two outcomes. There is

no other possibility.
Probability

• Fair Dice roll

• Probability of getting a 6: 1/6  P(‘6’) = 0.16666666666

• All possible outcomes in the current universe are 6.

Joint Probability
• Joint probability refers to a statistical measure that calculates the
likelihood of two events occurring together and at the same point
in time.

• Suppose we throw a white and black die simultaneously. What is

the probability that the outcome would sum to 3?

• (1,2) and (2,1) are the only two out of 36 possibilities that sum to
3.

• So: 𝑃(𝑠𝑢𝑚𝑠 𝑡𝑜 3) = 2/36

Conditional Probability
• Conditional probability is known as the possibility of an
event or outcome happening, based on the existence of
a previous event or outcome.
• Now let us suppose we have already thrown the black
dice and got a 2.
• What is the probability of “sums to 3” given this event?

• So: 𝑃(𝑠𝑢𝑚𝑠 𝑡𝑜 3 | 𝑎𝑙𝑟𝑒𝑎𝑑𝑦 𝑎 2 𝑜𝑛 𝑏𝑙𝑎𝑐𝑘 𝑑𝑖𝑐𝑒) = 1/6

• Only one possibility out of 6 possible outcomes remains.
Conditional Probability
• A Universe with all possible outcomes
• Interested in some subset of them (some event)
• Assume we are studying diabetes:
• We observe people and see whether they have diabetes or not
• If we take as our Universe, all the people participating in our
study, then there are two possible outcomes for any individual:
Either they have diabetes, or they do not have diabetes
• We can then split our universe in two events:
• The event “people with diabetes” (designated as 𝐴)
• The event “people with no diabetes” (designated as ~𝐴)
Conditional Probability

• So, what is the probability that a randomly chosen

person has diabetes?

elements in 𝑈 (universe)
• The number of elements in A divided by the number of

• We denote the number of elements of A as |A| (the

• We define the probability of A, 𝑃(𝐴) as:

cardinality of A)

elements as U, the probability 𝑃(𝐴) can be at most

• Since A can have at most the same number of

1.
Conditional Probability
• Let’s say there is a new screening test
that is supposed to measure
something
• That test will be “positive” for some
people, and “negative” for others.
• If we take the event B to be “people
for whom the test is positive”
• What is the probability that the test
will be “positive” for a randomly
selected person?
The Two Events Jointly
• What happens if we put them together?

• So, we can compute the probability of both events

occurring as:
The Two Events Jointly
• We are dealing with:
• An entire Universe (all people)
• The event A (people with cancer)

• There is also an overlap, the event AB (𝐴 ∩ 𝐵)

• The event B (people for whom the test is positive)

• There is also the event 𝐵 − 𝐴𝐵:

• “People with diabetes and with a positive test result”.

• “People with a positive test result and without

• And the event 𝐴 − 𝐴𝐵:

diabetes”

• “People with diabetes and with a negative test result”

Conditional Probability
• “Given that the test is positive for a randomly selected individual,
what is the probability that said individual has diabetes?”
• In terms of our Venn Diagram:
• Given that we are in region B, what is the probability
that we are in region AB?

• Or stated differently:
• “If we make region B our new Universe, what is the probability of

• The notation for this is 𝑃(𝐴|𝐵) (Probability of A given B)

A?”
Conditional Probability

• Dividing both the numerator and denominator by |𝑈|, we

• Let us convert the counts to probabilities

get:

= P(AB)/P(B)  Equation 1

• What we’ve effectively done is change the Universe from U

(all people) to B (people for whom the test is positive), but
we are still dealing with probabilities defined in U
Conditional Probability
• Now let’s ask the converse question:
• “given that a randomly selected individual has cancer
(event A), what is the probability that the test is positive
for that individual (event AB)?

= P(AB)/P(A)  Equation
2
The Bayes Theorem
• Now we have everything we need to derive Bayes theorem,
putting equation 1 and 2 together, we get:

• Which is to say 𝑃(𝐴 ∩ 𝐵) is the same whether you’re looking at it

from the point of view of A or B.
The Bayes Theorem
Independence
• If the probability of occurrence of an event A is not affected
by the occurrence of another event B, then A and B are said
to be independent events.
• A = “Today is Friday”
• B = “Heads on fair coin”
• If A and B are independent:
• P(A∩B) = P(A)P(B)
• Or stated a bit differently:
• P(A|B) = P(A) if P(B) > 0 and P(B|A) = P(B) if P(A) > 0
• P(A|B) = P(A∩B) / P(B) is not defined when P(B) = 0
• P(A|B) = P(A∩B) / P(A) is not defined when P(A) = 0
Independence and Mutual Exclusion
Summary
• For independent events A and B:

• For independent events A b and C:

• For dependent event A and B:

• For dependent events A, B and C:

Sources
• https://web.stanford.edu/~jurafsky/slp3/3.pdf

Evaluating Language Models
No ratings yet
Evaluating Language Models
21 pages
Akin Emma Donoghue instant download
100% (2)
Akin Emma Donoghue instant download
33 pages
NLP Digital Notes
No ratings yet
NLP Digital Notes
128 pages
Words and Corpora J+M
No ratings yet
Words and Corpora J+M
49 pages
Text preprocessing
No ratings yet
Text preprocessing
39 pages
eisner-probability how to use prob
No ratings yet
eisner-probability how to use prob
44 pages
CH 2_text operation
No ratings yet
CH 2_text operation
38 pages
2_text operation
No ratings yet
2_text operation
35 pages
Week 2
No ratings yet
Week 2
90 pages
Corpora
No ratings yet
Corpora
48 pages
Kuhlmann - Introduction To Computational Linguistics (Slides) (2015)
100% (1)
Kuhlmann - Introduction To Computational Linguistics (Slides) (2015)
66 pages
Transformation of Sentences - Diff Prelim Papers - QK 2 - 2024-25
No ratings yet
Transformation of Sentences - Diff Prelim Papers - QK 2 - 2024-25
28 pages
2 TextProc 2023
No ratings yet
2 TextProc 2023
35 pages
Text Preprocessing
No ratings yet
Text Preprocessing
59 pages
PART B NOTES
No ratings yet
PART B NOTES
62 pages
Session 1
No ratings yet
Session 1
33 pages
Lecture 2 NLP
No ratings yet
Lecture 2 NLP
27 pages
vandurme2011 how to use prob
No ratings yet
vandurme2011 how to use prob
44 pages
NLP CH 2
No ratings yet
NLP CH 2
59 pages
Chapter Four 1
No ratings yet
Chapter Four 1
91 pages
2_Tokens_Naturalness_of_Code
No ratings yet
2_Tokens_Naturalness_of_Code
56 pages
Lecture03 Naive Bayes
No ratings yet
Lecture03 Naive Bayes
33 pages
Lecture 02
No ratings yet
Lecture 02
31 pages
L3 LanguageModels
No ratings yet
L3 LanguageModels
118 pages
Al-Khasawneh - Error Analysis of Written English Paragraphs by Jordanian Undergraduate Students A Case Study
No ratings yet
Al-Khasawneh - Error Analysis of Written English Paragraphs by Jordanian Undergraduate Students A Case Study
17 pages
NLP_Week_02
No ratings yet
NLP_Week_02
55 pages
Schulz CulturalDifferencesStudent 2001
No ratings yet
Schulz CulturalDifferencesStudent 2001
16 pages
lecture5-ngrams
No ratings yet
lecture5-ngrams
40 pages
NLP Unit I
No ratings yet
NLP Unit I
117 pages
Unit - 2
No ratings yet
Unit - 2
10 pages
nlp
No ratings yet
nlp
16 pages
NLP UNIT III (Part 1)
No ratings yet
NLP UNIT III (Part 1)
15 pages
05 Introduction To NLP
No ratings yet
05 Introduction To NLP
63 pages
C10_AI_UNIT 3_NLP_ HALF YEARLY
No ratings yet
C10_AI_UNIT 3_NLP_ HALF YEARLY
37 pages
Tema 02 Perkembangan Ekonomi Digital Ips Kelas 9
100% (1)
Tema 02 Perkembangan Ekonomi Digital Ips Kelas 9
27 pages
NLP Lecture2 Text Pre Processing
No ratings yet
NLP Lecture2 Text Pre Processing
54 pages
Unit Vapplications Notes
No ratings yet
Unit Vapplications Notes
13 pages
N-Grams - Text Representation
No ratings yet
N-Grams - Text Representation
23 pages
Lecture 03
No ratings yet
Lecture 03
41 pages
Ai Unit 5
No ratings yet
Ai Unit 5
16 pages
Applied Natural Language Processing: Barbara Rosario
No ratings yet
Applied Natural Language Processing: Barbara Rosario
39 pages
AI Unit V
No ratings yet
AI Unit V
64 pages
Natural Language Processing Questions
No ratings yet
Natural Language Processing Questions
5 pages
Lecture04-Ngram Lang Models
No ratings yet
Lecture04-Ngram Lang Models
39 pages
NLP_Week_01
No ratings yet
NLP_Week_01
57 pages
NLP_Week_01
No ratings yet
NLP_Week_01
57 pages
American English Language Exams For Young Learners: CEFR Levels
No ratings yet
American English Language Exams For Young Learners: CEFR Levels
4 pages
AI6122 Topic 1.2 - WordLevel
No ratings yet
AI6122 Topic 1.2 - WordLevel
63 pages
Unit 5
No ratings yet
Unit 5
26 pages
NLP_Module 2(1)
No ratings yet
NLP_Module 2(1)
77 pages
Multimedia Application L5
No ratings yet
Multimedia Application L5
35 pages
5 BASIC TEXT PROCESSING
No ratings yet
5 BASIC TEXT PROCESSING
6 pages
Basic Text Process
No ratings yet
Basic Text Process
3 pages
Class 10 holiday homework
No ratings yet
Class 10 holiday homework
3 pages
NLP Viva
No ratings yet
NLP Viva
14 pages
Lec-3 Language Modeling N-Grams
No ratings yet
Lec-3 Language Modeling N-Grams
41 pages
Introduction To Language Modeling Final
No ratings yet
Introduction To Language Modeling Final
69 pages
Introduction To NLP
No ratings yet
Introduction To NLP
68 pages
Lecture 4
No ratings yet
Lecture 4
37 pages
NLP_Week_03
No ratings yet
NLP_Week_03
33 pages
N-Grams and Corpus Linguistics: Julia Hirschberg
No ratings yet
N-Grams and Corpus Linguistics: Julia Hirschberg
47 pages
Strings and Lists
No ratings yet
Strings and Lists
32 pages
Apex Institute of Technology Natural Language Processing (20CST354)
No ratings yet
Apex Institute of Technology Natural Language Processing (20CST354)
43 pages
L4. Principles of Language Assessment (1)
No ratings yet
L4. Principles of Language Assessment (1)
7 pages
Lecture 4
No ratings yet
Lecture 4
87 pages
English Review 2
No ratings yet
English Review 2
6 pages
A Review on the Effectiveness of Safety Training Methods for Malaysia Construction Industry
No ratings yet
A Review on the Effectiveness of Safety Training Methods for Malaysia Construction Industry
5 pages
Chapter 6-NLP
No ratings yet
Chapter 6-NLP
8 pages
Presentation (1)
No ratings yet
Presentation (1)
17 pages
Description and Definition
No ratings yet
Description and Definition
29 pages
JORGE - PLANIFICACION MICROCURRICULAR 4to EGB 2023 - 2024
No ratings yet
JORGE - PLANIFICACION MICROCURRICULAR 4to EGB 2023 - 2024
8 pages
Notes of NLP - Unit-2
No ratings yet
Notes of NLP - Unit-2
23 pages
AI_NLP
No ratings yet
AI_NLP
9 pages
Natural Language Processing (NLP)
No ratings yet
Natural Language Processing (NLP)
5 pages
LESSON 3 - Adverbs
No ratings yet
LESSON 3 - Adverbs
19 pages
OL 2017 Science Tamil
No ratings yet
OL 2017 Science Tamil
12 pages
0510 Writing A Review (For Examination From 2024)
75% (4)
0510 Writing A Review (For Examination From 2024)
7 pages
Ci-Qu - Mica - 11
No ratings yet
Ci-Qu - Mica - 11
14 pages
global success 9
No ratings yet
global success 9
4 pages
DM Lecture 1 Introudction and Policies
No ratings yet
DM Lecture 1 Introudction and Policies
17 pages
Cetak KHS
No ratings yet
Cetak KHS
2 pages
Efc PK2
No ratings yet
Efc PK2
5 pages
Abu App CV
No ratings yet
Abu App CV
2 pages
A Lot of and Lots of Are Used To Express That There Is A Large Quantity of Something
100% (1)
A Lot of and Lots of Are Used To Express That There Is A Large Quantity of Something
4 pages
Travels of Rizal P2
No ratings yet
Travels of Rizal P2
5 pages
5.2 Natural Language Processing
No ratings yet
5.2 Natural Language Processing
43 pages
Describing People: Speaking
No ratings yet
Describing People: Speaking
3 pages
ML Lecture 1 Introduction and Policies
No ratings yet
ML Lecture 1 Introduction and Policies
45 pages
School: Grade Level: Teacher: Section Teaching Dates and Time: Quarter
No ratings yet
School: Grade Level: Teacher: Section Teaching Dates and Time: Quarter
4 pages
5 Formal Lesson Plan 1
No ratings yet
5 Formal Lesson Plan 1
2 pages
ML Lecture 2 Supervised Learning Setup
No ratings yet
ML Lecture 2 Supervised Learning Setup
38 pages
IELTS Listening: How To Do Multiple Choice Questions: Check Your
No ratings yet
IELTS Listening: How To Do Multiple Choice Questions: Check Your
1 page
Set Swana
No ratings yet
Set Swana
89 pages
Natural Language Processing
No ratings yet
Natural Language Processing
6 pages
Teaching Composition Guide
From Everand
Teaching Composition Guide
Valerie Hockert, PhD
No ratings yet
Semi Detailed Lesson Plan in English 7
No ratings yet
Semi Detailed Lesson Plan in English 7
2 pages

NLP_Week_02

Uploaded by

NLP_Week_02

Uploaded by

CSCS 366 – Intro.

1. Tokenizing (segmenting) words

• character, word, sentence, paragraph, …

• The most common is word

• The 1989 edition of the Oxford English Dictionary had 615,000

• If the set of words in the vocabulary is word instance V,

• Token/Instance: An instance of that type in running text

• Word instances are the total number N of running words.

• He stepped out into the hall, was delighted to

• This sentence has 13 words if we don’t count punctuation marks

• Whether we treat period (“.”), comma (“,”), and so on as words

• This utterance has two kinds of disfluencies.

• We consider these to be words?

• Some languages, like Japanese, don’t have spaces

• It is important to test algorithms on more than one language, and

• Finland’s capital  Finland Finlands Finland’s ?

• Also called Word Segmentation

• Chinese words are composed of characters

• Characters are generally 1 syllable and 1 morpheme.

• Standard baseline segmentation algorithm:

• Maximum Matching (also called Greedy)

• Start a pointer at the beginning of the string

• Find the longest word in dictionary that matches the

• Move the pointer over the word in string

• Thetabledownthere the table down there

• Doesn’t generally work in English!

• Normalization: lowercasing, stemming, lemmatization

• Lemmatization: Converts words to their dictionary form

• Instead of defining tokens as words (whether delimited by

• NLP algorithms often learn some facts about language from

• To deal with this unknown word problem, modern

Corpus of 18 word tokens with counts for each

• More formally, a language model assigns a probability

• Language models can also assign a probability to an

• all of a sudden I notice three guys standing on the sidewalk

• than does this same set of words in a different order:

• on guys all I of notice sidewalk three a sudden standing the

• Fair Coin Toss:

• Probability of tails: ½  𝑷( 𝑻)→ 𝟎. 𝟓

• Fair Coin Toss universe has only two outcomes. There is

• Fair Dice roll

• Probability of getting a 6: 1/6  P(‘6’) = 0.16666666666

• All possible outcomes in the current universe are 6.

• Suppose we throw a white and black die simultaneously. What is

• So: 𝑃(𝑠𝑢𝑚𝑠 𝑡𝑜 3) = 2/36

• So: 𝑃(𝑠𝑢𝑚𝑠 𝑡𝑜 3 | 𝑎𝑙𝑟𝑒𝑎𝑑𝑦 𝑎 2 𝑜𝑛 𝑏𝑙𝑎𝑐𝑘 𝑑𝑖𝑐𝑒) = 1/6

• So, what is the probability that a randomly chosen

• We denote the number of elements of A as |A| (the

• We define the probability of A, 𝑃(𝐴) as:

elements as U, the probability 𝑃(𝐴) can be at most

• So, we can compute the probability of both events

• There is also an overlap, the event AB (𝐴 ∩ 𝐵)

• There is also the event 𝐵 − 𝐴𝐵:

• “People with a positive test result and without

• And the event 𝐴 − 𝐴𝐵:

• “People with diabetes and with a negative test result”

• The notation for this is 𝑃(𝐴|𝐵) (Probability of A given B)

• Dividing both the numerator and denominator by |𝑈|, we

• What we’ve effectively done is change the Universe from U

• Which is to say 𝑃(𝐴 ∩ 𝐵) is the same whether you’re looking at it

• For independent events A b and C:

• For dependent event A and B:

• For dependent events A, B and C:

You might also like