SlideShare a Scribd company logo
Jamie and
Debbie
Background
Jamie and Debbie got married. A couple weeks after getting married,
they left England and started travelling around the world. Jamie and
Debbie stayed in touch with their family and friends via email. However,
after about 2 years, the mother of Debbie started to worry about her.
She contacted the police and said:
Question: What do you think is the problem?
Their opinion
We think that the emails we receive from Debbie are not written by
Debbie. They though that the wording of the emails was not the type of
wording that their daughter would use.
Question:
Would you family notice if you did not write your own email?
Why do you think so?
Police investigation
Datasets
The detectives at Nottingham police station collected the emails that the
mother had received from Debbie’s email address.
The emails were divided into two datasets:
• Dataset 1: Emails received from Debbie before marriage
• Dataset 2: Emails received from Debbie after marriage
The detectives also felt that the style of writing was not the same in Dataset 1
and 2. They then collected emails sent by Jamie to his family. This is the third
dataset.
• Dataset 3: Emails received from Jamie
Dataset names
The police now have three datasets.
• Dataset 1: Emails received from Debbie before marriage
• Dataset 2: Emails received from Debbie after marriage
• Dataset 3: Emails received from Jamie
The investigators also used another dataset containing thousands of emails
written by different people.
• Dataset 4: Large collection of emails from many different senders
Question: Decide which datasets are questioned, known or reference.
Datasets
Questioned
(3,000
words)
Known 1:
Debbie
(28,000
words)
Known 2:
Jamie
(6,000
words)
Reference
(1 million
words)
Datasets
If the language features in two datasets are similar, the same person may have
written them.
If the language features in two datasets are NOT similar, the same person may
NOT have written them.
So, we need to discover if:
The features in the questioned dataset are similar to a known corpus
The features in the questioned dataset are different to a known corpus
Question: How can we do this?
Answers
Deep learning
• This will probably work, but it is difficult to explain in court
Statistical analysis
• This may also work, but it is also difficult to explain in court
Habits of language (idiosyncratic language - 口癖)
• This works and is the easy to explain.
Question: How can we systematically identify idiosyncratic language?
Is your language idiosyncratic?
THIN FAT DELICIOUS I
Thin
Slim
Slender
Skeletal
Skinny
Emaciated
Fat
Plump
Tubby
Chubby
Podgy
Overweight
Obese
Delicious
Tasty
Yummy
Flavorsome
Delectable
Scrumptious
俺
僕
私
儂
自分
うち
あたし
Datasets
Questioned
Known 1:
Debbie
Known 2:
Jamie
Reference
Afd
affadsA
Faasfsafs
afadsfa
Fsaafd
affsafs
Aafs
adsf
Afd
affadsA
Afd
affadsA
afadsfa
Fsaafd
affsafs
adsf
Afd
affadsA
Three common language features are used:
1. word frequency,
2. word frequency of words that are used more than expected, and
3. patterns following such words
Datasets
Questioned
Known 1:
Debbie
Known 2:
Jamie
Reference
Afd
affadsA
Faasfsafs
afadsfa
Fsaafd
affsafs
Aafs
adsf
Afd
affadsA
Afd
affadsA
afadsfa
Fsaafd
affsafs
adsf
Afd
affadsA
Question: How do we identify word frequency?
Datasets
Questioned
Known 1:
Debbie
Known 2:
Jamie
Reference
Afd
affadsA
Faasfsafs
afadsfa
Fsaafd
affsafs
Aafs
adsf
Afd
affadsA
Afd
affadsA
afadsfa
Fsaafd
affsafs
adsf
Afd
affadsA
Question: How do we identify frequency of words that are used more
than expected (keywords)?
Datasets
Questioned
Known 1:
Debbie
Known 2:
Jamie
Reference
Afd
affadsA
Faasfsafs
afadsfa
Fsaafd
affsafs
Aafs
adsf
Afd
affadsA
Afd
affadsA
afadsfa
Fsaafd
affsafs
adsf
Afd
affadsA
Question: How do we identify patterns following keywords?
Datasets
Questioned
Known 1:
Debbie
Known 2:
Jamie
Reference
Awhile 3
Buisiness 2
was sat 1
Awhile 2
Buisiness 4
was sat 1
Question: If these keywords were discovered, what would you conclude?
Guilty or Innocent?
Jamie Starbuck
Check the result online by searching for “Jamie Starbuck”
System Needed
Expert system
When preparing evidence for this case, the linguistic had to:
• Analyze keywords in the Questioned dataset
• Analyze keywords in each Known dataset separately
• Create a table in Excel to compare the keywords
• Identify keywords that occur in both Questioned and Known datasets
An expert system is needed to streamline this process, and remove the chance
of human error.

More Related Content

PPTX
From mystery to mastery: AI in Language Classrooms
PPTX
Martial artist's guide to research writing
PPTX
Social Media Ethics.pptx
PPTX
Future of Information Ethics.pptx
PPTX
Bioethics.pptx
PPTX
Surveillance and security.pptx
PPTX
Introduction to Expert Systems.pptx
PPTX
Unit 4 Problem breakdown.pptx
From mystery to mastery: AI in Language Classrooms
Martial artist's guide to research writing
Social Media Ethics.pptx
Future of Information Ethics.pptx
Bioethics.pptx
Surveillance and security.pptx
Introduction to Expert Systems.pptx
Unit 4 Problem breakdown.pptx

More from john6938 (20)

PPTX
Image_recognition.pptx
PPTX
Algorithms.pptx
PPTX
Artificial_intelligence.pptx
PPTX
Image_generation.pptx
PPTX
Computer_Graphics.pptx
PPTX
Security.pptx
PPTX
Gravitational_wave_detection.pptx
PPTX
Embedded_Systems.pptx
PPTX
Software_engineering.pptx
PPTX
Quantum_computers.pptx
PPTX
NLP.pptx
PPTX
Sensors_SLAM.pptx
PPTX
Maths.pptx
PPTX
Recommendation_systems.pptx
PPTX
Immersive_technologies.pptx
PPTX
OpenIoT.pptx
PPTX
Data_Visualization.pptx
PPTX
Database.pptx
PPTX
Safeguarding students v1.pptx
PPTX
Learning by Teaching
Image_recognition.pptx
Algorithms.pptx
Artificial_intelligence.pptx
Image_generation.pptx
Computer_Graphics.pptx
Security.pptx
Gravitational_wave_detection.pptx
Embedded_Systems.pptx
Software_engineering.pptx
Quantum_computers.pptx
NLP.pptx
Sensors_SLAM.pptx
Maths.pptx
Recommendation_systems.pptx
Immersive_technologies.pptx
OpenIoT.pptx
Data_Visualization.pptx
Database.pptx
Safeguarding students v1.pptx
Learning by Teaching
Ad

Recently uploaded (20)

PDF
O7-L3 Supply Chain Operations - ICLT Program
PPTX
master seminar digital applications in india
PDF
102 student loan defaulters named and shamed – Is someone you know on the list?
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
RMMM.pdf make it easy to upload and study
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PDF
Mark Klimek Lecture Notes_240423 revision books _173037.pdf
PPTX
Institutional Correction lecture only . . .
PPTX
Renaissance Architecture: A Journey from Faith to Humanism
PPTX
Microbial diseases, their pathogenesis and prophylaxis
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PDF
Origin of periodic table-Mendeleev’s Periodic-Modern Periodic table
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PPTX
The Healthy Child – Unit II | Child Health Nursing I | B.Sc Nursing 5th Semester
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PPTX
Week 4 Term 3 Study Techniques revisited.pptx
O7-L3 Supply Chain Operations - ICLT Program
master seminar digital applications in india
102 student loan defaulters named and shamed – Is someone you know on the list?
Abdominal Access Techniques with Prof. Dr. R K Mishra
Final Presentation General Medicine 03-08-2024.pptx
RMMM.pdf make it easy to upload and study
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
FourierSeries-QuestionsWithAnswers(Part-A).pdf
O5-L3 Freight Transport Ops (International) V1.pdf
Mark Klimek Lecture Notes_240423 revision books _173037.pdf
Institutional Correction lecture only . . .
Renaissance Architecture: A Journey from Faith to Humanism
Microbial diseases, their pathogenesis and prophylaxis
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
Origin of periodic table-Mendeleev’s Periodic-Modern Periodic table
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
The Healthy Child – Unit II | Child Health Nursing I | B.Sc Nursing 5th Semester
human mycosis Human fungal infections are called human mycosis..pptx
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
Week 4 Term 3 Study Techniques revisited.pptx
Ad

Starbuck.pptx