Conspiracy Stories:
Building Archives to
Facilitate Narrative
Analyses of Online Fake
News
Peter Broadwell (@PeterBroadwell)
Digital Library Program
University of California, Los Angeles
IFLA International News Media Conference
Reykjavík, Iceland, 27-28 April 2017
Peter Broadwell (@PeterBroadwell) - UCLA Digital Library
Building Archives to Facilitate Narrative Analyses of Fake News
IFLA International News Media Conference, 27 April 2017
Fake news is…
2
1 Jacob L. Nelson. 2017. “Is ‘fake news’ a fake problem?” Columbia Journalism Review, January 31, 2017
• Difficult to define and identify
• A badly overused term – especially these days
• Prominent on social media:
30% of traffic to fake news sites is from Facebook;
which is true for only 8% of legitimate news
• Financed by online advertising systems (Google),
and maybe some governments
• Now seen as a serious issue by journalists and
some Internet companies (finally)
• Not as dire a problem as some believe?1
• Worthy of further study
Peter Broadwell (@PeterBroadwell) - UCLA Digital Library
Building Archives to Facilitate Narrative Analyses of Fake News
IFLA International News Media Conference, 27 April 2017
3
Gregor Aisch, Jon Huang, Cecilia Kang,“Dissecting the #PizzaGate Conspiracy Theories,” New York Times, December 10, 2016
Analyzing the “shape” of conspiracy stories
Peter Broadwell (@PeterBroadwell) - UCLA Digital Library
Building Archives to Facilitate Narrative Analyses of Fake News
IFLA International News Media Conference, 27 April 2017
Title
4
TR Tangherlini, V Roychowdhury, B Glenn, CM Crespi, R Bandari, A Wadia, M Falahi, E
Ebrahimzadeh, R Bastani. 2016. “‘Mommy Blogs’ and the Vaccination Exemption
Narrative: Results From A Machine-Learning Approach for Story Aggregation on
Parenting Social Media Sites.” JMIR Public Health Surveillance 2:2 (2016).
Peter Broadwell (@PeterBroadwell) - UCLA Digital Library
Building Archives to Facilitate Narrative Analyses of Fake News
IFLA International News Media Conference, 27 April 2017
Don’t believe everything you read online…
5
http://newsroom.ucla.edu/releases/ucla-researchers-teach-computer-to-read-the-internet
Peter Broadwell (@PeterBroadwell) - UCLA Digital Library
Building Archives to Facilitate Narrative Analyses of Fake News
IFLA International News Media Conference, 27 April 2017
6
TR Tangherlini, V Roychowdhury, B Glenn, CM Crespi, R Bandari, A Wadia, M Falahi, E Ebrahimzadeh, R
Bastani. 2016. “‘Mommy Blogs’ and the Vaccination Exemption Narrative: Results From A Machine-Learning
Approach for Story Aggregation on Parenting Social Media Sites.” JMIR Public Health Surveillance 2:2 (2016).
Narrative framework analysis
Peter Broadwell (@PeterBroadwell) - UCLA Digital Library
Building Archives to Facilitate Narrative Analyses of Fake News
IFLA International News Media Conference, 27 April 2017
Building our own “fake news” web archive
7
Peter Broadwell (@PeterBroadwell) - UCLA Digital Library
Building Archives to Facilitate Narrative Analyses of Fake News
IFLA International News Media Conference, 27 April 2017
8
Or… using someone else’s archive?
Peter Broadwell (@PeterBroadwell) - UCLA Digital Library
Building Archives to Facilitate Narrative Analyses of Fake News
IFLA International News Media Conference, 27 April 2017
9
Small-scale case studies:
Pizzagate
and
Bridgegate
Peter Broadwell (@PeterBroadwell) - UCLA Digital Library
Building Archives to Facilitate Narrative Analyses of Fake News
IFLA International News Media Conference, 27 April 2017
Pre-selected “archives” of coverage
10
Peter Broadwell (@PeterBroadwell) - UCLA Digital Library
Building Archives to Facilitate Narrative Analyses of Fake News
IFLA International News Media Conference, 27 April 2017
Web archives: too much data,
not enough information
11
In WANE files Extracted by DBpedia Spotlight
3,376 unique names 954 unique Wikipedia entities
Christie_(P): 376
Tony_(P): 241
Chris_Christie_(P): 236
David_Wildstein_(P): 190
Bridget_Anne_Kelly_(P): 169
Chris_Christie_(P): 435
David_Wildstein_(P): 263
Bridget_Anne_Kelly_(P): 222
Bill_Baroni_(P): 194
Mark_Sokolich_(P): 130
• 2 seed URLs, 1 from Huffington Post, 1 from NJ Record
• Full Archive-It WARCs contain 124,355 unique linked URLs
• But 47,398 are adverts/spam, 73,544 likely news articles
• Only 415 news articles (163 from HuffPo, 252 from the Record)
are relevant to Bridgegate (.5%)
Peter Broadwell (@PeterBroadwell) - UCLA Digital Library
Building Archives to Facilitate Narrative Analyses of Fake News
IFLA International News Media Conference, 27 April 2017
Returning to Pizzagate…
12
Peter Broadwell (@PeterBroadwell) - UCLA Digital Library
Building Archives to Facilitate Narrative Analyses of Fake News
IFLA International News Media Conference, 27 April 2017
13
Gregor Aisch, Jon Huang, Cecilia Kang,“Dissecting the #PizzaGate Conspiracy Theories,” New York Times, December 10, 2016
Can we generate this computationally?
Peter Broadwell (@PeterBroadwell) - UCLA Digital Library
Building Archives to Facilitate Narrative Analyses of Fake News
IFLA International News Media Conference, 27 April 2017
14
A simple Pizzagate network model
Peter Broadwell (@PeterBroadwell) - UCLA Digital Library
Building Archives to Facilitate Narrative Analyses of Fake News
IFLA International News Media Conference, 27 April 2017
15
Towards a useful Pizzagate network model
Peter Broadwell (@PeterBroadwell) - UCLA Digital Library
Building Archives to Facilitate Narrative Analyses of Fake News
IFLA International News Media Conference, 27 April 2017
16
Bridgegate network, highlighting “degree”
Peter Broadwell (@PeterBroadwell) - UCLA Digital Library
Building Archives to Facilitate Narrative Analyses of Fake News
IFLA International News Media Conference, 27 April 2017
17
Bridgegate network, showing “betweenness”
Peter Broadwell (@PeterBroadwell) - UCLA Digital Library
Building Archives to Facilitate Narrative Analyses of Fake News
IFLA International News Media Conference, 27 April 2017
Archiving fake news for research…
18
• Is difficult to do well
• Requires constant monitoring and checking of
targeted sites, given capabilities of existing tools
• Would benefit from coordination between
institutions to distribute web-crawling tasks
• Entails a great deal of manual and semi-automated
content classification and filtering after collection
• Calls for the use of more sophisticated tools for
named entity resolution, disambiguation and
versioning of archival content
• Is potentially very worthwhile, despite these issues
Peter Broadwell (@PeterBroadwell) - UCLA Digital Library
Building Archives to Facilitate Narrative Analyses of Fake News
IFLA International News Media Conference, 27 April 2017
Thanks!
19
• Prof. Tim Tangherlini, UCLA Scandinavian Section
• Prof. Vwani Roychowdhury, UCLA Electrical Engineering
• Ph.D. students, UCLA Electrical Engineering:
• Ehsan Ebrahimzadeh
• Behnam Shahbazi
• Misagh Falahi
• Mark Graham, Internet Archive
• Karl Blumenthal, Archive-It

More Related Content

PDF
Twitter as a Research Megaphone
PPT
Twitter
PPTX
Meyer Big Data SDP13
PPTX
Social media for scientists st p
PPTX
Academic social networks site as networked socio-technical systems for schola...
PPTX
Using Academic Social Networking to increase your Research Visibility - Ciará...
PPTX
Social Media for Outreach
PPT
Social Media for Researchers
Twitter as a Research Megaphone
Twitter
Meyer Big Data SDP13
Social media for scientists st p
Academic social networks site as networked socio-technical systems for schola...
Using Academic Social Networking to increase your Research Visibility - Ciará...
Social Media for Outreach
Social Media for Researchers

What's hot (20)

PPTX
Tackling fake news wut approach
PPT
Linked Data : Cataloguing and a World Wide Web of Data
PPTX
PPTX
Good Riddance: Academic Publishers are Abandoning Publishing
PPT
Social Media for Researchers
PDF
Social Media Research Methods
PDF
Computational Approaches to Studying Anti-Social Behaviour on Social Media
PPT
Facebook Apps and Libraries' Friendly Future
PPTX
Constructing A Professional Presence - HEA Professional Presences For Academi...
PPTX
Echo Chamber? What Echo Chamber? Reviewing the Evidence
PDF
Joining the ‘buzz’ : the role of social media in raising research visibility ...
PDF
Listserv Monitoring Report
DOC
Ifsi library general information sheet september2010
PPTX
Twitter as a First Draft of the Present – and the Challenges of Preserving It...
PPTX
Managing a (different) Data Deluge - SPARC OA conference
PPTX
Advancing Open @ Vanderbilt
PPTX
Why twitter? What can Twitter do for my library & my professional development?
PPT
Digital Scholarship: building an online scholarly presence
PPTX
The biggest threat to science today: the scholarly publishing system
PDF
20140408 digital newspapers collections [idlc kuala lumpur]
Tackling fake news wut approach
Linked Data : Cataloguing and a World Wide Web of Data
Good Riddance: Academic Publishers are Abandoning Publishing
Social Media for Researchers
Social Media Research Methods
Computational Approaches to Studying Anti-Social Behaviour on Social Media
Facebook Apps and Libraries' Friendly Future
Constructing A Professional Presence - HEA Professional Presences For Academi...
Echo Chamber? What Echo Chamber? Reviewing the Evidence
Joining the ‘buzz’ : the role of social media in raising research visibility ...
Listserv Monitoring Report
Ifsi library general information sheet september2010
Twitter as a First Draft of the Present – and the Challenges of Preserving It...
Managing a (different) Data Deluge - SPARC OA conference
Advancing Open @ Vanderbilt
Why twitter? What can Twitter do for my library & my professional development?
Digital Scholarship: building an online scholarly presence
The biggest threat to science today: the scholarly publishing system
20140408 digital newspapers collections [idlc kuala lumpur]
Ad

Similar to Conspiracy Stories: Building Archives to Facilitate Narrative Analyses of Online Fake News (20)

PPT
Fake news
PPTX
Media literacy panel
PDF
2018 BHL Program Director’s Report: Secretariat & Technical Update
PDF
ICDM 2017 tutorial misinformation
PDF
#3 DataBeersBCN - "How to get into the news with Social networks analysis" by...
PDF
@WebSciDL PhD Student Project Reviews August 5&6, 2015
PPTX
Disrupting academic publishing: a future role for libraries
PPTX
Fake News, Real Teens: Problems and Possibilities
PPTX
Sustainable, Successful Open Data Publication
PDF
Platform Research Agenda
PPTX
Media and Information Literacy (MIL) - 5. Media and Information Sources
PDF
Media and Information Literacy
PDF
(lc 13,14) mil-massmediaandmediaeffects-160825023930.pdf
PDF
2013 ifla satellite zarndt et al [crowdsourcing the world's cultural heritage...
PPTX
Special libraries association meeting march 2014
PDF
Outreach Strategies to Engage Citizen Scientists: Insights from the Biodivers...
PPTX
Crisis COVID 19: Making Open Access Resources Reachable for Enhancing Online ...
PPT
New Forms Of Communication: Harnessing Collective Knowledge through Web Logs
PPT
New Forms Of Communication: Harnessing Collective Knowledge through Web Logs
Fake news
Media literacy panel
2018 BHL Program Director’s Report: Secretariat & Technical Update
ICDM 2017 tutorial misinformation
#3 DataBeersBCN - "How to get into the news with Social networks analysis" by...
@WebSciDL PhD Student Project Reviews August 5&6, 2015
Disrupting academic publishing: a future role for libraries
Fake News, Real Teens: Problems and Possibilities
Sustainable, Successful Open Data Publication
Platform Research Agenda
Media and Information Literacy (MIL) - 5. Media and Information Sources
Media and Information Literacy
(lc 13,14) mil-massmediaandmediaeffects-160825023930.pdf
2013 ifla satellite zarndt et al [crowdsourcing the world's cultural heritage...
Special libraries association meeting march 2014
Outreach Strategies to Engage Citizen Scientists: Insights from the Biodivers...
Crisis COVID 19: Making Open Access Resources Reachable for Enhancing Online ...
New Forms Of Communication: Harnessing Collective Knowledge through Web Logs
New Forms Of Communication: Harnessing Collective Knowledge through Web Logs
Ad

More from Peter Broadwell (7)

PPTX
The East Asian Studies Macroscope: Infrastructure for Collaborative Scholars...
PPTX
Integration of a Unique Multimedia Collection into Public Linked Open Data R...
PPTX
aiSelections: Computational Techniques for Matching Faculty Research Profiles...
PPTX
TrollFinder: Geo-Semantic Exploration of a Very Large Corpus of Danish Folklore
PPTX
From Trot to Cultural Technology: The Historical Development of Production Ne...
PPTX
Social Network Analysis of Collaborative Composition in Film Scoring via the ...
PPTX
ElfYelp: Geolocated Topic Models for Pattern Discovery in a Large Folklore Co...
The East Asian Studies Macroscope: Infrastructure for Collaborative Scholars...
Integration of a Unique Multimedia Collection into Public Linked Open Data R...
aiSelections: Computational Techniques for Matching Faculty Research Profiles...
TrollFinder: Geo-Semantic Exploration of a Very Large Corpus of Danish Folklore
From Trot to Cultural Technology: The Historical Development of Production Ne...
Social Network Analysis of Collaborative Composition in Film Scoring via the ...
ElfYelp: Geolocated Topic Models for Pattern Discovery in a Large Folklore Co...

Recently uploaded (20)

PDF
Practical Manual AGRO-233 Principles and Practices of Natural Farming
PPTX
20th Century Theater, Methods, History.pptx
PPTX
A powerpoint presentation on the Revised K-10 Science Shaping Paper
PDF
David L Page_DCI Research Study Journey_how Methodology can inform one's prac...
PDF
1.3 FINAL REVISED K-10 PE and Health CG 2023 Grades 4-10 (1).pdf
PPTX
Virtual and Augmented Reality in Current Scenario
PPTX
TNA_Presentation-1-Final(SAVE)) (1).pptx
DOCX
Cambridge-Practice-Tests-for-IELTS-12.docx
PDF
What if we spent less time fighting change, and more time building what’s rig...
PDF
Environmental Education MCQ BD2EE - Share Source.pdf
PDF
International_Financial_Reporting_Standa.pdf
PPTX
History, Philosophy and sociology of education (1).pptx
PDF
Τίμαιος είναι φιλοσοφικός διάλογος του Πλάτωνα
PDF
Uderstanding digital marketing and marketing stratergie for engaging the digi...
PPTX
Onco Emergencies - Spinal cord compression Superior vena cava syndrome Febr...
PPTX
Introduction to pro and eukaryotes and differences.pptx
PPTX
202450812 BayCHI UCSC-SV 20250812 v17.pptx
PPTX
Share_Module_2_Power_conflict_and_negotiation.pptx
DOC
Soft-furnishing-By-Architect-A.F.M.Mohiuddin-Akhand.doc
PDF
Hazard Identification & Risk Assessment .pdf
Practical Manual AGRO-233 Principles and Practices of Natural Farming
20th Century Theater, Methods, History.pptx
A powerpoint presentation on the Revised K-10 Science Shaping Paper
David L Page_DCI Research Study Journey_how Methodology can inform one's prac...
1.3 FINAL REVISED K-10 PE and Health CG 2023 Grades 4-10 (1).pdf
Virtual and Augmented Reality in Current Scenario
TNA_Presentation-1-Final(SAVE)) (1).pptx
Cambridge-Practice-Tests-for-IELTS-12.docx
What if we spent less time fighting change, and more time building what’s rig...
Environmental Education MCQ BD2EE - Share Source.pdf
International_Financial_Reporting_Standa.pdf
History, Philosophy and sociology of education (1).pptx
Τίμαιος είναι φιλοσοφικός διάλογος του Πλάτωνα
Uderstanding digital marketing and marketing stratergie for engaging the digi...
Onco Emergencies - Spinal cord compression Superior vena cava syndrome Febr...
Introduction to pro and eukaryotes and differences.pptx
202450812 BayCHI UCSC-SV 20250812 v17.pptx
Share_Module_2_Power_conflict_and_negotiation.pptx
Soft-furnishing-By-Architect-A.F.M.Mohiuddin-Akhand.doc
Hazard Identification & Risk Assessment .pdf

Conspiracy Stories: Building Archives to Facilitate Narrative Analyses of Online Fake News

  • 1. Conspiracy Stories: Building Archives to Facilitate Narrative Analyses of Online Fake News Peter Broadwell (@PeterBroadwell) Digital Library Program University of California, Los Angeles IFLA International News Media Conference Reykjavík, Iceland, 27-28 April 2017
  • 2. Peter Broadwell (@PeterBroadwell) - UCLA Digital Library Building Archives to Facilitate Narrative Analyses of Fake News IFLA International News Media Conference, 27 April 2017 Fake news is… 2 1 Jacob L. Nelson. 2017. “Is ‘fake news’ a fake problem?” Columbia Journalism Review, January 31, 2017 • Difficult to define and identify • A badly overused term – especially these days • Prominent on social media: 30% of traffic to fake news sites is from Facebook; which is true for only 8% of legitimate news • Financed by online advertising systems (Google), and maybe some governments • Now seen as a serious issue by journalists and some Internet companies (finally) • Not as dire a problem as some believe?1 • Worthy of further study
  • 3. Peter Broadwell (@PeterBroadwell) - UCLA Digital Library Building Archives to Facilitate Narrative Analyses of Fake News IFLA International News Media Conference, 27 April 2017 3 Gregor Aisch, Jon Huang, Cecilia Kang,“Dissecting the #PizzaGate Conspiracy Theories,” New York Times, December 10, 2016 Analyzing the “shape” of conspiracy stories
  • 4. Peter Broadwell (@PeterBroadwell) - UCLA Digital Library Building Archives to Facilitate Narrative Analyses of Fake News IFLA International News Media Conference, 27 April 2017 Title 4 TR Tangherlini, V Roychowdhury, B Glenn, CM Crespi, R Bandari, A Wadia, M Falahi, E Ebrahimzadeh, R Bastani. 2016. “‘Mommy Blogs’ and the Vaccination Exemption Narrative: Results From A Machine-Learning Approach for Story Aggregation on Parenting Social Media Sites.” JMIR Public Health Surveillance 2:2 (2016).
  • 5. Peter Broadwell (@PeterBroadwell) - UCLA Digital Library Building Archives to Facilitate Narrative Analyses of Fake News IFLA International News Media Conference, 27 April 2017 Don’t believe everything you read online… 5 http://newsroom.ucla.edu/releases/ucla-researchers-teach-computer-to-read-the-internet
  • 6. Peter Broadwell (@PeterBroadwell) - UCLA Digital Library Building Archives to Facilitate Narrative Analyses of Fake News IFLA International News Media Conference, 27 April 2017 6 TR Tangherlini, V Roychowdhury, B Glenn, CM Crespi, R Bandari, A Wadia, M Falahi, E Ebrahimzadeh, R Bastani. 2016. “‘Mommy Blogs’ and the Vaccination Exemption Narrative: Results From A Machine-Learning Approach for Story Aggregation on Parenting Social Media Sites.” JMIR Public Health Surveillance 2:2 (2016). Narrative framework analysis
  • 7. Peter Broadwell (@PeterBroadwell) - UCLA Digital Library Building Archives to Facilitate Narrative Analyses of Fake News IFLA International News Media Conference, 27 April 2017 Building our own “fake news” web archive 7
  • 8. Peter Broadwell (@PeterBroadwell) - UCLA Digital Library Building Archives to Facilitate Narrative Analyses of Fake News IFLA International News Media Conference, 27 April 2017 8 Or… using someone else’s archive?
  • 9. Peter Broadwell (@PeterBroadwell) - UCLA Digital Library Building Archives to Facilitate Narrative Analyses of Fake News IFLA International News Media Conference, 27 April 2017 9 Small-scale case studies: Pizzagate and Bridgegate
  • 10. Peter Broadwell (@PeterBroadwell) - UCLA Digital Library Building Archives to Facilitate Narrative Analyses of Fake News IFLA International News Media Conference, 27 April 2017 Pre-selected “archives” of coverage 10
  • 11. Peter Broadwell (@PeterBroadwell) - UCLA Digital Library Building Archives to Facilitate Narrative Analyses of Fake News IFLA International News Media Conference, 27 April 2017 Web archives: too much data, not enough information 11 In WANE files Extracted by DBpedia Spotlight 3,376 unique names 954 unique Wikipedia entities Christie_(P): 376 Tony_(P): 241 Chris_Christie_(P): 236 David_Wildstein_(P): 190 Bridget_Anne_Kelly_(P): 169 Chris_Christie_(P): 435 David_Wildstein_(P): 263 Bridget_Anne_Kelly_(P): 222 Bill_Baroni_(P): 194 Mark_Sokolich_(P): 130 • 2 seed URLs, 1 from Huffington Post, 1 from NJ Record • Full Archive-It WARCs contain 124,355 unique linked URLs • But 47,398 are adverts/spam, 73,544 likely news articles • Only 415 news articles (163 from HuffPo, 252 from the Record) are relevant to Bridgegate (.5%)
  • 12. Peter Broadwell (@PeterBroadwell) - UCLA Digital Library Building Archives to Facilitate Narrative Analyses of Fake News IFLA International News Media Conference, 27 April 2017 Returning to Pizzagate… 12
  • 13. Peter Broadwell (@PeterBroadwell) - UCLA Digital Library Building Archives to Facilitate Narrative Analyses of Fake News IFLA International News Media Conference, 27 April 2017 13 Gregor Aisch, Jon Huang, Cecilia Kang,“Dissecting the #PizzaGate Conspiracy Theories,” New York Times, December 10, 2016 Can we generate this computationally?
  • 14. Peter Broadwell (@PeterBroadwell) - UCLA Digital Library Building Archives to Facilitate Narrative Analyses of Fake News IFLA International News Media Conference, 27 April 2017 14 A simple Pizzagate network model
  • 15. Peter Broadwell (@PeterBroadwell) - UCLA Digital Library Building Archives to Facilitate Narrative Analyses of Fake News IFLA International News Media Conference, 27 April 2017 15 Towards a useful Pizzagate network model
  • 16. Peter Broadwell (@PeterBroadwell) - UCLA Digital Library Building Archives to Facilitate Narrative Analyses of Fake News IFLA International News Media Conference, 27 April 2017 16 Bridgegate network, highlighting “degree”
  • 17. Peter Broadwell (@PeterBroadwell) - UCLA Digital Library Building Archives to Facilitate Narrative Analyses of Fake News IFLA International News Media Conference, 27 April 2017 17 Bridgegate network, showing “betweenness”
  • 18. Peter Broadwell (@PeterBroadwell) - UCLA Digital Library Building Archives to Facilitate Narrative Analyses of Fake News IFLA International News Media Conference, 27 April 2017 Archiving fake news for research… 18 • Is difficult to do well • Requires constant monitoring and checking of targeted sites, given capabilities of existing tools • Would benefit from coordination between institutions to distribute web-crawling tasks • Entails a great deal of manual and semi-automated content classification and filtering after collection • Calls for the use of more sophisticated tools for named entity resolution, disambiguation and versioning of archival content • Is potentially very worthwhile, despite these issues
  • 19. Peter Broadwell (@PeterBroadwell) - UCLA Digital Library Building Archives to Facilitate Narrative Analyses of Fake News IFLA International News Media Conference, 27 April 2017 Thanks! 19 • Prof. Tim Tangherlini, UCLA Scandinavian Section • Prof. Vwani Roychowdhury, UCLA Electrical Engineering • Ph.D. students, UCLA Electrical Engineering: • Ehsan Ebrahimzadeh • Behnam Shahbazi • Misagh Falahi • Mark Graham, Internet Archive • Karl Blumenthal, Archive-It