SlideShare a Scribd company logo
NEWORDER – Science in the Online Knowledge
Order
Stefan Dietze, 13.10.2023
Discourse Interactions
Algorithms/AI
Motivation: science online discourse vs offline society & policies
Society, Media, Politics & Policies
(Offline & Online)
Science discourse online
(NEWORDER focus: news & social media) 2
3
▪ Percentage of tweets containing
links to scientific articles (journals,
publishers, science blogs etc)
▪ Uses list of > 17 K science web
domains (URLs)
▪ Data source: 1% sample of Twitter
(https://data.gesis.org/tweetskb/),
(> 14 bn tweets archived since
2013)
Motivation: scientific online discourse is on the rise
Example: Twitter / X
4
NEWORDER project
Interdisciplinary approach, team and objectives
▪ Perception of roles, sources, and authority; impact on trust-
worthiness assessment
(Cognitive Psychology)
▪ Dissolution of phases, hierarchies and contexts in the
scientific process
(Social Sciences, Media & Communication Studies)
▪ Computational methods for collecting, detecting and
classifying scientific online discourse
(Computer Science/AI & Computational Linguistics)
Cress, Utz (IWM & Uni Tübingen)
Marcinkowski, Koss (HHU)
Dietze, Boland, Jabeen (GESIS), Kallmeyer (HHU)
How can „scientific discourse“ be defined?
Example: Twitter / X
5
Science claim
Science reference
Science relevance
No science
Science reference
Hafid, S., Schellhammer, S., Bringay, S., Todorov, K., Dietze, S., SciTweets - A Dataset and Annotation Framework for Detecting Scientific Online Discourse,
CIKM2022
Training AI to detect science discourse: SciTweets dataset & classifier
6
▪ Manual annotation of ground truth dataset for
testing models (heuristics-based sampling,
annotation framework, > 1K annotated tweets)
▪ Training AI models to detect science discourse in
large-scale discourse data (e.g. Web archives)
▪ Reasonable classification performance using fine-
tuned language model (SciBERT) applied to
TweetsKB data
Hafid, S., Schellhammer, S., Bringay, S., Todorov, K., Dietze, S., SciTweets - A Dataset and Annotation Framework for Detecting Scientific Online Discourse,
CIKM2022
https://github.com/AI-4-Sci/SciTweets
What is science discourse and how does it evolve?
Increasing amount and proportion of not peer-reviewed science works
7
Absolute amount of tweets sharing preprints Proportion of preprints among shared science URLs
How is public attention distributed?
Power law distribution
8
• 10% of studies receive
> 75% of all Twitter
mentions
• Long tail of studies
with few mentions
• Data source: 1.67 M
tweets mentioning at
least one of the
primary science
studies in the
„Altmetrics“ corpus
Top x (%) of mentioned science studies
Share
of
twitter
mentions
(%)
Challenge: online science discourse is not well-informed
Links to actual scientific studies/context missing in news & social media
9
▪ NLP models able to predict missing primary science reference (e.g. DOI or journal paper link) for
given informal reference (e.g. “Heinsberg Studie”) or secondary reference (news article)
Challenge: online science discourse is not well-informed
Links to actual scientific studies/context missing in news & social media
10
▪ NLP models able to predict missing primary science reference (e.g. DOI or journal paper link) for
given informal reference (e.g. “Heinsberg Studie”) or secondary reference (news article)
Challenge: online science discourse is not well-informed
Links to actual scientific studies/context missing in news & social media
11
▪ Supervised & unsupervised approaches using DL language models
Science discourse is „different“
12
Examples from http://snopes.com
Non-science claim
Science claim
Computational (AI) challenge
NLP methods (e.g. for fact-checking) perform worse on science discourse
13
▪ Take-away: AI-based methods geared towards scientific discourse required
Performance of state-of-the-art AI/deep learning using standard benchmark datasets
Claim Check-Worthiness
Detection
Fake News Detection
Claim Verification
Wrap-Up & Outlook: Interdisciplinary Work Plan
Media & Communication Studies
(spreading pattern & societal impact)
WP5 Longitudinal online discourse analysis
WP2 Dissolution of phases & contexts
WP3 Perception of roles, credibility & trust
WP1 Data collection & study preparation
Cognitive & Social Psychology
(effects on individuals)
Computer & Information Science
(understanding online discourse)
WP4 NLP for classifying sources & roles
15
http://gesis.org/en/kts

More Related Content

Similar to NEWORDER Project - Science in the online knowledge order (20)

PDF
AI in between online and offline discourse - and what has ChatGPT to do with ...
Stefan Dietze
 
PDF
Towards the study of sentiment in the public opinion of science in Spanish
Technological Ecosystems for Enhancing Multiculturality
 
PPTX
Science communication, Guest Lecture, Paige Brown
Paige Jarreau
 
PPTX
Interpreting social media acts. The various meanings of altmetrics
Stefanie Haustein
 
PPTX
The role of new information and communication technologies in information and...
Christina Pikas
 
PPTX
Citizen Science @ OIISDP 2013 in Toronto
Todd Suomela
 
PDF
Communication as a foundation of the Open Science
Emanuel Kulczycki
 
PDF
Science dissemination 2.0: Social media for researchers (MTM-MSc 2022)
Xavier Lasauca i Cisa
 
PDF
Stefanie Haustein, Timothy D. Bowman, Benoît Macaluso, Cassidy R. Sugimoto &...
Stefanie Haustein
 
PPTX
darshit Scientometrics_Informetrics_Presentation.pptx
arjunbhu47
 
PPTX
Social media for science communication campus communicators
Paige Jarreau
 
PDF
CMNS 446-4 Communication, Science & Technology, Updated Syllabus, Fall 2015 A...
Anis Rahman
 
PPTX
Day1 Civic Science Lab: Experts in the Policymaking Process & Models of Scien...
Matthew Nisbet
 
PPTX
Social media for science communication - URMA Presentation
Paige Jarreau
 
PPTX
Opening Science: Building bridges between research and society
Miquel Duran
 
PDF
The New Art Of Old Public Science Communication The Science Slam 1st Edition ...
queberispal
 
PPTX
Altmetrix
Hugo Besemer
 
PPTX
20130610 ubuviri-ciencia20-eng
Miquel Duran
 
PPTX
Politics and Pragmatism in Scientific Ontology Construction
Mike Travers
 
AI in between online and offline discourse - and what has ChatGPT to do with ...
Stefan Dietze
 
Towards the study of sentiment in the public opinion of science in Spanish
Technological Ecosystems for Enhancing Multiculturality
 
Science communication, Guest Lecture, Paige Brown
Paige Jarreau
 
Interpreting social media acts. The various meanings of altmetrics
Stefanie Haustein
 
The role of new information and communication technologies in information and...
Christina Pikas
 
Citizen Science @ OIISDP 2013 in Toronto
Todd Suomela
 
Communication as a foundation of the Open Science
Emanuel Kulczycki
 
Science dissemination 2.0: Social media for researchers (MTM-MSc 2022)
Xavier Lasauca i Cisa
 
Stefanie Haustein, Timothy D. Bowman, Benoît Macaluso, Cassidy R. Sugimoto &...
Stefanie Haustein
 
darshit Scientometrics_Informetrics_Presentation.pptx
arjunbhu47
 
Social media for science communication campus communicators
Paige Jarreau
 
CMNS 446-4 Communication, Science & Technology, Updated Syllabus, Fall 2015 A...
Anis Rahman
 
Day1 Civic Science Lab: Experts in the Policymaking Process & Models of Scien...
Matthew Nisbet
 
Social media for science communication - URMA Presentation
Paige Jarreau
 
Opening Science: Building bridges between research and society
Miquel Duran
 
The New Art Of Old Public Science Communication The Science Slam 1st Edition ...
queberispal
 
Altmetrix
Hugo Besemer
 
20130610 ubuviri-ciencia20-eng
Miquel Duran
 
Politics and Pragmatism in Scientific Ontology Construction
Mike Travers
 

More from Stefan Dietze (20)

PDF
Collecting & Temporal Analysis of Behavioral Web Data - Tales From The Inside
Stefan Dietze
 
PDF
An interdisciplinary journey with the SAL spaceship – results and challenges ...
Stefan Dietze
 
PDF
Research Knowledge Graphs at NFDI4DS & GESIS
Stefan Dietze
 
PDF
Research Knowledge Graphs at GESIS & NFDI4DataScience
Stefan Dietze
 
PDF
Human-in-the-loop: the Web as Foundation for interdisciplinary Data Science M...
Stefan Dietze
 
PDF
Human-in-the-Loop: das Web als Grundlage interdisziplinärer Data Science Meth...
Stefan Dietze
 
PDF
Towards research data knowledge graphs
Stefan Dietze
 
PDF
Beyond research data infrastructures: exploiting artificial & crowd intellige...
Stefan Dietze
 
PDF
From Web Data to Knowledge: on the Complementarity of Human and Artificial In...
Stefan Dietze
 
PDF
Using AI to understand everyday learning on the Web
Stefan Dietze
 
PDF
Analysing User Knowledge, Competence and Learning during Online Activities
Stefan Dietze
 
PDF
Analysing & Improving Learning Resources Markup on the Web
Stefan Dietze
 
PDF
Beyond Linked Data - Exploiting Entity-Centric Knowledge on the Web
Stefan Dietze
 
PDF
Big Data in Learning Analytics - Analytics for Everyday Learning
Stefan Dietze
 
PDF
Retrieval, Crawling and Fusion of Entity-centric Data on the Web
Stefan Dietze
 
PDF
Mining and Understanding Activities and Resources on the Web
Stefan Dietze
 
PDF
Towards embedded Markup of Learning Resources on the Web
Stefan Dietze
 
PDF
Semantic Linking & Retrieval for Digital Libraries
Stefan Dietze
 
PDF
Linked Data for Architecture, Engineering and Construction (AEC)
Stefan Dietze
 
PDF
Dietze linked data-vr-es
Stefan Dietze
 
Collecting & Temporal Analysis of Behavioral Web Data - Tales From The Inside
Stefan Dietze
 
An interdisciplinary journey with the SAL spaceship – results and challenges ...
Stefan Dietze
 
Research Knowledge Graphs at NFDI4DS & GESIS
Stefan Dietze
 
Research Knowledge Graphs at GESIS & NFDI4DataScience
Stefan Dietze
 
Human-in-the-loop: the Web as Foundation for interdisciplinary Data Science M...
Stefan Dietze
 
Human-in-the-Loop: das Web als Grundlage interdisziplinärer Data Science Meth...
Stefan Dietze
 
Towards research data knowledge graphs
Stefan Dietze
 
Beyond research data infrastructures: exploiting artificial & crowd intellige...
Stefan Dietze
 
From Web Data to Knowledge: on the Complementarity of Human and Artificial In...
Stefan Dietze
 
Using AI to understand everyday learning on the Web
Stefan Dietze
 
Analysing User Knowledge, Competence and Learning during Online Activities
Stefan Dietze
 
Analysing & Improving Learning Resources Markup on the Web
Stefan Dietze
 
Beyond Linked Data - Exploiting Entity-Centric Knowledge on the Web
Stefan Dietze
 
Big Data in Learning Analytics - Analytics for Everyday Learning
Stefan Dietze
 
Retrieval, Crawling and Fusion of Entity-centric Data on the Web
Stefan Dietze
 
Mining and Understanding Activities and Resources on the Web
Stefan Dietze
 
Towards embedded Markup of Learning Resources on the Web
Stefan Dietze
 
Semantic Linking & Retrieval for Digital Libraries
Stefan Dietze
 
Linked Data for Architecture, Engineering and Construction (AEC)
Stefan Dietze
 
Dietze linked data-vr-es
Stefan Dietze
 
Ad

Recently uploaded (20)

PDF
A deep Search for Ethylene Glycol and Glycolonitrile in the V883 Ori Protopla...
Sérgio Sacani
 
PDF
Perchlorate brine formation from frost at the Viking 2 landing site
Sérgio Sacani
 
PPTX
Quality control test for plastic & metal.pptx
shrutipandit17
 
PPTX
Nature of Science and the kinds of models used in science
JocelynEvascoRomanti
 
PPTX
Biology-BIO241-lec6. for human healthpptx
omarmora65
 
PPTX
Pengenalan Sel dan organisasi kehidupanpptx
SuntiEkaprawesti1
 
PPTX
Laboratory design and safe microbiological practices
Akanksha Divkar
 
PPTX
Hericium erinaceus, also known as lion's mane mushroom
TinaDadkhah1
 
PPTX
RED ROT DISEASE OF SUGARCANE.pptx
BikramjitDeuri
 
DOCX
Introduction to Weather & Ai Integration (UI)
kutatomoshi
 
PDF
Systems Biology: Integrating Engineering with Biological Research (www.kiu.a...
publication11
 
PDF
Challenges of Transpiling Smalltalk to JavaScript
ESUG
 
PPTX
CO1 20223 SCIENCESIX Q3 WK7 D2 (1).pptx
vanesamayqmondejar
 
PPTX
Molecular Marker Assisted Breeding in Plants
Muhammad Hassan Asadi
 
PPTX
Nanofertilizer: Its potential benefits and associated challenges.pptx
BikramjitDeuri
 
PDF
Lecture Notes on Linear Algebra: From Concrete Matrices to Abstract Structures
Pranav Sharma
 
PPTX
Graduation Project 2025 mohamed Tarek PT
midotarekss12
 
PPTX
Q1_Science 8_Week4-Day 5.pptx science re
AizaRazonado
 
PPTX
mirna_2025_clase_genética_cinvestav_Dralvarez
lalvarezmex
 
PDF
New Physics and Quantum AI: Pioneering the Next Frontier
Saikat Basu
 
A deep Search for Ethylene Glycol and Glycolonitrile in the V883 Ori Protopla...
Sérgio Sacani
 
Perchlorate brine formation from frost at the Viking 2 landing site
Sérgio Sacani
 
Quality control test for plastic & metal.pptx
shrutipandit17
 
Nature of Science and the kinds of models used in science
JocelynEvascoRomanti
 
Biology-BIO241-lec6. for human healthpptx
omarmora65
 
Pengenalan Sel dan organisasi kehidupanpptx
SuntiEkaprawesti1
 
Laboratory design and safe microbiological practices
Akanksha Divkar
 
Hericium erinaceus, also known as lion's mane mushroom
TinaDadkhah1
 
RED ROT DISEASE OF SUGARCANE.pptx
BikramjitDeuri
 
Introduction to Weather & Ai Integration (UI)
kutatomoshi
 
Systems Biology: Integrating Engineering with Biological Research (www.kiu.a...
publication11
 
Challenges of Transpiling Smalltalk to JavaScript
ESUG
 
CO1 20223 SCIENCESIX Q3 WK7 D2 (1).pptx
vanesamayqmondejar
 
Molecular Marker Assisted Breeding in Plants
Muhammad Hassan Asadi
 
Nanofertilizer: Its potential benefits and associated challenges.pptx
BikramjitDeuri
 
Lecture Notes on Linear Algebra: From Concrete Matrices to Abstract Structures
Pranav Sharma
 
Graduation Project 2025 mohamed Tarek PT
midotarekss12
 
Q1_Science 8_Week4-Day 5.pptx science re
AizaRazonado
 
mirna_2025_clase_genética_cinvestav_Dralvarez
lalvarezmex
 
New Physics and Quantum AI: Pioneering the Next Frontier
Saikat Basu
 
Ad

NEWORDER Project - Science in the online knowledge order

  • 1. NEWORDER – Science in the Online Knowledge Order Stefan Dietze, 13.10.2023
  • 2. Discourse Interactions Algorithms/AI Motivation: science online discourse vs offline society & policies Society, Media, Politics & Policies (Offline & Online) Science discourse online (NEWORDER focus: news & social media) 2
  • 3. 3 ▪ Percentage of tweets containing links to scientific articles (journals, publishers, science blogs etc) ▪ Uses list of > 17 K science web domains (URLs) ▪ Data source: 1% sample of Twitter (https://data.gesis.org/tweetskb/), (> 14 bn tweets archived since 2013) Motivation: scientific online discourse is on the rise Example: Twitter / X
  • 4. 4 NEWORDER project Interdisciplinary approach, team and objectives ▪ Perception of roles, sources, and authority; impact on trust- worthiness assessment (Cognitive Psychology) ▪ Dissolution of phases, hierarchies and contexts in the scientific process (Social Sciences, Media & Communication Studies) ▪ Computational methods for collecting, detecting and classifying scientific online discourse (Computer Science/AI & Computational Linguistics) Cress, Utz (IWM & Uni Tübingen) Marcinkowski, Koss (HHU) Dietze, Boland, Jabeen (GESIS), Kallmeyer (HHU)
  • 5. How can „scientific discourse“ be defined? Example: Twitter / X 5 Science claim Science reference Science relevance No science Science reference Hafid, S., Schellhammer, S., Bringay, S., Todorov, K., Dietze, S., SciTweets - A Dataset and Annotation Framework for Detecting Scientific Online Discourse, CIKM2022
  • 6. Training AI to detect science discourse: SciTweets dataset & classifier 6 ▪ Manual annotation of ground truth dataset for testing models (heuristics-based sampling, annotation framework, > 1K annotated tweets) ▪ Training AI models to detect science discourse in large-scale discourse data (e.g. Web archives) ▪ Reasonable classification performance using fine- tuned language model (SciBERT) applied to TweetsKB data Hafid, S., Schellhammer, S., Bringay, S., Todorov, K., Dietze, S., SciTweets - A Dataset and Annotation Framework for Detecting Scientific Online Discourse, CIKM2022 https://github.com/AI-4-Sci/SciTweets
  • 7. What is science discourse and how does it evolve? Increasing amount and proportion of not peer-reviewed science works 7 Absolute amount of tweets sharing preprints Proportion of preprints among shared science URLs
  • 8. How is public attention distributed? Power law distribution 8 • 10% of studies receive > 75% of all Twitter mentions • Long tail of studies with few mentions • Data source: 1.67 M tweets mentioning at least one of the primary science studies in the „Altmetrics“ corpus Top x (%) of mentioned science studies Share of twitter mentions (%)
  • 9. Challenge: online science discourse is not well-informed Links to actual scientific studies/context missing in news & social media 9
  • 10. ▪ NLP models able to predict missing primary science reference (e.g. DOI or journal paper link) for given informal reference (e.g. “Heinsberg Studie”) or secondary reference (news article) Challenge: online science discourse is not well-informed Links to actual scientific studies/context missing in news & social media 10
  • 11. ▪ NLP models able to predict missing primary science reference (e.g. DOI or journal paper link) for given informal reference (e.g. “Heinsberg Studie”) or secondary reference (news article) Challenge: online science discourse is not well-informed Links to actual scientific studies/context missing in news & social media 11 ▪ Supervised & unsupervised approaches using DL language models
  • 12. Science discourse is „different“ 12 Examples from http://snopes.com Non-science claim Science claim
  • 13. Computational (AI) challenge NLP methods (e.g. for fact-checking) perform worse on science discourse 13 ▪ Take-away: AI-based methods geared towards scientific discourse required Performance of state-of-the-art AI/deep learning using standard benchmark datasets Claim Check-Worthiness Detection Fake News Detection Claim Verification
  • 14. Wrap-Up & Outlook: Interdisciplinary Work Plan Media & Communication Studies (spreading pattern & societal impact) WP5 Longitudinal online discourse analysis WP2 Dissolution of phases & contexts WP3 Perception of roles, credibility & trust WP1 Data collection & study preparation Cognitive & Social Psychology (effects on individuals) Computer & Information Science (understanding online discourse) WP4 NLP for classifying sources & roles