SlideShare a Scribd company logo
TRUSTWORTHY AI AND
OPEN SCIENCE
Beth Plale
Michael A and Laurie Burns McRobbie Professor of Computer Engineering
Beilstein Open Science symposium
October 06, 2021
Luddy School of Informatics, Computing, and Engineering
Data To Insight Center
Observations influenced by my role (2017-2020) in the
National Science Foundation working on agency policies
and practice in open science. Views expressed are
entirely my own.
Funding agency perspective on open science: how do
we bring visibility to the products of research (that we
fund)
NSF funds the collection and capture
of research data through projects
ranging from a few hundred thousand
dollars to tens of millions of dollars.
The data are maintained in a
landscape of solutions to meet the
needs of researchers.
Specialist repositories
- Organizational resources
Generalist repositories
- Organizational resources
Data Portals
- Low velocity data
- Employs cloud resources
- Employs data-compute proximity for analysis
Observation networks
- High velocity data
- Employs cloud resources
RESEARCH DATA LANDSCAPE
SAGE
NEON ARM
HPWREN
UWI
LTER, OOI
NEON
HydroShare
LTER
MGDS, IRIS
ICPSR
QDR
TAIR
MDF
IEDA
PDB
CCDC
DataVerse
Figshare
Dryad
Zenodo
IRs
Exemplar
systems
RESEARCH DATA LANDSCAPE
Data
timeliness
need
Researcher
depth of
expertise
Expectation
for level of
curation
Expectation
of data
longevity
Specialist repositories
- Organizational resources
Generalist repositories
- Organizational resources
Data Portals
- Low velocity data
- Employs cloud resources
- Employs data-compute proximity for analysis
Observation networks
- High velocity data
- Employs cloud resources
SAGE
NEON ARM
HPWREN
UWI
LTER, OOI
NEON
HydroShare
LTER
MGDS, IRIS
ICPSR
QDR
TAIR
MDF
IEDA
PDB
CCDC
DataVerse
Figshare
Dryad
Zenodo
IRs
RESEARCH DATA LANDSCAPE
Publisher’s
view of
landscape
(general
public
view as
well)
Optimization
for timeliness
of research
could
suggest
lower value
over time
Specialist repositories
- Organizational resources
Generalist repositories
- Organizational resources
Data Portals
- Low velocity data
- Employs cloud resources
- Employs data-compute proximity for analysis
Observation networks
- High velocity data
- Employs cloud resources
SAGE
NEON ARM
HPWREN
UWI
LTER, OOI
NEON
HydroShare
LTER
MGDS, IRIS
ICPSR
QDR
TAIR
MDF
IEDA
PDB
CCDC
DataVerse
Figshare
Dryad
Zenodo
IRs
Generalist–Aided
Deposit:
engages generalist
curators
Metadata:
generalist schema
Reuse potential:
moderate-low as
metadata is curated
but general
Scope:
discipline agnostic
scope
Discovery:
broad name
recognition
Specialist-DBMS
Deposit:
difficult so DB often
read-only
Metadata:
data dictionary + DB
schema
Reuse potential:
high potential as self
contained
Scope:
subdiscipline scope
Discovery:
known within
subdiscipline
Specialist–Aided
Deposit:
engages specialist
curators
Metadata:
specialized
schema
Reuse potential:
high due to
specialists
Scope:
discipline scope
Discovery:
known within
discipline
Specialist-Unaided
Deposit:
unaided deposit
Metadata:
specialized schema
Reuse potential:
moderate-high from
discipline focus of
metadata schema
Scope:
discipline scope
Discovery:
known within
discipline
Generalist-Unaided
Deposit:
unaided deposit
Metadata:
generalist schema
Reuse potential:
low as metadata is
minimal
Scope:
discipline agnostic
scope
Discovery:
broad name
recognition
i.e., institutional repositories
Generalist–Aided
Deposit:
engages generalist
curators
Metadata:
generalist schema
Reuse potential:
moderate-low as
metadata is curated
but general
Scope:
discipline agnostic
scope
Discovery:
broad name
recognition
Specialist-DBMS
Deposit:
difficult so DB often
read-only
Metadata:
data dictionary + DB
schema
Reuse potential:
high potential as self
contained
Scope:
subdiscipline scope
Discovery:
known within
subdiscipline
Specialist–Aided
Deposit:
engages specialist
curators
Metadata:
specialized
schema
Reuse potential:
high due to
specialists
Scope:
discipline scope
Discovery:
known within
discipline
Specialist-Unaided
Deposit:
unaided deposit
Metadata:
specialized schema
Reuse potential:
moderate-high from
discipline focus of
metadata schema
Scope:
discipline scope
Discovery:
known within
discipline
Generalist-Unaided
Deposit:
unaided deposit
Metadata:
generalist schema
Reuse potential:
low as metadata is
minimal
Scope:
discipline agnostic
scope
Discovery:
broad name
recognition
i.e., institutional repositories
Generalist–Aided
Deposit:
engages generalist
curators
Metadata:
generalist schema
Reuse potential:
moderate-low as
metadata is curated
but general
Scope:
discipline agnostic
scope
Discovery:
broad name
recognition
Specialist-DBMS
Deposit:
difficult so DB often
read-only
Metadata:
data dictionary + DB
schema
Reuse potential:
high potential as self
contained
Scope:
subdiscipline scope
Discovery:
known within
subdiscipline
Specialist–Aided
Deposit:
engages specialist
curators
Metadata:
specialized
schema
Reuse potential:
high due to
specialists
Scope:
discipline scope
Discovery:
known within
discipline
Specialist-Unaided
Deposit:
unaided deposit
Metadata:
specialized schema
Reuse potential:
moderate-high from
discipline focus of
metadata schema
Scope:
discipline scope
Discovery:
known within
discipline
Generalist-Unaided
Deposit:
unaided deposit
Metadata:
generalist schema
Reuse potential:
low as metadata is
minimal
Scope:
discipline agnostic
scope
Discovery:
broad name
recognition
i.e., institutional repositories
FEDERAL RESEARCH DATA SUMMARY
• Observation networks and data portals are a fixed part of the
landscape. They have a different role in open science than do
repositories
• Generalist repositories are easier to use than specialist
repositories
• Specialist repositories have higher reusability
• Generalist repositories have economies of scale
• If specialist repositories can leverage generalist repositories as
back ends it would reduce overall cost
OPEN SCIENCE ROLE IN AI
TRUSTWORTHINESS
“ON ARTIFICIAL
INTELLIGENCE, TRUST
IS A MUST, NOT A
NICE-TO-HAVE”
Margrethe Vestager, the European
Commission executive vice president
who oversees digital policy for the 27-
nation bloc
TRUST ó TRUSTWORTHINESS
TRUST
• An individual’s confidence in an
entity
• “I trust this web site”
TRUSTWORTHINESS
• An entity’s state of being
trustworthy or reliable
• An estimate of an object’s
worthiness to receive someone’s
trust
• Trustworthiness is difficult to
accurately quantify
Trustworthy AI and Open Science
Trustworthy AI and Open Science
INDIANA UNIVERSITY BLOOMINGTON
AI: Human-Machine Interaction
§ Fitness smartwatch, smart hearing aids
§ Co-bots, cyber-crews, digital twins
§ Integration of smart machines into human body in
form of computer-brain interfaces or cyborgs
AI: Autonomous and Semi-
Autonomous Actors
• Weapon systems
• Robots in deep sea and space
exploration
• Self driving cars
• Bots in financial trade
AI: Big Data / Big Compute
• Deep learning / Machine Learning /
Natural Language Processing
• Medical diagnosis, image recognition
Broad Categories
of AI
INDIANA UNIVERSITY BLOOMINGTON
AI: Human-Machine Interaction
§ Fitness smartwatch, smart hearing aids
§ Co-bots, cyber-crews, digital twins
§ Integration of smart machines into human body in
form of computer-brain interfaces or cyborgs
AI: Autonomous and Semi-
Autonomous Actors
• Weapon systems
• Robots in deep sea and space
exploration
• Self driving cars
• Bots in financial trade
AI: Big Data / Big Compute
• Deep learning / Machine Learning /
Natural Language Processing
• Medical diagnosis, image recognition
Broad Categories
of AI
Category with most
urgency in issues of
artificial moral agency
INDIANA UNIVERSITY BLOOMINGTON
AI: Human-Machine Interaction
§ Fitness smartwatch, smart hearing aids
§ Co-bots, cyber-crews, digital twins
§ Integration of smart machines into human body in
form of computer-brain interfaces or cyborgs
AI: Autonomous and Semi-
Autonomous Actors
• Weapon systems
• Robots in deep sea and space
exploration
• Self driving cars
• Bots in financial trade
AI: Big Data / Big Compute
• Deep learning / Machine Learning /
Natural Language Processing
• Medical diagnosis, image recognition
Broad Categories
of AI
Research needed in policy
and technical extensions
that lead to greater and
more measurable forms of
accountability
INDIANA UNIVERSITY BLOOMINGTON
INTERVENTION POINTS: ENHANCED
TRUSTWORTHINESS
Developer
ethics,
development
process norms
Societal influence:
public pressure,
legislation,
regulatory
oversight AI algorithmic
knowledge
exhibiting
higher levels
of
trustworthiness
Technological
manifestation:
verifiable claims,
explainability,
accountability
Trustworthy AI is AI that is designed, developed, and used in a
manner that is lawful, fair, unbiased, accurate, reliable,
effective, safe, secure, resilient, understandable, and with
processes in place to regularly monitor and evaluate the AI
system’s performance and outcomes
Lynne Parker, Deputy US Chief Technology Officer and Director of the National Artificial Intelligence Initiative Office
ML PROCESS
M. Veale et al., CHI 2018
Data
Training
data
Feature
extraction
Test data
Learning
algorithm
Trained
model
Predict
New
data
Explain-
ability
inquiries
dev ops
RESEARCH PRODUCTS
M. Veale et al., CHI 2018
Data
Training
data
Feature
extraction
Test data
Learning
algorithm
Trained
model
Predict
New
data
Explain-
ability
inquiries
dev ops
Open science contributes to trustworthy
AI (trusted products)
The research products of AI need to
include intermediate results and
explainability services
BETH PLALE
INDIANA UNIVERSITY
PLALE@INDIANA.EDU
TRUSTWORTHY

More Related Content

What's hot (20)

PPTX
Research Data Management
Jamie Bisset
 
PPTX
Introduction to Research Data Management - 2015-02-09 - MPLS Division, Univer...
Research Support Team, IT Services, University of Oxford
 
PPT
Lifting the Lid on Linked Data
Jane Stevenson
 
PPT
Information Extraction and Linked Data Cloud
Dhaval Thakker
 
PPT
CrossRef And The Pursuit Of Truthiness, STM Meeting, Frankfurt, Germany, Octo...
Crossref
 
PDF
Organizational Identifiers - Crossref LIVE Hannover
Crossref
 
PPT
LIBER Webinar: 23 Things About Research Data Management
LIBER Europe
 
PPTX
DataONE Education Module 10: Legal and Policy Issues
DataONE
 
PDF
Connecting the dots: drug information and Linked Data
Tomasz Adamusiak
 
PDF
Pistoia alliance harmonizing fair data catalog approaches webinar
Pistoia Alliance
 
PPT
Fox-Keynote-Now and Now of Data Publishing-nfdp13
DataDryad
 
PPTX
Washington Linked Data Authority Service at University of Houston
National Information Standards Organization (NISO)
 
PDF
Keystone summer school_2015_miguel_antonio_ldcompression_4-joined
Joel Azzopardi
 
PDF
Mendeley Data FAIR hackathon
Luiz Olavo Bonino da Silva Santos
 
PPTX
CrossRef at SciELO15 Conference 2013
Crossref
 
PPTX
Experience from 10 months of University Linked Data
Mathieu d'Aquin
 
PPTX
The Dataverse Commons
Merce Crosas
 
PPTX
Working with data.open.ac.uk, the Linked Data Platform of the Open University
Mathieu d'Aquin
 
PPTX
The State of Linked Government Data
Richard Cyganiak
 
PPTX
Research Data Sharing: A Basic Framework
Paul Groth
 
Research Data Management
Jamie Bisset
 
Introduction to Research Data Management - 2015-02-09 - MPLS Division, Univer...
Research Support Team, IT Services, University of Oxford
 
Lifting the Lid on Linked Data
Jane Stevenson
 
Information Extraction and Linked Data Cloud
Dhaval Thakker
 
CrossRef And The Pursuit Of Truthiness, STM Meeting, Frankfurt, Germany, Octo...
Crossref
 
Organizational Identifiers - Crossref LIVE Hannover
Crossref
 
LIBER Webinar: 23 Things About Research Data Management
LIBER Europe
 
DataONE Education Module 10: Legal and Policy Issues
DataONE
 
Connecting the dots: drug information and Linked Data
Tomasz Adamusiak
 
Pistoia alliance harmonizing fair data catalog approaches webinar
Pistoia Alliance
 
Fox-Keynote-Now and Now of Data Publishing-nfdp13
DataDryad
 
Washington Linked Data Authority Service at University of Houston
National Information Standards Organization (NISO)
 
Keystone summer school_2015_miguel_antonio_ldcompression_4-joined
Joel Azzopardi
 
Mendeley Data FAIR hackathon
Luiz Olavo Bonino da Silva Santos
 
CrossRef at SciELO15 Conference 2013
Crossref
 
Experience from 10 months of University Linked Data
Mathieu d'Aquin
 
The Dataverse Commons
Merce Crosas
 
Working with data.open.ac.uk, the Linked Data Platform of the Open University
Mathieu d'Aquin
 
The State of Linked Government Data
Richard Cyganiak
 
Research Data Sharing: A Basic Framework
Paul Groth
 

Similar to Trustworthy AI and Open Science (20)

PPTX
Repository Federation: Towards Data Interoperability
Robert H. McDonald
 
PPTX
Solving disciplines 20200207 v2
home
 
PPTX
Milano short 20190529 v1
home
 
PDF
Beyond Meta-Data: Nano-Publications Recording Scientific Endeavour
KNOWeSCAPE2014
 
PPTX
Machines are people too
Paul Groth
 
PPTX
20211103 jim spohrer oecd ai_science_productivity_panel v5
home
 
PPTX
Hicss52 20190108 v3
home
 
PDF
Artificial Intelligence in Data Curation
Novartis Institutes for BioMedical Research
 
PPTX
Building COVID-19 Museum as Open Science Project
vty
 
PPTX
CNI Fall 2011 Meeting Presentation Margaret Hedstrom & Robert McDonald (Dec. ...
SEAD
 
PPTX
Digital Library Federation - DataNets Panel presentation (Nov. 1st, 2011)
SEAD
 
PPTX
Spohrer SIRs 20230511 v16.pptx
home
 
PPTX
20220228 uc merced maglio_class v14
home
 
PPTX
Milano short 20190530 v2
home
 
PPTX
2021020 jim spohrer ai for_good_conference future_of_ai v4
home
 
PDF
Research in Intelligent Systems and Data Science at the Knowledge Media Insti...
Enrico Motta
 
PDF
[DSC Europe 23] Luciano Catani - AI in Diplomacy.PDF
DataScienceConferenc1
 
PPTX
Quo vadis, provenancer?  Cui prodest?  our own trajectory: provenance of data...
Paolo Missier
 
PDF
Carpenter "The Future of the Scholarly Record"
National Information Standards Organization (NISO)
 
Repository Federation: Towards Data Interoperability
Robert H. McDonald
 
Solving disciplines 20200207 v2
home
 
Milano short 20190529 v1
home
 
Beyond Meta-Data: Nano-Publications Recording Scientific Endeavour
KNOWeSCAPE2014
 
Machines are people too
Paul Groth
 
20211103 jim spohrer oecd ai_science_productivity_panel v5
home
 
Hicss52 20190108 v3
home
 
Artificial Intelligence in Data Curation
Novartis Institutes for BioMedical Research
 
Building COVID-19 Museum as Open Science Project
vty
 
CNI Fall 2011 Meeting Presentation Margaret Hedstrom & Robert McDonald (Dec. ...
SEAD
 
Digital Library Federation - DataNets Panel presentation (Nov. 1st, 2011)
SEAD
 
Spohrer SIRs 20230511 v16.pptx
home
 
20220228 uc merced maglio_class v14
home
 
Milano short 20190530 v2
home
 
2021020 jim spohrer ai for_good_conference future_of_ai v4
home
 
Research in Intelligent Systems and Data Science at the Knowledge Media Insti...
Enrico Motta
 
[DSC Europe 23] Luciano Catani - AI in Diplomacy.PDF
DataScienceConferenc1
 
Quo vadis, provenancer?  Cui prodest?  our own trajectory: provenance of data...
Paolo Missier
 
Carpenter "The Future of the Scholarly Record"
National Information Standards Organization (NISO)
 
Ad

More from Beth Plale (11)

PDF
Open science as roadmap to better data science research
Beth Plale
 
PDF
Capsule Computing: Safe Open Science
Beth Plale
 
PDF
Towards FAIR Open Science with PID Kernel Information: RPID Testbed
Beth Plale
 
PDF
HathiTrust Research Center Secure Commons
Beth Plale
 
PDF
Trust threads : Active Curation and Publishing in SEAD
Beth Plale
 
PDF
Trust threads: Provenance for Data Reuse in Long Tail Science
Beth Plale
 
PDF
Case Study Big Data: Socio-Technical Issues of HathiTrust Digital Texts
Beth Plale
 
PDF
Plale HathiTrust El Colegio de Mexico May2014
Beth Plale
 
PDF
Bridging Digital Humanities Research and Big Data Repositories of Digital Text
Beth Plale
 
PDF
Big data and open access: a collision course for science
Beth Plale
 
PPTX
HathiTrust Reserach Center Nov2013
Beth Plale
 
Open science as roadmap to better data science research
Beth Plale
 
Capsule Computing: Safe Open Science
Beth Plale
 
Towards FAIR Open Science with PID Kernel Information: RPID Testbed
Beth Plale
 
HathiTrust Research Center Secure Commons
Beth Plale
 
Trust threads : Active Curation and Publishing in SEAD
Beth Plale
 
Trust threads: Provenance for Data Reuse in Long Tail Science
Beth Plale
 
Case Study Big Data: Socio-Technical Issues of HathiTrust Digital Texts
Beth Plale
 
Plale HathiTrust El Colegio de Mexico May2014
Beth Plale
 
Bridging Digital Humanities Research and Big Data Repositories of Digital Text
Beth Plale
 
Big data and open access: a collision course for science
Beth Plale
 
HathiTrust Reserach Center Nov2013
Beth Plale
 
Ad

Recently uploaded (20)

PDF
GOOGLE ADS (1).pdf THE ULTIMATE GUIDE TO
kushalkeshwanisou
 
PPTX
01_Nico Vincent_Sailpeak.pptx_AI_Barometer_2025
FinTech Belgium
 
PDF
SQL for Accountants and Finance Managers
ysmaelreyes
 
PPTX
美国史蒂文斯理工学院毕业证书{SIT学费发票SIT录取通知书}哪里购买
Taqyea
 
PPTX
thid ppt defines the ich guridlens and gives the information about the ICH gu...
shaistabegum14
 
PDF
2025 Global Data Summit - FOM with AI.pdf
Marco Wobben
 
PDF
IT GOVERNANCE 4-2 - Information System Security (1).pdf
mdirfanuddin1322
 
PPTX
Module-2_3-1eentzyssssssssssssssssssssss.pptx
ShahidHussain66691
 
PDF
apidays Singapore 2025 - From API Intelligence to API Governance by Harsha Ch...
apidays
 
PDF
Research Methodology Overview Introduction
ayeshagul29594
 
PDF
apidays Singapore 2025 - Surviving an interconnected world with API governanc...
apidays
 
PDF
IT GOVERNANCE 4-1 - Information System Security (1).pdf
mdirfanuddin1322
 
PPTX
Generative AI Boost Data Governance and Quality- Tejasvi Addagada
Tejasvi Addagada
 
PPTX
SHREYAS25 INTERN-I,II,III PPT (1).pptx pre
swapnilherage
 
PDF
5991-5857_Agilent_MS_Theory_EN (1).pdf. pdf
NohaSalah45
 
PDF
apidays Singapore 2025 - Trustworthy Generative AI: The Role of Observability...
apidays
 
PDF
Business implication of Artificial Intelligence.pdf
VishalChugh12
 
PPTX
04_Tamás Marton_Intuitech .pptx_AI_Barometer_2025
FinTech Belgium
 
PDF
UNISE-Operation-Procedure-InDHIS2trainng
ahmedabduselam23
 
PPTX
big data eco system fundamentals of data science
arivukarasi
 
GOOGLE ADS (1).pdf THE ULTIMATE GUIDE TO
kushalkeshwanisou
 
01_Nico Vincent_Sailpeak.pptx_AI_Barometer_2025
FinTech Belgium
 
SQL for Accountants and Finance Managers
ysmaelreyes
 
美国史蒂文斯理工学院毕业证书{SIT学费发票SIT录取通知书}哪里购买
Taqyea
 
thid ppt defines the ich guridlens and gives the information about the ICH gu...
shaistabegum14
 
2025 Global Data Summit - FOM with AI.pdf
Marco Wobben
 
IT GOVERNANCE 4-2 - Information System Security (1).pdf
mdirfanuddin1322
 
Module-2_3-1eentzyssssssssssssssssssssss.pptx
ShahidHussain66691
 
apidays Singapore 2025 - From API Intelligence to API Governance by Harsha Ch...
apidays
 
Research Methodology Overview Introduction
ayeshagul29594
 
apidays Singapore 2025 - Surviving an interconnected world with API governanc...
apidays
 
IT GOVERNANCE 4-1 - Information System Security (1).pdf
mdirfanuddin1322
 
Generative AI Boost Data Governance and Quality- Tejasvi Addagada
Tejasvi Addagada
 
SHREYAS25 INTERN-I,II,III PPT (1).pptx pre
swapnilherage
 
5991-5857_Agilent_MS_Theory_EN (1).pdf. pdf
NohaSalah45
 
apidays Singapore 2025 - Trustworthy Generative AI: The Role of Observability...
apidays
 
Business implication of Artificial Intelligence.pdf
VishalChugh12
 
04_Tamás Marton_Intuitech .pptx_AI_Barometer_2025
FinTech Belgium
 
UNISE-Operation-Procedure-InDHIS2trainng
ahmedabduselam23
 
big data eco system fundamentals of data science
arivukarasi
 

Trustworthy AI and Open Science

  • 1. TRUSTWORTHY AI AND OPEN SCIENCE Beth Plale Michael A and Laurie Burns McRobbie Professor of Computer Engineering Beilstein Open Science symposium October 06, 2021 Luddy School of Informatics, Computing, and Engineering Data To Insight Center
  • 2. Observations influenced by my role (2017-2020) in the National Science Foundation working on agency policies and practice in open science. Views expressed are entirely my own. Funding agency perspective on open science: how do we bring visibility to the products of research (that we fund)
  • 3. NSF funds the collection and capture of research data through projects ranging from a few hundred thousand dollars to tens of millions of dollars. The data are maintained in a landscape of solutions to meet the needs of researchers.
  • 4. Specialist repositories - Organizational resources Generalist repositories - Organizational resources Data Portals - Low velocity data - Employs cloud resources - Employs data-compute proximity for analysis Observation networks - High velocity data - Employs cloud resources RESEARCH DATA LANDSCAPE SAGE NEON ARM HPWREN UWI LTER, OOI NEON HydroShare LTER MGDS, IRIS ICPSR QDR TAIR MDF IEDA PDB CCDC DataVerse Figshare Dryad Zenodo IRs Exemplar systems
  • 5. RESEARCH DATA LANDSCAPE Data timeliness need Researcher depth of expertise Expectation for level of curation Expectation of data longevity Specialist repositories - Organizational resources Generalist repositories - Organizational resources Data Portals - Low velocity data - Employs cloud resources - Employs data-compute proximity for analysis Observation networks - High velocity data - Employs cloud resources SAGE NEON ARM HPWREN UWI LTER, OOI NEON HydroShare LTER MGDS, IRIS ICPSR QDR TAIR MDF IEDA PDB CCDC DataVerse Figshare Dryad Zenodo IRs
  • 6. RESEARCH DATA LANDSCAPE Publisher’s view of landscape (general public view as well) Optimization for timeliness of research could suggest lower value over time Specialist repositories - Organizational resources Generalist repositories - Organizational resources Data Portals - Low velocity data - Employs cloud resources - Employs data-compute proximity for analysis Observation networks - High velocity data - Employs cloud resources SAGE NEON ARM HPWREN UWI LTER, OOI NEON HydroShare LTER MGDS, IRIS ICPSR QDR TAIR MDF IEDA PDB CCDC DataVerse Figshare Dryad Zenodo IRs
  • 7. Generalist–Aided Deposit: engages generalist curators Metadata: generalist schema Reuse potential: moderate-low as metadata is curated but general Scope: discipline agnostic scope Discovery: broad name recognition Specialist-DBMS Deposit: difficult so DB often read-only Metadata: data dictionary + DB schema Reuse potential: high potential as self contained Scope: subdiscipline scope Discovery: known within subdiscipline Specialist–Aided Deposit: engages specialist curators Metadata: specialized schema Reuse potential: high due to specialists Scope: discipline scope Discovery: known within discipline Specialist-Unaided Deposit: unaided deposit Metadata: specialized schema Reuse potential: moderate-high from discipline focus of metadata schema Scope: discipline scope Discovery: known within discipline Generalist-Unaided Deposit: unaided deposit Metadata: generalist schema Reuse potential: low as metadata is minimal Scope: discipline agnostic scope Discovery: broad name recognition i.e., institutional repositories
  • 8. Generalist–Aided Deposit: engages generalist curators Metadata: generalist schema Reuse potential: moderate-low as metadata is curated but general Scope: discipline agnostic scope Discovery: broad name recognition Specialist-DBMS Deposit: difficult so DB often read-only Metadata: data dictionary + DB schema Reuse potential: high potential as self contained Scope: subdiscipline scope Discovery: known within subdiscipline Specialist–Aided Deposit: engages specialist curators Metadata: specialized schema Reuse potential: high due to specialists Scope: discipline scope Discovery: known within discipline Specialist-Unaided Deposit: unaided deposit Metadata: specialized schema Reuse potential: moderate-high from discipline focus of metadata schema Scope: discipline scope Discovery: known within discipline Generalist-Unaided Deposit: unaided deposit Metadata: generalist schema Reuse potential: low as metadata is minimal Scope: discipline agnostic scope Discovery: broad name recognition i.e., institutional repositories
  • 9. Generalist–Aided Deposit: engages generalist curators Metadata: generalist schema Reuse potential: moderate-low as metadata is curated but general Scope: discipline agnostic scope Discovery: broad name recognition Specialist-DBMS Deposit: difficult so DB often read-only Metadata: data dictionary + DB schema Reuse potential: high potential as self contained Scope: subdiscipline scope Discovery: known within subdiscipline Specialist–Aided Deposit: engages specialist curators Metadata: specialized schema Reuse potential: high due to specialists Scope: discipline scope Discovery: known within discipline Specialist-Unaided Deposit: unaided deposit Metadata: specialized schema Reuse potential: moderate-high from discipline focus of metadata schema Scope: discipline scope Discovery: known within discipline Generalist-Unaided Deposit: unaided deposit Metadata: generalist schema Reuse potential: low as metadata is minimal Scope: discipline agnostic scope Discovery: broad name recognition i.e., institutional repositories
  • 10. FEDERAL RESEARCH DATA SUMMARY • Observation networks and data portals are a fixed part of the landscape. They have a different role in open science than do repositories • Generalist repositories are easier to use than specialist repositories • Specialist repositories have higher reusability • Generalist repositories have economies of scale • If specialist repositories can leverage generalist repositories as back ends it would reduce overall cost
  • 11. OPEN SCIENCE ROLE IN AI TRUSTWORTHINESS
  • 12. “ON ARTIFICIAL INTELLIGENCE, TRUST IS A MUST, NOT A NICE-TO-HAVE” Margrethe Vestager, the European Commission executive vice president who oversees digital policy for the 27- nation bloc
  • 13. TRUST ó TRUSTWORTHINESS TRUST • An individual’s confidence in an entity • “I trust this web site” TRUSTWORTHINESS • An entity’s state of being trustworthy or reliable • An estimate of an object’s worthiness to receive someone’s trust • Trustworthiness is difficult to accurately quantify
  • 16. INDIANA UNIVERSITY BLOOMINGTON AI: Human-Machine Interaction § Fitness smartwatch, smart hearing aids § Co-bots, cyber-crews, digital twins § Integration of smart machines into human body in form of computer-brain interfaces or cyborgs AI: Autonomous and Semi- Autonomous Actors • Weapon systems • Robots in deep sea and space exploration • Self driving cars • Bots in financial trade AI: Big Data / Big Compute • Deep learning / Machine Learning / Natural Language Processing • Medical diagnosis, image recognition Broad Categories of AI
  • 17. INDIANA UNIVERSITY BLOOMINGTON AI: Human-Machine Interaction § Fitness smartwatch, smart hearing aids § Co-bots, cyber-crews, digital twins § Integration of smart machines into human body in form of computer-brain interfaces or cyborgs AI: Autonomous and Semi- Autonomous Actors • Weapon systems • Robots in deep sea and space exploration • Self driving cars • Bots in financial trade AI: Big Data / Big Compute • Deep learning / Machine Learning / Natural Language Processing • Medical diagnosis, image recognition Broad Categories of AI Category with most urgency in issues of artificial moral agency
  • 18. INDIANA UNIVERSITY BLOOMINGTON AI: Human-Machine Interaction § Fitness smartwatch, smart hearing aids § Co-bots, cyber-crews, digital twins § Integration of smart machines into human body in form of computer-brain interfaces or cyborgs AI: Autonomous and Semi- Autonomous Actors • Weapon systems • Robots in deep sea and space exploration • Self driving cars • Bots in financial trade AI: Big Data / Big Compute • Deep learning / Machine Learning / Natural Language Processing • Medical diagnosis, image recognition Broad Categories of AI Research needed in policy and technical extensions that lead to greater and more measurable forms of accountability
  • 19. INDIANA UNIVERSITY BLOOMINGTON INTERVENTION POINTS: ENHANCED TRUSTWORTHINESS Developer ethics, development process norms Societal influence: public pressure, legislation, regulatory oversight AI algorithmic knowledge exhibiting higher levels of trustworthiness Technological manifestation: verifiable claims, explainability, accountability
  • 20. Trustworthy AI is AI that is designed, developed, and used in a manner that is lawful, fair, unbiased, accurate, reliable, effective, safe, secure, resilient, understandable, and with processes in place to regularly monitor and evaluate the AI system’s performance and outcomes Lynne Parker, Deputy US Chief Technology Officer and Director of the National Artificial Intelligence Initiative Office
  • 21. ML PROCESS M. Veale et al., CHI 2018 Data Training data Feature extraction Test data Learning algorithm Trained model Predict New data Explain- ability inquiries dev ops
  • 22. RESEARCH PRODUCTS M. Veale et al., CHI 2018 Data Training data Feature extraction Test data Learning algorithm Trained model Predict New data Explain- ability inquiries dev ops
  • 23. Open science contributes to trustworthy AI (trusted products) The research products of AI need to include intermediate results and explainability services