SlideShare a Scribd company logo
Brief Introduction to Provenance
"As data becomes plentiful, verifiable truth becomes scarce”
http://go-to-hellman.blogspot.com/2010/02/named-graphs-argleton-and-
truth-economy.html
For JISC KeepItcourse on Digital Preservation Tools for Repository Managers
Module 3, Primer on preservation workflow, formats and characterisation
Westminster-Kingsway College, London, 2 March 2010
Provenance: example
The following excerpt and slides are taken with permission from Moreau, L.
The Open Provenance Model:Towards inter-operability of Provenance
Systems http://users.ecs.soton.ac.uk/lavm/talks/iam09.pdf
Example The provenance of a bottle of wine includes:
• Grapes from which it is made
• Where those grapes grew
• Process in the wine’s preparation
• How the wine was stored
• Between which parties the wine was transported,
e.g. producer to distributer to retailer
• Where it was auctioned
Provenance Definition
• Oxford English Dictionary:
– the fact of coming from some particular source or quarter;
origin, derivation
– the historyor pedigree of a work of art, manuscript, rare
book, etc.;
– concretely, a record of the passage
of an item through its various
owners.
• The provenance of a piece of data is the
process that led to that piece of data
The Science Lifecycle
scientists
Local
Web
Repositories
Graduate
Students
Undergraduate
Students
Virtual Learning
Environment
Technical
Reports
Reprints
Peer-
Reviewed
Journal &
Conference
Papers
Preprints
&
Metadata
Certified
Experimental Results
& Analyses
experimentation
Data, Metadata,
Provenance, Scripts,
Workflows, Services,
Ontologies, Blogs, ...
Digital
Libraries
Next Generation
Researchers
Adapted from David De Roure’s slides
scientists
Local
Web
Repositories
Graduate
Students
Undergraduate
Students
Virtual Learning
Environment
Technical
Reports
Reprints
Peer-
Reviewed
Journal &
Conference
Papers
Preprints
&
Metadata
Certified
Experimental Results
& Analyses
experimentation
Data, Metadata,
Provenance, Scripts,
Workflows, Services,
Ontologies, Blogs, ...
Digital
Libraries
Next Generation
Researchers
Finding the Provenance
of research outputs
across all the systems
data transited through
Open Provenance Model (OPM)
• Allows us to express all the causes of an item
• Allow for process-oriented and dataflow
oriented views
• Based on a notion of annotated causality
graph
Moreau, L., et al. v1.00 (Dec 2007), OPM v1.01
(Jul 2008), OPM v1.1 (Dec 2009)
OPM Requirements
• To allow provenance information to be
exchanged between systems, by means of a
compatibility layer based on a shared provenance
model.
• To allow developers to build and share tools that
operate on such provenance model.
• To define the model in a precise, technology-
agnostic manner.
• To define bindings to XML/RDF separately
• To support a digital representation of provenance
for any “thing”, whether produced by computer
systems or not
OPM Serialisation
• OPM is an abstract data model to represent past
execution and what causes data and processes to occur
• OPM can be serialised in different formats, referred to
as “technology bindings” or serializations
• OPM XML schema
(http://openprovenance.org/model/v1.01.a)
• OPM RDF schema
• OPM OWL ontology
• Effort underway to ensure full equivalence of
representations
Nodes
• Artifact: Immutable piece of state, which
may have a physical embodiment in a
physical object, or a digital
representation in a computer system.
• Process: Action or series of actions
performed on or caused by artifacts, and
resulting in new artifacts.
• Agent: Contextual entity acting as a
catalyst of a process, enabling,
facilitating, controlling, affecting its
execution.
A
P
Ag
Edges
A1 A2
P1 P2
wasTriggeredBy
wasDerivedFrom
A Pused(R)
AP
wasGeneratedBy(R)
Ag P
wasControlledBy(R)
Edge labels are in the past to express that these are used to describe past executions
Illustration
• Process “used” artifacts and
“generated” artifact
• Edge “roles” indicate the
function of the artifact with
respect to the process (akin
to function parameters)
• Edges and nodes can be
typed
Causation chain:
• P was caused by A1 and A2
• A3 and A4 were caused by P
• Does it mean that A3 and A4
were caused by A1 and A2?
P
A1 A2
A3 A4
used(divisor)used(dividend)
wasGeneratedBy(rest)wasGeneratedBy(quotient)
type=division
Time Constraints
A Pused(R)
A
wasGeneratedBy(R)
Ag
wasControlledBy(R)
start: T2
end: T5
T4T3
T1<T3 (artifact must exist before being used)
T2<T3 (process must have started before using artifacts)
T3<T5 (process uses artifacts before it ends)
T2<T4 (process must have started before generating artifacts)
T4<T5 (process generates artifacts before it ends)
T4<T6 (artifact must exist before being used)
T2<T5 (process must have started before ending)
no constraint between t3 and t4
wasGeneratedBy(R)
T1
used(R)
T6
Dublin Core Profile (draft)
• To many people, provenance is primarily
about attribution, citation, bibliographic
information
• DC provides terms to relate resources to such
information
• DC profile aims to use of Dublin Core terms to
OPM concepts and graph patterns
with Simon Miles and Joe Futrelle
DC to OPM example: dc:publisher
A2
A1
P
publish
wasSameResourceAs
state=published
Ag
wasActionOf
state=unpublished
person
name=Luc wasGeneratedBy
What have we learned about
provenance?
• Provenance: describes and records the results of
processes on objects over time
• OPM represents provenance as XML
• OPM can be serialised in different formats
• RDF, Semantic Web
• OPM is a work in progress
By working with an open standard model, that can
pass information as XML and in standard serialisation
formats (e.g. RDF), it should be possible to build
provenance services into repository environments
Ad

More Related Content

Viewers also liked (15)

Records inventory final
Records inventory finalRecords inventory final
Records inventory final
Roger Sebastian
 
Ch03 records management
Ch03 records managementCh03 records management
Ch03 records management
xtin101
 
Records Inventory And Appraisal
Records Inventory And AppraisalRecords Inventory And Appraisal
Records Inventory And Appraisal
Fe Angela Verzosa
 
Ch06 records management slide show part 2 with notes
Ch06 records management slide show part 2 with notesCh06 records management slide show part 2 with notes
Ch06 records management slide show part 2 with notes
francarter2
 
Introduction to archival research 2015
Introduction to archival research 2015Introduction to archival research 2015
Introduction to archival research 2015
Humphrey Southall
 
Principles of records management Mushi
Principles of records management MushiPrinciples of records management Mushi
Principles of records management Mushi
sylvanus mushi
 
Records inventory and appraisal
Records inventory and appraisalRecords inventory and appraisal
Records inventory and appraisal
corpuzed
 
Archival research
Archival researchArchival research
Archival research
Lance Gerard G. Abalos LPT, MA
 
Ch07 records management
Ch07 records managementCh07 records management
Ch07 records management
xtin101
 
Prov-O-Viz: Interactive Provenance Visualization
Prov-O-Viz: Interactive Provenance VisualizationProv-O-Viz: Interactive Provenance Visualization
Prov-O-Viz: Interactive Provenance Visualization
Rinke Hoekstra
 
Appraisal
AppraisalAppraisal
Appraisal
Sharon Pullen
 
Data Governance: Keystone of Information Management Initiatives
Data Governance: Keystone of Information Management InitiativesData Governance: Keystone of Information Management Initiatives
Data Governance: Keystone of Information Management Initiatives
Alan McSweeney
 
Behind the Gate: challenges facing archivists in academic research libraries
Behind the Gate: challenges facing archivists in academic research librariesBehind the Gate: challenges facing archivists in academic research libraries
Behind the Gate: challenges facing archivists in academic research libraries
Audra Eagle Yun
 
Inventory management
Inventory managementInventory management
Inventory management
Kuldeep Uttam
 
How to conduct a records and information inventory
How to conduct a records and information inventoryHow to conduct a records and information inventory
How to conduct a records and information inventory
Jesse Wilkins
 
Ch03 records management
Ch03 records managementCh03 records management
Ch03 records management
xtin101
 
Records Inventory And Appraisal
Records Inventory And AppraisalRecords Inventory And Appraisal
Records Inventory And Appraisal
Fe Angela Verzosa
 
Ch06 records management slide show part 2 with notes
Ch06 records management slide show part 2 with notesCh06 records management slide show part 2 with notes
Ch06 records management slide show part 2 with notes
francarter2
 
Introduction to archival research 2015
Introduction to archival research 2015Introduction to archival research 2015
Introduction to archival research 2015
Humphrey Southall
 
Principles of records management Mushi
Principles of records management MushiPrinciples of records management Mushi
Principles of records management Mushi
sylvanus mushi
 
Records inventory and appraisal
Records inventory and appraisalRecords inventory and appraisal
Records inventory and appraisal
corpuzed
 
Ch07 records management
Ch07 records managementCh07 records management
Ch07 records management
xtin101
 
Prov-O-Viz: Interactive Provenance Visualization
Prov-O-Viz: Interactive Provenance VisualizationProv-O-Viz: Interactive Provenance Visualization
Prov-O-Viz: Interactive Provenance Visualization
Rinke Hoekstra
 
Data Governance: Keystone of Information Management Initiatives
Data Governance: Keystone of Information Management InitiativesData Governance: Keystone of Information Management Initiatives
Data Governance: Keystone of Information Management Initiatives
Alan McSweeney
 
Behind the Gate: challenges facing archivists in academic research libraries
Behind the Gate: challenges facing archivists in academic research librariesBehind the Gate: challenges facing archivists in academic research libraries
Behind the Gate: challenges facing archivists in academic research libraries
Audra Eagle Yun
 
Inventory management
Inventory managementInventory management
Inventory management
Kuldeep Uttam
 
How to conduct a records and information inventory
How to conduct a records and information inventoryHow to conduct a records and information inventory
How to conduct a records and information inventory
Jesse Wilkins
 

Similar to Keepit Course 3: Provenance (and OPM), based on slides by Luc Moreau (20)

On the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream ProcessingOn the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream Processing
PlanetData Network of Excellence
 
OrdRing 2013 keynote - On the need for a W3C community group on RDF Stream Pr...
OrdRing 2013 keynote - On the need for a W3C community group on RDF Stream Pr...OrdRing 2013 keynote - On the need for a W3C community group on RDF Stream Pr...
OrdRing 2013 keynote - On the need for a W3C community group on RDF Stream Pr...
Oscar Corcho
 
DEBS 2015 Tutorial : Patterns for Realtime Streaming Analytics
DEBS 2015 Tutorial : Patterns for Realtime Streaming AnalyticsDEBS 2015 Tutorial : Patterns for Realtime Streaming Analytics
DEBS 2015 Tutorial : Patterns for Realtime Streaming Analytics
Sriskandarajah Suhothayan
 
ACM DEBS 2015: Realtime Streaming Analytics Patterns
ACM DEBS 2015: Realtime Streaming Analytics PatternsACM DEBS 2015: Realtime Streaming Analytics Patterns
ACM DEBS 2015: Realtime Streaming Analytics Patterns
Srinath Perera
 
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Eric D. Schabell
 
"Data Provenance: Principles and Why it matters for BioMedical Applications"
"Data Provenance: Principles and Why it matters for BioMedical Applications""Data Provenance: Principles and Why it matters for BioMedical Applications"
"Data Provenance: Principles and Why it matters for BioMedical Applications"
Pinar Alper
 
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Eric D. Schabell
 
oai-2.0-adv.ppt
oai-2.0-adv.pptoai-2.0-adv.ppt
oai-2.0-adv.ppt
Bharath Abbareddy
 
OSLC KM (Knowledge Management): elevating the meaning of data and operations ...
OSLC KM (Knowledge Management): elevating the meaning of data and operations ...OSLC KM (Knowledge Management): elevating the meaning of data and operations ...
OSLC KM (Knowledge Management): elevating the meaning of data and operations ...
CARLOS III UNIVERSITY OF MADRID
 
Oxford Common File Layout (OCFL)
Oxford Common File Layout (OCFL)Oxford Common File Layout (OCFL)
Oxford Common File Layout (OCFL)
Simeon Warner
 
Environment Canada's Data Management Service
Environment Canada's Data Management ServiceEnvironment Canada's Data Management Service
Environment Canada's Data Management Service
Safe Software
 
The Data Distribution Service Tutorial
The Data Distribution Service TutorialThe Data Distribution Service Tutorial
The Data Distribution Service Tutorial
Angelo Corsaro
 
Eureka Research Workbench: A Semantic Approach to an Open Source Electroni...
Eureka Research Workbench: A Semantic Approach to an Open Source Electroni...Eureka Research Workbench: A Semantic Approach to an Open Source Electroni...
Eureka Research Workbench: A Semantic Approach to an Open Source Electroni...
Stuart Chalk
 
Norman and McCraken, "OpenURL Implementation: Link Resolution That Users Will...
Norman and McCraken, "OpenURL Implementation: Link Resolution That Users Will...Norman and McCraken, "OpenURL Implementation: Link Resolution That Users Will...
Norman and McCraken, "OpenURL Implementation: Link Resolution That Users Will...
National Information Standards Organization (NISO)
 
Introduction to OpenSees by Frank McKenna
Introduction to OpenSees by Frank McKennaIntroduction to OpenSees by Frank McKenna
Introduction to OpenSees by Frank McKenna
openseesdays
 
BL Demo Day - July2011 - (9) IMPACT Interoperability and Evaluation Framework
BL Demo Day - July2011 - (9) IMPACT Interoperability and Evaluation FrameworkBL Demo Day - July2011 - (9) IMPACT Interoperability and Evaluation Framework
BL Demo Day - July2011 - (9) IMPACT Interoperability and Evaluation Framework
IMPACT Centre of Competence
 
Introduction to Web Services - Architecture
Introduction to Web Services - ArchitectureIntroduction to Web Services - Architecture
Introduction to Web Services - Architecture
Matrix823409
 
OpenURL - The Rough Guide
OpenURL - The Rough GuideOpenURL - The Rough Guide
OpenURL - The Rough Guide
Tony Hammond
 
Diagnosing Production Akka.NET Problems with OpenTelemetry.pptx
Diagnosing Production Akka.NET Problems with OpenTelemetry.pptxDiagnosing Production Akka.NET Problems with OpenTelemetry.pptx
Diagnosing Production Akka.NET Problems with OpenTelemetry.pptx
petabridge
 
Introduction to Networking and OSI Model
Introduction to Networking and OSI ModelIntroduction to Networking and OSI Model
Introduction to Networking and OSI Model
KawtharAlsharah
 
On the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream ProcessingOn the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream Processing
PlanetData Network of Excellence
 
OrdRing 2013 keynote - On the need for a W3C community group on RDF Stream Pr...
OrdRing 2013 keynote - On the need for a W3C community group on RDF Stream Pr...OrdRing 2013 keynote - On the need for a W3C community group on RDF Stream Pr...
OrdRing 2013 keynote - On the need for a W3C community group on RDF Stream Pr...
Oscar Corcho
 
DEBS 2015 Tutorial : Patterns for Realtime Streaming Analytics
DEBS 2015 Tutorial : Patterns for Realtime Streaming AnalyticsDEBS 2015 Tutorial : Patterns for Realtime Streaming Analytics
DEBS 2015 Tutorial : Patterns for Realtime Streaming Analytics
Sriskandarajah Suhothayan
 
ACM DEBS 2015: Realtime Streaming Analytics Patterns
ACM DEBS 2015: Realtime Streaming Analytics PatternsACM DEBS 2015: Realtime Streaming Analytics Patterns
ACM DEBS 2015: Realtime Streaming Analytics Patterns
Srinath Perera
 
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Eric D. Schabell
 
"Data Provenance: Principles and Why it matters for BioMedical Applications"
"Data Provenance: Principles and Why it matters for BioMedical Applications""Data Provenance: Principles and Why it matters for BioMedical Applications"
"Data Provenance: Principles and Why it matters for BioMedical Applications"
Pinar Alper
 
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Eric D. Schabell
 
OSLC KM (Knowledge Management): elevating the meaning of data and operations ...
OSLC KM (Knowledge Management): elevating the meaning of data and operations ...OSLC KM (Knowledge Management): elevating the meaning of data and operations ...
OSLC KM (Knowledge Management): elevating the meaning of data and operations ...
CARLOS III UNIVERSITY OF MADRID
 
Oxford Common File Layout (OCFL)
Oxford Common File Layout (OCFL)Oxford Common File Layout (OCFL)
Oxford Common File Layout (OCFL)
Simeon Warner
 
Environment Canada's Data Management Service
Environment Canada's Data Management ServiceEnvironment Canada's Data Management Service
Environment Canada's Data Management Service
Safe Software
 
The Data Distribution Service Tutorial
The Data Distribution Service TutorialThe Data Distribution Service Tutorial
The Data Distribution Service Tutorial
Angelo Corsaro
 
Eureka Research Workbench: A Semantic Approach to an Open Source Electroni...
Eureka Research Workbench: A Semantic Approach to an Open Source Electroni...Eureka Research Workbench: A Semantic Approach to an Open Source Electroni...
Eureka Research Workbench: A Semantic Approach to an Open Source Electroni...
Stuart Chalk
 
Introduction to OpenSees by Frank McKenna
Introduction to OpenSees by Frank McKennaIntroduction to OpenSees by Frank McKenna
Introduction to OpenSees by Frank McKenna
openseesdays
 
BL Demo Day - July2011 - (9) IMPACT Interoperability and Evaluation Framework
BL Demo Day - July2011 - (9) IMPACT Interoperability and Evaluation FrameworkBL Demo Day - July2011 - (9) IMPACT Interoperability and Evaluation Framework
BL Demo Day - July2011 - (9) IMPACT Interoperability and Evaluation Framework
IMPACT Centre of Competence
 
Introduction to Web Services - Architecture
Introduction to Web Services - ArchitectureIntroduction to Web Services - Architecture
Introduction to Web Services - Architecture
Matrix823409
 
OpenURL - The Rough Guide
OpenURL - The Rough GuideOpenURL - The Rough Guide
OpenURL - The Rough Guide
Tony Hammond
 
Diagnosing Production Akka.NET Problems with OpenTelemetry.pptx
Diagnosing Production Akka.NET Problems with OpenTelemetry.pptxDiagnosing Production Akka.NET Problems with OpenTelemetry.pptx
Diagnosing Production Akka.NET Problems with OpenTelemetry.pptx
petabridge
 
Introduction to Networking and OSI Model
Introduction to Networking and OSI ModelIntroduction to Networking and OSI Model
Introduction to Networking and OSI Model
KawtharAlsharah
 
Ad

More from JISC KeepIt project (20)

EPrints Preservation: Why we need Preservation Planning
EPrints Preservation: Why we need Preservation PlanningEPrints Preservation: Why we need Preservation Planning
EPrints Preservation: Why we need Preservation Planning
JISC KeepIt project
 
Preserving repository content: practical steps for repository managers by Mig...
Preserving repository content: practical steps for repository managers by Mig...Preserving repository content: practical steps for repository managers by Mig...
Preserving repository content: practical steps for repository managers by Mig...
JISC KeepIt project
 
Update on the JISC KeepIt Repository Preservation Exemplars Project, June 2010
Update on the JISC KeepIt Repository Preservation Exemplars Project, June 2010Update on the JISC KeepIt Repository Preservation Exemplars Project, June 2010
Update on the JISC KeepIt Repository Preservation Exemplars Project, June 2010
JISC KeepIt project
 
Transforming repositories: from repository managers to institutional data man...
Transforming repositories: from repository managers to institutional data man...Transforming repositories: from repository managers to institutional data man...
Transforming repositories: from repository managers to institutional data man...
JISC KeepIt project
 
Keepit Course 5: Concluding the course
Keepit Course 5: Concluding the courseKeepit Course 5: Concluding the course
Keepit Course 5: Concluding the course
JISC KeepIt project
 
Keepit Course 5: Revision
Keepit Course 5: RevisionKeepit Course 5: Revision
Keepit Course 5: Revision
JISC KeepIt project
 
KeepIt Course 5: DRAMBORA: Risk and Trust and Data Management, by Martin Donn...
KeepIt Course 5: DRAMBORA: Risk and Trust and Data Management, by Martin Donn...KeepIt Course 5: DRAMBORA: Risk and Trust and Data Management, by Martin Donn...
KeepIt Course 5: DRAMBORA: Risk and Trust and Data Management, by Martin Donn...
JISC KeepIt project
 
Keepit Course 5: Tools for Assessing Trustworthy Repositories
Keepit Course 5: Tools for Assessing Trustworthy RepositoriesKeepit Course 5: Tools for Assessing Trustworthy Repositories
Keepit Course 5: Tools for Assessing Trustworthy Repositories
JISC KeepIt project
 
Keepit Course 5: Trust
Keepit Course 5: TrustKeepit Course 5: Trust
Keepit Course 5: Trust
JISC KeepIt project
 
Preservation Planning using Plato, by Hannes Kulovits and Andreas Rauber
Preservation Planning using Plato, by Hannes Kulovits and Andreas RauberPreservation Planning using Plato, by Hannes Kulovits and Andreas Rauber
Preservation Planning using Plato, by Hannes Kulovits and Andreas Rauber
JISC KeepIt project
 
Physical preservation with EPrints: 1 Storage, by Adam Field, David Tarrant, ...
Physical preservation with EPrints: 1 Storage, by Adam Field, David Tarrant, ...Physical preservation with EPrints: 1 Storage, by Adam Field, David Tarrant, ...
Physical preservation with EPrints: 1 Storage, by Adam Field, David Tarrant, ...
JISC KeepIt project
 
KeepIt Course 4: digital preservation recap, by Andreas Rauber, Hannes Kulovi...
KeepIt Course 4: digital preservation recap, by Andreas Rauber, Hannes Kulovi...KeepIt Course 4: digital preservation recap, by Andreas Rauber, Hannes Kulovi...
KeepIt Course 4: digital preservation recap, by Andreas Rauber, Hannes Kulovi...
JISC KeepIt project
 
KeepIt Course 4: Putting storage, format management and preservation planning...
KeepIt Course 4: Putting storage, format management and preservation planning...KeepIt Course 4: Putting storage, format management and preservation planning...
KeepIt Course 4: Putting storage, format management and preservation planning...
JISC KeepIt project
 
KeepIt Course 3: Applying Preservation Metadata to Repositories
KeepIt Course 3: Applying Preservation Metadata to RepositoriesKeepIt Course 3: Applying Preservation Metadata to Repositories
KeepIt Course 3: Applying Preservation Metadata to Repositories
JISC KeepIt project
 
Significant Properties - Where Next? (SPs part 6), by Stephen Grace and Garet...
Significant Properties - Where Next? (SPs part 6), by Stephen Grace and Garet...Significant Properties - Where Next? (SPs part 6), by Stephen Grace and Garet...
Significant Properties - Where Next? (SPs part 6), by Stephen Grace and Garet...
JISC KeepIt project
 
Supporting Significant Properties in a Working Archive (SPs part 5), by Steph...
Supporting Significant Properties in a Working Archive (SPs part 5), by Steph...Supporting Significant Properties in a Working Archive (SPs part 5), by Steph...
Supporting Significant Properties in a Working Archive (SPs part 5), by Steph...
JISC KeepIt project
 
Significant Properties, Practical 2: Stakeholder Analysis (SPs part 4), by St...
Significant Properties, Practical 2: Stakeholder Analysis (SPs part 4), by St...Significant Properties, Practical 2: Stakeholder Analysis (SPs part 4), by St...
Significant Properties, Practical 2: Stakeholder Analysis (SPs part 4), by St...
JISC KeepIt project
 
Significant Properties, Practical 1: Object Analysis (SPs part 3), by Stephen...
Significant Properties, Practical 1: Object Analysis (SPs part 3), by Stephen...Significant Properties, Practical 1: Object Analysis (SPs part 3), by Stephen...
Significant Properties, Practical 1: Object Analysis (SPs part 3), by Stephen...
JISC KeepIt project
 
InSPECT Significant Properties Framework (SPs part 2), by Stephen Grace and G...
InSPECT Significant Properties Framework (SPs part 2), by Stephen Grace and G...InSPECT Significant Properties Framework (SPs part 2), by Stephen Grace and G...
InSPECT Significant Properties Framework (SPs part 2), by Stephen Grace and G...
JISC KeepIt project
 
Introducing Significant Properties (SPs part 1), by Stephen Grace and Gareth ...
Introducing Significant Properties (SPs part 1), by Stephen Grace and Gareth ...Introducing Significant Properties (SPs part 1), by Stephen Grace and Gareth ...
Introducing Significant Properties (SPs part 1), by Stephen Grace and Gareth ...
JISC KeepIt project
 
EPrints Preservation: Why we need Preservation Planning
EPrints Preservation: Why we need Preservation PlanningEPrints Preservation: Why we need Preservation Planning
EPrints Preservation: Why we need Preservation Planning
JISC KeepIt project
 
Preserving repository content: practical steps for repository managers by Mig...
Preserving repository content: practical steps for repository managers by Mig...Preserving repository content: practical steps for repository managers by Mig...
Preserving repository content: practical steps for repository managers by Mig...
JISC KeepIt project
 
Update on the JISC KeepIt Repository Preservation Exemplars Project, June 2010
Update on the JISC KeepIt Repository Preservation Exemplars Project, June 2010Update on the JISC KeepIt Repository Preservation Exemplars Project, June 2010
Update on the JISC KeepIt Repository Preservation Exemplars Project, June 2010
JISC KeepIt project
 
Transforming repositories: from repository managers to institutional data man...
Transforming repositories: from repository managers to institutional data man...Transforming repositories: from repository managers to institutional data man...
Transforming repositories: from repository managers to institutional data man...
JISC KeepIt project
 
Keepit Course 5: Concluding the course
Keepit Course 5: Concluding the courseKeepit Course 5: Concluding the course
Keepit Course 5: Concluding the course
JISC KeepIt project
 
KeepIt Course 5: DRAMBORA: Risk and Trust and Data Management, by Martin Donn...
KeepIt Course 5: DRAMBORA: Risk and Trust and Data Management, by Martin Donn...KeepIt Course 5: DRAMBORA: Risk and Trust and Data Management, by Martin Donn...
KeepIt Course 5: DRAMBORA: Risk and Trust and Data Management, by Martin Donn...
JISC KeepIt project
 
Keepit Course 5: Tools for Assessing Trustworthy Repositories
Keepit Course 5: Tools for Assessing Trustworthy RepositoriesKeepit Course 5: Tools for Assessing Trustworthy Repositories
Keepit Course 5: Tools for Assessing Trustworthy Repositories
JISC KeepIt project
 
Preservation Planning using Plato, by Hannes Kulovits and Andreas Rauber
Preservation Planning using Plato, by Hannes Kulovits and Andreas RauberPreservation Planning using Plato, by Hannes Kulovits and Andreas Rauber
Preservation Planning using Plato, by Hannes Kulovits and Andreas Rauber
JISC KeepIt project
 
Physical preservation with EPrints: 1 Storage, by Adam Field, David Tarrant, ...
Physical preservation with EPrints: 1 Storage, by Adam Field, David Tarrant, ...Physical preservation with EPrints: 1 Storage, by Adam Field, David Tarrant, ...
Physical preservation with EPrints: 1 Storage, by Adam Field, David Tarrant, ...
JISC KeepIt project
 
KeepIt Course 4: digital preservation recap, by Andreas Rauber, Hannes Kulovi...
KeepIt Course 4: digital preservation recap, by Andreas Rauber, Hannes Kulovi...KeepIt Course 4: digital preservation recap, by Andreas Rauber, Hannes Kulovi...
KeepIt Course 4: digital preservation recap, by Andreas Rauber, Hannes Kulovi...
JISC KeepIt project
 
KeepIt Course 4: Putting storage, format management and preservation planning...
KeepIt Course 4: Putting storage, format management and preservation planning...KeepIt Course 4: Putting storage, format management and preservation planning...
KeepIt Course 4: Putting storage, format management and preservation planning...
JISC KeepIt project
 
KeepIt Course 3: Applying Preservation Metadata to Repositories
KeepIt Course 3: Applying Preservation Metadata to RepositoriesKeepIt Course 3: Applying Preservation Metadata to Repositories
KeepIt Course 3: Applying Preservation Metadata to Repositories
JISC KeepIt project
 
Significant Properties - Where Next? (SPs part 6), by Stephen Grace and Garet...
Significant Properties - Where Next? (SPs part 6), by Stephen Grace and Garet...Significant Properties - Where Next? (SPs part 6), by Stephen Grace and Garet...
Significant Properties - Where Next? (SPs part 6), by Stephen Grace and Garet...
JISC KeepIt project
 
Supporting Significant Properties in a Working Archive (SPs part 5), by Steph...
Supporting Significant Properties in a Working Archive (SPs part 5), by Steph...Supporting Significant Properties in a Working Archive (SPs part 5), by Steph...
Supporting Significant Properties in a Working Archive (SPs part 5), by Steph...
JISC KeepIt project
 
Significant Properties, Practical 2: Stakeholder Analysis (SPs part 4), by St...
Significant Properties, Practical 2: Stakeholder Analysis (SPs part 4), by St...Significant Properties, Practical 2: Stakeholder Analysis (SPs part 4), by St...
Significant Properties, Practical 2: Stakeholder Analysis (SPs part 4), by St...
JISC KeepIt project
 
Significant Properties, Practical 1: Object Analysis (SPs part 3), by Stephen...
Significant Properties, Practical 1: Object Analysis (SPs part 3), by Stephen...Significant Properties, Practical 1: Object Analysis (SPs part 3), by Stephen...
Significant Properties, Practical 1: Object Analysis (SPs part 3), by Stephen...
JISC KeepIt project
 
InSPECT Significant Properties Framework (SPs part 2), by Stephen Grace and G...
InSPECT Significant Properties Framework (SPs part 2), by Stephen Grace and G...InSPECT Significant Properties Framework (SPs part 2), by Stephen Grace and G...
InSPECT Significant Properties Framework (SPs part 2), by Stephen Grace and G...
JISC KeepIt project
 
Introducing Significant Properties (SPs part 1), by Stephen Grace and Gareth ...
Introducing Significant Properties (SPs part 1), by Stephen Grace and Gareth ...Introducing Significant Properties (SPs part 1), by Stephen Grace and Gareth ...
Introducing Significant Properties (SPs part 1), by Stephen Grace and Gareth ...
JISC KeepIt project
 
Ad

Recently uploaded (20)

Build Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For DevsBuild Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For Devs
Brian McKeiver
 
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul
 
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager APIUiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPathCommunity
 
Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)
Ortus Solutions, Corp
 
Role of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered ManufacturingRole of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered Manufacturing
Andrew Leo
 
How analogue intelligence complements AI
How analogue intelligence complements AIHow analogue intelligence complements AI
How analogue intelligence complements AI
Paul Rowe
 
Generative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in BusinessGenerative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in Business
Dr. Tathagat Varma
 
TrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business ConsultingTrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business Consulting
Trs Labs
 
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath MaestroDev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
UiPathCommunity
 
Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.
hpbmnnxrvb
 
Semantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AISemantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AI
artmondano
 
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfComplete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Software Company
 
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded DevelopersLinux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Toradex
 
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfThe Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
Abi john
 
Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025
Splunk
 
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
organizerofv
 
Procurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptxProcurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptx
Jon Hansen
 
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Aqusag Technologies
 
Rusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond SparkRusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond Spark
carlyakerly1
 
Mobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi ArabiaMobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi Arabia
Steve Jonas
 
Build Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For DevsBuild Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For Devs
Brian McKeiver
 
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul
 
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager APIUiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPathCommunity
 
Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)
Ortus Solutions, Corp
 
Role of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered ManufacturingRole of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered Manufacturing
Andrew Leo
 
How analogue intelligence complements AI
How analogue intelligence complements AIHow analogue intelligence complements AI
How analogue intelligence complements AI
Paul Rowe
 
Generative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in BusinessGenerative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in Business
Dr. Tathagat Varma
 
TrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business ConsultingTrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business Consulting
Trs Labs
 
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath MaestroDev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
UiPathCommunity
 
Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.
hpbmnnxrvb
 
Semantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AISemantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AI
artmondano
 
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfComplete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Software Company
 
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded DevelopersLinux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Toradex
 
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfThe Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
Abi john
 
Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025
Splunk
 
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
organizerofv
 
Procurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptxProcurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptx
Jon Hansen
 
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Aqusag Technologies
 
Rusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond SparkRusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond Spark
carlyakerly1
 
Mobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi ArabiaMobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi Arabia
Steve Jonas
 

Keepit Course 3: Provenance (and OPM), based on slides by Luc Moreau

  • 1. Brief Introduction to Provenance "As data becomes plentiful, verifiable truth becomes scarce” http://go-to-hellman.blogspot.com/2010/02/named-graphs-argleton-and- truth-economy.html For JISC KeepItcourse on Digital Preservation Tools for Repository Managers Module 3, Primer on preservation workflow, formats and characterisation Westminster-Kingsway College, London, 2 March 2010
  • 2. Provenance: example The following excerpt and slides are taken with permission from Moreau, L. The Open Provenance Model:Towards inter-operability of Provenance Systems http://users.ecs.soton.ac.uk/lavm/talks/iam09.pdf Example The provenance of a bottle of wine includes: • Grapes from which it is made • Where those grapes grew • Process in the wine’s preparation • How the wine was stored • Between which parties the wine was transported, e.g. producer to distributer to retailer • Where it was auctioned
  • 3. Provenance Definition • Oxford English Dictionary: – the fact of coming from some particular source or quarter; origin, derivation – the historyor pedigree of a work of art, manuscript, rare book, etc.; – concretely, a record of the passage of an item through its various owners. • The provenance of a piece of data is the process that led to that piece of data
  • 4. The Science Lifecycle scientists Local Web Repositories Graduate Students Undergraduate Students Virtual Learning Environment Technical Reports Reprints Peer- Reviewed Journal & Conference Papers Preprints & Metadata Certified Experimental Results & Analyses experimentation Data, Metadata, Provenance, Scripts, Workflows, Services, Ontologies, Blogs, ... Digital Libraries Next Generation Researchers Adapted from David De Roure’s slides
  • 5. scientists Local Web Repositories Graduate Students Undergraduate Students Virtual Learning Environment Technical Reports Reprints Peer- Reviewed Journal & Conference Papers Preprints & Metadata Certified Experimental Results & Analyses experimentation Data, Metadata, Provenance, Scripts, Workflows, Services, Ontologies, Blogs, ... Digital Libraries Next Generation Researchers Finding the Provenance of research outputs across all the systems data transited through
  • 6. Open Provenance Model (OPM) • Allows us to express all the causes of an item • Allow for process-oriented and dataflow oriented views • Based on a notion of annotated causality graph Moreau, L., et al. v1.00 (Dec 2007), OPM v1.01 (Jul 2008), OPM v1.1 (Dec 2009)
  • 7. OPM Requirements • To allow provenance information to be exchanged between systems, by means of a compatibility layer based on a shared provenance model. • To allow developers to build and share tools that operate on such provenance model. • To define the model in a precise, technology- agnostic manner. • To define bindings to XML/RDF separately • To support a digital representation of provenance for any “thing”, whether produced by computer systems or not
  • 8. OPM Serialisation • OPM is an abstract data model to represent past execution and what causes data and processes to occur • OPM can be serialised in different formats, referred to as “technology bindings” or serializations • OPM XML schema (http://openprovenance.org/model/v1.01.a) • OPM RDF schema • OPM OWL ontology • Effort underway to ensure full equivalence of representations
  • 9. Nodes • Artifact: Immutable piece of state, which may have a physical embodiment in a physical object, or a digital representation in a computer system. • Process: Action or series of actions performed on or caused by artifacts, and resulting in new artifacts. • Agent: Contextual entity acting as a catalyst of a process, enabling, facilitating, controlling, affecting its execution. A P Ag
  • 10. Edges A1 A2 P1 P2 wasTriggeredBy wasDerivedFrom A Pused(R) AP wasGeneratedBy(R) Ag P wasControlledBy(R) Edge labels are in the past to express that these are used to describe past executions
  • 11. Illustration • Process “used” artifacts and “generated” artifact • Edge “roles” indicate the function of the artifact with respect to the process (akin to function parameters) • Edges and nodes can be typed Causation chain: • P was caused by A1 and A2 • A3 and A4 were caused by P • Does it mean that A3 and A4 were caused by A1 and A2? P A1 A2 A3 A4 used(divisor)used(dividend) wasGeneratedBy(rest)wasGeneratedBy(quotient) type=division
  • 12. Time Constraints A Pused(R) A wasGeneratedBy(R) Ag wasControlledBy(R) start: T2 end: T5 T4T3 T1<T3 (artifact must exist before being used) T2<T3 (process must have started before using artifacts) T3<T5 (process uses artifacts before it ends) T2<T4 (process must have started before generating artifacts) T4<T5 (process generates artifacts before it ends) T4<T6 (artifact must exist before being used) T2<T5 (process must have started before ending) no constraint between t3 and t4 wasGeneratedBy(R) T1 used(R) T6
  • 13. Dublin Core Profile (draft) • To many people, provenance is primarily about attribution, citation, bibliographic information • DC provides terms to relate resources to such information • DC profile aims to use of Dublin Core terms to OPM concepts and graph patterns with Simon Miles and Joe Futrelle
  • 14. DC to OPM example: dc:publisher A2 A1 P publish wasSameResourceAs state=published Ag wasActionOf state=unpublished person name=Luc wasGeneratedBy
  • 15. What have we learned about provenance? • Provenance: describes and records the results of processes on objects over time • OPM represents provenance as XML • OPM can be serialised in different formats • RDF, Semantic Web • OPM is a work in progress By working with an open standard model, that can pass information as XML and in standard serialisation formats (e.g. RDF), it should be possible to build provenance services into repository environments