SlideShare a Scribd company logo
Making small Data BIG
THROUGH INTERDISCIPLINARY
PARTNERSHIPS AMONG LONG-TAIL DOMAINS
AGU FM 2014: IN14B-01
1
K. Lehnert 1, S. Carbotte 1, R. Arko 1, V. L. Ferrini 1, L. Hsu 1, L. Song 1,
M. Ghiorso 2, J. D. Walker 3
1 Lamont -Doherty Earth Observatory, Columbia University, Palisades, NY,
2 OFM Research, Seattle, WA
3 University of Kansas, Lawrence, KS
DATA FACILITIES IN A BIG DATA
WORLD
AGU FM 2014: IN14B-01
2
small data
BIGdata
Data Centers & Facilities
X axis: Data Volume
Y axis: Data Size
Distributed datasets
Research Data Collections
33
HOW WE DEFINE ‘BIG’
Volume
Velocity
Variety
Veracity
VALUE
AGU FM 2014: IN14B-01
3
“The long tail is a
breeding ground for new
ideas and never before
attempted science.”
(Heidorn, B. 2008: “Shedding
Light on the Dark Data in the
Long Tail of Science”)
ADDING VALUE
AGU FM 2014: IN14B-01
4
citable
small data
BIG DATA
accessible
integrated
digital data collection
trustworthy repositories
domain standards
interoperable
APIs, OLP,
55
DATA FACILITIES
“acquire, curate, preserve, and/or disseminate data, software,
and/or models for one or more defined communities or
disciplines”
need to adhere to standards (e.g. ISO 16363, ICSU-WDS)
such as
• governance and organizational viability
• organizational structure and staffing
• procedural accountability and preservation policy framework
• financial sustainability
• contracts, licenses, and liabilities
AGU FM 2014: IN14B-01
5
66
DOMAIN-SPECIFIC
DATA FACILITIES
“With both content-area and
digital curation expertise, domain
repositories are uniquely capable
of ensuring that data and other
research products are adequately
preserved, enhanced, and made
available for replication,
collaboration, and cumulative
knowledge building.”
AGU FM 2014: IN14B-01
6
“Sustaining Domain Repositories for Digital Data: A Call for
Change from an Interdisciplinary Working Group of Domain
Repositories”
Interuniversity Consortium for Political and Social Research
(ICPSR), 2013
IEDA: A MULTI-DISCIPLINARY DATA
FACILITY FOR LONG-TAIL SCIENCE
AGU FM 2014: IN14B-01
7
• Many disciplines
• geochemistry, marine geophysics, marine geology, geochronology, and more
• Many data types
• sensor data and sample-based observations & experiments
• raw data (e.g. multi-beam), field data, lab data, derived data, samples
• gridded data, point data, time-series data, maps, photos, and more
• File sizes varying from a few kilobytes to terabytes
DRIVEN BY MULTI-DISCIPLINARY
SCIENCE
AGU FM 2014: IN14B-01
8
• Ridge 2000
• MARGINS
• GeoPrisms
FROM RESEARCH DATA COLLECTIONS
TO DATA FACILITY
AGU FM 2014: IN14B-01
9
“This Cooperative Agreement
converts a series of
proposal/award-driven activities
into a community-based facility
that serves to support, sustain,
and advance the geosciences by
providing a centralized location
for the registry of and access to
data essential for research in the
solid-earth and polar sciences.”
LDEO Data projects funded by NSF OCE,
EAR, OPP that were merged into IEDA
FROM RESEARCH DATA COLLECTIONS
TO DATA FACILITY
AGU FM 2014: IN14B-01
10
Formal Governance
Robust Infrastructure
Stable Expert Team
Accreditation
Adherence to
Community Standards
IEDA: small data gone BIG
AGU FM 2014: IN14B-01
11
IEDA Syntheses
 19 x 106 analytical values in EarthChem
 2.63 x 106 miles of data from 808 cruises in the
Global Multi-Resolution Topography (GMRT)
IEDA Repositories
 >500,000 files
 47 TB
 4 x 106 samples
LAYERED SERVICES:
THE EUDAT MODEL
AGU FM 2014: IN14B-01
12
Discipline-specific Services
Users
Common Services
- data publication (DOI)
- data submission
- data management (investigator) support
- integrated data access & visualization
- interoperability (web services, RDF linked data, etc.)
- community governance
- community liaison (E&O)
- Data capture (templates, software tools)
- Domain-specific workflows & GUIs
- Data products (syntheses)
- Community standards
- User support & training
IEDA: SCOPE & PARTNERS
AGU FM 2014: IN14B-01
13
EarthChem MGDS
Users (Data contribution & retrieval)
Geochron
IEDA Common Services
Solid Earth Observational Data
Areas of expertise: Sensor data & Sample data
IEDA: SCOPE & PARTNERS
AGU FM 2014: IN14B-01
14
EarthChem MGDS
Users (Data contribution & retrieval)
Geochron
IEDA Common Services
LEPR
1515
PARTNERS
ROLES & RESPONSIBILITIES
Operation of partner systems & services
• Day-to-day operation (except sys admin)
• Planning improvements & new capabilities
• supported by and in coordination with IEDA Implementation Team)
• Align partner systems with IEDA Common Services
• Plan & oversee budget for their activities
• Interact with their specific user communities (user support, training,
feedback, etc.)
Participate in IEDA Partner Assembly
• Contributes to strategic planning & development
• Contribute to planning & prioritization of IEDA developments & activities
• Recommends new opportunities & partnerships
• Participate in IEDA governance
• Participate in annual Face-- Face meeting
15
EXAMPLE
AGU FM 2014: IN14B-01
16
IEDA
Repository
IEDA
Sample
Registry
IEDA Sys Op
J.D. Walker (KU):
- metadata schemas
- user interfaces
- web services
- community liaison
Geochron
IEDA Common
Services
EXAMPLE
AGU FM 2014: IN14B-01
17
IEDA
Repository
IEDA
Sample
Registry
IEDA Sys Op
M. Ghiorso
(OFM-Research):
- metadata schemas
- user interface
- web services
- community liaison
LEPR
IEDA Common
Services
A SCALABLE MODEL
AGU FM 2014: IN14B-01
18
EarthChem MGDS
Users (Data contribution & retrieval)
Geochron LEPR
IEDA Common Services
XX YY
. . . . . .
‘EXTERNAL’ PARTNERSHIPS
AGU FM 2014: IN14B-01
19
PartnerPartner
Funded through the Cooperative AgreementFunded outside the CA;
contract with IEDA
Users (Data contribution & retrieval)
IEDA Common Services
2020
CONCLUSION
Data facilities can grow small data through partnerships
among data efforts in long tail communities
• Maintain the expertise and community liaison of domain-
specific data efforts
• Leverage data curation expertise & infrastructure of data
facilities
AGU FM 2014: IN14B-01
20
Interdisciplinary
Earth Data Alliance
THE NEW IEDA
AGU FM 2014: IN14B-01
21
Interdisciplinary
Earth Data Alliance
 “IEDA strives to be a leading-edge inter-disciplinary data facility
for solid earth data and information,
 founded in domain-specific data resources,
 to deliver integrated and streamlined data services that advance
Ocean, Earth and Polar science and education.”

More Related Content

What's hot (20)

PPTX
Welcome & Workshop Objectives: Introduction to COMPRES by Jay Bass, Universit...
EarthCube
 
PPTX
CoESRA: Platform for collaborative research
TERN Australia
 
PDF
Facing data sharing in a heterogeneous research community: lights and shadows...
Research Data Alliance
 
PPTX
Australian Ecosystems Science Cloud
TERN Australia
 
PDF
EarthCube's OceanLink - Project Overview and Presentation Updates (March 2014)
EarthCube
 
PPTX
AusCover
TERN Australia
 
PDF
Data Infrastructure Development for SKA/Jasper Horrell
African Open Science Platform
 
PPTX
Big Data is today: key issues for big data - Dr Ben Evans
ARDC
 
PPTX
CSIRO investing in the future of data - John Morrissey
ARDC
 
PDF
AusPlots field data collection with AusScribe
TERN Australia
 
PDF
Rogan esip overview
Rebreid
 
PPTX
EV-UCD
Shweta Gupte
 
PPTX
Charting the Future - Ms Heather Jenks, ANU
ARDC
 
PPTX
Cool Tools Esri ArcGIS
The HDF-EOS Tools and Information Center
 
PDF
Presentation on INSPIRE and Higher Education (1 of 2)
JISC GECO
 
PPTX
Ag Data Commons for AgBioData
Cyndy Parr
 
PDF
LSST Education and Public Outreach (EPO)
Amanda Bauer
 
PPTX
INSPIRE - ensuring access or continuity of access?
Martin Donnelly
 
PDF
Optique presentation
DBOnto
 
PPTX
ADASS XXV: LSST DM - Building the Data System for the Era of Petascale Optica...
Mario Juric
 
Welcome & Workshop Objectives: Introduction to COMPRES by Jay Bass, Universit...
EarthCube
 
CoESRA: Platform for collaborative research
TERN Australia
 
Facing data sharing in a heterogeneous research community: lights and shadows...
Research Data Alliance
 
Australian Ecosystems Science Cloud
TERN Australia
 
EarthCube's OceanLink - Project Overview and Presentation Updates (March 2014)
EarthCube
 
AusCover
TERN Australia
 
Data Infrastructure Development for SKA/Jasper Horrell
African Open Science Platform
 
Big Data is today: key issues for big data - Dr Ben Evans
ARDC
 
CSIRO investing in the future of data - John Morrissey
ARDC
 
AusPlots field data collection with AusScribe
TERN Australia
 
Rogan esip overview
Rebreid
 
EV-UCD
Shweta Gupte
 
Charting the Future - Ms Heather Jenks, ANU
ARDC
 
Presentation on INSPIRE and Higher Education (1 of 2)
JISC GECO
 
Ag Data Commons for AgBioData
Cyndy Parr
 
LSST Education and Public Outreach (EPO)
Amanda Bauer
 
INSPIRE - ensuring access or continuity of access?
Martin Donnelly
 
Optique presentation
DBOnto
 
ADASS XXV: LSST DM - Building the Data System for the Era of Petascale Optica...
Mario Juric
 

Similar to IEDA: Making Small Data BIG Through Interdisciplinary Partnerships Among Long-tail Domains (20)

PPTX
IEDA Overview & Updates, March 2014
iedadata
 
PPTX
Research Data Infrastructure for Geochemistry (DFG Roundtable)
Kerstin Lehnert
 
PPTX
Data Facilities Workshop - Panel on Current Concepts in Data Sharing & Intero...
EarthCube
 
PPTX
EGU 2018 Ian McHarg Lecture
Kerstin Lehnert
 
PDF
Integrated Earth Data Applications: Enhancing Reliable Data Services Through ...
iedadata
 
PPTX
Lehnert: Making Small Data Big, IACS, April2015
Kerstin Lehnert
 
PDF
DataONE_cobb_hubbub2012_20120924_v05
John Cobb
 
PPTX
CNI Fall 2011 Meeting Presentation Margaret Hedstrom & Robert McDonald (Dec. ...
SEAD
 
PPTX
The Data Management Ecosystem
John Kunze
 
PPTX
Repository Federation: Towards Data Interoperability
Robert H. McDonald
 
PDF
XldbEuropeEdinburgh-09-jun2011
Alex Hardisty
 
PPTX
RDAP13 John Kunze: The Data Management Ecosystem
ASIS&T
 
PPTX
Why manage research data?
Graham Pryor
 
PDF
Use of persistent identifiers to link heterogeneous data systems in the Integ...
hsuleslie
 
PDF
Virtual Organizations 2.0: Social Constructs for Data-centered Collaborative ...
Globus
 
PPT
jamstec-rew.ppt
ARKODAS2248403
 
PDF
Introducingthe anu datacommons
Doug Moncur
 
PDF
Unidata's Approach to Community Broadening through Data and Technology Sharing
The HDF-EOS Tools and Information Center
 
PPT
Edinburgh DataShare – A DSpace Data Repository: Achievements and Aspirations
EDINA, University of Edinburgh
 
PPT
Fedora Oxford Dec09
University of Edinburgh
 
IEDA Overview & Updates, March 2014
iedadata
 
Research Data Infrastructure for Geochemistry (DFG Roundtable)
Kerstin Lehnert
 
Data Facilities Workshop - Panel on Current Concepts in Data Sharing & Intero...
EarthCube
 
EGU 2018 Ian McHarg Lecture
Kerstin Lehnert
 
Integrated Earth Data Applications: Enhancing Reliable Data Services Through ...
iedadata
 
Lehnert: Making Small Data Big, IACS, April2015
Kerstin Lehnert
 
DataONE_cobb_hubbub2012_20120924_v05
John Cobb
 
CNI Fall 2011 Meeting Presentation Margaret Hedstrom & Robert McDonald (Dec. ...
SEAD
 
The Data Management Ecosystem
John Kunze
 
Repository Federation: Towards Data Interoperability
Robert H. McDonald
 
XldbEuropeEdinburgh-09-jun2011
Alex Hardisty
 
RDAP13 John Kunze: The Data Management Ecosystem
ASIS&T
 
Why manage research data?
Graham Pryor
 
Use of persistent identifiers to link heterogeneous data systems in the Integ...
hsuleslie
 
Virtual Organizations 2.0: Social Constructs for Data-centered Collaborative ...
Globus
 
jamstec-rew.ppt
ARKODAS2248403
 
Introducingthe anu datacommons
Doug Moncur
 
Unidata's Approach to Community Broadening through Data and Technology Sharing
The HDF-EOS Tools and Information Center
 
Edinburgh DataShare – A DSpace Data Repository: Achievements and Aspirations
EDINA, University of Edinburgh
 
Fedora Oxford Dec09
University of Edinburgh
 
Ad

More from Kerstin Lehnert (16)

PPTX
Astromat Update on Developments 2021-01-29
Kerstin Lehnert
 
PPTX
Data Services for Geochemical Data
Kerstin Lehnert
 
PPTX
Lehnert_EGU201_SampleMetadataStandards
Kerstin Lehnert
 
PPTX
Goldschmidt2019 Samples Workshop
Kerstin Lehnert
 
PPTX
Boosting Data Science in Geochemistry: We Need Global Geochemical Data Standa...
Kerstin Lehnert
 
PPT
EarthCubeArchitectureWS_June2015
Kerstin Lehnert
 
PPTX
Advancing Reproducible Science from Physical Samples: The IGSN and the iSampl...
Kerstin Lehnert
 
PPTX
Making Small Data BIG (UT Austin, March 2016)
Kerstin Lehnert
 
PPTX
IGSN: The International Geo Sample Number (DFG Roundtable)
Kerstin Lehnert
 
PPTX
Data Standards & Best Practices for the Stratigraphic Record
Kerstin Lehnert
 
PPTX
Interdisciplinary Data Resources for Volcanology at the IEDA (Interdisciplina...
Kerstin Lehnert
 
PPTX
The Internet of Samples: IGSN in Action
Kerstin Lehnert
 
PPTX
Digital Representation of Physical Samples in Scientific Publications
Kerstin Lehnert
 
PPTX
iSamples Research Coordination Network (C4P Webinar)
Kerstin Lehnert
 
PPTX
MoonDB: Restoration & Synthesis of Planetary Geochemical Data
Kerstin Lehnert
 
PPTX
IEDA Data Publication Workshop @AGU
Kerstin Lehnert
 
Astromat Update on Developments 2021-01-29
Kerstin Lehnert
 
Data Services for Geochemical Data
Kerstin Lehnert
 
Lehnert_EGU201_SampleMetadataStandards
Kerstin Lehnert
 
Goldschmidt2019 Samples Workshop
Kerstin Lehnert
 
Boosting Data Science in Geochemistry: We Need Global Geochemical Data Standa...
Kerstin Lehnert
 
EarthCubeArchitectureWS_June2015
Kerstin Lehnert
 
Advancing Reproducible Science from Physical Samples: The IGSN and the iSampl...
Kerstin Lehnert
 
Making Small Data BIG (UT Austin, March 2016)
Kerstin Lehnert
 
IGSN: The International Geo Sample Number (DFG Roundtable)
Kerstin Lehnert
 
Data Standards & Best Practices for the Stratigraphic Record
Kerstin Lehnert
 
Interdisciplinary Data Resources for Volcanology at the IEDA (Interdisciplina...
Kerstin Lehnert
 
The Internet of Samples: IGSN in Action
Kerstin Lehnert
 
Digital Representation of Physical Samples in Scientific Publications
Kerstin Lehnert
 
iSamples Research Coordination Network (C4P Webinar)
Kerstin Lehnert
 
MoonDB: Restoration & Synthesis of Planetary Geochemical Data
Kerstin Lehnert
 
IEDA Data Publication Workshop @AGU
Kerstin Lehnert
 
Ad

Recently uploaded (20)

PPTX
PROTECTED CULTIVATION ASSIGNMENT 2..pptx
RbDharani
 
PPTX
Raising awareness on the story beyond the surface. A case study on the signif...
Kristel Wautier
 
PPTX
NAUSEAS Y VOMITO POSTOPERATORIO ANESTESIOLOGIA
CristinaIvonGonzlezC
 
PDF
Agentic AI: Autonomy, Accountability, and the Algorithmic Society
vs5qkn48td
 
PPTX
Level 3 Food Safety training for food handlers
nibbuchadnizar
 
PDF
The First Detection of Molecular Activity in the Largest Known Oort Cloud Com...
Sérgio Sacani
 
PDF
Historical Knowledge Bases with Semantic MediaWiki
BernhardKrabina
 
PDF
Human-to-Robot Handovers track - RGMC - ICRA 2025
Alessio Xompero
 
PDF
The scientific heritage No 163 (163) (2025)
The scientific heritage
 
PDF
Evidence for a sub-Jovian planet in the young TWA 7 disk
Sérgio Sacani
 
PDF
Thermal stratification in lakes-J. Bovas Joel.pdf
J. Bovas Joel BFSc
 
PDF
Cultivation and goods of microorganisms-4.pdf
adimondal300
 
PDF
SCH 4103_Fibre Technology & Dyeing_07012020.pdf
samwelngigi37
 
PDF
Bacterial microbes kal growth by Atlas.pdf
adimondal300
 
PPTX
Philippine_Literature_Precolonial_Period_Designed.pptx
josedalagdag5
 
PDF
Rational points on curves -- BIMR 2025 --
mmasdeu
 
PPTX
animal form and function zoology miller harley
sarmadbilal3
 
PDF
Voyage to the Cosmos of Consciousness.pdf
Saikat Basu
 
PPTX
Respiratory and Circulatory Sytems.pptpptx
AngeloAngeles17
 
PDF
Driving down costs for fermentation: Recommendations from techno-economic data
The Good Food Institute
 
PROTECTED CULTIVATION ASSIGNMENT 2..pptx
RbDharani
 
Raising awareness on the story beyond the surface. A case study on the signif...
Kristel Wautier
 
NAUSEAS Y VOMITO POSTOPERATORIO ANESTESIOLOGIA
CristinaIvonGonzlezC
 
Agentic AI: Autonomy, Accountability, and the Algorithmic Society
vs5qkn48td
 
Level 3 Food Safety training for food handlers
nibbuchadnizar
 
The First Detection of Molecular Activity in the Largest Known Oort Cloud Com...
Sérgio Sacani
 
Historical Knowledge Bases with Semantic MediaWiki
BernhardKrabina
 
Human-to-Robot Handovers track - RGMC - ICRA 2025
Alessio Xompero
 
The scientific heritage No 163 (163) (2025)
The scientific heritage
 
Evidence for a sub-Jovian planet in the young TWA 7 disk
Sérgio Sacani
 
Thermal stratification in lakes-J. Bovas Joel.pdf
J. Bovas Joel BFSc
 
Cultivation and goods of microorganisms-4.pdf
adimondal300
 
SCH 4103_Fibre Technology & Dyeing_07012020.pdf
samwelngigi37
 
Bacterial microbes kal growth by Atlas.pdf
adimondal300
 
Philippine_Literature_Precolonial_Period_Designed.pptx
josedalagdag5
 
Rational points on curves -- BIMR 2025 --
mmasdeu
 
animal form and function zoology miller harley
sarmadbilal3
 
Voyage to the Cosmos of Consciousness.pdf
Saikat Basu
 
Respiratory and Circulatory Sytems.pptpptx
AngeloAngeles17
 
Driving down costs for fermentation: Recommendations from techno-economic data
The Good Food Institute
 

IEDA: Making Small Data BIG Through Interdisciplinary Partnerships Among Long-tail Domains

  • 1. Making small Data BIG THROUGH INTERDISCIPLINARY PARTNERSHIPS AMONG LONG-TAIL DOMAINS AGU FM 2014: IN14B-01 1 K. Lehnert 1, S. Carbotte 1, R. Arko 1, V. L. Ferrini 1, L. Hsu 1, L. Song 1, M. Ghiorso 2, J. D. Walker 3 1 Lamont -Doherty Earth Observatory, Columbia University, Palisades, NY, 2 OFM Research, Seattle, WA 3 University of Kansas, Lawrence, KS
  • 2. DATA FACILITIES IN A BIG DATA WORLD AGU FM 2014: IN14B-01 2 small data BIGdata Data Centers & Facilities X axis: Data Volume Y axis: Data Size Distributed datasets Research Data Collections
  • 3. 33 HOW WE DEFINE ‘BIG’ Volume Velocity Variety Veracity VALUE AGU FM 2014: IN14B-01 3 “The long tail is a breeding ground for new ideas and never before attempted science.” (Heidorn, B. 2008: “Shedding Light on the Dark Data in the Long Tail of Science”)
  • 4. ADDING VALUE AGU FM 2014: IN14B-01 4 citable small data BIG DATA accessible integrated digital data collection trustworthy repositories domain standards interoperable APIs, OLP,
  • 5. 55 DATA FACILITIES “acquire, curate, preserve, and/or disseminate data, software, and/or models for one or more defined communities or disciplines” need to adhere to standards (e.g. ISO 16363, ICSU-WDS) such as • governance and organizational viability • organizational structure and staffing • procedural accountability and preservation policy framework • financial sustainability • contracts, licenses, and liabilities AGU FM 2014: IN14B-01 5
  • 6. 66 DOMAIN-SPECIFIC DATA FACILITIES “With both content-area and digital curation expertise, domain repositories are uniquely capable of ensuring that data and other research products are adequately preserved, enhanced, and made available for replication, collaboration, and cumulative knowledge building.” AGU FM 2014: IN14B-01 6 “Sustaining Domain Repositories for Digital Data: A Call for Change from an Interdisciplinary Working Group of Domain Repositories” Interuniversity Consortium for Political and Social Research (ICPSR), 2013
  • 7. IEDA: A MULTI-DISCIPLINARY DATA FACILITY FOR LONG-TAIL SCIENCE AGU FM 2014: IN14B-01 7 • Many disciplines • geochemistry, marine geophysics, marine geology, geochronology, and more • Many data types • sensor data and sample-based observations & experiments • raw data (e.g. multi-beam), field data, lab data, derived data, samples • gridded data, point data, time-series data, maps, photos, and more • File sizes varying from a few kilobytes to terabytes
  • 8. DRIVEN BY MULTI-DISCIPLINARY SCIENCE AGU FM 2014: IN14B-01 8 • Ridge 2000 • MARGINS • GeoPrisms
  • 9. FROM RESEARCH DATA COLLECTIONS TO DATA FACILITY AGU FM 2014: IN14B-01 9 “This Cooperative Agreement converts a series of proposal/award-driven activities into a community-based facility that serves to support, sustain, and advance the geosciences by providing a centralized location for the registry of and access to data essential for research in the solid-earth and polar sciences.” LDEO Data projects funded by NSF OCE, EAR, OPP that were merged into IEDA
  • 10. FROM RESEARCH DATA COLLECTIONS TO DATA FACILITY AGU FM 2014: IN14B-01 10 Formal Governance Robust Infrastructure Stable Expert Team Accreditation Adherence to Community Standards
  • 11. IEDA: small data gone BIG AGU FM 2014: IN14B-01 11 IEDA Syntheses  19 x 106 analytical values in EarthChem  2.63 x 106 miles of data from 808 cruises in the Global Multi-Resolution Topography (GMRT) IEDA Repositories  >500,000 files  47 TB  4 x 106 samples
  • 12. LAYERED SERVICES: THE EUDAT MODEL AGU FM 2014: IN14B-01 12 Discipline-specific Services Users Common Services - data publication (DOI) - data submission - data management (investigator) support - integrated data access & visualization - interoperability (web services, RDF linked data, etc.) - community governance - community liaison (E&O) - Data capture (templates, software tools) - Domain-specific workflows & GUIs - Data products (syntheses) - Community standards - User support & training
  • 13. IEDA: SCOPE & PARTNERS AGU FM 2014: IN14B-01 13 EarthChem MGDS Users (Data contribution & retrieval) Geochron IEDA Common Services Solid Earth Observational Data Areas of expertise: Sensor data & Sample data
  • 14. IEDA: SCOPE & PARTNERS AGU FM 2014: IN14B-01 14 EarthChem MGDS Users (Data contribution & retrieval) Geochron IEDA Common Services LEPR
  • 15. 1515 PARTNERS ROLES & RESPONSIBILITIES Operation of partner systems & services • Day-to-day operation (except sys admin) • Planning improvements & new capabilities • supported by and in coordination with IEDA Implementation Team) • Align partner systems with IEDA Common Services • Plan & oversee budget for their activities • Interact with their specific user communities (user support, training, feedback, etc.) Participate in IEDA Partner Assembly • Contributes to strategic planning & development • Contribute to planning & prioritization of IEDA developments & activities • Recommends new opportunities & partnerships • Participate in IEDA governance • Participate in annual Face-- Face meeting 15
  • 16. EXAMPLE AGU FM 2014: IN14B-01 16 IEDA Repository IEDA Sample Registry IEDA Sys Op J.D. Walker (KU): - metadata schemas - user interfaces - web services - community liaison Geochron IEDA Common Services
  • 17. EXAMPLE AGU FM 2014: IN14B-01 17 IEDA Repository IEDA Sample Registry IEDA Sys Op M. Ghiorso (OFM-Research): - metadata schemas - user interface - web services - community liaison LEPR IEDA Common Services
  • 18. A SCALABLE MODEL AGU FM 2014: IN14B-01 18 EarthChem MGDS Users (Data contribution & retrieval) Geochron LEPR IEDA Common Services XX YY . . . . . .
  • 19. ‘EXTERNAL’ PARTNERSHIPS AGU FM 2014: IN14B-01 19 PartnerPartner Funded through the Cooperative AgreementFunded outside the CA; contract with IEDA Users (Data contribution & retrieval) IEDA Common Services
  • 20. 2020 CONCLUSION Data facilities can grow small data through partnerships among data efforts in long tail communities • Maintain the expertise and community liaison of domain- specific data efforts • Leverage data curation expertise & infrastructure of data facilities AGU FM 2014: IN14B-01 20 Interdisciplinary Earth Data Alliance
  • 21. THE NEW IEDA AGU FM 2014: IN14B-01 21 Interdisciplinary Earth Data Alliance  “IEDA strives to be a leading-edge inter-disciplinary data facility for solid earth data and information,  founded in domain-specific data resources,  to deliver integrated and streamlined data services that advance Ocean, Earth and Polar science and education.”

Editor's Notes

  • #2: It is ironic that we are starting this session on ‘Data Facilities in a Big Data world’ talking about small data.
  • #3: The data world is generally distinguishing between big data and small data based on data size and data volume meaning number of datasets. The BIG data world in the Earth Sciences is represented by disciplines that generate massive volumes of observational or computed data using large-scale, shared instrumentation such as global sensor networks, satellites, or high-performance computing facilities. These data are typically managed  and curated by well-supported community data facilities. The small data world is the one where data are primarily acquired by individual investigators or small teams (known as ‘Long-tail data’). These small data are usually poorly shared and integrated, and lack data repositories that ensure persistent access, quality control, long-term archiving, standardization, and interoperability. But often these communities have Research Data Collections, Disciplines with small, PI-generated, distributed data often lack domain-specific data facilities. No consensus on data practices. Insufficient funding to support data facilities. But often served by “Research Data Collections” of substantial scientific value. mostly short-term funded (research grants) PI driven (‘single point of failure’) lack resources to implement repository standards
  • #6: enabling transformational science fostering technological development promoting educational opportunities
  • #8: IEDA is a data facility that hosts observational solid earth data and tools from the marine, terrestrial, and polar environments. ¤ Multiple diverse data systems that were developed independently, serving both ¤ sensor data from large collaborative cruise programs ¤ sample-based measurements from unique analytical laboratories ¤ IEDA data systems enable the data to be discovered and reused.
  • #21: Gain “Trustworthiness” through IEDA’s shared repository services DOI registration long-term archiving solutions linking to literature & awards Augment usage & value through integration into IEDA’s multi-disciplinary data services Single point data submission Sample registration Link to scientific literature Data access & visualization (GeoMapApp) Ensure representation in relevant data curation, data publication, and  informatics communities Improve sustainability