0% found this document useful (0 votes)
110 views

Data Sharing Toolkit

Uploaded by

Abdullah Zaky
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
110 views

Data Sharing Toolkit

Uploaded by

Abdullah Zaky
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

DATA SHARING

March 2020 TOOLKIT


Approaches, guidance and resources
to unlock the value of data
Data sharing toolkit 02

About Smart Dubai About Nesta Acknowledgements

His Highness Sheikh Mohammed bin Nesta is an innovation foundation. For us, This toolkit was created by Tom Symons and
Rashid Al Maktoum, Vice President and innovation means turning bold ideas into reality Camilla Bertoncin from Nesta in partnership
Prime Minister of the UAE, and Ruler of Dubai, and changing lives for the better. with the Smart Dubai team.
launched the Smart Dubai Initiative in 2013
We use our expertise, skills and funding in areas As part of the research process for this toolkit,
with a vision of making Dubai the happiest
where there are big challenges facing society. we held two workshops and presented our work
and smartest city on Earth.
to external experts in Dubai and London.
Nesta is based in the UK and supported by
The Smart Dubai office was formed in 2015
a financial endowment. We work with partners We’d like to thank all those who have attended
to oversee Dubai’s smart transformation
around the globe to bring bold ideas to life the workshops and contributed along
and accomplish the leadership’s vision.
to change the world for good. the way. In particular, we would like to thank
Collaborating with government and private
Thea Snow and Eddie Copeland for their insight
sector partners, Smart Dubai is consistently www.nesta.org.uk
and support.
adopting the latest technological innovations
to provide efficient, seamless, safe and This toolkit is a copyrighted work
personalised city experiences for residents of Smart Dubai.
and visitors.

As of January 2020, the Smart Dubai


office was officially renamed as the
Smart Dubai Department.

www.smartdubai.ae
Data sharing toolkit 03

About this toolkit Who is it for? 05

04
When should it be used? 05
How should it be used? 06

Data sharing Public problems addressed through data and analysis 08


Barriers to data sharing 10
Examples of existing data sharing initiatives 11

07
Labelling data sharing initiatives 12
Data sharing decision tool 13

A: The decision matrix Canvasses16


Why share data? 17
What data to share? 19
Who to involve? 22
What is the overall governance structure? 24

14
What is the appropriate data infrastructure? 26
How is data accessed? 27

B: Project foundations Project foundations 30

29
Checklist31
Requirements32
04

ABOUT THIS
TOOLKIT

Globally, innovation in data governance is fairly Our research has focused on how trusted data
embryonic. Recent years have seen a flurry of sharing arrangements can be formed that
new activity, but much of this remains poorly ensure data is in the right hands and value
defined or nascent. Few, if any, countries have can be extracted from it.
‘cracked the code’ of responsible and effective
data sharing initiatives and governance. We approached this work by analysing a range
of models for data sharing and boiling them down
Data’s value can be unlocked by creating trusted to the essential components that are common
and ethical mechanisms for individuals, private to all of them.
and public sector organisations to share data.
Data sharing toolkit / About this toolkit 05

Who is it for? When should it be used?

— Innovators around the world interested We designed a flexible decision tool that:
in exploring data sharing initiatives
and looking for a tool to spark the right — Provides useful guidance and resources for
discussions, anticipate issues and find new private and public organisations to prepare
collaborative approaches to data sharing. for and design data sharing initiatives.

— International organisations, public — Helps them identify the right combination


institutions, businesses and non-profits. of options for the specific and unique
context for the given circumstances.
— Those familiar with data and/or previously
involved in traditional data sharing.
We have created six canvasses for this
— Those grappling with slow or failed toolkit. While this guide presents the stages
progress in activating data sharing in sequential order, in reality users may find
collaborations on a bigger scale that they begin in the middle and end at the
or dealing with complex initiatives. start. This is fine. It may also be necessary
to walk through the steps more than once.
This toolkit is designed to be used both by
Additional material can be found in the ‘Useful
individuals and by teams and groups working
tools’ sections. The tools are all referenced
through the activities and canvasses together.
in the toolkit, accessible online and easy to use
if you wish to dive deeper.
Data sharing toolkit / About this toolkit 06

How should it be used?

You are new to data sharing and You are considering proposing a data You already have a partnership in
would like to know more about what sharing partnership to your team, but place and want to kickstart aligning
is theoretically out there. want to have an idea of what it would expectations and incentives, and need
take to put one in place first. a tool to spark conversations across
different stakeholders.

— The first part of this toolkit provides — Start with the decision matrix. 14 ▶ — Start with the ‘Project foundations’ section.
the context, an overview of a range 29 ▶
— You can go through the material
of different data sharing initiatives
of the canvasses and dive deeper — Discuss the checklist items and then
and plenty of case studies. 07 ▶
using the additional resources in move to the more practical exercise
the ‘Useful tools’ sections. of the decision matrix. 14 ▶

— Move to the ‘Project foundations’ section. — Print each canvas on A3 (or a large piece
Do you tick all the boxes of the checklist? of paper), divide into multiple groups
29 ▶ and repeat the process if necessary
with other stakeholders to guarantee
— If not, is there something you can do
maximum representation.
to facilitate the process of change that
leads to all boxes being ticked?
07

DATA SHARING

It is now widely recognised that the value of data Yet many datasets that could help solve public
held by individuals and organisations alike can problems remain closed, proprietary or difficult
increase exponentially if it is shared and combined to find or share.
with other sources of data.
For this reason, governments, public and private
Bringing together data sources breaks down sector organisations around the world are
traditional silos and unleashes the potential experimenting with different ways of accelerating
for data to generate important and meaningful data sharing and collaboration between those
insights. Public services in many parts of the who hold valuable data and those able to deliver
world are exploring the potential of data analytics solutions to unlock the value from data.
to address public problems, using their own
data to become more efficient and effective.

The open data movement has led to governments


worldwide sharing their data.
Data sharing toolkit / Data sharing 08

Public problems addressed through data and analysis ▶

Labour markets Consumer data and retail Smart cities and city data

— Future of work — Consumer sentiment and experience — Transport, mobility and urban planning

— Skills — Consumption patterns — Congestion reduction

— Jobs — Business operations — Energy sector

The Open Skills Project is a public–private Linking consumer confidence index and social City Brain, a partnership of the Chinese
partnership focused on providing a dynamic, media sentiment analysis is an analysis of government with the commercial platform
locally relevant, up-to-date and normalised the correlation between the official consumer Alibaba, provides real-time data from
taxonomy of skills and jobs. Its aim is to improve confidence index obtained from the MIER 750 sources to tackle problems of traffic
our understanding of the labour market and (Malaysian Institute of Economic Research) congestion, analyse energy and water
reduce frictions in the workforce data ecosystem and social media big data (via sentiment consumption patterns and identify vulnerable
by enabling a more granular common language analysis, from Twitter) on consumer purchasing residents in need of additional support.
of skills among industry, academia, government, behaviour for two types of products over
and non-profit organisations. the course of two years.
Data sharing toolkit / Data sharing 09

▶ Public Problems

Education Health research Environment

— Tailoring learning materials based — Rare diseases — Air and water pollution reduction
on a student’s needs
— Genomic data mapping — Flood risk modelling
— Diagnosing strengths, weaknesses or
— Population growth forecasts — Forest change monitoring
gaps in a student’s learning experience 

— Providing automated feedback

The Baltimore Early Childhood Data Personal health datasets shared by voluntary Fluxnet is a repository of eddy covariance
Collaborative is a partnership of Baltimore patients and research centres are increasingly measurements of carbon dioxide and water
City agencies serving young children and used to support research and preventive health vapour exchange from more than 800 active
their families, sharing data to understand care services, such as in the cases of MIDATA, and historic flux measurement sites, dispersed
the experiences of young children in Baltimore the NCI Genomic Data Commons or Healx. across most of the world’s climate space and
and how those experiences relate to later representative biomes.
educational outcomes.
Data sharing toolkit / Data sharing 10

Barriers to data sharing

Data breaches and privacy missteps While data sharing offers many opportunities, Given the challenges, but in light of the
regularly make headlines and there are also significant challenges to be significant opportunities outlined, there is
have recently caused a profound addressed and barriers to be overcome. a growing demand for trusted mechanisms
and widening lack of trust among Other barriers to data sharing include: for sharing data and meaningful privacy
individuals, institutions and and data protection regulations.
governments in the notion of safe — Risks associated with sharing commercially
Institutional frameworks needed to support
data sharing. sensitive information
the safe and trusted sharing and use of data
— The complexity of facilitating cross-border between multiple different organisations
In addition, a culture of risk aversion data flows are not yet well-established – there are not
in public sector agencies can mean yet clearly codified processes that facilitate
that the privacy risks are seen — Reputational concerns
responsible data sharing.
to outweigh the potential benefits. — Regulatory or legal uncertainty
The challenge facing public and private sector
— A lack of dedicated personnel to drive entities globally is how to strengthen trust
and steward such initiatives and implement effective public–private data
collaboration.
— Mixed levels of data maturity
across organisations This toolkit aims to offer some insights and
answers to tackle this challenge.
— Unclear incentives (especially when
engaging private companies)
Data sharing toolkit / Data sharing 11

Examples of existing data sharing initiatives

At the outset of our research, While these may seem very different,


we surveyed a range of different they are all united in their attempt
data sharing initiatives and grouped to release greater value from data
them by the following types. through sharing.

Data commons Data exchanges Data trusts Open data platforms and
and markets open APIs
A spectrum of initiatives in which Usually this is a data platform Legal structures that provide Curated sets of open datasets
data is shared as a common where data is treated as independent stewardship of data. and APIs (application
resource among individuals an economic good, and programming interfaces).
or organisations, who collectively access is regulated through
decide on the rules that price mechanisms.
govern access to it.
— data.gov.uk
— Dataverse — Copenhagen–Hitachi City — Open Data Institute (ODI) — Transport for London Unified APIs
— DECODE Data Exchange data trusts pilots

Data collaboratives Data co‑operatives Offices of data analytics Research partnerships


and data hackathons
Includes all forms of collaboration Mutual organisations owned Multiple organisations share data Projects focused on the
in which participants from different and controlled by their sourced from public sector bodies identification of specific problems,
sectors – including private members, formed to collect to improve services and make collaborative work sessions
companies, research institutions and share data in the interests better decisions. and events where people work
and government agencies – of their members. on data‑related projects.
exchange data to create value
for the public good.

— Global Forest Watch — MIDATA — Essex Centre for Data Analytics — Consumer Data Research Centre
— California Data Collaborative — Analytics Vidhya hackathon contests
Data sharing toolkit / Data sharing 12

Labelling data sharing initiatives

As a very nascent field, there has context, predefined models and labels, such to think about the specific problem that sharing
been an emergence of organisations as those described above, are not useful as data is set to solve. From this problem and
specialising in researching and tools for translating theory into practice. the specific context in which the partnership
promoting specific data sharing models. is being developed, other questions around
Rather than considering whether a ‘data trust’
governance, power and access will flow.
or a ‘data collaborative’ or a ‘data commons’
To get a sense of the great variation in names is the right approach, it is much more important
and approaches, below we show the ODI Data
Access Map. This is a clear example of how
many terms and definitions can be attached
to data sharing initiatives, and it is an attempt
to try and navigate them, by crowdsourcing
definitions and case studies. These labels can
be used interchangeably depending on the
organisation in question, leading to confusion
and a lack of rigour in their application.

Some of the resistance to exploring data


sharing opportunities is due to the fact
that this definitional confusion means that
decision‑makers have no firm framework to
help them understand their options for sharing
data in practice. In addition, each collaborative
project involving data sharing is unique, set
in a specific dynamic composed of different
actors, laws and rules, expectations, levels
of expertise, incentives and relations. In this
ODI (2019), Data Access Map
Data sharing toolkit / Data sharing 13

Data sharing decision tool

Our work has involved an in-depth


review of a range of different data A: The decision matrix B: Project foundations
sharing collaboratives and partnerships
to understand the essential components Identifies six key decision points, Identifies the conditions required
that are common to all of them. prompting and guiding discussions to move forward with the data
about all key elements of a data sharing project and provides
This analysis was then translated into sharing arrangement. the overarching legal, technical
a flexible decision tool that covers and relationship considerations.
two stages of development, which
can happen almost in parallel, as the

AB
process of designing a data sharing

14 29
initiative will involve some degree
of iteration.
14

A:
THE DECISION
MATRIX

Analysis of a range of models (such as trusts, There is a degree of necessary circularity to


collaboratives and co‑operatives) indicates that the process, meaning that the tool may be
there are six essential components of a data used multiple times, switching from one section
sharing collaborative. These are framed in the to another, refining the level of detail and
decision matrix as the key decision points that the participation of different groups of stakeholders.
anyone developing a data trust should consider
when designing a data sharing approach. Each decision point is described in more detail
in the following section, with guidance around
These are ordered in what is in theory an ideal which option is best suited for the particular set
sequence, but often reality is more complex. of circumstances.
Data sharing toolkit / A: The decision matrix 15

Prediction capability and forecasting Faster decision-making


Discovery of new insights Unlocking innovation Efficiency and co-ordination

Why share data?

Public sector
No access Restricted Open data Open data Private sector Individuals

How is data accessed? What data to share?

The decision matrix

Centralised Federated Decentralised Top-down Bottom-up

What is the appropriate What is the overall


data infrastructure? governance structure?

Who to involve?
Private sector organisations Individuals International entities
Third sector organisations Public sector organisations
Data sharing toolkit / A: The decision matrix 16

Canvasses

Why share data? Who to involve? What is the appropriate


data infrastructure?

This step encourages you to define the purpose This step supports consideration of which This step sets out the options for data storage
of sharing data. Defining this upfront, and parties need to be involved in the arrangement infrastructure, which enable the data exchange
having shared agreement of the purpose to make it successful. across stakeholders.
and what success looks like is a precondition
to a project’s success.

— Review Canvas 1 17 ▶ — Review Canvas 3 22 ▶ — Review Canvas 5 26 ▶

What data to share? What is the overall How is data accessed?


governance structure?

This step of the decision matrix supports This step of the matrix prompts parties This step helps you choose which form
consideration of datasets needed for the to consider where power lies, and how of data access best fits your initiative.
partnership and prompts stakeholders to the structure and roles of all parties interact
discuss the types of data that need to be shared, in the data sharing arrangement.
their form and the ethical considerations that
need to follow.

— Review Canvas 2 19 ▶ — Review Canvas 4 24 ▶ — Review Canvas 6 27 ▶


Data sharing toolkit / A: The decision matrix / Canvas 1 17

Why share data? ▶


How to use this canvas
Review the questions, identify Select the reasons you might use data to achieve your purpose
the reasons why you might use
data to achieve the purpose and Discovery of new insights Increased prediction capability
define your problem statement/ Creating new knowledge, sharing it with multiple and forecasting
value proposition in the box below. partners and identifying the key questions that Identifying new drivers of more accurate forecasts
need to be addressed. from disparate, interrelated and interconnected
data sources from the use of advanced
Unlocking innovation data analytics.
Questions to ask Identifying new sources of value by opening up
— What type of problem are you trying data to third parties that can turn them into new, Optimised process efficiency
to solve? innovative data products with shared value. and co‑ordination
Providing additional insights from new data sources
— What other things could you Faster decision-making to augment co-ordination and reduce inefficiencies
do instead of sharing data? Providing stakeholders with a more complete in day-to-day operations.
— Are there examples of similar projects and accurate picture of complex issues
that could help identify benefits and for rapid decision-making. This can include
make the case for the partnership improving public service design and delivery,
to be formed? and emergency response.

— Is the output of the initiative going


to be a one-off analysis or does Define a specific problem statement or value proposition below
it require continuous data sharing?
— Is the data you want to share going .............................................................................................................................
to be used for strategic reasons .............................................................................................................................
or operational? ............................................................................................................................
— Are you starting with a specific ............................................................................................................................
problem in mind or with the data?
.............................................................................................................................
— How is the project being
communicated? .............................................................................................................................
.............................................................................................................................
.............................................................................................................................
Useful tools
.............................................................................................................................
— GovLab Data collaboratives explorer .............................................................................................................................
— Nesta DIY Toolkit Problem definition
Data sharing toolkit / A: The decision matrix / Canvas 1 18

▶ Why share data? – Case studies

Discover new insights Unlocking innovation — Fixing potholes and identifying Optimised process efficiency
high-priority streets for and co-ordination
Amdex (Amsterdam Data Open Banking Since 2018, maintenance services
Exchange) A data exchange a regulation from the Competition Seoul Owl Bus In Seoul, South
initiative by the Amsterdam and Markets Authority mandates — Sharing garbage trucks’
real‑time location Korea, where the metro system
Economic Board, backed that UK-regulated banks allow shuts down from midnight to 5 am,
by Amsterdam Science Park authorised providers (such as the municipal government used its
and Amsterdam Data Science, licensed startups offering budgeting citizens’ late-night calls and texts
and supported by the City apps, or other banks) direct access Increased prediction
to plan routes for a new night bus
of Amsterdam. The project is still to customer account information capability and forecasting service. A telecom company (KT)
at concept phase, and aims to and data at transaction level provided the government with
collect city data held by government through APIs. The idea behind this Flowminder During the 2010 Haiti
earthquake response, Flowminder anonymous phone data, which
agencies, companies and others initiative is that it will bring more officials used to colour-code
to provide broad access to data innovation to financial services researchers pioneered the use of
de-identified data from mobile regions of the city by call volume.
for researchers, businesses, thanks to third-party developers, They then analysed the number
governments and individuals who will create new tools that will operators to follow population
displacement. As a result, mobile of passengers who were getting
in a secure marketplace. AMdEX positively impact on vulnerable on and off at each bus stop in
explores possible use cases communities’ financial inclusion. phone data is increasingly used
both in emergency contexts, the heavy-call volume regions
where data exchanges might and, based on this information,
be useful, including: providing operationally useful
insight to humanitarian staff, implemented the Seoul’s Owl Bus
Faster decision-making service along the nine most heavily
— Data Logistics for Logistics and in discrete pieces of research
Google Waze A platform that looking, for instance, at the trafficked late night routes. This
Data (DL4LD) An innovation project partnership was able to create
of the Dutch national technical provides real-time anonymised intersection of migration and the
crowdsourced traffic data collected climate crisis. To support partners, a service that not only saved late
institute TNO and the University of night commuters $1.2 million in taxi
Amsterdam on sharing logistic data from participating drivers. In its Flowminder has created FlowKit,
Connected Cities programme, a suite of software tools designed fares from 2012 to 2014, reducing
at a large scale. car trips by 2.3 million annually (city
it shares its large amount of traffic to enable access and analysis
— Chief E-Mobility A data-driven data with government agencies, of mobile data for humanitarian buses emit 80 per cent less carbon
optimisation project aimed at the which can use this data to better and development use cases. monoxide than private cars); it was
creation of an electric car charging inform policy or quickly deploy also beneficial to low-income
infrastructure in Amsterdam. traffic assistance if needed. Some communities, providing them with
— Knowledge Mile One of use cases of Waze data include: a viable solution to commute home
Amsterdam’s long streets, made to the outer boroughs after working
into the smartest city street — Reducing traffic night shifts in the city.
by the Amsterdam Creative — Reducing incident response times
Industries Network.
Data sharing toolkit / A: The decision matrix / Canvas 2 19

What data to share? ▶


How to use this canvas — Where do these datasets fit on
the spectrum of closed–open?
Identify the datasets that would Open data
help you solve the problem or — How do we understand public
achieve the value proposition and norms around this type of data
place them in the diagram below (e.g. mandatory portability)?
(group by data owner). — What form does each of the
datasets take?
In addition, you could also use — How mature is the data?
Nesta’s Dataset Catalogue, which
we created to help you list the — How shareable and easy to link is it?
datasets identified, rank them from — What is the level of anonymisation
most to least essential, and include of the data?
useful information about the data.

Useful tools
Questions to ask
— Nesta Dataset Catalogue
— What data would you need, and — ODI Data Spectrum
how much of it is already available? Individuals Potential datasets Public sector
— ODI Data Access Map
— How are you going to incorporate
data if it becomes available — ODI Data About Us
in the future? — Wellcome Trust,
— What are the data gaps and can Understanding patient data
you mitigate the effect of inequality — Local Government Association
in data availability? Data Maturity self‑assessment tool

Private sector
Data sharing toolkit / A: The decision matrix / Canvas 2 20

▶ What data to share? ▶

Open data

Public sector
Individuals

Potential datasets

Private sector
Data sharing toolkit / A: The decision matrix / Canvas 2 21

▶ What data to share? – Case studies

Open data Private companies


Open data can be published by Data held by private companies
different actors, in different formats are a very rich resource.
or through APIs. For a list of open Examples include:
data hubs see 27 ▶
— Bank account information and
customer data at transaction level
Public sector (Open Banking).

There are many examples — De-identified data types produced


of initiatives using data held by mobile network operators,
by the public sector. increasingly used to inform public
health, urban planning and
Interesting examples are the pilots crisis response (e.g. Flowminder,
from the ODAs, which, combining Seoul Owl Bus).
data from multiple sources, tackle — Energy consumption data
issues of domestic abuse, school (e.g. the Ontario Smart
readiness, business and building Metering Initiative).
inspections and gang violence.

A growing number of cities are now Individuals


interested in unlocking the value
of urban data, and considering Examples of initiatives involving
to make it freely available to citizen data include:
the public, such as in the case
of Transport for West Midlands, — Crowdsourced traffic data and road
using Chordant’s oneTRANSPORT conditions from drivers and public
Data Marketplace to help deliver transport (Google Waze).
improved transport services for — Record data such as noise levels,
residents and travellers across pollution, temperature and humidity
the region. (DECODE’s citizen sensing pilot
in Barcelona).
Another example is environmental — Health data (MIDATA).
and air quality data (e.g. AirNow).
Data sharing toolkit / A: The decision matrix / Canvas 3 22

Who to involve? ▶
How to use this canvas
Review the questions and, using
the stakeholder map below, identify the
people you will need to involve (place
at the centre the most essential ones).
Third sector organisations International organisations
e.g. universities and research centres, think tanks, and companies
Questions to ask civil society organisations, voluntary organisations e.g. UN, World Bank, IMF, international charities,
(NGOs, community networks, etc.) multinational organisations
— Who is initiating the partnership?
— What are the incentives of
stakeholders to take part in
the partnership?
— Assess what the value distribution
is in the partnership (i.e. is there
an equity of value among all
stakeholders? Who will not benefit?)
— Where is funding coming from?
— Who holds data that relates to this
use case?
Public sector Individuals
— Is there anyone else besides data organisations e.g. patients, consumers,
providers who are needed to make e.g. national, regional citizen scientists
this project work? For example, and local entities
(municipalities, etc.)
do you have the expertise in terms
of  data science?
— When data subjects’ involvement is
required, how are they represented
in the decisions?
— Is there an option for opting in or out? Commercial entities
e.g. national entities (mobile networks,
financial services, healthcare providers, etc.),
Useful tools local businesses (SMEs, etc.)

— ODI Mapping Data Ecosystems


— Nesta Partnership Toolkit
Data sharing toolkit / A: The decision matrix / Canvas 3 23

▶ Who to involve?

Third sector International organisations


organisations and companies

Public sector Individuals


organisations

Commercial entities
Data sharing toolkit / A: The decision matrix / Canvas 4 24

What is the overall governance structure? ▶


How to use this canvas
Discuss the questions below and
identify the approach your data Top-down Trusted intermediaries Bottom-up
sharing initiative should take,
considering risks and benefits
summarised below.
One organisation’s executive body or Formal or informal mediators, stewards Control is shared among multiple
a group creating the conditions under or trustees that are appointed to parties, which can (at various degrees)
Questions to ask which data is shared and used. This manage an asset (in this case, data) create the conditions under which data
approach is disseminated under their for a purpose on behalf of a beneficiary is shared and used. Governance can
— How will each party report progress? authority to lower levels in the hierarchy or beneficiaries who own the asset. be managed through ad hoc contract-
— How will decisions be made and of stakeholders, who are, to a greater based networks with a shared vision
in which forum? or lesser extent, bound by them but not This could apply to both top-down and a set of governance principles,
able to set the rules of the game. and bottom-up approaches. enshrined in interlocking agreements
— How will conflicts between the parties
be resolved? between all entities.

— Would anyone external Suitable when Suitable when Suitable when


be brought in? — The project is driven by one — There are competing incentives — There are risks associated with
— Who is accountable for what? organisation that initiates the among stakeholders. power concentration (e.g. for the
— How is risk being managed and, collaboration, funds it and ultimately — The data is particularly sensitive and sensitivity of data involved).
if needed, mitigated? owns the problem it is set to solve. requires external mediation and — Collective-choice arrangements
— There are no identifiable ethical additional measures to ensure it is are needed to allow all
downsides or risks to privacy. shared safely. stakeholders to participate
Useful tools in decision‑making processes.

— Royal Society and British Academy, Risks Risks Risks


Data management and use — Power concentration — Competing regulations — Competing interests
— ODI Lessons from pilots — Lack of representativeness — — Difficult co-ordination
— — —
— — —

benefits benefits benefits


— — —
— — —
Data sharing toolkit / A: The decision matrix / Canvas 4 25

▶ What is the overall governance structure? – Case studies

There are various options for — The Cancer Genome Atlas and shares data. These are
governance of data sharing (note that this, like many other early‑stage, but examples like
initiatives, which here are presented health research initiatives, MIDATA or Saluus.coop have
as a set of options on a spectrum involves highly sensitive data). tried to show how this could work
of top-down to bottom-up — The AirNow partnership on air in practice.
approaches, depending on the quality data (therefore using — DECODE project pilots,
power dynamic among those who less sensitive data). which demonstrated how
make the decisions on the structure bottom‑up approaches
of the partnership, those who will — Top-down approaches where supported by enabling city
run the data sharing initiative, an intermediary is introduced include authorities can operate as
those who will provide the data initiatives such as: an effective hybrid model.
and the outputs derived from it.
— The Ontario Smart Metering Entity The best-fit governance model will
In reality, many options will take the in Canada depend upon the answers to a set
form of a hybrid, with some top- of questions around themes such
down involvement (either from the — The ODI data trust pilots in the UK
as who has decision-making power,
public or private sector), combined where accountability lies and how
with an element of self‑governance — At the opposite end of the spectrum,
it is difficult to imagine a purely risk is managed.
by other stakeholders (e.g. a city,
citizen group, commercial entity). bottom-up model, as infrastructure
and technologies often include
— Examples at the top-down end of decisions made by those outside of
the scale include the Sidewalk Labs’ the stakeholders’ group, and can
smart city project in Toronto and end up unrepresentative of the
the partnership between DeepMind population. Examples of initiative
and NHS in the UK, both beset by that get closer to bottom-up
controversy and criticism, mainly approaches are:
around the lack of meaningful public
engagement, the choice of providers — Membership models, often
and issues of data governance. defined as ‘data co‑ops’, which
There are, however, very successful give people shared ownership
top-down initiatives that involve data and decision-making power
sharing such as: over a platform that gathers
Data sharing toolkit / A: The decision matrix / Canvas 5 26

What is the appropriate data infrastructure?


How to use this canvas
On the spectrum, identify the
architecture that is most suitable Centralised Federated Decentralised
to your needs by reviewing
the options provided below.

Stakeholder data is consolidated and Predefined datasets reside within Parts of the system exist in
Questions to ask housed in the same physical location. the infrastructure of data holders separate locations.
— What is the structure that best fits and metadata is searched through
the purpose and why? a central system engine.

— Will the data need curation? Suitable when Suitable when Suitable when
If so, who is responsible for it? — Interoperability across stakeholders’ — There’s a need for predefined — There is need for higher fault
— Future proofing: what measures systems is not required. control over what is shared and tolerance and scalability potential.
have been considered if — Use of legacy data systems and with whom. — Less central control and higher
circumstances change? existing structures are preferred — Both local security and regulatory security (through encrypted
to creating a new one. compliance measures are communication protocols)
— Projects require lower mandatory but the need for global are required.
implementation and scale is also present.
maintenance costs.

Example Example Example


Data.gov.uk is the central platform that Dataverse is a platform where Ocean Protocol is a decentralised
provides storage and open access to researchers can share their data into protocol and network of artificial
a wide variety of government data a ‘dataverse’, which is a container that intelligence (AI) data and services.
sources for the UK. stores datasets, documentation, code It helps power marketplaces to
and metadata. Researchers can then buy and sell AI data and services,
track scholarly citations and have full software for publishing and accessing
control over their datasets, from who commons data, and AI/data science
they share data with to when they tools for consuming data.
publish it.
Data sharing toolkit / A: The decision matrix / Canvas 6 27

How is data accessed? ▶


How to use this canvas
Think about the levels of security No access Restricted access
needed for your data. Within
your own data sharing initiative, The partnership requires sharing data, but this will Access is regulated under specific terms, depending
indicate what kind of data needs not be made available to stakeholders. on the sensitivity of the data and governance
to have no access, restricted access arrangements. Some examples of data access
and open access. Open access types are:

Nationally and internationally, there is increasing — User registration e.g. The SeaDataNet portal and
commitment to the principle that data which are all metadata services are public domain. However,
Questions to ask a user registration is required for submitting requests
publicly funded should be publicly available. Open
— Will data be accessible? data is becoming increasingly available throughout for datasets and for downloading datasets from
the world, released by governments as part of the the distributed data centres, which is arranged via
— If so, what is the access model the Common Data Index (CDI) service. The user
to the data? transparency agenda, in the forms of open APIs or
open data hubs. registration is required to ensure that users agree
— Is there the need to have multiple with the SeaDataNet data policy and its associated
access models? Below is a selection of links to open data hubs. This is User Licence, which rules all dataset deliveries via
— If restricted, what kind of restriction not intended to be a complete list, but an indicator of SeaDataNet. Moreover, it gives SeaDataNet partners
does it require and why? what is openly available. insight in its users and their data requirements.

— International level Global development World Bank — Licences dependent on approval


Open Data, HDX for humanitarian data. (i.e. tiered access, membership) e.g. Access
to the Dementia Platform UK Data Portal can be
— Regional/Federal/National US City-Data, requested through an application process, limited to
the EU Open Data portal, the UK data.gov.uk, the variables needed for the scientific overview of the
the ODI certified datasets. proposed project. The proposal is circulated to the data
— City-level NYC Open Data for the city of New York, guardians of the cohort data requested. The analysis
Amsterdam Open Data, Dubai Pulse. platform is housed within a separate remote desktop,
which will appear as a window on the researcher’s
Some cities, like London, are now exploring how their personal host computer. Data cannot be downloaded
open data platforms could support the sharing of from the analysis platform. Summary tables for the
data that isn’t necessarily suitable to be published for purposes of reporting may be downloaded following
public consumption. Access here more information on additional approval.
the future of the London Datastore. — Value exchange (money/tokens) e.g. any
data marketplace, such as Qlik DataMarket.
Data sharing toolkit / A: The decision matrix / Canvas 6 28

▶ How is data accessed?

Open access
.........................................
.........................................
.........................................
.........................................
.........................................
Restricted access .........................................
.........................................
.........................................
.........................................
.........................................
.........................................
......................................... No access
.........................................
.........................................
.........................................
.........................................
.........................................
.........................................
29

B:
PROJECT
FOUNDATIONS

It is critical to recognise the important role that Investing in open and clear communication
people play in supporting (or hindering) the success and effective working relationships is
of data sharing initiatives. as critical as any other aspect of this work
to ensuring success.
At the end of the day, something as sophisticated
as a data sharing initiative will only succeed
if the people involved are working constructively
in partnership with each other.
Data sharing toolkit / B: Project foundations 30

Project foundations

Once the key decisions are made, a data sharing Senior buy-in Incentives Equitable contributions
model, or perhaps a number of possibilities will
Checklist
emerge. The important question then becomes
how to turn that vision into a reality. There
are two sets of considerations in this phase.

— A checklist, like the one on the following


page, will help you go through the
Project foundations
considerations that need to be addressed
before initiating the partnership.

— The following section on requirements


sums up additional requirements
to consider, including legal, funding
and technical requirements. Requirements
Legal Technical Funding
As previously mentioned, it is likely that in
the design phase there will be a need to go
back to the decision matrix while tackling
a specific element of the project foundation
and vice versa.
Data sharing toolkit / B: Project foundations 31

Checklist
What to do
Senior buy-in to has compelled banks to make medical data. One of the things
To establish a good collaboration their data open through APIs that makes MIDATA different from
environment there are work together that authorised third-party other data storage platforms is
three elements that need to Are you engaging with people organisations can use to develop that it does not use monetary
be addressed. senior enough to make decisions personalised financial services. rewards, but wider societal benefit
and unlock issues when they arise? to encourage data sharing, as they
If you can’t tick all three boxes, Economic incentives consider financial incentives the
What signals have you been given Value that directly or indirectly
you might want to consider how that there is senior buy-in, on both wrong incentive for people to
to identify blockers and solve them affects the bottom line by share their health data.
sides? Do you think this can last increasing revenue or reducing
before starting the process. This will the test of time?
avoid more issues down the line. costs (such as efficiency gains, Another important set of
enlarging customer base incentives in this group respond
A clear incentive for or creating a competitive to the principle of reciprocity
Tools all parties to be involved advantage) or from the direct (i.e. people or organisations
commercialisation of data participate with the aim of
Nesta’s Partnership Toolkit is For a data sharing partnership to (e.g. any data exchange platform helping and receiving advantages
a very useful resource to support work, all parties must benefit. You or market). at the same time, of which
this element of the process, which might have to help your potential reputation is a good example).
identifies the practical steps partner to understand how they Sometimes an economic incentive
that help create a successful will benefit from the partnership. to solve a particular problem
partnership, write an effective Remember that they will need to can be created by announcing Equitable contributions
partnership agreement and get sell the idea internally, no matter a challenge prize, such as in the
case of the Taiwan Presidential Each party involved in the data-
stakeholders’ collaboration off how senior they are. Think outside sharing arrangement must be
to a good start. the box and seek input from others Hackathon.
able to offer something to the
who bring a fresh perspective. Non-economic incentives partnership; however, making
The EAST framework, developed To help them understand the If the benefits that arise from an equitable contribution does
by the Behavioural Insights Team business case for partnership, sharing data are not strictly not mean making an equal
from its experience of applying consider whether they might economic nor result from contribution. Examples of
behavioural insights over the benefit from these incentives: regulations, they fall into contributions include money, time,
past few years, sets out four this category. These include resources, expertise, connections
simple principles for influencing Legal incentives
Regulatory measures can considerations on the common or data. What is required for
behaviour – make it easy, simple, good (i.e. if the value generated a partnership to succeed is for
attractive and timely (EAST). be taken by the government
to compel data sharing. by sharing data benefits society all stakeholders to be clear and
at large). An interesting case is happy that the contributions
An example is Open Banking, MIDATA.coop, a Swiss co‑operative brought into the partnership are
whereby the UK government that gives people control over their fair and valuable to all parties.
Data sharing toolkit / B: Project foundations 32

Requirements
What to do
Each of these will vary considerably Regulation Technical Funding
from initiative to initiative and will Navigating this will require an Together with individual Different types of data sharing
require ad hoc advice from legal ad hoc analysis of the project at evaluations of data maturity partnerships will need funding
and technical experts. multiple levels and will require: and technological infrastructure structures and mechanisms
audits, other considerations designed to support adequately
Make sure that everything — Investigating what the should include: the type of partnership.
decided in this sphere is aligned relevant legal/regulatory
with the design decisions. requirements are. — Data quality, standards — Investment of one lead
— Ensuring that the project and sharing frequency organisation Such as
complies with these requirements Poor data quality and systems a government/private company
and justifying this to relevant that are not interoperable pose or third-sector organisation.
authorities. challenges where ongoing data — Equal or tiered funding One,
sharing is required. It is important some or all partners form the
— Being aware of any evolving legal to understand whether the
requirements and communicating oversight of the initiative, each
project will require a one-off contributing a predetermined
these appropriately to relevant sharing, or whether it will require
stakeholders and, when amount (as per their tier).
ongoing, routine data exchange.
appropriate, to the public. This will have a direct impact — Tiers may be organised based
on the technical architecture on the level of input and/or
For example, the data protection that is needed for supporting incorporate a ‘cost-free’ tier,
regime in the local context data sharing. where partners can document
where the project is held will their interest and support through
influence the structure of — Is the data sharing data provision or consultancy
the partnership. For global architecture going to where necessary.
technology companies, be outsourced or developed
ad hoc Commercial pre‑built — External funding e.g. a grant,
though, the problem isn’t the sum of money from a government
imposition of a single regulatory solutions are available on
the market, but might carry or other organisation for running
regime, but rather the need a particular data sharing project
to consider many, potentially constraints in the way data
is handled. Tailor-made data or pilot.
conflicting ones to maintain
their global businesses. sharing technical architectures — Commercialisation as part
can also be developed in-house of business model, with
or contracted from external income generated through selling
consultants and suppliers. data created (e.g. 23andMe).
Copyright © Smart Dubai. 2020.
All rights reserved.

No part of this work may be reproduced


or transmitted in any form or by any means,
electronic, manual, photocopying, recording
or by any information storage and retrieval
system, without prior written permission
of Smart Dubai.

Designed by soapbox.co.uk

You might also like