Applying Blockchain Technology in Global Data Infrastructure
Applying Blockchain Technology in Global Data Infrastructure
blockchain
technology
in global data
infrastructure
ODI-TR-2016-001
1 | Open Data Institute 2016 | Applying blockchain technology in global data infrastructure
Table of contents
Executive summary
Introduction 3
The foundations of blockchain technology
13
16
Practical experimentation
20
Conclusions 22
About the ODI
23
Bibliography 24
Executive summary
Blockchains, or distributed ledgers, are part of a new area of technology that is generating a
lot of interest. They have the potential to become an important component of our global data
infrastructure.
In this report, we present an overview of Blockchain technology and issues that come with it,
for a non-technical audience who seek to understand the potential of distributed ledgers and
blockchains in a commercial or policy context. We focus on non-financial use cases, both to
avoid duplication of other work and to explore the wider impact of the technology.
Some of the key areas covered in this report are:
the basics of blockchain and distributed ledger technology
the landscape of use cases and applications
how blockchains will need to link into our global data infrastructure
how blockchains will need to grow and scale over time
the potential privacy implications for personal data in distributed ledgers, and
the risk of adding personal information to blockchain systems without careful
design
a practical exploration of building a blockchain system for data storage
We conclude that distributed ledgers are a potentially important area of technology, but that
we must avoid being swept up by blockchain hype, and remember to focus on solid user
needs first, before we choose the technology that we will use.
We recommend further that this is best done by organisations convening across sectors to
identify common data infrastructure needs.
2 | Open Data Institute 2016 | Applying blockchain technology in global data infrastructure
Introduction
Data is essential for the modern age; it is infrastructure for the whole economy. It underpins
public service transformation, business innovation and democratic engagement. It connects
sectors.
Data infrastructure includes technology, processes and organisations. The Open Data Institute
(ODI) is in the process of developing its design principles for strengthening the data infrastructure
we rely on to build tools, products and services that benefit everyone. Sometimes we talk
about our data infrastructure like we do our road infrastructure: roads help you get from A to
B data helps you get to a decision. Sometimes we need a super highway or a motorway;
at other times a country lane. Data infrastructure needs to be as reliable as necessary and as
open as possible.
A new class of data infrastructure technologies has recently emerged, known as distributed
ledgers. Blockchains are one specific type of technology in that domain, and though far from
the only type, they are receiving a lot of attention. Blockchain technology emerged from the
digital currency Bitcoin, and has been hailed as a revolutionary step forward for data storage
and the decentralisation of computer systems.
With the support of Deutsche Bank, the ODI has been exploring emerging data infrastructure
technologies including distributed ledgers and blockchains their potential and the issues
they might bring. We have explored the technology landscape, raised questions based on
our recent work on data infrastructure and personal data, and begun to investigate technical
implementation. Each of these activities has led to new insights into the current state, and
potential, of these technologies.
While there are many technologies relevant to data infrastructure, we have focused recently on
blockchains because of the hype surrounding them, and a concern that it may lead to negative
outcomes if they are used in the wrong ways.
Plateau of Productivity
VISIBILITY
Slope of Enlightenment
Trough of Disillusionment
Technology Trigger
TIME
3 | Open Data Institute 2016 | Applying blockchain technology in global data infrastructure
Many technologies go through a similar hype cycle; the peak of inflated expectations can
lead to the trough of disillusionment, before eventually we reach the plateau of productivity
(Gartner, 1995). Blockchains and distributed ledgers are definitely on the upwards slope
towards the peak, and in our work we hope to help accelerate through the cycle, and rather
than increasing the height of the peak, to instead raise the floor of the trough. Good ideas die
in the trough, and we want to help those good ideas survive through the cycle to success.
It is only together that we will be able to answer the question of whether blockchains are as
fundamental for forward progress in society as the Magna Carta or the Rosetta Stone (Swan,
2015, p. viii), whether they are irrevocably bound to the failed Bitcoin experiment (Hearn,
2016), or whether the truth lies somewhere in between.
In this report, we hope to help the reader form their own opinion on that emerging question,
and provide a roadmap for further exploration.
4 | Open Data Institute 2016 | Applying blockchain technology in global data infrastructure
If an application does not need to meet all the above criteria, then it does not require a
blockchain solution; there are much more mature technologies available that meet many of
the above criteria (including traditional databases) and can also support distributed service
and organisational models.
Permissions
While classical blockchains (such as Bitcoin) are public and allow anyone to write to them,
some distributed ledgers work differently. Distributed ledgers can be classified as:
Public: anyone may have a copy of the database and anyone may write to
it. This is the classic Bitcoin-style public ledger. Sometimes referred to as
unpermissioned or permissionless
Permissioned: anyone may have a copy of the database, but only certain
parties may write to it an audit log for government information, for example
Private: only certain authorised users have access to the database, whether for
reading or writing a blockchain used internally within an organisations firewall,
for example
The concepts of permissioned and private blockchains are controversial in the blockchain
development community, as they may reintroduce trusted intermediaries into the system and
therefore undermine one of the unique aspects of the technology.
Permission types in blockchains are aligned with the closed / shared / open components of
the ODIs Data Spectrum, which classifies data by who has access to it.1
See http://theodi.org/data-spectrum.
6 | Open Data Institute 2016 | Applying blockchain technology in global data infrastructure
7 | Open Data Institute 2016 | Applying blockchain technology in global data infrastructure
Section summary
Blockchains (or distributed ledgers) are a new type of database
with unique properties
Blockchains suit applications where multiple readers and writers
need to use the database without a trusted third-party
Data stored in a blockchain cannot be changed afterwards
Public, permissioned and private blockchains are differentiated
by who can read or write data within them
Smart contract systems enable blockchains to run executable
code, allowing much more complex behaviour
8 | Open Data Institute 2016 | Applying blockchain technology in global data infrastructure
Emerging platforms
At the current stage of development, platform technologies have not yet fully emerged. Many
blockchain systems are bespoke, and combine both platform and application functions. As
the field matures, commoditised platforms that allow applications to converge on a common,
underlying technology stack are likely to appear.
Some of the most interesting applications, at least from a technical perspective, are those that
use multiple distributed technologies and blockchain implementations in a single service. One
example of this is Alexandria,2 a system designed to allow users to distribute digital content.
It uses Bitcoin to handle payment, Florincoin3 (an alternative cryptocurrency) to create a
searchable index, and the InterPlanetary File System4 or IPFS (a distributed hash table or
DHT) for content storage and distribution.
Such emerging hybrid stacks show that complex services can be entirely implemented in a
shared, distributed manner. We find one promising stack to be the combination of Ethereum5
for the application or logic layer, IPFS for bulk storage, and BigchainDB6 for database storage.
This shows the beginning of a true platform ecosystem mirroring the common LAMP7
webstack.
2
3
4
5
6
7
See http://blocktech.com.
See http://florincoin.org.
See https://ipfs.io.
See https://ethereum.org.
See https://www.bigchaindb.com.
Linux, Apache, MySQL, and PHP.
9 | Open Data Institute 2016 | Applying blockchain technology in global data infrastructure
This list is published under an open licence for anyone to access, use and share at dlt-research.labs.theodi.org.
10 | Open Data Institute 2016 | Applying blockchain technology in global data infrastructure
Governance
The following services aim to use smart contracts to inform decision-making processes,
and in some cases create Decentralised Autonomous Organisations (DAOs) or even
decentralised nations.
Colony: a tool to help organise distributed, democratic company-like entities or
organisations (colony.io)
Freecoin: an alternative currency for democratic social organisations
(freecoin.ch)
BoardRoom: a service that organises decision-making processes
(boardroom.to)
BitNation: a collaborative platform for governance for organisations and even
virtual nations (bitnation.co)
11 | Open Data Institute 2016 | Applying blockchain technology in global data infrastructure
False promises
While there are promising applications as identified above, a great many of the ideas out
there are vapourware, with no viable implementation or model. For instance, development on
Honduras land registry, which is being turned into a distributed ledger by Factom, has stalled
with no working system (Rizzo, 2015). This is particularly important, as this example is used
repeatedly to show that blockchains can be useful in traditional government applications, but
has not yet shown any results.
There are also many instances of old ideas being brought back to life with an application of new
technology sheen. For instance, tracking benefit payments and how they are spent is a policy
idea that has been proposed and rejected in the past, but is now reappearing with blockchains.
Many such projects failed for good reasons in the past, and the addition of blockchains will not
change those reasons, which is more often social or cultural than technological.
Section summary
The distributed ledger technology stack is still emerging, with
unclear boundaries between platforms and applications
Many blockchain applications are technology-centric rather than
focused on user needs
While there are good examples of blockchains being used to
build useful services, they are currently in the minority
Interested parties must be alert to blockchain hype and myths
when surveying the landscape, and should focus on those
applications solving real problems
12 | Open Data Institute 2016 | Applying blockchain technology in global data infrastructure
13 | Open Data Institute 2016 | Applying blockchain technology in global data infrastructure
Linking blockchains
To find the balance between a few large and many small blockchains, blockchains will need
to be able to split and merge over time, and for this to work, they will need to be able to refer
to each other.
It is also important that new data infrastructure, built on blockchains, is compatible with the
Web we have already. We will need to be able to link to items in a blockchain from outside,
on the wider Internet. After all, data is at its most useful when it can be referred to and linked.
Therefore, transactions on blockchains should have stable URLs and an equivalent of HTTP
redirection to point to updated locations. We will need a standardisation effort to create
standard URL schemes for blockchain transactions, if these systems are to link into our global
data infrastructure.
14 | Open Data Institute 2016 | Applying blockchain technology in global data infrastructure
Section summary
Fewer, large blockchains will be more secure
More small blockchains will be more scalable
Blockchains will need to be able to split or merge over time to
balance their size and security
We need to create URL and related standards to link blockchains
into our global data infrastructure
15 | Open Data Institute 2016 | Applying blockchain technology in global data infrastructure
Arguably, we have a situation even now where once data is published on the Web, it can never
truly be removed. Certainly removing a page from Googles search results under a right to be
forgotten does not actually remove it from the Web, rather it makes it harder to find. What is
different about blockchains is that if a court were to attempt to legally compel the removal of
data from them, it would be both hard to do and have very disruptive side-effects, which we
explain in the next section.
a donate button on a blog. Those who trade Bitcoins are therefore advised to hold several
addresses and not to transfer Bitcoins between those accounts to avoid others linking them
together. It is not clear how many Bitcoin holders are aware of this risk, and security-throughobscurity is well-known to be insufficient.
Some of the proposed uses for blockchain such as to record auditable benefits payments
threaten to expose this kind of information about a much wider range of people, the benefits
they receive and with whom they spend them.
Blockchains do not have to expose personal data directly to reveal private information about
people. A blockchain recording visits to health practitioners (including midwives, mental health
teams and AIDS clinics) does not need to include the entirety of someones health records to
reveal information about them. Much like phone records (Mayer & Mutchler, 2014) or browsing
histories, this metadata may be sufficient to reveal personal details.
Finally, it is possible to encrypt data stored within a blockchain. The main problem with this
approach is that if the decryption key for encrypted data is ever made public, the encrypted
content is readable by anyone with that key; there is no way of encrypting the data with a
different key once it is embedded within a blockchain. Conversely, if the key is ever lost, the
data cannot be read. And there is the problem of sharing the key for the data amongst all those
who legitimately need to be able to read it.
On top of that, a well-used blockchain will be a potentially eternal datastore, and over a
sufficiently long period any encryption will be broken whether by discovery of loopholes,
backdoors, or the advent of new techniques such as quantum computing.
Regardless of the approach taken to designing blockchains, every blockchain contains
transaction data. That data needs to be designed so that it is not disclosive in and of itself,
which may be a tricky balance as that data might also be necessary to assess whether the
transaction is valid and therefore prevent fraud or errors. Transactions should also be designed
so that they cannot be used to add comments that might include personal data.
Blockchains are not necessarily bad for privacy; it all depends on how they are designed. As
stated by Vitalik Buterin (2015): blockchains do not solve privacy issues, and are an authenticity
solution only. Anyone experimenting in the area should be thinking through the implications.
As the ICOs guidance around privacy by design suggests, designers should be carrying out a
privacy impact assessment or similar process up-front, to ensure that the transparency of the
information stored in the blockchain does not infringe on peoples privacy. Unlike with other
technologies, getting it wrong is really hard to reverse.
Section summary
Immutable data storage in blockchains may be incompatible with
legislation which requires changes to the official truth
Once added, removing data from blockchains can be impractical
and highly disruptive
Even if personal data is not stored on a blockchain, metadata can
be sufficient to reveal personal information
Blockchains by themselves are not a solution for personal or
private data
Any encryption used is likely to be broken in the future
Bad blockchain design decisions are very hard to reverse
19 | Open Data Institute 2016 | Applying blockchain technology in global data infrastructure
Practical experimentation
During early 2016, the ODI Labs team carried out some practical exploration of blockchain
technologies and how they could be applied to data infrastructure. This was done in order
to understand the technology more deeply, but also to drive out further issues through
experimentation.
Creating a blockchain
The first step was to create a blockchain to experiment with. The team opted to build on
Multichain,10 a blockchain software toolkit that supports Bitcoin and other blockchains, but
adds support for metadata and custom assets.
As software developers, it was reasonably simple to get a system up and running, although
this would be confusing and difficult for anyone without technical experience. There is a clear
need for user-friendly prototyping systems so that non-technical users (for instance policymakers) can explore their own use cases.
version of the Food Standards Agency premises ratings dataset. If the FSA website was
compromised, and ratings altered, the information on our blockchain which could not be
deleted or altered could be used to detect the tampering. We successfully imported the full
rating data (rather than just an audit hash) into the blockchain, storing the premises ID, along
with the inspection date and rating.
Searching blockchains
While the team successfully stored open data in a blockchain, various issues emerged when
trying to use the data.
Firstly, finding a particular record is non-trivial. To find a particular food hygiene rating for a
premise, we needed to carry out a brute-force search of the blockchain. Many search systems
are available, and indeed there are many sites for searching and viewing the Bitcoin blockchain,
which index the blockchain into a searchable database. The problem then becomes that you
are now relying on the integrity of your search index to ensure you find the right information.
If the idea of the blockchain is to prevent centralisation (due to lack of trust or other reasons),
we cannot trust search indexes maintained by others. In order to find anything, not only would
each blockchain client have to have the entire blockchain stored, but also store (and update)
a search index built from it, which could be considerably larger. There is a need for search
indexing to be built into blockchain software if it is to be used for data.
Incentivising maintenance
The test blockchain we created consisted of only two nodes, mining and sending fake currency
to each other. The key to the integrity of blockchain technology is its decentralised nature.
Once one person or organisation owns the majority of nodes in a blockchain network, this
integrity breaks down, as discussed earlier.
If we were to open up our audit blockchain to the wider public, how would we incentivise
mining? With Bitcoin, once a miner adds new transactions to the blockchain, there is a fixed
reward (currently 25 BTC, almost 7,500 in todays exchange rate). As the size of the blockchain
network increases, so does the computational power required to add new blocks (if using a
proof of work algorithm); ODwyer and Malone (2014) have compared the electricity usage
of the Bitcoin network to that of Ireland. As a result of this, and the rewards at stake, there are
entire server farms devoted to Bitcoin mining.
Currently, the test blockchain we created is tiny, and a few well-intentioned people might be
happy to mine it for free. However, as the blockchain grows, mining will get harder, and the
worthless currency the blockchain rewards miners with wont help them cover the electricity
bills the computation will end up costing them.
21 | Open Data Institute 2016 | Applying blockchain technology in global data infrastructure
Section summary
Simple prototyping tools are required so that non-technical users
can explore blockchain use cases
Only a very small amount of information can be encoded with
each transaction in current blockchain software (including
Bitcoin)
Distributed search indexes will be required for trusted search of
blockchain data, increasing the space requirements
Conclusions
Data is part of the infrastructure of the modern world. It is essential to the operation of society,
and it is vital that we learn how to build, maintain and strengthen our data infrastructure.
Distributed ledgers are a potentially important technology for enabling a shared data
infrastructure, and are worthy of investigation.
However, new technologies always go through a hype cycle. The challenge at the beginning of
that cycle is to identify the uses and applications that will stand the test of time. Blockchains
are unusual in that mistakes made in early deployment could last a long time, and could cause
significant damage, especially if deployed carelessly with personal data.
Blockchains could be used to build confidence in government services through public
auditability. They also hold great potential for collaborative maintenance of data assets,
enabling widely distributed data collection and publishing for applications such as supply-chain
information. Smart contracts also have promise for the future across many applicationareas.
However, in our research we have seen many cases where people attempt to bolt old, failed
or impossible policy and business ideas onto the new technology, or to unnecessarily reinvent
things that work perfectly well. Many other cases show familiar organisational models being
rebuilt as permissioned ledgers based on blockchain technologies, but this ignores the core
innovation of the technology and its promised transformation.
We have seen many ideas that would put new personal data into blockchains but learnt that,
if misused, this will create significant new privacy issues. The core problems that blockchain
technologies help to address of distributed maintenance by collaborating organisations is
of growing importance and an area that shows some promise, but few are considering it.
Success in data infrastructure design will come from convening sectors (such as finance,
22 | Open Data Institute 2016 | Applying blockchain technology in global data infrastructure
Recommendations
Organisations must remember to start with user needs, rather
than preselecting technologies that may or not be appropriate
Organisations considering blockchain technology should be
aware that there are many other distributed technologies
available, and it is important to assess needs properly in order to
arrive at the right technology choice
Organisations should convene across sectors to identify
common data infrastructure needs, and then decide how those
can best be met and which technologies should be used
When experimenting with blockchain technologies, researchers
and developers must be aware of the privacy implications of
storing information in a public immutable database; what is done
cannot be easily undone
23 | Open Data Institute 2016 | Applying blockchain technology in global data infrastructure
Bibliography
Alexander, R. (2014). The First Blockchain Wedding. [Online] Available at: https://bitcoinmagazine.com/articles/firstblockchain-wedding-2-1412544247 [Accessed 2016-05-20].
Borah, P. (2015). DAO Wars. [Online] Available at: https://github.com/consensys/dao-wars [Accessed 2016-05-23].
Buterin, V. (2016). Privacy on the Blockchain. [Online] Available at: https://blog.ethereum.org/2016/01/15/privacy-on-theblockchain/ [Accessed 2016-05-20].
Hern, A. (2014). Bitcoin currency could have been destroyed by 51% attack. The Guardian [Online] Available at: https://
www.theguardian.com/technology/2014/jun/16/bitcoin-currency-destroyed-51-attack-ghash-io [Accessed 201605-20].
Hearn, M. (2016). The resolution of the Bitcoin experiment. [Online] Available at: https://medium.com/@octskyward/theresolution-of-the-bitcoin-experiment-dabb30201f7 [Accessed 2016-05-20].
Gartner. (1995). Hype Cycle Research Methodology. [Online] Available at: http://www.gartner.com/technology/research/
methodologies/hype-cycle.jsp [Accessed 2016-05-20].
Greenspan, G. (2015). Avoiding the pointless blockchain project. [Online] Available at: http://www.multichain.com/
blog/2015/11/avoiding-pointless-blockchain-project/ [Accessed 2016-05-23].
Mayer, J. & Mutchler, P. (2014). MetaPhone: The Sensitivity of Telephone Metadata. [Online] Available at: http://webpolicy.
org/2014/03/12/metaphone-the-sensitivity-of-telephone-metadata/ [Accessed 2016-05-20].
Munroe, R. (2014). Twitter. In: What If? London: John Murray, p. 217221.
Neyfahk, L. (2015). Californias Sane New Approach to Sex Offenders. [Online] Available at: http://www.slate.com/articles/
news_and_politics/crime/2015/04/california_s_sane_new_approach_to_sex_offenders_and_why_no_one_is_
following.html [Accessed 2016-05-20].
ODwyer, K. and Malone, D. (2014). Bitcoin Mining and its Energy Footprint. [Online] Available at: https://karlodwyer.github.
io/publications/pdf/bitcoin_KJOD_2014.pdf [Accessed 2016-05-21].
Onename, (2015). Why Onename is Migrating to the Bitcoin Blockchain. [Online] Available at: http://blog.onename.com/
namecoin-to-bitcoin [Accessed 2016-05-20].
Provenance, (2015). Blockchain: the solution for transparency in product supply chains. [Online] Available at: https://www.
provenance.org/whitepaper [Accessed 2016-05-20].
Resnikoff, P. (2015). Im Imogen Heap. And This Is Why Im Releasing My Music on Blockchain. [Online] Available at:
http://www.digitalmusicnews.com/2015/10/05/im-imogen-heap-and-this-is-why-im-releasing-my-music-onblockchain/ [Accessed 2016-05-20].
Rizzo, P. (2015). Blockchain Land Title Project Stalls in Honduras. [Online] Available at: http://www.coindesk.com/
debate-factom-land-title-honduras/ [Accessed 2016-05-23].
Swan, M. (2015). Blockchain: Blueprint for a new economy. 1st ed. OReilly Media.
van Wirdum, A. (2015). Honduran Govt to Build Land Registry Initiative on Bitcoin Blockchain. [Online] Available at: http://
cointelegraph.com/news/honduran-govt-to-build-land-registry-initiative-on-bitcoin-blockchain [Accessed 201605-20].
Woods, T. (2015). This couple got married on the blockchain. [Online] Available at: https://technical.ly/brooklyn/2015/11/11/
couple-got-married-blockchain/ [Accessed 2016-05-23].
24 | Open Data Institute 2016 | Applying blockchain technology in global data infrastructure