Open Data Study
Open Data Study
study
New technologies
Becky Hogge
Transparency &
Accountability Initiative
c/o Open Society Foundation
4th floor, Cambridge House
100 Cambridge Grove
London, W6 0LE, UK
Tel: +44 (0)20 7031 0200
www.transparency-initiative.org
Contents
Introduction 4
The UK data.gov.uk
The US data.gov
Civil Society
12
Top-level drive
15
17
International perspectives
10
18
19
21
Data characteristics
21
25
27
29
30
32
33
34
35
36
Status of FOI
37
37
Potential end-users
37
37
37
38
38
Conclusion 39
Annexes 41
Annexes I: Methodology
42
43
Bibliography 44
Acknowledgements 45
About the author
45
Introduction
There are substantial social and economic gains to be made from opening government data to the
public. The combination of geographic, budget, demographic, services, education and other data,
publicly available in an open format on the web, promises to improve services as well as create future
economic growth.
This approach has been recently pioneered by governments
in the United States and the United Kingdom (with the
launch of two web portals www.data.gov and www.
data.gov.uk respectively) inspired in part by applications
developed by grassroots civil society organisations (CSOs)
ranging from maps of bicycle accidents to sites breaking
down how and where tax money is spent. In the UK, the
data.gov.uk initiative was spearheaded by Tim Berners-Lee,
the inventor of the World Wide Web.
This research, commissioned by a consortium of funders
and NGOs under the umbrella of the Transparency and
Accountability Initiative, seeks to explore the feasibility of
applying this approach to open data in relevant middle
income and developing countries. Its aim is to identify
the strategies used in the US and UK contexts with a view
to building a set of criteria to guide the selection of pilot
countries, which in turn suggests a template strategy to
open government data.
The report finds that in both the US and UK, a three-tiered
drive was at play. The three groups of actors who were
crucial to the projects success were:
Civil society, and in particular a small and
motivated group of civic hackers ;1
An engaged and well-resourced middle layer
of skilled government bureaucrats; and
A top-level mandate, motivated by either an outside
force (in the case of the UK) or a refreshed political
administration hungry for change (in the US).
1 For
a definition of civic hackers, see the Civil society section
of this report.
The UK - data.co.uk
2 Brown, 2009.
3 Berners-Lee, 2009.
4 See: http://www.wheredoesmymoneygo.org/
5 Newbury, Bently and Pollock, 2008.
6 The
study found that in most cases, although making data
The US - data.gov
data.gov is a US government web portal providing the public with access to federal governmentcreated datasets. It was launched in 2009, both to allow citizen feedback and new ideas enabling
transparency, participation and collaboration between state and citizen and to increase efficiency
among government agencies. Most US government agencies already work to codified information
dissemination requirements, and data.gov is conceived as a tool to aid their mission delivery.
8 For
the purposes of comparing the UK and US efforts, only
10 Kirkpatrick, 2010.
11 As
noted above, for the purposes of comparing the UK and
US efforts, only datasets that fall within the Raw Data Catalog
have been counted.
The three-tiered
approach
Civil Society
Its not like mySociety were the only people reusing
data, but we were virtually the only people reusing
it in a sphere that meant that politicians and policy
people paid any attention to who we were.
Tom Steinberg
12 mySociety.
13 Ibid.
19 Where
CKAN stands for Comprehensive Knowledge Archive
23 Ibid.
29 Crabtree,
2010. This sentiment was reflected by the reports
Jonathan Gray
John Wonderlich
Top-level drive
John Wonderlich
Tom Steinberg
43 http://data.octo.dc.gov/
Tim Berners-Lee
44 Brown
took over the position when he became Labour Party
47 Escher, 2009.
International
perspectives
The UK structure and the way this happened in the UK is totally unlikely to
work in the developing world. And I think we should just let go of it.
Ethan Zuckerman
Data characteristics
Key budget
documents
No. of Countries
Making Available
Online
No. of Countries
Making Available
On Request
No. of Countries
Producing the
Information but
not Publishing It
No. of Countries
not Producing
the Information
Pre-budget statement
27
29
26
Executives budget
proposal
49
13
23
Citizens budget
13
67
Enacted budget
68
13
In-year reports
63
13
Mid-year review
18
21
42
Year-end report
50
14
14
Audit report
50
21
the rule of that first six months was to get the lowhanging fruit, to show that online data was valuable
but to do it without attempting anything which
was questionable, like not going anywhere near
personally identifiable information.
Tim Berners-Lee
Other issues around the character of governmentmaintained data were raised in interviews. The specification
of standard formats for data publication was one such
issue. The format in which a government releases data can
have a positive or negative impact on that datas reuse by
third parties, particularly if the government chooses to
release data in proprietary, as opposed to open format. The
following guidance is taken from a techno-legal manual on
open data in the context of international aid, but applies
universally to open data projects:
If the donor and the ministry are not specifying data
export standards, the idea that this data is open,
what youre going to get instead is someone coming
in and [proposing] a proprietary system, heavy focus
on security, a system where it is almost impossible to
squeeze the data out of it in the end. I mean, I think
where there might be an opportunity in all of this is
thinking about how you construct a data standard
and policy for donors... What Im not sure about
is how long it might take for international or local
contractors to figure out how to respond to that and
figure out how to build good bids around that.
Ethan Zuckerman
51 http://www.pmg.org.za/
The implications of this for open data portals are unclear, but
the existence of PMG points to a different path being taken
by civil society actors around government data than that
seen in the UK and US. Those planning interventions around
open data in South Africa should consider the impact of such
interventions on existing civil society initiatives.
The issue of data quality was also raised:
Nathaniel Heller
Toby Mendel
Ethan Zuckerman
Complementary strategies
for open data
During the second round of interviews, a number of
complementary strategies were suggested for pushing
open data initiatives in developing and middle income
countries. As observed in the UK and US contexts, Obama
and Brown, in their endorsement of data.gov and data.
gov.uk respectively, were operating in distinct political
moments. Toby Mendel drew on his experience of
promoting FOI laws around the world to describe a number
of other political moments worth looking out for.
The first (which is arguably similar to Obamas moment)
was the context of a fresh administration brought in on
a popular mandate to replace a corrupt or otherwise
politically dicey old regime:
52 See http://lists.okfn.org/pipermail/okfn-discuss/2010-
53 See http://opengovernmentdata.org/
An open data
strategy checklist
Status of FOI
Potential end-users
H
ow thoroughly does the administration report on
aid spending?
H
ow has the country reacted to previous tied aid?
Is there scope for positive conditionality?
A
re there private donors (local or international)
active in the country who could be useful allies?
Conclusion
This report has been produced on the premise that there are substantial social and economic gains
to be made from opening government data to the public, and that the combination of geographic,
budget, demographic, services, education and other data, publicly available in an open format on
the web promises to improve services as well as create future economic growth. Several interviewees
approached during the research that went into this report have sought to challenge that premise,
and those challenges have been noted. It is beyond the scope of this report to investigate those
challenges in any further detail. However, the researcher suggests that those who wish to take this
research further would do well to monitor the wider impact of data.gov and data.gov.uk, if only to be
able to respond to such challenges in the future.
This research has sought to explore the feasibility of
applying the approach to open data taken by the US and UK
administrations (with the open data catalogues www.data.
gov and www.data.gov.uk respectively) to relevant middle
income and developing countries. It has sought to identify
the strategies used in the US/UK contexts, and it has proposed
a set of criteria to guide the selection of pilot countries,
criteria which in turn suggest a template strategy to open
government data in a middle income or developing country.
The resulting checklist appears to give equal weight to
each factor, and indeed one of the most striking aspects
of the success of the two open data projects studied is
how, in each case, that success was brought about by a
broad range of concurrent factors and events. Nonetheless,
the researcher would like to suggest that one aspect of
the open data strategy deserves special highlighting: the
existence, in both the US and the UK, of highly established
data collection activities operated by a well-resourced,
broadly independent and highly skilled middle layer
of government administrators. This is one aspect of the
trajectory towards data.gov and data.gov.uk that has
perhaps attracted the least attention of onlookers, for
two reasons: the activities of this section of society do
not generally attract attention (at least not when they are
functioning well), nor do its members seek it; and secondly,
the activities of another group that has contributed to the
open data initiatives the so-called civic hackers are fresh
and exciting, and therefore likely to be more attentiongrabbing. All this is worth highlighting since, as several
regional and domain experts have made clear during
interviews, these middle layer activities both the data
collection and the political position the administrators who
undertake it exist within are potentially weak or absent in
most middle income and developing countries.
This report has focussed in some detail on the difficulty
experienced in the UK of opening up geospatial data
that was at the time commercially licensed in order to aid
cost recovery. Such commercial / cost recovery activities
have also been highlighted in the checklist. However, the
relationship of commercial licensing with the eventual
success of a data catalogue like data.gov.uk strikes the
researcher as complex. It has been suggested in the course
of this report that the barrier these activities imposed
in the UK may have served as a common call to action
among both civil society and the middle-layer government
administrators, which in turn served to strengthen the
crucial communication between these two groups in the
trajectory towards data.gov.uk, and ultimately enrich the
final proposition when compared with data.gov.
Of course, without the intervention of Tim Berners-Lee, it
is unlikely whether the UK government would have ever
sanctioned the opening up of geospatial data. But there is
a risk that funders, seeing the impact of this intervention,
Annexes
Annex I: Methodology
Tim Berners-Lee
www.w3.org/People/Berners-Lee/
Domain experts
Regional experts
Nathaniel Heller
Managing Director Global Integrity
www.globalintegrity.org/aboutus/team.
cfm#nheller
T
om Steinberg, Director, mySociety
www.mysociety.org/
John Wonderlich, Policy Director,
Sunlight Foundation
sunlightfoundation.com/
T
oby Mendel, Executive Director
Center for Law and Democracy
V
ivek Ramkumar
International Budget Partnership
www.internationalbudget.org/
Ethan Zuckerman, Senior Researcher
Berkman Center for Internet and Society
http://ethanzuckerman.com/
Dan McQuillan
Social Innovation Camps (CEE)
www.sicamp.org/
O
ry Okolloh, co-founder Mzalendo
co-founder, Ushahidi (Kenya)
www.kenyanpundit.com/
Rakesh Rajani, Founder
Twaweza (Tanzania)
www.twaweza.org/
Bibliography
Acknowledgements