Quantitative Methods I: Reproducible Research and Quantitative Geography

This document discusses the importance of reproducible research in quantitative geography. It argues that as trends move toward algorithms, simulations, and big data analysis, quantitative research should be documented well enough that a third party can replicate the results. The document provides examples of reproducible research in geography from the 1970s to present day. It discusses tools that enable reproducibility and the benefits it provides. Finally, it considers challenges quantitative geographers face in publishing reproducible research.

Uploaded by

kausara

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

56 views

Quantitative Methods I: Reproducible Research and Quantitative Geography

Uploaded by

kausara

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

View metadata, citation and similar papers at core.ac.

uk brought to you by CORE

provided by MURAL - Maynooth University Research Archive Library

Progress report
Progress in Human Geography
2016, Vol. 40(5) 687–696
Quantitative methods I: ª The Author(s) 2015
Reprints and permission:

Reproducible research sagepub.co.uk/journalsPermissions.nav

DOI: 10.1177/0309132515599625
phg.sagepub.com
and quantitative geography

Chris Brunsdon
National Centre for Geocomputation, Maynooth University, Ireland

Abstract
Reproducible quantitative research is research that has been documented sufficiently rigorously that a third
party can replicate any quantitative results that arise. It is argued here that such a goal is desirable for
quantitative human geography, particularly as trends in this area suggest a turn towards the creation of
algorithms and codes for simulation and the analysis of Big Data. A number of examples of good practice in
this area are considered, spanning a time period from the late 1970s to the present day. Following this,
practical aspects such as tools that enable research to be made reproducible are discussed, and some
beneficial side effects of adopting the practice are identified. The paper concludes by considering some of the
challenges faced by quantitative geographers aspiring to publish reproducible research.

Keywords
Big Data, computational paradigm, geocomputation, programming, reproducibility

I Reproducibility in research The term reproducible research (Claerbout,

1992) is used to describe an approach which
A great deal of practical quantitative work in
may be used to address this problem. Although
human geography relies on the analysis of data –
not noted greatly by geographers at the time of
and it is often the case that published results are
writing (but see Brunsdon and Singleton, 2015),
the final exposition of a great deal of behind-
it has gained attention in a number of areas where
the-scenes data collation, re-formatting, cod-
quantitative data analysis is used, for example:
ing, statistical modelling and visualization. It
statistics (Buckheit and Donoho, 1995; Gentle-
might be said that although published articles
man and Temple Lang, 2004), econometrics
in this area exist to outline underlying questions,
(Koenker, 1996) and signal processing (Barni
and draw conclusions from the data analysis, the
et al., 2007). It is argued here that there is a strong
conclusions will depend greatly on the behind-
case for a focus on this topic in quantitative geo-
the-scenes work as well. This is why those carry-
graphy. The goal of reproducible research is that
ing out this work are generally listed as authors.
However, although the publication itself is a
platform for discourse and debate around its
Corresponding author:
content, it is sometimes harder to incorporate the
Chris Brunsdon, National Centre for Geocomputation,
behind-the-scenes activities into such debate, NCG, Maynooth University, Iontas Building, NUI, Maynooth,
despite the fact that it can also influence conclu- Ireland.
sions and recommendations. Email: [email protected]
688 Progress in Human Geography 40(5)

complete details of any reported results and the by the researcher or a third party) as an enabling
computation used to obtain them results should technology. In both of the newer paradigms,
be available, so that others following the same although important ideas may be articulated in
procedures and using the same data can obtain published texts, distinct intellectual contribu-
identical results. This article considers the rele- tions are embedded in software code where the
vance and implications of this for geographical ideas are represented in their most detailed
data analysis and GIS. Although the idea was put form. Given this, a full critical engagement with
forward over two decades ago, the need to adopt researchers working within these paradigms is
reproducible practices is more relevant than inhibited if code is not available openly. This
ever. It has been argued that in addition to the is generally the case for quantitative science and
two ‘classical’ paradigms of science that were social science, and for digital humanities. Here
commonly acknowledged at the time of the attention will be focused on the implications
Claerbout (1992) paper (Hey et al., 2009; for quantitative geography, geocomputation and
Kitchin, 2014b), two further paradigms are geographical information science.
emerging:

1. Deductive (mathematics and formal logic) II Geographical examples

2. Empirical (data collection, statistical model of reproducible research
calibration and testing of hypotheses) For geographers, a consideration of the implica-
tions of the computational and exploratory
In chronological order a third computational paradigms is key in making the case for repro-
paradigm uses algorithmic approaches such as ducibility. In terms of the computational para-
large-scale simulation (for example agent- digm, there is already a long tradition of the
based modelling; Heppenstall et al., 2012) as a use of this approach. Although pre-dating the
tool to gain insight into complex systems. Next, time when the idea of a computational paradigm
a fourth exploratory paradigm is emerging in science was more common currency, work
(Kelling et al., 2009), typified by the use of ‘data such as Openshaw and Taylor (1979) exploring
mining’ or, more generally, data-intensive the variation in correlation coefficients as areal
approaches to identify interesting (arguably units change demonstrates its use impressively.
useful?) patterns in very large and structurally A key idea in the paper is this exploration of
complex data sets. This emergence is in part due variability, but a comprehensive and accurate
to the fact that advanced data collection, mea- record of how this was achieved lies in the
surement and observational technology have underlying FORTRAN code. Further examples
made it possible to collect very large data (but include those related to microsimulation (Clarke
often ‘messy’ data sets), and parallel advances and Holm, 1987). Lovelace and Ballas (2013)
in computer technology, such as cloud comput- modify microsimulation techniques to provide
ing, mean it is possible to process such data sets simulations guaranteed to produce integer-
in efficient ways. As the two ‘traditional’ para- based weighting for iterative proportional fit-
digms interact, there are interactions between ting (Ballas et al., 2005), and again the key ideas
all four of the paradigms listed. For example, are those reflected in code. In this case, a fully
large-scale simulations are a way of exploring reproducible approach is taken – in a supple-
the consequences of certain mathematical assump- ment to the main article by Lovelace and Ballas
tions arising from deductive approaches. a document outlines the technique in detail,
One thing linking the newer paradigms is incorporating code written in R (R Core Team,
their reliance on computer code (either created 2015) used to implement the algorithm. This
Brunsdon 689

enables others to interact with the algorithm

specified, and either modify it or apply it in
a different situation, but one sufficiently sim-
ilar that the same analytical framework would
be meaningful. A similarly open approach is
found in Ren and Karimi (2012), who present
a fuzzy logic approach to GPS-based wheel-
chair navigation – here a link is provided to
Java code used to implement their proposed
algorithm.
An epidemiological example may be found
in Parker and Epstein (2011), which uses
agent-based models to simulate disease trans-
mission on a global scale. In discussion, the
authors provide a detailed outline of the under-
lying code used and, in particular, consider and
provide details to assist in reproducing the code
(including several code chunks), again making
it possible to understand the underlying model
(the key idea embodied in the article) more thor- Figure 1. Map obtained by reproducing algorithm of
oughly and consider the effects of relaxing or Wood et al. (2012).
modifying the assumptions of the model by
modifying the code and re-running. of reproducibility of the algorithm discussed,
Other articles, although not providing full the author of this article was able to recreate
reproducibility – as they do not make the exact it in R, for example, producing the results
code used available – do provide very detailed shown in Figure 1.
descriptions so that there is a strong chance
that a third party could reproduce the results.
Although arguably this implies that full repro-
ducibility is not achieved, papers adopting this
III A geographical case
approach demonstrate some of the advantages for reproducibility
outlined above. For instance, Bergmann (2013) Clearly, this idea is more practical in some areas
combines quantitative and qualitative approaches of study than others, and resources are an impor-
to consider global geographies of carbon emis- tant factor. It would not be feasible to re-run
sions from a number of perspectives. For the an entire census including data collection, col-
quantitative part, full details of input-output lation and distribution, for example. However,
models are provided which could be used to in the area of quantitative human geography
reconstruct and run analyses. A very different (assuming we accept census data ‘as seen’!), and
paper by Wood et al. (2012) similarly provides particularly spatial data analysis and GIS, it is a
highly detailed computational description – in practical proposal in many cases.
this case of algorithms and data graphics with The above may be seen as sufficient justifica-
the appearance of being hand-drawn. Although tion for reproducible research. However, if a
the direct code to produce the results seen more detailed case is to be made, the following
in the paper is not shared, an open source scenarios (taken from Brunsdon and Singleton,
library of tools is made available. In terms 2015) help to reinforce the argument:
690 Progress in Human Geography 40(5)

1. You have a data set that you would like article by Claerbout (1992) and those following,
to analyse using the same technique as providing such detail was not considered stan-
described in a paper recently published dard practice in many disciplines. Indeed, some
by another researcher in your area. In time later, few journals (none in geography,
that paper the technique is outlined in although this could be changing soon) insist that
prose form, but no explicit algorithm is such precise details are provided, and it could
given. Although you have access to the perhaps be argued that there is some contribu-
data used in the paper, and have attempted tory negligence on their part.
to recreate the technique, you are unable Similarly, although it is usually required
to reproduce the results reported there. that researchers must cite the sources of sec-
2. You published a paper five years ago in ondary data, such citations often consist of
which an analytical technique was applied acknowledgement of the agency that sup-
to a data set. You now discover an alterna- plied this data, possibly with a link to a gen-
tive method of analysis, and wish to com- eral website, rather than an explicit link (or
pare the results. links) to a file (or files) that contained the
3. A particular form of analysis was reported actual data used in the research, or details
in a paper; subsequently it was discovered of any re-formatting of the data (including
that one software package offered an code) prior to analysis. However, both pieces
implementation of this method that con- of information allow published results to be
tained errors. You wish to check whether critically assessed and scrutinized – ulti-
this affects the findings in the paper. mately leading to more trustworthy research
4. A data set used in a reported analysis was conclusions.
subsequently found to contain rogue
data, and has now been corrected. You
wish to update the analysis with the
IV The case for reproducible
newer version of the data. quantitative geography
The above is a general argument for reproduci-
Articles providing precise verbal description bility. However, one could ask whether this is
of algorithms are useful in these scenarios – as relevant or practical for applications in quantita-
exemplified in the earlier examples – and it is tive human geography. In terms of relevance, it
certainly the case that this is a great improve- is worth noting that a great deal of analysis of
ment on vaguer descriptions that provide insuf- social and economic data is inherently spatial –
ficient information to reproduce initial analyses. whether focusing on regional, local or street
However, one could argue that the code itself is level – and that the results of such analyses are
a much stronger aid to reproduction – a verbal often used to inform policy-makers, and are
description being prone both to incorrect inter- used in decision-making processes. In many
pretation and omission of necessary detail. In cases, the data being analysed is publicly avail-
addition, there is the possibility that the code able – for example, the US Census Bureau
used in an article may contain an error, so that provide a number of APIs to access official sta-
the precise description is in fact precise only tistics such as economic time series indicators
in outlining what the author thinks it does – only and the decennial census for 1990, 2000 and
the code itself will yield what it actually does. In 2010, the UK provides public access to census
most cases, the omission of such information is and reported crime data crime data, and Ireland
not done with malice aforethought on the part of provides access to Irish census data. However,
researchers. Until the issue was raised in the not all reports or articles analysing this and
Brunsdon 691

other publicly available data provide precise weighting of summary statistics lead to serious
details of the analysis. miscalculations that inaccurately represent the
There are a number of arguments as to why relationship between public debt and GDP growth
such information should be provided. The first among 20 advanced economies . . . Our overall
is a purely academic one – a useful and informed evidence refutes RR’s claim that public debt/GDP
ratios above 90% consistently reduce a country’s
critical discourse of any analytical work can only
GDP growth. (2013: 1)
take place when full details are provided. When
the data analysis is a black box, it is difficult to This arose after a student, Thomas Herndon,
either uphold or argue against any conclusions unsuccessfully attempted to reproduce the anal-
reached. One cannot tell whether the underlying ysis in Reinhart and Rogoff’s paper as a course-
models or techniques are appropriate or, even if work exercise. Investigations unearthed that the
they are, whether the underlying code or other analysis was flawed – in part due to an error
computational approach faithfully reflects them. with an Excel spreadsheet. In this case measures
A second argument is one of accountability. were not taken to ensure reproducibility in the
Many quantitative studies inform policy deci- original paper – it took an amount of forensic
sions by governments and other institutions – computing to discover the problem. Following
different quantitative analyses with different this, an errata was published (Reinhart and Rog-
outcomes could well lead to different policy off, 2013), although Rogoff and Reinhart have
decisions. Providing information not only about defended their conclusions – if not their original
the sources of data used but also about the meth- analysis. However, the debate continues as
ods used to analyse the data is a key strategy of authors of the critique continue to challenge a
open government and democratic decision- number of assumptions in the corrected analysis.
making. As suggested earlier, this in turn leads Putting aside any criticisms I may have of the
to a more trustworthy approach – although this original paper, the outcome here is perhaps one
does not guarantee that an analysis is without of cautious optimism in that an open debate about
error, it provides a mechanism where it is open the underlying analysis is now taking place –
to public scrutiny, so that the probability that any albeit after a great deal of public controversy.
error is identified and corrected is notably Again quoting from Herndon, Ash and Pollin:
increased. Also, relating to the earlier point, it
implies that any assumptions made in the analy- Beyond these strictly analytical considerations,
sis are open to scrutiny, so that public discussion we also believe that the debate generated by our
critique of RR has produced some forward prog-
and debate regarding the basis of policy deci-
ress in the sphere of economic policy making.
sions is made possible. (2013: 279)
A reminder of the relevance of this is pro-
vided through the recent controversy surround- However, a reproducible approach here could
ing a paper by Reinhart and Rogoff (2010), have resulted in a smoother path to the final sit-
whose published findings have been widely uation of public debate and a resolution of the
cited as an argument for fiscal austerity. How- erroneous analysis. Indeed, the spirit of the
ever, in an article by Herndon, Ash and Pollin exercise set to the student was that of reprodu-
(2013), flaws were identified in the data analy- cing the published analysis.
sis carried out in the paper. Quoting from the
abstract of the latter article:
V Achieving reproducibility
We replicate . . . and find that selective exclusion To address these problems, one approach pro-
of available data, coding errors and inappropriate posed is that of literate programming (Knuth,
692 Progress in Human Geography 40(5)

1984). This was initially proposed as a tool for publicly available. Thus, not only is it possible
documenting code, where a single file contained to share high level data analysis operations, but
both the code documentation and the code itself. also the code used to build the tools at the higher
This was used to generate both a human read- level.
able document and computer readable content Another possibility here is an approach using
to generate software. The purpose of this was Pweave (Pastell, 2014) – a similar extension of
that the human readable output provided an NOWEB to embed Python code rather than R.
explanation of the working of the program (and Again, Python offers many tools for geographi-
also neatly printed listings of the code), offering cal data analysis, such as the PySAL package
an accessible overview explanation of the pro- (Rey, 2015).
gram’s function. However, such compendium
files can also be used in a slightly different way,
where rather than describing the code, the VI Beneficial side effects
human readable output is an article containing Although much of the justification of a reprodu-
some data analysis performed by the incorpo- cible approach has been defensive, there are a
rated code. Tabulated results, graphs and maps number of benefits provided. Many of these
are created by the embedded code. As before, occur as side effects when using the kinds of
two operations can be applied to the files – approach outlined above. In particular:
document creation, and code extraction. The
embedded code is also visible in the original Reproducible analyses can be compared:
file. Thus information about both the reporting Different analytical approaches attempt-
and the processing can be contained in a single ing to address the same hypothesis can
document – and if this document is shared then a be compared on the same data set, to
reproducible analysis (together with associated assess the robustness of any conclusions
discussion) is achieved. drawn. In particular, a third party can take
Examples of this approach are the NOWEB an existing reproducible document and
system (Ramsey, 1994), and the Sweave and add an alternative analysis to it.
Knitr packages (Leisch, 2002; Xie, 2013). The Methods are documented: One option
first of these incorporates code into LaTeX doc- with many reproducibility tools is to
uments using two very simple extensions to the incorporate the code itself – as well as its
markup language. The latter two are extended outputs – in the documents produced.
implementations of this system using R as the This allows for transparency in the way
language for the embedded code. Knitr also that results are obtained.
offers the possibility of embedding code into Methods are portable: Since the code
markdown – a simpler markup language than may be extracted from the documents,
LaTeX – which facilitates very quick produc- others may use it and apply it to other data
tion of reproducible documents. The fact that sets, or modify it and combine it with
R is used in the latter two approaches is other methods. This allows approaches
encouraging for geographers, since R offers a to be assessed in terms of their generality,
number of packages for spatial analysis, geogra- and encourages further dialog in terms of
phical data manipulation of the kind provided interpretation of existing data.
by geographical information systems, and spa- Results may be updated: If updated ver-
tial statistics (Brunsdon and Comber, 2015). sions of data used in an analysis are pub-
Furthermore, as R is open source software, the lished (for example new census data),
code used in any of these packages is also methods applied to the old data may be
Brunsdon 693

re-applied and updated results compared wave of practitioners for whom the adoption
to the original ones. Also, if the original of coding as a tool for data analysis does not
data required amendment, an updated imply a change of culture. Recent attendance
analysis could easily be carried out. at GIS conferences by the author would suggest,
Reports may have greater impact: Recent at least anecdotally, that these trends are
work has shown that papers in a number reflected in geocomputation and geographical
of fields, including reproducible analy- information science.
ses, have higher impact and visibility. This Other minor practical challenges also exist –
is discussed in Vandewalle, Kovačević for example, how can a sequence of random
and Vetterli (2009). numbers in simulations be reproduced? How-
ever, many of these can be resolved by exam-
ples of ‘best practice’. In the given example,
VII Challenges random sequences may be made reproducible
The above sections argue that reproducible by noting that they are actually pseudo-random
approaches offer a number of benefits. However, and specifying the code used to produce them,
their adoption requires challenging changes in and the seed value(s).
current practice. Perhaps one of the most nota- However, a more significant challenge is cre-
ble is that the knitr, Sweave and Pweave ated by the so-called ‘Data Revolution’ (Kitchin,
approaches all require the use of code to carry 2014b) and the idea of Big Data – relating to the
out statistical analysis, visualization and data new paradigm of exploration and the search for
manipulation, rather than commonly adopted empirical pattern, with implications of data min-
GUI-based tools, such as Excel. Unfortunately ing and the search for patterns. Not only referring
this is an inherent characteristic of reproducibil- to the size of data sets, the term Big Data also
ity. After a series of point-and-click operations, refers to the diversity of applications, complexity
results are cut and pasted into a Word document of data and the fact that data is produced in a real-
(or similar) and the link between the reported time ‘firehose’ environment where sensors and
result and the analytical procedure is lost. It is other data-gathering devices are streaming vast
perhaps no surprise that the Reinhart and Rogoff quantities of data every second. This is of
affair was seeded by an error in Excel. importance to geographers applying quantitative
Despite this, perhaps it is more realistic to techniques, since much of this data has a geogra-
consider ways in which the divide between phical component. The exploratory paradigm is
GUI-based tools and reproducibility could be not without controversy – while the computa-
bridged than to propose such tools be aban- tional paradigm could be viewed as working
doned. One possibility might be to provide in co-operation with deductive and empirical
GUI-based software in which every interactive approaches, some propose the exploration of Big
event is echoed by a code equivalent, which is Data as a superior competitor to theory-led
recorded. The recorded code could then be approaches (see Mayer-Schonberger and Cukier,
embedded in a document. One such tool that 2013, or Anderson, 2008), suggesting that work-
does this on a web-based interface is Radiant ing with near-universal data sets and identifying
(Radiant News). However, it is perhaps also pattern supplants the need for theory and experi-
worth noting a general turn towards coding and ment. The title of the Anderson piece leaves little
away from GUI solutions in data analysis as doubt as to the magnitude of the claim being
indicated by the popularity of a number of books made!
such as O’Neill and Schutt (2013) and McKin- However, such boosterish claims have not
ney (2012) – suggesting that there is a current gone unchallenged – notably, in the discipline
694 Progress in Human Geography 40(5)

of geography, by Miller and Goodchild (2014), scrutiny of the representativeness of data – one
who argue, among other things, that there is still contextual factor that may enable more mean-
a need to understand the nature of the data being ingful analysis of Big Data.
used and to discriminate between spurious and
meaningful patterns. Kitchin (2014) warns of VIII Conclusion
the risks of ignoring contextual knowledge in
There are strong arguments for reproducibility
the analysis of Big Data. Although reproducibil-
in quantitative analysis of human geography
ity in research involving Big Data analysis
data – not just for academics, but also for public
would not fully address any of these issues, it
agencies and private consultancies charged with
may be argued that it can provide a foothold.
analysing data that may influence policy. Achiev-
Giving precise details of assumptions in coding
ing this in some situations is clearly within reach,
(for example, what kinds of patterns are being
although there are also some challenges ahead, as
sought out by a particular data mining algo-
the diversity and volume of geographically refer-
rithm?) will certainly provide an entry point into
enced information increases. Arguably there is
dialogues addressing the issues raised above.
also a role for such methods in addressing the Big
Despite this, currently many examples of
Data Revolution. However, the adoption of repro-
reproducible research have used fairly ‘tradi-
ducible approaches does call for some changes in
tional’ approaches to data analysis, where a data
the practice of both researchers – in adopting
set consists of a static file containing a rectangu-
reproducible research practices – and publishers –
lar table of cases by variables. More complex
in providing a medium where reproducible
data poses less of a conceptual problem per se
documents may be easily submitted, handled
in terms of reproducibility – the challenge here
and distributed.
is to devise appropriate analytical methods, but
if that can be achieved then code can be created Declaration of Conflicting Interests
and reproducible research can be carried out in
The author(s) declared no potential conflicts of inter-
the ways outlined above. Similarly, diversity est with respect to the research, authorship, and/or
of applications presents no further conceptual publication of this article.
difficulties for reproducibility. However, the
real-time aspect does provide some challenges – Funding
clearly, even with the same code, two people The author(s) received no financial support for the
accessing the same data stream at different research, authorship, and/or publication of this
points in time will not obtain identical results. article.
One possibility might be to acknowledge that
data used in a given publication is a static entity References
consisting of data obtained from a stream at a Anderson C (2008) The end of theory: The data deluge
given point in time – and to time stamp and makes the scientific method obsolete. Wired. Available
archive the data obtained and used in analysis at: http://www.wired.com/science/discoveries/maga-
zine/16-07/pb_theory (accessed 22 July 2015).
at the moment it was carried out. Although it
Ballas D, Clarke G, Dorling D, Eyre H, Thomas B and
would be impossible for a third party to obtain
Rossiter D (2005) SimBritain: A spatial microsimula-
identical data from the stream, and consequently tion approach to population dynamics. Population,
impossible to obtain identical analytical results, Space and Place 11(1): 13–34.
it would at least be possible to see the code used Barni M, Perez-Gonzalez F, Comesaña P and Bartoli G
to access the stream, note the time the stream (2007) Putting reproducible signal processing into
was accessed, and access a copy of the data practice: A case study in watermarking. Proc. IEEE
obtained at that time. This would also enable International Conference on Acoustics, Speech and
Brunsdon 695

Signal Processing. Available at: http://gpsc.uvigo.es/ Kitchin R (2014a) Big Data, new epistemologies and para-
sites/default/files/publications/icassp07reproducible. digm shifts. Big Data & Society 1(1). DOI: 10.1177/
pdf (accessed 22 July 2015). 2053951714528481.
Bergmann L (2013) Bound by chains of carbon: Ecological- Kitchin R (2014b) The Data Revolution: Big Data, Open
economic geographies of globalization. Annals of Data, Data Infrastructures and Their Consequences.
the Association of American Geographers 103(6): London: SAGE.
1348–70. DOI: 10.1080/00045608.2013.779547. Knuth D (1984) Literate programming. Computer Journal
Brunsdon C and Comber A (2015) An Introduction to R for 27(2): 97–111.
Spatial Analysis and Mapping. London: SAGE. Koenker R (1996) Reproducible Econometric Research.
Brunsdon C and Singleton A (2015) Reproducible Department of Econometrics, University of Illinois.
research: Concepts, techniques and issues. In: Bruns- Leisch F (2002) Dynamic generation of statistical reports
don C and Singleton A (eds) Geocomputation: A Prac- using literate data analysis. In: Härdle W and Rönz B
tical Primer. London: SAGE, 254–64. (eds) Compstat 2002: Proceedings in Computational
Buckheit JB and Donoho DL (1995) WaveLab and Repro- Statistics. Heidelberg: Physika Verlag, 575–580.
ducible Research. Tech. Rep. 474, Dept. of Statistics, Lovelace R and Ballas D (2013) ‘Truncate, replicate, sam-
Stanford University. ple’: A method for creating integer weights for spatial
Claerbout J (1992) Electronic documents give reprodu- microsimulation. Computers, Environment and Urban
cible research a mew meaning. In: Proc. 62nd Ann. Systems 41: 1–11.
Int. Meeting of the Soc. of Exploration Geophysics, Mayer-Schonberger V and Cukier K (2013) Big Data: A
601–604. Revolution That Will Change How We Live, Work and
Clarke M and Holm E (1987) Microsimulation methods in Think. London: John Murray.
spatial analysis in planning. Geografiska Annaler McKinney W (2012) Python for Data Analysis: Data
Series B, Human Geography 69(2): 145–164. Wrangling with Pandas, NumPy, and IPython. New
Gentleman R and Temple Lang D (2004) Statistical anal- York: O’Reilly.
yses and reproducible research. Bioconductor Project: Miller HJ and Goodchild M (2014) Data-driven geo-
Working Paper 2. graphy. GeoJournal. DOI: 10.10007/s10708-014-
Heppenstall A, Crooks A, See L and Batty M (2012) 9602-6.
Agent-Based Models of Geographical Systems. New O’Neill C and Schutt R (2013) Doing Data Science:
York: Springer. Straight Talk from the Frontline. New York:
Herndon T, Ash M and Pollin R (2013) Does high public O’Reilly.
debt consistently stifle economic growth? A critique Openshaw S and Taylor PJ (1979) A million or so cor-
of Reinhart and Rogoff. Cambridge Journal of relation coefficients: Three experiments on the
Economics 38: 257–279. modifiable areal unit problem. In: Statistical Appli-
Hey T, Tansley S and Tolle H (2009) Jim Gray on cations in the Spatial Sciences 21. London: Pion,
eScience: A transformed scientific method. In: Hey 127–144.
T, Tansley S and Tolle K (eds) The Fourth Paradigm: Parker J and Epstein J (2011) A distributed platform for
Data-Intensive Scientific Discovery. Redmond: Micro- global-scale agent-based models of disease transmis-
soft Research. Available at: http://research.microsoft. sion. ACM Transactions on Modeling and Computer
com/en-us/collaboration/fourthparadigm/4th_paradigm_ Simulation 22(1). DOI: 10.1145/2043635.2043637.
book_jim_gray_transcript.pdf (accessed 22 July 2015). Pastell M (2014) Pweave: Reports from data with Python.
Kelling S, Hochachka WH, Fink D, Riedewald M, Available at: http://mpastell.com/pweave/docs.html
Caruana R, Ballard G and Hooker G (2009) Data- (accessed 22 July 2015).
intensive science: A new paradigm for biodiversity R Core Team (2015) R: A Language and Environment for
studies. BioScience 59(7): 613–20. DOI: 10.1525/ Statistical Computing. Vienna: R Foundation for Statis-
bio.2009.59.7.12. tical Computing. Available at: http://www.R-project.
Kitchin R (2014) Big Data and human geography: Oppor- org/ (accessed 22 July 2015).
tunities, challenges and risks. Dialogues in Human Radiant News (2015) Introducing Radiant: A shiny inter-
Geography 3(3): 262–267. face for R. Available at: http://www.r-bloggers.com/
696 Progress in Human Geography 40(5)

introducing-radiant-a-shiny-interface-for-r-2/ (accessed Rey S (2015) Python Spatial Analysis Library (PySAL):

22 July 2015). An update and illustration. In: Brunsdon C and Single-
Ramsey N (1994) Literate programming simplified. IEEE ton S (eds) Geocomputation: A Practical Primer.
Software 11(5): 97–105. London: SAGE, 233–254.
Reinhart CM and Rogoff KS (2010) Growth in a time Vandewalle P, Kovačević J and Vetterli M (2009) Repro-
of debt. American Economic Review: Papers and ducible research in signal processing. IEEE Signal
Proceedings 100(May): 573–578. Processing Magazine 26(3): 37–47.
Reinhart CM and Rogoff KS (2013) Errata: Growth in a Wood J, Isenberg P, Isenberg T, Dykes J, Boukhelifa
time of debt. Harvard University, 5 May. Available at: N and Slingsby A (2012) Sketchy rendering for
http://www.carmenreinhart.com/user_uploads/data/36_ information visualization. IEEE Transactions on
data.pdf (accessed 22 July 2015). Visualization and Computer Graphics 18(12):
Ren M and Karimi HA (2012) A fuzzy logic map matching 2749–2758.
for wheelchair navigation. GPS Solutions 16: 274–282. Xie Y (2013) Dynamic Documents with R and Knitr. New
DOI: 10.1007/s10291-011-0229-5. York: Chapman and Hall CRC.

EDSS379 Assignment 2
100% (1)
EDSS379 Assignment 2
25 pages
Space Syntax Toolkit - User Guide v0.1.0
No ratings yet
Space Syntax Toolkit - User Guide v0.1.0
24 pages
EBOOK Key Methods in Geography 3Rd Edition Ebook PDF Download Full Chapter PDF Kindle
100% (51)
EBOOK Key Methods in Geography 3Rd Edition Ebook PDF Download Full Chapter PDF Kindle
61 pages
Grade 12 Geography Notes
75% (8)
Grade 12 Geography Notes
8 pages
Role of Statistics in Geography
100% (1)
Role of Statistics in Geography
14 pages
Probability, Statistics, and Decision for Civil Engineers
From Everand
Probability, Statistics, and Decision for Civil Engineers
Jack R Benjamin
3/5 (2)
Terra Sync Reference Manual
No ratings yet
Terra Sync Reference Manual
352 pages
Fundamentos Transdisciplinares Da Ciência de Dados Geoespaciais
No ratings yet
Fundamentos Transdisciplinares Da Ciência de Dados Geoespaciais
25 pages
Data Report
No ratings yet
Data Report
25 pages
Key Methods in Geography 3rd Edition, (Ebook PDF) pdf download
100% (36)
Key Methods in Geography 3rd Edition, (Ebook PDF) pdf download
51 pages
marwick2016
No ratings yet
marwick2016
27 pages
Geospatial Data Science: Combining Geography with Data Science
From Everand
Geospatial Data Science: Combining Geography with Data Science
Dr Aran Castro A J
No ratings yet
Download ebooks file Key Methods in Geography 3rd Edition, (Ebook PDF) all chapters
100% (1)
Download ebooks file Key Methods in Geography 3rd Edition, (Ebook PDF) all chapters
41 pages
Unit 1-Lecture Note
No ratings yet
Unit 1-Lecture Note
9 pages
Spatial Data Mining and Geographic Knowl
No ratings yet
Spatial Data Mining and Geographic Knowl
6 pages
Automatic Data Matching For Geospatial Models A New Paradigm For Geospatial Data and Models Sharing
No ratings yet
Automatic Data Matching For Geospatial Models A New Paradigm For Geospatial Data and Models Sharing
17 pages
Innovative Way To Support Data Processing Using The Geospatial Data Science
No ratings yet
Innovative Way To Support Data Processing Using The Geospatial Data Science
8 pages
2020 - GeoAI Spatially Explicit Artificial Intelligence Techniques For Geographic Knowledge Discovery and Beyond
No ratings yet
2020 - GeoAI Spatially Explicit Artificial Intelligence Techniques For Geographic Knowledge Discovery and Beyond
13 pages
Data Analysis
No ratings yet
Data Analysis
9 pages
2010 Torrens Geojournal Geography and Computational Social Science
No ratings yet
2010 Torrens Geojournal Geography and Computational Social Science
16 pages
2013 Faniel Barrera Kriesberg Yakel
No ratings yet
2013 Faniel Barrera Kriesberg Yakel
4 pages
7759 Cad-10
No ratings yet
7759 Cad-10
6 pages
Geoai: Spatially Explicit Artificial Intelligence Techniques For Geographic Knowledge Discovery and Beyond
No ratings yet
Geoai: Spatially Explicit Artificial Intelligence Techniques For Geographic Knowledge Discovery and Beyond
13 pages
Geographic_Data_Science
No ratings yet
Geographic_Data_Science
15 pages
Pgis Nov-19 (Sol) (E-Next - In)
No ratings yet
Pgis Nov-19 (Sol) (E-Next - In)
23 pages
Digital Cartographic Modeling and Geographic Information System
No ratings yet
Digital Cartographic Modeling and Geographic Information System
10 pages
Previewpdf
100% (1)
Previewpdf
66 pages
Metropolitans
No ratings yet
Metropolitans
235 pages
Geo Grade 12
No ratings yet
Geo Grade 12
72 pages
Big Data and Human Geography: Opportunities, Challenges and Risks
No ratings yet
Big Data and Human Geography: Opportunities, Challenges and Risks
6 pages
Spatial Modeling Principles in Earth Sciences
100% (1)
Spatial Modeling Principles in Earth Sciences
358 pages
Leveraging Container Technologies in A Giscience Project: A Perspective From Open Reproducible Research
No ratings yet
Leveraging Container Technologies in A Giscience Project: A Perspective From Open Reproducible Research
22 pages
Download GIS Algorithms 1st Edition Ningchuan Xiao ebook All Chapters PDF
50% (2)
Download GIS Algorithms 1st Edition Ningchuan Xiao ebook All Chapters PDF
61 pages
IAL_Geography_Fieldwork_Guide
No ratings yet
IAL_Geography_Fieldwork_Guide
19 pages
Principle of Geographic Information Systems
No ratings yet
Principle of Geographic Information Systems
20 pages
52 Classification Gis Lit and Apps PDF
No ratings yet
52 Classification Gis Lit and Apps PDF
42 pages
Get GIS Algorithms 1st Edition Ningchuan Xiao free all chapters
100% (7)
Get GIS Algorithms 1st Edition Ningchuan Xiao free all chapters
20 pages
Final Icrm
No ratings yet
Final Icrm
4 pages
GIS Algorithms 1st Edition Ningchuan Xiao pdf download
No ratings yet
GIS Algorithms 1st Edition Ningchuan Xiao pdf download
65 pages
Atlas Relational Patterns As The Means of Big Data Handling
No ratings yet
Atlas Relational Patterns As The Means of Big Data Handling
17 pages
2017 - Computational Reproducibility in Archaeological Re - 1
No ratings yet
2017 - Computational Reproducibility in Archaeological Re - 1
48 pages
Modern Spatiotemporal Geostatistics
From Everand
Modern Spatiotemporal Geostatistics
George Christakos
4/5 (1)
What Is GIS
No ratings yet
What Is GIS
6 pages
Unit 1
No ratings yet
Unit 1
27 pages
Administrative Social Science Data
No ratings yet
Administrative Social Science Data
13 pages
Reproducible-Research
No ratings yet
Reproducible-Research
46 pages
Statistical Methods in Geography
No ratings yet
Statistical Methods in Geography
15 pages
Lecture Notes in Earth Sciences 122: Editors: J. Reitner, G Ottingen M. H. Trauth, Potsdam K. ST Uwe, Graz D. Yuen, USA
No ratings yet
Lecture Notes in Earth Sciences 122: Editors: J. Reitner, G Ottingen M. H. Trauth, Potsdam K. ST Uwe, Graz D. Yuen, USA
13 pages
Spatial Data Science: Geo-Information
No ratings yet
Spatial Data Science: Geo-Information
5 pages
Unit One 1_Presentation (quan)
No ratings yet
Unit One 1_Presentation (quan)
41 pages
Introduction
No ratings yet
Introduction
8 pages
Application of Geographic Information Systems in T
No ratings yet
Application of Geographic Information Systems in T
7 pages
What Kind of Quantitative Methods For What Kind of Geography PDF
No ratings yet
What Kind of Quantitative Methods For What Kind of Geography PDF
9 pages
Research Paper
No ratings yet
Research Paper
8 pages
Principle of Geographic Information Systems: January 1998
No ratings yet
Principle of Geographic Information Systems: January 1998
20 pages
4th PARADIGM BOOK Complete HR
No ratings yet
4th PARADIGM BOOK Complete HR
287 pages
A Practical Guide to Mixed Research Methodology: For research students, supervisors, and academic authors
From Everand
A Practical Guide to Mixed Research Methodology: For research students, supervisors, and academic authors
Farhad Daneshgar PhD
No ratings yet
Communication Nets: Stochastic Message Flow and Delay
From Everand
Communication Nets: Stochastic Message Flow and Delay
Leonard Kleinrock
3/5 (1)
CITA Complex Modelling
From Everand
CITA Complex Modelling
Mette Ramsgaard Thomsen
No ratings yet
Random Field Models in Earth Sciences
From Everand
Random Field Models in Earth Sciences
George Christakos
5/5 (1)
Mathematical Foundations of Image Processing and Analysis
From Everand
Mathematical Foundations of Image Processing and Analysis
Jean-Charles Pinoli
No ratings yet
Full-Field Measurements and Identification in Solid Mechanics
From Everand
Full-Field Measurements and Identification in Solid Mechanics
Michel Grediac
No ratings yet
Modern Research Design: The Best Approach To Qualitative And Quantitative Data
From Everand
Modern Research Design: The Best Approach To Qualitative And Quantitative Data
Frank Albert
No ratings yet
CH 11 - Kenneth C. Laudon, Jane P. Laudon - Essentials of MIS (12th Edition) - Pearson (2016) - 413-447
No ratings yet
CH 11 - Kenneth C. Laudon, Jane P. Laudon - Essentials of MIS (12th Edition) - Pearson (2016) - 413-447
35 pages
2b dv syllabus (1)
No ratings yet
2b dv syllabus (1)
3 pages
Analysis of KFC'S Location Strategies
No ratings yet
Analysis of KFC'S Location Strategies
3 pages
Establishment of National Spatial Data Infrastructure
No ratings yet
Establishment of National Spatial Data Infrastructure
10 pages
Lab 33
No ratings yet
Lab 33
18 pages
B.Tech Open Elective I 3rd Year (VI Semester) PDF
No ratings yet
B.Tech Open Elective I 3rd Year (VI Semester) PDF
16 pages
Modern Survey Unit-4
100% (2)
Modern Survey Unit-4
61 pages
The New Frontier Exploring For Oil With Gravity and Magnetics
No ratings yet
The New Frontier Exploring For Oil With Gravity and Magnetics
11 pages
Geoinformatics
No ratings yet
Geoinformatics
13 pages
AP Human Geography Unit 1 Thinking Geographically
100% (1)
AP Human Geography Unit 1 Thinking Geographically
5 pages
Gis
No ratings yet
Gis
29 pages
A Study On Historical Transformation of The Urban Integration Core of Khulna City, Bangladesh
No ratings yet
A Study On Historical Transformation of The Urban Integration Core of Khulna City, Bangladesh
14 pages
Knowage 6.x CE Manual
No ratings yet
Knowage 6.x CE Manual
221 pages
A Review on QUAL2K Water Quality Model Comparative
No ratings yet
A Review on QUAL2K Water Quality Model Comparative
9 pages
Proceeding 3
100% (4)
Proceeding 3
226 pages
Exercise 1: 1. Getting To Know Arcgis
No ratings yet
Exercise 1: 1. Getting To Know Arcgis
19 pages
Applied Sciences: Modeling Photovoltaic Potential For Bus Shelters On A City-Scale: A Case Study in Lisbon
No ratings yet
Applied Sciences: Modeling Photovoltaic Potential For Bus Shelters On A City-Scale: A Case Study in Lisbon
16 pages
Hydraulic Sewer Model
No ratings yet
Hydraulic Sewer Model
20 pages
BROCH OpenRail Asset Performance LTR EN LR
No ratings yet
BROCH OpenRail Asset Performance LTR EN LR
12 pages
Secrets of Gen AI Succes Snowflake
No ratings yet
Secrets of Gen AI Succes Snowflake
20 pages
Britmindo Company Profile
No ratings yet
Britmindo Company Profile
34 pages
Coal Map
No ratings yet
Coal Map
25 pages
Global Monitoring For GIS
No ratings yet
Global Monitoring For GIS
8 pages
Fu Thesis
100% (3)
Fu Thesis
8 pages
QGIS 3.4 QGISTrainingManual BG PDF
100% (3)
QGIS 3.4 QGISTrainingManual BG PDF
679 pages
Smartphone Surveying
No ratings yet
Smartphone Surveying
11 pages
Mapping Philippine Agro-Ecological Zones (AEZS) Technical Notes
No ratings yet
Mapping Philippine Agro-Ecological Zones (AEZS) Technical Notes
59 pages

Quantitative Methods I: Reproducible Research and Quantitative Geography

Uploaded by

Quantitative Methods I: Reproducible Research and Quantitative Geography

Uploaded by

View metadata, citation and similar papers at core.ac.

uk brought to you by CORE

Reproducible research sagepub.co.uk/journalsPermissions.nav

I Reproducibility in research The term reproducible research (Claerbout,

1. Deductive (mathematics and formal logic) II Geographical examples

enables others to interact with the algorithm

introducing-radiant-a-shiny-interface-for-r-2/ (accessed Rey S (2015) Python Spatial Analysis Library (PySAL):

You might also like