100% found this document useful (3 votes)
54 views

Spatial Data Science With Applications in R 1st Edition All Chapters Included

The document is a comprehensive guide to 'Spatial Data Science With Applications in R', published by CRC Press in 2023. It covers various topics including spatial data, coordinates, geometries, and data cubes, along with practical applications using R. The book aims to provide readers with the necessary tools and knowledge to effectively work with spatial data.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (3 votes)
54 views

Spatial Data Science With Applications in R 1st Edition All Chapters Included

The document is a comprehensive guide to 'Spatial Data Science With Applications in R', published by CRC Press in 2023. It covers various topics including spatial data, coordinates, geometries, and data cubes, along with practical applications using R. The book aims to provide readers with the necessary tools and knowledge to effectively work with spatial data.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Spatial Data Science With Applications in R - 1st Edition

Visit the link below to download the full version of this book:

https://medipdf.com/product/spatial-data-science-with-applications-in-r-1st-edit
ion/

Click Download Now


Cover artwork by Allison Horst
First edition published 2023
by CRC Press
6000 Broken Sound Parkway NW, Suite 300, Boca Raton, FL 33487-2742

and by CRC Press


4 Park Square, Milton Park, Abingdon, Oxon, OX14 4RN

CRC Press is an imprint of Taylor & Francis Group, LLC

© 2023 Taylor & Francis Group, LLC

Reasonable efforts have been made to publish reliable data and information, but the author and pub-
lisher cannot assume responsibility for the validity of all materials or the consequences of their use.
The authors and publishers have attempted to trace the copyright holders of all material reproduced
in this publication and apologize to copyright holders if permission to publish in this form has not
been obtained. If any copyright material has not been acknowledged please write and let us know so
we may rectify in any future reprint.

Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced,
transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or
hereafter invented, including photocopying, microfilming, and recording, or in any information stor-
age or retrieval system, without written permission from the publishers.

For permission to photocopy or use material electronically from this work, access www.copyright.com
or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923,
978-750-8400. For works that are not available on CCC please contact [email protected]

Trademark notice: Product or corporate names may be trademarks or registered trademarks and are
used only for identification and explanation without intent to infringe.

ISBN: 978-1-138-31118-3 (hbk)


ISBN: 978-1-032-47392-5 (pbk)
ISBN: 978-0-429-45901-6 (ebk)

DOI: 10.1201/9780429459016

Typeset in Latin Modern font


by KnowledgeWorks Global Ltd.

Publisher’s note: This book has been prepared from camera-ready copy provided by the authors.

Access the Support Material: https://r-spatial.org/book/.


Table of contents

Preface xi

I Spatial Data 1
1 Getting Started 5
1.1 A first map . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Coordinate reference systems . . . . . . . . . . . . . . . . . . . 7
1.3 Raster and vector data . . . . . . . . . . . . . . . . . . . . . 8
1.4 Raster types . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.5 Time series, arrays, data cubes . . . . . . . . . . . . . . . . . . 11
1.6 Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.7 Spatial data science software . . . . . . . . . . . . . . . . . . 13
1.7.1 GDAL . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.7.2 PROJ . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.7.3 GEOS and s2geometry . . . . . . . . . . . . . . . . . . 14
1.7.4 NetCDF, udunits2, liblwgeom . . . . . . . . . . . . . . 14
1.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2 Coordinates 17
2.1 Quantities, units, datum . . . . . . . . . . . . . . . . . . . . . 17
2.2 Ellipsoidal coordinates . . . . . . . . . . . . . . . . . . . . . 18
2.2.1 Spherical or ellipsoidal coordinates . . . . . . . . . . . 19
2.2.2 Projected coordinates, distances . . . . . . . . . . . . . 21
2.2.3 Bounded and unbounded spaces . . . . . . . . . . . . 22
2.3 Coordinate reference systems . . . . . . . . . . . . . . . . . . 22
2.4 PROJ and mapping accuracy . . . . . . . . . . . . . . . . . . 23
2.5 WKT-2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3 Geometries 29
3.1 Simple feature geometries . . . . . . . . . . . . . . . . . . . . 29
3.1.1 The big seven . . . . . . . . . . . . . . . . . . . . . . . 29
3.1.2 Simple and valid geometries, ring direction . . . . . . . 31
3.1.3 Z and M coordinates . . . . . . . . . . . . . . . . . . . . 31
3.1.4 Empty geometries . . . . . . . . . . . . . . . . . . . . 32
3.1.5 Ten further geometry types . . . . . . . . . . . . . . . 32
3.1.6 Text and binary encodings . . . . . . . . . . . . . . . 33

v
vi Contents

3.2 Operations on geometries . . . . . . . . . . . . . . . . . . . . 34


3.2.1 Unary predicates . . . . . . . . . . . . . . . . . . . . . 34
3.2.2 Binary predicates and DE-9IM . . . . . . . . . . . . . 34
3.2.3 Unary measures . . . . . . . . . . . . . . . . . . . . . 36
3.2.4 Binary measures . . . . . . . . . . . . . . . . . . . . . . 37
3.2.5 Unary transformers . . . . . . . . . . . . . . . . . . . . . 37
3.2.6 Binary transformers . . . . . . . . . . . . . . . . . . . 38
3.2.7 N-ary transformers . . . . . . . . . . . . . . . . . . . . 38
3.3 Precision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.4 Coverages: tessellations and rasters . . . . . . . . . . . . . . 40
3.4.1 Topological models . . . . . . . . . . . . . . . . . . . . 40
3.4.2 Raster tessellations . . . . . . . . . . . . . . . . . . . . . 41
3.5 Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

4 Spherical Geometries 45
4.1 Straight lines . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.2 Ring direction and full polygon . . . . . . . . . . . . . . . . . 45
4.3 Bounding box, rectangle, and cap . . . . . . . . . . . . . . . 46
4.4 Validity on the sphere . . . . . . . . . . . . . . . . . . . . . . . 47
4.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

5 Attributes and Support 49


5.1 Attribute-geometry relationships and support . . . . . . . . . 50
5.2 Aggregating and summarising . . . . . . . . . . . . . . . . . 52
5.3 Area-weighted interpolation . . . . . . . . . . . . . . . . . . 54
5.3.1 Spatially extensive and intensive variables . . . . . . . 54
5.3.2 Dasymetric mapping . . . . . . . . . . . . . . . . . . . 55
5.3.3 Support in file formats . . . . . . . . . . . . . . . . . . 55
5.4 Up- and Downscaling . . . . . . . . . . . . . . . . . . . . . . 56
5.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

6 Data Cubes 59
6.1 A four-dimensional data cube . . . . . . . . . . . . . . . . . 59
6.2 Dimensions, attributes, and support . . . . . . . . . . . . . . 60
6.2.1 Regular dimensions, GDAL’s geotransform . . . . . . 62
6.2.2 Support along cube dimensions . . . . . . . . . . . . . 62
6.3 Operations on data cubes . . . . . . . . . . . . . . . . . . . . 63
6.3.1 Slicing a cube: filter . . . . . . . . . . . . . . . . . . . 63
6.3.2 Applying functions to dimensions . . . . . . . . . . . . 64
6.3.3 Reducing dimensions . . . . . . . . . . . . . . . . . . . 64
6.4 Aggregating raster to vector cubes . . . . . . . . . . . . . . . 65
6.5 Switching dimension with attributes . . . . . . . . . . . . . . . 67
6.6 Other dynamic spatial data . . . . . . . . . . . . . . . . . . . . 67
6.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
Contents vii

II R for Spatial Data Science 71


7 Introduction to sf and stars 75
7.1 Package sf . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
7.1.1 Creation . . . . . . . . . . . . . . . . . . . . . . . . . . 76
7.1.2 Reading and writing . . . . . . . . . . . . . . . . . . . . 77
7.1.3 Subsetting . . . . . . . . . . . . . . . . . . . . . . . . . 78
7.1.4 Binary predicates . . . . . . . . . . . . . . . . . . . . . 78
7.1.5 tidyverse . . . . . . . . . . . . . . . . . . . . . . . . . 80
7.2 Spatial joins . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
7.2.1 Sampling, gridding, interpolating . . . . . . . . . . . . 82
7.3 Ellipsoidal coordinates . . . . . . . . . . . . . . . . . . . . . 82
7.4 Package stars . . . . . . . . . . . . . . . . . . . . . . . . . . 84
7.4.1 Reading and writing raster data . . . . . . . . . . . . 85
7.4.2 Subsetting stars data cubes . . . . . . . . . . . . . . 86
7.4.3 Cropping . . . . . . . . . . . . . . . . . . . . . . . . . 88
7.4.4 Redimensioning and combining stars objects . . . . . 89
7.4.5 Extracting point samples, aggregating . . . . . . . . . . 91
7.4.6 Predictive models . . . . . . . . . . . . . . . . . . . . . 92
7.4.7 Plotting raster data . . . . . . . . . . . . . . . . . . . 93
7.4.8 Analysing raster data . . . . . . . . . . . . . . . . . . 93
7.4.9 Curvilinear rasters . . . . . . . . . . . . . . . . . . . . 98
7.4.10 GDAL utils . . . . . . . . . . . . . . . . . . . . . . . . 98
7.5 Vector data cube examples . . . . . . . . . . . . . . . . . . . 99
7.5.1 Example: aggregating air quality time series . . . . . . 99
7.5.2 Example: Bristol origin-destination data cube . . . . . 102
7.5.3 Tidy array data . . . . . . . . . . . . . . . . . . . . . . 108
7.5.4 File formats for vector data cubes . . . . . . . . . . . 109
7.6 Raster-to-vector, vector-to-raster . . . . . . . . . . . . . . . . 109
7.6.1 Vector-to-raster . . . . . . . . . . . . . . . . . . . . . . 109
7.7 Coordinate transformations and conversions . . . . . . . . . 110
7.7.1 st_crs . . . . . . . . . . . . . . . . . . . . . . . . . . 110
7.7.2 st_transform, sf_project . . . . . . . . . . . . . . . . 111
7.7.3 sf_proj_info . . . . . . . . . . . . . . . . . . . . . . 112
7.7.4 Datum grids, proj.db, cdn.proj.org, local cache . . . . 112
7.7.5 Transformation pipelines . . . . . . . . . . . . . . . . . 113
7.7.6 Axis order and direction . . . . . . . . . . . . . . . . . 115
7.8 Transforming and warping rasters . . . . . . . . . . . . . . . 116
7.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117

8 Plotting spatial data 119


8.1 Every plot is a projection . . . . . . . . . . . . . . . . . . . . 119
8.1.1 What is a good projection for my data? . . . . . . . . 120
8.2 Plotting points, lines, polygons, grid cells . . . . . . . . . . . . 121
8.2.1 Colours . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
viii Contents

8.2.2 Colour breaks: classInt . . . . . . . . . . . . . . . . 122


8.2.3 Graticule and other navigation aids . . . . . . . . . . 123
8.3 Base plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
8.3.1 Adding to plots with legends . . . . . . . . . . . . . . 123
8.3.2 Projections in base plots . . . . . . . . . . . . . . . . . 125
8.3.3 Colours and colour breaks . . . . . . . . . . . . . . . . 125
8.4 Maps with ggplot2 . . . . . . . . . . . . . . . . . . . . . . . 125
8.5 Maps with tmap . . . . . . . . . . . . . . . . . . . . . . . . . . 127
8.6 Interactive maps: leaflet, mapview, tmap . . . . . . . . . . 128
8.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

9 Large data and cloud native 131


9.1 Vector data: sf . . . . . . . . . . . . . . . . . . . . . . . . . 132
9.1.1 Reading from local disk . . . . . . . . . . . . . . . . . 132
9.1.2 Reading from databases, dbplyr . . . . . . . . . . . . 133
9.1.3 Reading from online resources or web services . . . . . 134
9.1.4 APIs, OpenStreetMap . . . . . . . . . . . . . . . . . . 134
9.1.5 GeoParquet and GeoArrow . . . . . . . . . . . . . . . 135
9.2 Raster data: stars . . . . . . . . . . . . . . . . . . . . . . . . 136
9.2.1 stars proxy objects . . . . . . . . . . . . . . . . . . . . 137
9.2.2 Operations on proxy objects . . . . . . . . . . . . . . . 139
9.2.3 Remote raster resources . . . . . . . . . . . . . . . . . 139
9.3 Very large data cubes . . . . . . . . . . . . . . . . . . . . . . 140
9.3.1 Finding and processing assets . . . . . . . . . . . . . . 140
9.3.2 Cloud native storage: Zarr . . . . . . . . . . . . . . . . 140
9.3.3 APIs for data: GEE, openEO . . . . . . . . . . . . . . . 141
9.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142

III Models for Spatial Data 143


10 Statistical modelling of spatial data 147
10.1 Mapping with non-spatial regression and ML models . . . . 148
10.2 Support and statistical modelling . . . . . . . . . . . . . . . 149
10.3 Time in predictive models . . . . . . . . . . . . . . . . . . . 150
10.4 Design-based and model-based inference . . . . . . . . . . . . . 151
10.5 Predictive models with coordinates . . . . . . . . . . . . . . 152
10.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153

11 Point Pattern Analysis 155


11.1 Observation window . . . . . . . . . . . . . . . . . . . . . . . 156
11.2 Coordinate reference systems . . . . . . . . . . . . . . . . . . 160
11.3 Marked point patterns, points on linear networks . . . . . . . . 161
11.4 Spatial sampling and simulating a point process . . . . . . . 163
11.5 Simulating points on the sphere . . . . . . . . . . . . . . . . 164
11.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
Contents ix

12 Spatial Interpolation 165


12.1 A first dataset . . . . . . . . . . . . . . . . . . . . . . . . . . 165
12.2 Sample variogram . . . . . . . . . . . . . . . . . . . . . . . . . 167
12.3 Fitting variogram models . . . . . . . . . . . . . . . . . . . . 169
12.4 Kriging interpolation . . . . . . . . . . . . . . . . . . . . . . . 171
12.5 Areal means: block kriging . . . . . . . . . . . . . . . . . . . . 171
12.6 Conditional simulation . . . . . . . . . . . . . . . . . . . . . 173
12.7 Trend models . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
12.7.1 A population grid . . . . . . . . . . . . . . . . . . . . 175
12.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179

13 Multivariate and Spatiotemporal Geostatistics 181


13.1 Preparing the air quality dataset . . . . . . . . . . . . . . . . . 181
13.2 Multivariable geostatistics . . . . . . . . . . . . . . . . . . . 183
13.3 Spatiotemporal geostatistics . . . . . . . . . . . . . . . . . . 184
13.3.1 A spatiotemporal variogram model . . . . . . . . . . . 184
13.3.2 Irregular space time data . . . . . . . . . . . . . . . . 189
13.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190

14 Proximity and Areal Data 191


14.1 Representing proximity in spdep . . . . . . . . . . . . . . . . 192
14.2 Contiguous neighbours . . . . . . . . . . . . . . . . . . . . . 194
14.3 Graph-based neighbours . . . . . . . . . . . . . . . . . . . . . . 197
14.4 Distance-based neighbours . . . . . . . . . . . . . . . . . . . 199
14.5 Weights specification . . . . . . . . . . . . . . . . . . . . . . 204
14.6 Higher order neighbours . . . . . . . . . . . . . . . . . . . . . 206
14.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208

15 Measures of Spatial Autocorrelation 209


15.1 Measures and process misspecification . . . . . . . . . . . . . 209
15.2 Global measures . . . . . . . . . . . . . . . . . . . . . . . . . . 211
15.2.1 Join-count tests for categorical data . . . . . . . . . . 212
15.2.2 Moran’s I . . . . . . . . . . . . . . . . . . . . . . . . . 214
15.3 Local measures . . . . . . . . . . . . . . . . . . . . . . . . . . 216
15.3.1 Local Moran’s Ii . . . . . . . . . . . . . . . . . . . . . 218
15.3.2 Local Getis-Ord Gi . . . . . . . . . . . . . . . . . . . . 225
15.3.3 Local Geary’s Ci . . . . . . . . . . . . . . . . . . . . . 226
15.3.4 The rgeoda package . . . . . . . . . . . . . . . . . . . 229
15.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231

16 Spatial Regression 233


16.1 Markov random field and multilevel models . . . . . . . . . . 233
16.1.1 Boston house value dataset . . . . . . . . . . . . . . . 235
16.2 Multilevel models of the Boston dataset . . . . . . . . . . . . . 237
16.2.1 IID random effects with lme4 . . . . . . . . . . . . . . 238
16.2.2 IID and CAR random effects with hglm . . . . . . . . 238
x Contents

16.2.3 IID and ICAR random effects with R2BayesX . . . . . 239


16.2.4 IID, ICAR and Leroux random effects with INLA . . 240
16.2.5 ICAR random effects with mgcv::gam() . . . . . . . . . 241
16.2.6 Upper-level random effects: summary . . . . . . . . . . 241
16.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243

17 Spatial Econometrics Models 245


17.1 Spatial econometric models: definitions . . . . . . . . . . . . 245
17.2 Maximum likelihood estimation in spatialreg . . . . . . . . 248
17.2.1 Boston house value dataset examples . . . . . . . . . . 249
17.3 Impacts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252
17.4 Predictions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254
17.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256

A Older R Spatial Packages 259


A.1 Retiring rgdal and rgeos . . . . . . . . . . . . . . . . . . . . 259
A.2 Links and differences between sf and sp . . . . . . . . . . . . 259
A.3 Migration code and packages . . . . . . . . . . . . . . . . . . 260
A.4 Package raster and terra . . . . . . . . . . . . . . . . . . . . 260

B R Basics 263
B.1 Pipes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263
B.2 Data structures . . . . . . . . . . . . . . . . . . . . . . . . . 264
B.2.1 Homogeneous vectors . . . . . . . . . . . . . . . . . . 264
B.2.2 Heterogeneous vectors: list . . . . . . . . . . . . . . . 265
B.2.3 NULL and removing list elements . . . . . . . . . . . . 267
B.2.4 Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . 267
B.2.5 The names attributes . . . . . . . . . . . . . . . . . . . 270
B.2.6 Using structure . . . . . . . . . . . . . . . . . . . . . . 271
B.3 Dissecting a MULTIPOLYGON . . . . . . . . . . . . . . . . . . . 272

References 277

Index 291
Index of functions 299
Preface

Data science is concerned with finding answers to questions on the basis of


available data, and communicating that effort. Besides showing the results,
this communication involves sharing the data used, but also exposing the
path that led to the answers in a comprehensive and reproducible way. It
also acknowledges the fact that available data may not be sufficient to answer
questions, and that any answers are conditional on the data collection or
sampling protocols employed.
This book introduces and explains the concepts underlying spatial data: points,
lines, polygons, rasters, coverages, geometry attributes, data cubes, reference
systems, as well as higher-level concepts including how attributes relate to
geometries and how this affects analysis. The relationship of attributes to
geometries is known as support, and changing support also changes the charac-
teristics of attributes. Some data generation processes are continuous in space,
and may be observed everywhere. Others are discrete, observed in tesselated
containers. In modern spatial data analysis, tesellated methods are often used
for all data, extending across the legacy partition into point process, geosta-
tistical and lattice models. It is support (and the understanding of support)
that underlies the importance of spatial representation. The book aims at data
scientists who want to get a grip on using spatial data in their analysis. To
exemplify how to do things, it uses R. In future editions we hope to extend
this with examples using Python (see, e.g., Bivand 2022b) and Julia.
It is often thought that spatial data boils down to having observations’ longitude
and latitude in a dataset, and treating these just like any other variable. This
carries the risk of missed opportunities and meaningless analyses. For instance,
• coordinate pairs really are pairs, and lose much of their meaning when
treated independently
• rather than having point locations, observations are often associated with
spatial lines, areas, or grid cells
• spatial distances between observations are often not well represented by
straight-line distances, but by great circle distances, distances through
networks, or by measuring the effort it takes getting from A to B
We introduce the concepts behind spatial data, coordinate reference systems,
spatial analysis, and introduce a number of packages, including sf (Pebesma
2018, 2022b), stars (Pebesma 2022d), s2 (Dunnington, Pebesma, and Rubak

xi
xii Preface

2023) and lwgeom (Pebesma 2023), as well as a number of spatial tidyverse


(Wickham et al. 2019; Wickham 2022) extensions, and a number of spatial
analysis and visualisation packages that can be used with these packages,
including gstat (Pebesma 2004; Pebesma and Graeler 2022), spdep (Bivand
2022c), spatialreg (Bivand and Piras 2022), spatstat (Baddeley, Rubak, and
Turner 2015; Baddeley, Turner, and Rubak 2022), tmap (Tennekes 2018, 2022)
and mapview (Appelhans et al. 2022).
Like data science, spatial data science seems to be a field that arises bottom-
up in and from many existing scientific disciplines and industrial activities
concerned with application of spatial data, rather than being a sub-discipline
of an existing scientific discipline. Although there are various activities trying
to scope it through focused conferences, symposia, chairs and study programs,
we believe that the versatility of spatial data applications and questions will
render such activity hard. Giving this book the title “spatial data science”
is not another attempt to define the bounds of this field but rather an at-
tempt to contribute to it from our 3-4 decades of experience working with
researchers from various fields willing to publicly share research questions,
data, and attempts to solve these questions with software. As a consequence,
the selection of topics found in this book has a certain bias towards our own
areas of research interest and experience. Platforms that have helped create
an open research community include the ai-geostats and r-sig-geo mailing lists,
sourceforge, r-forge, GitHub, and the OpenGeoHub summer schools organized
yearly since 2006. The current possibility and willingness to cross data science
language barriers opens a new and very exciting perspective. Our motivation
to contribute to this field is a belief that open science leads to better science,
and that better science might contribute to a more sustainable world.

Acknowledgements
We are grateful to the entire r-spatial community, especially those who
• developed r-spatial packages or contributed to their development
• contributed to discussions on twitter #rspatial or GitHub
• brought comments or asked questions in courses, summer schools, or
conferences.
We are in particular grateful to Dewey Dunnington for implementing the s2
package, and for active contributions from Sahil Bhandari, Jonathan Bahlmann
for preparing the figures in Chapter 6, Claus Wilke, Jakub Nowosad, the
“Spatial Data Science with R” classes of 2021 and 2022, and to those who
actively contributed with GitHub issues, pull requests, or discussions:
Preface xiii

• to the book repository (Nowosad, jonathom, JaFro96, singhkpratham,


liuyadong, hurielreichel, PPaccioretti, Robinlovelace, Syverpet, jonas-hurst,
angela-li, ALanguillaume, florisvdh, ismailsunni, andronaco),
• to the sf repository (aecoleman, agila5, andycraig, angela-li, ateucher, barry-
rowlingson, bbest, BenGraeler, bhaskarvk, Bisaloo, bkmgit, christophertull,
chrisyeh96, cmcaine, cpsievert, daissi, dankelley, DavisVaughan, dbaston,
dblodgett-usgs, dcooley, demorenoc, dpprdan, drkrynstrng, etiennebr, fa-
muvie, fdetsch, florisvdh, gregleleu, hadley, hughjonesd, huizezhang-sherry,
jeffreyhanson, jeroen, jlacko, joethorley, joheisig, JoshOBrien, jwolfson,
kadyb, karldw, kendonB, khondula, KHwong12, krlmlr, lambdamoses,
lbusett, lcgodoy, lionel-, loicdtx, marwahaha, MatthieuStigler, mdsumner,
MichaelChirico, microly, mpadge, mtennekes, nikolai-b, noerw, Nowosad,
oliverbeagley, Pakillo, paleolimbot, pat-s, PPaccioretti, prdm0, ranghetti,
rCarto, renejuan, rhijmans, rhurlin, rnuske, Robinlovelace, robitalec, rubak,
rundel, statnmap, thomasp85, tim-salabim, tyluRp, uribo, Valexandre,
wibeasley, wittja01, yutannihilation, Zedseayou),
• to the stars repository (a-benini, ailich, ateucher, btupper, dblodgett-usgs,
djnavarro, ErickChacon, ethanwhite, etiennebr, flahn, floriandeboissieu,
gavg712, gdkrmr, jannes-m, jeroen, JoshOBrien, kadyb, kendonB, md-
sumner, michaeldorman, mtennekes, Nowosad, pat-s, PPaccioretti, przell,
qdread, Rekyt, rhijmans, rubak, rushgeo, statnmap, uribo, yutannihilation),
• to the s2 repository (kylebutts, spiry34, jeroen, eddelbuettel).
Part I

Spatial Data
3

The first part of this book introduces concepts of spatial data science: maps,
projections, vector and raster data structures, software, attributes and support,
and data cubes. This part uses R only to generate text output or figures. The R
code for this is not shown or explained, as it would distract from the message:
Part II focuses on the use of R. The online version of this book, found at
https://r-spatial.org/book/ contains the R code at the place where it is used in
hidden sections that can be unfolded on demand and copied to the clipboard
for execution and experimenting. Output from R code uses code font and has
lines starting with a #, as in
# Linking to GEOS 3.11.1, GDAL 3.6.2, PROJ 9.1.1; sf_use_s2()
# is TRUE
More detailed explanation of R code to solve spatial data science problems
starts in the second part of this book. Appendix B contains a short, elementary
explanation of R data structures, Wickham (2014a) gives a more extensive
treatment on this.
1
Getting Started

This chapter introduces a number of concepts associated with handling spatial


and spatiotemporal data, pointing forward to later chapters where these
concepts are discussed in more detail. It also introduces a number of open
source technologies that form the foundation of all spatial data science language
implementations.

1.1 A first map


The typical way to graph spatial data is by creating a map. Let us consider a
simple map, shown in Figure 1.1.

BIR74
37°N
36°N
35°N
34°N

84°W 82°W 80°W 78°W 76°W

5000 10000 15000 20000

Figure 1.1: A first map: birth counts 1974-78, North Carolina counties

You might also like