0% found this document useful (0 votes)
43 views

Preview (2023) Introduction To Environmental Data Science in R 33p

This document provides an introduction to the textbook "Introduction to Environmental Data Science". It focuses on applying data science methods in R to environmental research, including exploratory data analysis, spatial data analysis, statistics and modeling, time series analysis, and communication. The textbook is designed for undergraduate to graduate students in environmental fields but can also serve as a reference for environmental professionals. It gives thorough consideration to spatial and temporal data needs in environmental research and features examples applying to field-collected and government data. Each chapter includes exercises to help students learn the concepts.

Uploaded by

André Barros
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views

Preview (2023) Introduction To Environmental Data Science in R 33p

This document provides an introduction to the textbook "Introduction to Environmental Data Science". It focuses on applying data science methods in R to environmental research, including exploratory data analysis, spatial data analysis, statistics and modeling, time series analysis, and communication. The textbook is designed for undergraduate to graduate students in environmental fields but can also serve as a reference for environmental professionals. It gives thorough consideration to spatial and temporal data needs in environmental research and features examples applying to field-collected and government data. Each chapter includes exercises to help students learn the concepts.

Uploaded by

André Barros
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

Introduction to Environmental

Data Science
Introduction to Environmental Data Science focuses on data science methods in the R language
applied to environmental research, with sections on exploratory data analysis in R including data
abstraction, transformation, and visualization; spatial data analysis in vector and raster models;
statistics & modelling ranging from exploratory to modelling, considering confirmatory statis-
tics and extending to machine learning models; time series analysis, focusing especially on car-
bon and micrometeorological flux; and communication. Introduction to Environmental Data
Science. It is an ideal textbook to teach undergraduate to graduate level students in environmen-
tal science, environmental studies, geography, earth science, and biology, but can also serve as a
reference for environmental professionals working in consulting, NGOs, and government agen-
cies at the local, state, federal, and international levels.

Features

• Gives thorough consideration of the needs for environmental research in both spatial and
temporal domains.
• Features examples of applications involving field-collected data ranging from individual ob-
servations to data logging.
• Includes examples also of applications involving government and NGO sources, ranging
from satellite imagery to environmental data collected by regulators such as EPA.
• Contains class-tested exercises in all chapters other than case studies. Solutions manual
available for instructors.
• All examples and exercises make use of a GitHub package for functions and especially data.
Taylor & Francis
Taylor & Francis Group
http://taylorandfrancis.com
Introduction to Environmental
Data Science

Jerry D. Davis
Designed cover image: By Anna Studwell and Jerry D. Davis

First edition published 2023


by CRC Press
6000 Broken Sound Parkway NW, Suite 300, Boca Raton, FL 33487-2742

and by CRC Press


4 Park Square, Milton Park, Abingdon, Oxon, OX14 4RN

CRC Press is an imprint of Taylor & Francis Group, LLC

© 2023 Jerry D. Davis

Reasonable efforts have been made to publish reliable data and information, but the author and publisher cannot
assume responsibility for the validity of all materials or the consequences of their use. The authors and publishers
have attempted to trace the copyright holders of all material reproduced in this publication and apologize to copyright
holders if permission to publish in this form has not been obtained. If any copyright material has not been acknowl-
edged please write and let us know so we may rectify in any future reprint.

Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or
utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including pho-
tocopying, microfilming, and recording, or in any information storage or retrieval system, without written permission
from the publishers.

For permission to photocopy or use material electronically from this work, access www.copyright.com or contact the
Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. For works that are
not available on CCC please contact [email protected]

Trademark notice: Product or corporate names may be trademarks or registered trademarks and are used only for
identification and explanation without intent to infringe.

ISBN: 978-1-032-32218-6 (hbk)


ISBN: 978-1-032-33034-1 (pbk)
ISBN: 978-1-003-31782-1 (ebk)

DOI: 10.1201/9781003317821

Typeset in LM Roman
by KnowledgeWorks Global Ltd.

Publisher’s note: This book has been prepared from camera-ready copy provided by the authors.
“Dandelion fluff – Ephemeral stalk sheds seeds to the universe” by Anna Studwell
Taylor & Francis
Taylor & Francis Group
http://taylorandfrancis.com
Contents

Author/editor biographies xiii

List of Figures xv

1 Background, Goals and Data 1


1.1 Environmental Data Science . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Environmental Data and Methods . . . . . . . . . . . . . . . . . . . . . . . 1
1.3 Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3.1 Some definitions: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Exploratory Data Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.5 Software and Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.5.1 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.6 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

I Exploratory Data Analysis 11


2 Introduction to R 13
2.1 Data Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.1.1 Scalars and assignment . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.3 Expressions and Statements . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.4 Data Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.4.1 Integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.5 Rectangular Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.6 Data Structures in R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.6.1 Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.6.2 Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.6.3 Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.6.4 Data frames . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.6.5 Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.7 Accessors and Subsetting . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.7.1 [] Subsetting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.7.2 [[]] The mysterious double bracket . . . . . . . . . . . . . . . . . . 41
2.7.3 $ Accessing a vector from a data frame . . . . . . . . . . . . . . . . . 42
2.8 Programming scripts in RStudio . . . . . . . . . . . . . . . . . . . . . . . . 42
2.8.1 function : creating your own . . . . . . . . . . . . . . . . . . . . . . 43
2.8.2 if : conditional operations . . . . . . . . . . . . . . . . . . . . . . . . 44
2.8.3 for loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
2.8.4 Subsetting with logic . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
2.8.5 Apply functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
2.9 RStudio projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
2.9.1 R Markdown . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

vii
viii Contents

2.10 Exercises: Introduction to R . . . . . . . . . . . . . . . . . . . . . . . . . . 53

3 Data Abstraction 55
3.1 The Tidyverse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.2 Tibbles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.2.1 Building a tibble from vectors . . . . . . . . . . . . . . . . . . . . . . 57
3.2.2 tribble . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.2.3 read_csv . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.3 Summarizing variable distributions . . . . . . . . . . . . . . . . . . . . . . 60
3.3.1 Stratifying variables by site using a Tukey box plot . . . . . . . . . . 62
3.4 Database operations with dplyr . . . . . . . . . . . . . . . . . . . . . . . . 63
3.4.1 Select, mutate, and the pipe . . . . . . . . . . . . . . . . . . . . . . . 63
3.4.2 filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
3.4.3 Writing a data frame to a csv . . . . . . . . . . . . . . . . . . . . . . 67
3.4.4 Summarize by group . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
3.4.5 Count . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
3.4.6 Sorting after summarizing . . . . . . . . . . . . . . . . . . . . . . . . 69
3.4.7 The dot operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
3.5 String abstraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
3.5.1 Detecting matches . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
3.5.2 Subsetting strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
3.5.3 String length . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
3.5.4 Replacing substrings with other text (“mutating” strings) . . . . . . 73
3.5.5 Concatenating and splitting . . . . . . . . . . . . . . . . . . . . . . . 74
3.6 Dates and times with lubridate . . . . . . . . . . . . . . . . . . . . . . . . 76
3.7 Calling functions explicitly with :: . . . . . . . . . . . . . . . . . . . . . . 77
3.8 Exercises: Data Abstraction . . . . . . . . . . . . . . . . . . . . . . . . . . 77

4 Visualization 79
4.1 plot in base R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
4.2 ggplot2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
4.3 Plotting one variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
4.3.1 Histogram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
4.3.2 Density plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
4.3.3 Boxplot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
4.4 Plotting Two Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
4.4.1 Two continuous variables . . . . . . . . . . . . . . . . . . . . . . . . 90
4.4.2 Two variables, one discrete . . . . . . . . . . . . . . . . . . . . . . . 92
4.4.3 Color systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
4.4.4 Trend line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
4.5 General Symbology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
4.5.1 Categorical symbology . . . . . . . . . . . . . . . . . . . . . . . . . . 99
4.5.2 Log scales instead of transform . . . . . . . . . . . . . . . . . . . . . 99
4.6 Graphs from Grouped Data . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
4.6.1 Faceted graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
4.7 Titles and Subtitles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
4.8 Pairs Plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
4.9 Exercises: Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

5 Data Transformation 107


5.1 Data joins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
Contents ix

5.2 Set operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109


5.3 Binding rows and columns . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
5.4 Pivoting data frames . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
5.4.1 pivot_longer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
5.4.2 pivot_wider . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
5.4.3 A free_y faceted graph using a pivot . . . . . . . . . . . . . . . . . . 116
5.5 Exercise: Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

II Spatial 121
6 Spatial Data and Maps 123
6.1 Spatial Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
6.1.1 Simple geometry building in sf . . . . . . . . . . . . . . . . . . . . . 125
6.1.2 Building points from a data frame . . . . . . . . . . . . . . . . . . . 128
6.1.3 SpatVectors in terra . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
6.1.4 Creating features from shapefiles . . . . . . . . . . . . . . . . . . . . 133
6.2 Coordinate Referencing Systems . . . . . . . . . . . . . . . . . . . . . . . . 135
6.3 Creating sf Data from Data Frames . . . . . . . . . . . . . . . . . . . . . . 137
6.3.1 Removing geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
6.4 Base R’s plot() with terra . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
6.4.1 Using maptiles to create a basemap . . . . . . . . . . . . . . . . . . 139
6.5 Raster data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
6.5.1 Building rasters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
6.5.2 Vector to raster conversion . . . . . . . . . . . . . . . . . . . . . . . 143
6.6 ggplot2 for Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
6.6.1 Rasters in ggplot2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
6.7 tmap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
6.8 Interactive Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
6.8.1 Leaflet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
6.8.2 Mapview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
6.8.3 tmap (view mode) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
6.8.4 Interactive mapping of individual penguins abstracted from a big
dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
6.9 Exercises: Spatial Data and Maps . . . . . . . . . . . . . . . . . . . . . . . 159
6.9.1 Project preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . 159

7 Spatial Analysis 163


7.1 Data Frame Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
7.1.1 Using grouped summaries, and filtering by a selection . . . . . . . . 165
7.2 Spatial Analysis Operations . . . . . . . . . . . . . . . . . . . . . . . . . . 168
7.2.1 Using topology to subset . . . . . . . . . . . . . . . . . . . . . . . . . 168
7.2.2 Centroid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
7.2.3 Distance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
7.2.4 Buffers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
7.2.5 Spatial overlay: union and intersection . . . . . . . . . . . . . . . . . 179
7.2.6 Clip with st_crop . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
7.2.7 Spatial join with st_join . . . . . . . . . . . . . . . . . . . . . . . . 183
7.2.8 Further exploration of spatial analysis . . . . . . . . . . . . . . . . . 184
7.3 Exercises: Spatial Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 184

8 Raster Spatial Analysis 187


x Contents

8.1 Terrain functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187


8.2 Map Algebra in terra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
8.3 Distance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
8.4 Extracting Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
8.5 Focal Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
8.6 Zonal Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
8.7 Exercises: Raster Spatial Analysis . . . . . . . . . . . . . . . . . . . . . . . 203

9 Spatial Interpolation 205


9.1 Null Model of the Original Data . . . . . . . . . . . . . . . . . . . . . . . . 205
9.2 Voronoi Polygon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
9.2.1 Cross-validation and relative performance . . . . . . . . . . . . . . . 209
9.3 Nearest Neighbor Interpolation . . . . . . . . . . . . . . . . . . . . . . . . . 210
9.3.1 Cross-validation and relative performance of the nearest neighbor
model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
9.4 Inverse Distance Weighted (IDW) . . . . . . . . . . . . . . . . . . . . . . . 211
9.4.1 Using cross-validation and relative performance to guide inverse-
distance weight choice . . . . . . . . . . . . . . . . . . . . . . . . . . 212
9.4.2 IDW: trying other inverse distance powers . . . . . . . . . . . . . . . 213
9.5 Polynomials and Trend Surfaces . . . . . . . . . . . . . . . . . . . . . . . . 214
9.6 Kriging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
9.6.1 Create a variogram. . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
9.6.2 Fit the variogram based on visual interpretation . . . . . . . . . . . 220
9.6.3 Ordinary Kriging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
9.7 Exercises: Spatial Interpolation . . . . . . . . . . . . . . . . . . . . . . . . . 223

III Statistics and Modeling 225


10 Statistical Summaries and Tests 227
10.1 Goals of Statistical Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 227
10.2 Summary Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
10.2.1 Summarize by group: stratifying a summary . . . . . . . . . . . . . . 229
10.2.2 Boxplot for visualizing distributions by group . . . . . . . . . . . . . 230
10.2.3 Generating pseudorandom numbers . . . . . . . . . . . . . . . . . . . 230
10.3 Correlation r and Coefficient of Determination r2 . . . . . . . . . . . . . . . 233
10.3.1 Displaying correlation in a pairs plot . . . . . . . . . . . . . . . . . . 236
10.4 Statistical Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
10.4.1 Comparing samples and groupings with a t test and a non-parametric
Kruskal-Wallis Rank Sum test . . . . . . . . . . . . . . . . . . . . . 237
10.4.2 Analysis of variance . . . . . . . . . . . . . . . . . . . . . . . . . . . 246
10.4.3 Testing a correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . 251
10.5 Exercises: Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251

11 Modeling 253
11.1 Some Common Statistical Models . . . . . . . . . . . . . . . . . . . . . . . 253
11.2 Linear Model (lm) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254
11.3 Spatial Influences on Statistical Analysis . . . . . . . . . . . . . . . . . . . 256
11.3.1 Mapping residuals . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256
11.4 Analysis of Covariance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261
11.5 Generalized linear model (GLM) . . . . . . . . . . . . . . . . . . . . . . . . 266
11.5.1 Binomial family: logistic GLM with streams . . . . . . . . . . . . . . 266
Contents xi

11.5.2 Logistic landslide model . . . . . . . . . . . . . . . . . . . . . . . . . 269


11.5.3 Poisson regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276
11.5.4 Models employing machine learning . . . . . . . . . . . . . . . . . . 278
11.6 Exercises: Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279

12 Imagery and Classification Models 281


12.1 Reading and Displaying Sentinel-2 Imagery . . . . . . . . . . . . . . . . . . 281
12.1.1 Individual bands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283
12.1.2 Spectral subsets to create three-band R-G-B and NIR-R-G for visual-
ization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284
12.1.3 Crop to study area extent . . . . . . . . . . . . . . . . . . . . . . . . 284
12.1.4 Saving results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286
12.1.5 Band scatter plots . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286
12.2 Spectral Profiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286
12.3 Map Algebra and Vegetation Indices . . . . . . . . . . . . . . . . . . . . . . 290
12.3.1 Vegetation indices . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290
12.3.2 Histogram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291
12.3.3 Other vegetation indices . . . . . . . . . . . . . . . . . . . . . . . . . 291
12.4 Unsupervised Classification with k-means . . . . . . . . . . . . . . . . . . . 293
12.5 Machine Learning Classification of Imagery . . . . . . . . . . . . . . . . . . 295
12.5.1 Read imagery and training data and extract sample values for training 296
12.5.2 Training the CART model . . . . . . . . . . . . . . . . . . . . . . . . 297
12.5.3 Prediction using the CART model . . . . . . . . . . . . . . . . . . . 298
12.5.4 Validating the model . . . . . . . . . . . . . . . . . . . . . . . . . . . 300
12.6 Classifying with 10 m Sentinel-2 Imagery . . . . . . . . . . . . . . . . . . . 303
12.6.1 Subset bands (10 m) . . . . . . . . . . . . . . . . . . . . . . . . . . . 303
12.6.2 Crop to RCV extent and extract pixel values . . . . . . . . . . . . . 304
12.6.3 Training the CART model (10 m) and plot the tree . . . . . . . . . 304
12.6.4 Prediction using the CART model (10 m) . . . . . . . . . . . . . . . 305
12.7 Classification Using Multiple Images Capturing Phenology . . . . . . . . . 308
12.7.1 Create a 10-band stack from both images . . . . . . . . . . . . . . . 309
12.7.2 Extract the training data (10 m spring + summer) . . . . . . . . . . 309
12.7.3 CART model and prediction (10 m spring + summer) . . . . . . . . 310
12.8 Conclusions and Next Steps for Imagery Classification . . . . . . . . . . . . 315
12.9 Exercises: Imagery Analysis and Classification Models . . . . . . . . . . . . 316

IV Time Series 317


13 Time Series Visualization and Analysis 319
13.1 Structure, Seasonality, and Decomposition of Time Series . . . . . . . . . . 321
13.2 Creation of Time Series (ts) Data . . . . . . . . . . . . . . . . . . . . . . . 323
13.2.1 Frequency, start, and end parameters for ts() . . . . . . . . . . . . . 324
13.2.2 Associating times with time series . . . . . . . . . . . . . . . . . . . 325
13.2.3 Subsetting time series by times . . . . . . . . . . . . . . . . . . . . . 325
13.2.4 Changing the frequency to use a different period . . . . . . . . . . . 327
13.2.5 Time stamps and extensible time series . . . . . . . . . . . . . . . . 328
13.3 Data smoothing: moving average (ma) . . . . . . . . . . . . . . . . . . . . . 332
13.4 Decomposition of data logger data: Marble Mountains . . . . . . . . . . . . 335
13.5 Facet Graphs for Comparing Variables over Time . . . . . . . . . . . . . . 338
13.6 Lag Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341
13.6.1 The lag regression, using a lag function in a linear model . . . . . . 343
xii Contents

13.7 Ensemble Summary Statistics . . . . . . . . . . . . . . . . . . . . . . . . . 345


13.8 Learning more about Time Series in R . . . . . . . . . . . . . . . . . . . . 347
13.9 Exercises: Time Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347

V Communication and References 349


14 Communication with Shiny 351
14.1 Shiny Document . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351
14.1.1 Input and output objects in the Old Faithful Eruptions document . 353
14.1.2 Input widgets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354
14.1.3 Other input widgets . . . . . . . . . . . . . . . . . . . . . . . . . . . 356
14.2 A Shiny App . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357
14.2.1 A brief note on reactivity . . . . . . . . . . . . . . . . . . . . . . . . 359
14.3 Shiny App I/O Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . 359
14.3.1 Data tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 359
14.3.2 Text as character: renderPrint() and verbatimTextOutput() . . . . . 360
14.3.3 Formatted text . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 360
14.3.4 Plots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361
14.4 Shiny App in a Package . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 362
14.5 Components of a Shiny App (sierra) . . . . . . . . . . . . . . . . . . . . . . 363
14.5.1 Initial data setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363
14.5.2 The ui section, with a tabsetPanel structure . . . . . . . . . . . . . . 364
14.5.3 The server section, including reactive elements . . . . . . . . . . . . 365
14.5.4 Calling shinyApp with the ui and server function results . . . . . . . 367
14.6 A MODIS Fire App with Web Scraping and observe with leafletProxy . . 367
14.6.1 Setup code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 368
14.6.2 ui . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 369
14.6.3 Using observe and leafletProxy to allow changing the date while
retaining the map zoom . . . . . . . . . . . . . . . . . . . . . . . . . 369
14.7 Learn More about Shiny Apps . . . . . . . . . . . . . . . . . . . . . . . . . 370
14.8 Exercises: Shiny . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371

References 373

Index 377
Author/editor biographies

Jerry Douglas Davis is a Professor of Geography & Environment (https://geog.sfsu.edu/)


and the Director of the Institute for Geographic Information Science (https://gis.sfsu.edu/)
at San Francisco State University, and borrows heavily from his and his students’ field-based
environmental research for examples in the book.

xiii
Taylor & Francis
Taylor & Francis Group
http://taylorandfrancis.com
List of Figures

1.1 Environmental data science . . . . . . . . . . . . . . . . . . . . . . . . . . 1


1.2 California counties simple features data in igisci package . . . . . . . . . . 7

2.1 Variables, observations, and values in rectangular data . . . . . . . . . . . 22


2.2 Temperature plotted by index (left) and elevation (right) . . . . . . . . . . 28
2.3 The three penguin species in palmerpenguins. Photos by KB Gorman. Used
with permission . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.4 Diagram of penguin head with indication of bill length and bill depth (from
Horst, Hill, and Gorman (2020), used with permission) . . . . . . . . . . . 32
2.5 Temperature and elevation scatter plot . . . . . . . . . . . . . . . . . . . . 35
2.6 TRI dataframe – DT datatable output . . . . . . . . . . . . . . . . . . . . 35
2.7 Crude river map using x y coordinates . . . . . . . . . . . . . . . . . . . . 46
2.8 Longitudinal profile built from cumulative distances and elevation . . . . . 48

3.1 Visualization of some abstracted data from the EPA Toxic Release Inventory 55
3.2 Euc-Oak paired plot runoff and erosion study (Thompson, Davis, and
Oliphant (2016)) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
3.3 Eucalyptus/Oak paired site locations . . . . . . . . . . . . . . . . . . . . . 62
3.4 Tukey boxplot of runoff under eucalyptus canopy . . . . . . . . . . . . . . 62

4.1 Flipper length by mass and by species, base plot system. The Antarctic
peninsula penguin data set is from @palmer. . . . . . . . . . . . . . . . . . 80
4.2 Simple bar graph of meadow vegetation samples . . . . . . . . . . . . . . . 81
4.3 Distribution of NDVI, Knuthson Meadow . . . . . . . . . . . . . . . . . . . 83
4.4 Distribution of Average Monthly Temperatures, Sierra Nevada . . . . . . . 83
4.5 Cumulative Distribution of Average Monthly Temperatures, Sierra Nevada 84
4.6 Density plot of NDVI, Knuthson Meadow . . . . . . . . . . . . . . . . . . . 85
4.7 Comparative density plot using alpha setting . . . . . . . . . . . . . . . . . 85
4.8 Runoff under eucalyptus and oak in Bay Area sites . . . . . . . . . . . . . 86
4.9 Boxplot of runoff by site . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
4.10 Runoff at Bay Area Sites, colored as eucalyptus and oak . . . . . . . . . . 87
4.11 Marble Valley, Marble Mountains Wilderness, California . . . . . . . . . . 88
4.12 Marble Mountains soil gas sampling sites, with surface topographic features
and cave passages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
4.13 Visualizing soil CO2 data with a Tukey box plot . . . . . . . . . . . . . . . 89
4.14 Scatter plot of discharge (Q) and specific electrical conductance (EC) for
Sagehen Creek, California . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
4.15 Q and EC for Sagehen Creek, using log10 scaling on both axes . . . . . . . 91
4.16 Setting one color for all points . . . . . . . . . . . . . . . . . . . . . . . . . 91
4.17 Two variables, one discrete . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
4.18 Using aesthetics settings for both points and lines . . . . . . . . . . . . . . 93
4.19 Color set within aes() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

xv
xvi List of Figures

4.20 Streamflow (Q) and specific electrical conductance (EC) for Sagehen Creek,
colored by temperature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
4.21 Channel slope as range from green to red, vertices sized by elevation . . . 96
4.22 Channel slope as range of line colors on a longitudinal profile . . . . . . . . 96
4.23 Channel slope by longitudinal distance as scatter points colored by slope . 97
4.24 Trend line with a linear model . . . . . . . . . . . . . . . . . . . . . . . . . 98
4.25 EPA TRI, categorical symbology for industry sector . . . . . . . . . . . . . 99
4.26 Using log scales instead of transforming . . . . . . . . . . . . . . . . . . . . 100
4.27 NDVI symbolized by vegetation in two seasons . . . . . . . . . . . . . . . . 101
4.28 Eucalyptus and oak: rainfall and runoff . . . . . . . . . . . . . . . . . . . . 101
4.29 Faceted graph alternative to color grouping (note that the y scale is the
same for each) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
4.30 Titles added . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
4.31 Pairs plot for Sierra Nevada stations variables . . . . . . . . . . . . . . . . 104
4.32 Enhanced GGally pairs plot for palmerpenguin data . . . . . . . . . . . . . 104

5.1 Color classified by phenology, data created by a pivot . . . . . . . . . . . . 113


5.2 Euc vs oak graphs created using a pivot . . . . . . . . . . . . . . . . . . . 114
5.3 Runoff/rainfall scatterplot colored by tree, created by pivot and binding
rows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
5.4 Flux tower installed at Loney Meadow, 2016. Photo credit: Darren Black-
burn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
5.5 free-y facet graph supported by pivot (note the y axis scaling varies among
variables) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
5.6 Goal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

6.1 A simple ggplot2 map built from scratch with hard-coded data as simple
feature columns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
6.2 Using an sf class to build a map in ggplot2, displaying an attribute . . . . 127
6.3 Base R plot of one attribute from two states . . . . . . . . . . . . . . . . . 128
6.4 Points created from a dataframe with Simple Features . . . . . . . . . . . 129
6.5 Simple plot of SpatVector point data with labels (note that overlapping
labels may result, as seen here) . . . . . . . . . . . . . . . . . . . . . . . . 131
6.6 ggplot of twostates and stations . . . . . . . . . . . . . . . . . . . . . . . . 132
6.7 Base R plot of twostates and stations SpatVectors . . . . . . . . . . . . . . 133
6.8 A simple plot of polygon data by default shows all variables . . . . . . . . 134
6.9 A single map with a legend is produced when a variable is specified . . . . 134
6.10 Points created from data frame with coordinate variables . . . . . . . . . . 137
6.11 Plotting SpatVector data with base R plot system . . . . . . . . . . . . . . 138
6.12 Features added to the map using the base R plot system . . . . . . . . . . 139
6.13 Using maptiles for a base map . . . . . . . . . . . . . . . . . . . . . . . . . 140
6.14 Converted sf data for map with tiles . . . . . . . . . . . . . . . . . . . . . 141
6.15 Simple plot of a worldwide SpatRaster of 30-degree cells, with SpatVector
of CA and NV added . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
6.16 Stream raster converted from stream features, with 30 m cells from an ele-
vation raster template . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
6.17 Shuttle Radar Topography Mission (SRTM) image of Virgin River Canyon
area, southern Utah . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
6.18 simple ggplot map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
6.19 labels added . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
6.20 repositioned legend . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
List of Figures xvii

6.21 Using bbox to zoom into two counties . . . . . . . . . . . . . . . . . . . . . 149


6.22 Rasters displayed in ggplot by converting to points . . . . . . . . . . . . . 150
6.23 tmap of the world . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
6.24 tmap fill colored by variable . . . . . . . . . . . . . . . . . . . . . . . . . . 152
6.25 hillshade, borders and point symbols in tmap . . . . . . . . . . . . . . . . 152
6.26 Two western states with a basemap in tmap . . . . . . . . . . . . . . . . . 153
6.27 Leaflet map showing the location of the SFSU Institute for Geographic
Information Science with choices of basemaps . . . . . . . . . . . . . . . . 155
6.28 View (interactive) mode of tmap with selection of basemaps . . . . . . . . 157
6.29 Observations of Adélie penguin migration from a 5-season
study of a large colony at Ross Island in the SW Ross
Sea, Antarctica; and an individual – H36CROZ0708 –
from season 0708. Data source: Ballard et al. (2019). Fine-scale oceano-
graphic features characterizing successful Adélie penguin foraging in the
SW Ross Sea. Marine Ecology Progress Series 608:263-277. . . . . . . . . . 158
6.30 tmap View mode (goal) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161

7.1 Plotting filtered data: above 2,000 m and 38°N latitude with a basemap . . 165
7.2 A Bodie scene, from Bodie State Historic Park (https://www.parks.ca.gov/) 165
7.3 Sierra data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
7.4 Northern Sierra stations and places . . . . . . . . . . . . . . . . . . . . . . 169
7.5 California county centroids . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
7.6 Map scaled to cover Bay Area tracts using a bbox . . . . . . . . . . . . . . 171
7.7 Nile River points, colored by channel slope . . . . . . . . . . . . . . . . . . 173
7.8 Nile River channel slope as range of colors from green to red, with great
circle channel distances derived using the haversine method . . . . . . . . 173
7.9 Selection of soil CO2 sampling sites, July 1995 . . . . . . . . . . . . . . . . 174
7.10 Selection of soil CO2 and in-cave water samples . . . . . . . . . . . . . . . 176
7.11 Distance from CO2 samples to closest streams (not including lakes) . . . . 177
7.12 Distance to towns (places) from weather stations . . . . . . . . . . . . . . 178
7.13 100 m trail buffer, Marble Mountains . . . . . . . . . . . . . . . . . . . . . 179
7.14 Unioned trail buffer, dissolving boundaries . . . . . . . . . . . . . . . . . . 180
7.15 Intersection of trail and stream buffers . . . . . . . . . . . . . . . . . . . . 181
7.16 Union of two sets of buffer polygons . . . . . . . . . . . . . . . . . . . . . . 181
7.17 Cropping with specified x and y limits . . . . . . . . . . . . . . . . . . . . 182
7.18 TRI points with census variables added via a spatial join . . . . . . . . . . 183
7.19 Transect Buffers (goal) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185

8.1 Marble Mountains (California) elevation . . . . . . . . . . . . . . . . . . . 188


8.2 Slope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
8.3 Aspect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
8.4 Classified slopes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
8.5 Hillshade . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
8.6 Map algebra conversion of elevations from metres to feet . . . . . . . . . . 191
8.7 Boolean: slope > 20 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
8.8 Boolean intersection: (slope > 20) * (elev > 2000) . . . . . . . . . . . . . . 192
8.9 Stream distance raster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
8.10 Random points in the Marble Valley area, Marble Mountains, California . 196
8.11 Points colored by geology extracted from raster . . . . . . . . . . . . . . . 197
8.12 Elevation by stream distance, colored by geology, random point extraction 197
xviii List of Figures

8.13 Dissolved calcium carbonate grouped by geology extracted at water sample


points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
8.14 Slope by elevation colored by extracted geology . . . . . . . . . . . . . . . 198
8.15 Logarithm of calcium carbonate total hardness at sample points, showing
geologic units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
8.16 9x9 focal mean of elevation . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
8.17 Hillshade of 9x9 focal mean of elevation . . . . . . . . . . . . . . . . . . . . 201
8.18 Marble Mountains geology raster . . . . . . . . . . . . . . . . . . . . . . . 202
8.19 Modal geology in 9 by 9 neighborhoods . . . . . . . . . . . . . . . . . . . . 202
8.20 Geology and elevation by stream and trail distance (goal) . . . . . . . . . . 204

9.1 Precipitation map in Teale Albers in Sierra counties . . . . . . . . . . . . . 206


9.2 Voronoi polygons around Sierra stations . . . . . . . . . . . . . . . . . . . 207
9.3 Precipitation mapped by Voronoi polygon . . . . . . . . . . . . . . . . . . 208
9.4 Rasterized Voronoi polygons . . . . . . . . . . . . . . . . . . . . . . . . . . 208
9.5 Nearest neighbor interpolation of precipitation . . . . . . . . . . . . . . . . 210
9.6 IDW interpolation, power = 2 . . . . . . . . . . . . . . . . . . . . . . . . . 212
9.7 IDW interpolation, power = 1 . . . . . . . . . . . . . . . . . . . . . . . . . 213
9.8 Linear trend . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
9.9 2nd order polynomial, precipitation . . . . . . . . . . . . . . . . . . . . . . 216
9.10 Third order polynomial, temperature . . . . . . . . . . . . . . . . . . . . . 217
9.11 Third order polynomial with extremes flattened . . . . . . . . . . . . . . . 217
9.12 Third order local polynomial, precipitation . . . . . . . . . . . . . . . . . . 218
9.13 Variogram of precipitation at Sierra weather stations . . . . . . . . . . . . 219
9.14 Fitted variogram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220
9.15 Spherical fit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
9.16 Exponential model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
9.17 Ordinary Kriging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
9.18 Voronoi polygons of precipitation (goal) . . . . . . . . . . . . . . . . . . . 224
9.19 IDW (goal) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224

10.1 Tukey boxplot by group . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231


10.2 Marble Mountains average soil carbon dioxide per site . . . . . . . . . . . 231
10.3 Random uniform histogram . . . . . . . . . . . . . . . . . . . . . . . . . . 232
10.4 Random normal histogram . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
10.5 Random normal density plot . . . . . . . . . . . . . . . . . . . . . . . . . . 233
10.6 Random normal plotted against random uniform . . . . . . . . . . . . . . 234
10.7 Scatter plot illustrating negative correlation . . . . . . . . . . . . . . . . . 235
10.8 Pairs plot with r values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
10.9 NDVI by phenology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238
10.10 Runoff under eucalyptus and oak in Bay Area sites . . . . . . . . . . . . . 241
10.11 Runoff at various sites contrasting euc and oak . . . . . . . . . . . . . . . . 241
10.12 East Bay sites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
10.13 Eucalyptus and oak sediment runoff box plots . . . . . . . . . . . . . . . . 243
10.14 Facet density plot of eucalyptus and oak sediment runoff . . . . . . . . . . 244
10.15 Water sampling in varying lithologies in a karst area . . . . . . . . . . . . 247
10.16 Total hardness from dissolved carbonates at water sampling sites in Upper
Sinking Cove, TN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247
10.17 Sinking Cove dissolved carbonates as total hardness by lithology . . . . . . 248
10.18 Upper Sinking Cove (Tennessee) stratigraphy . . . . . . . . . . . . . . . . 250
10.19 Sinking Cove dissolved carbonates as TH and elevation by lithology . . . . 250
List of Figures xix

11.1 Original February temperature data . . . . . . . . . . . . . . . . . . . . . . 258


11.2 Temperature predicted by elevation model . . . . . . . . . . . . . . . . . . 258
11.3 Temperature predicted by elevation raster . . . . . . . . . . . . . . . . . . 259
11.4 Residuals of temperature from model predictions by elevation . . . . . . . 260
11.5 Meandering river . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262
11.6 Braided river . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262
11.7 Anastomosed river . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263
11.8 Q vs S with stream type . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263
11.9 Landslide in San Pedro Creek watershed . . . . . . . . . . . . . . . . . . . 270
11.10 Landslides in San Pedro Creek watershed . . . . . . . . . . . . . . . . . . . 271
11.11 Sediment source analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271
11.12 Raw random points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272
11.13 Landslides and buffers to exclude from random points . . . . . . . . . . . . 273
11.14 Landslides and random points (excluded from slide buffers) . . . . . . . . . 273
11.15 Logistic model prediction of 1983 landslide probability . . . . . . . . . . . 276
11.16 Black-footed albatross counts, July 2006 . . . . . . . . . . . . . . . . . . . 277
11.17 Prediction of temperature from elevation (one of two goals) . . . . . . . . 279
11.18 Prediction raster (goal) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 280

12.1 Four bands of a Sentinel-2 scene from 20210628. . . . . . . . . . . . . . . . 283


12.2 R-G-B image from Sentinel-2 scene 20210628. . . . . . . . . . . . . . . . . 284
12.3 Color image from Sentinel-2 of Red Clover Valley, 20210628. . . . . . . . . 285
12.4 NIR-R-G image from Sentinel-2 of Red Clover Valley, 20210628. . . . . . . 285
12.5 Relations between Red and NIR bands, Red Clover Valley Sentinel-2 image,
20210628 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286
12.6 Spectral signature of nine-level training polygons, 20 m Sentinel-2 imagery
from 20210628. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287
12.7 Spectral signature of seven-level training polygons, 20 m Sentinel-2 imagery
from 20210628. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288
12.8 Spectral signature of six-level training polygons, 20 m Sentinel-2 imagery
from 20210628 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289
12.9 NDVI from Sentinel-2 image, 20210628 . . . . . . . . . . . . . . . . . . . . 290
12.10 NDVI histogram, Sentinel-2 image, 20210628 . . . . . . . . . . . . . . . . . 291
12.11 NDMI from Sentinel-2 image, 20210628 . . . . . . . . . . . . . . . . . . . . 292
12.12 NDMI histogram, Sentinel-2 image, 20210628 . . . . . . . . . . . . . . . . 292
12.13 NDGI from Sentinel-2 image, 20210628 . . . . . . . . . . . . . . . . . . . . 293
12.14 NDGI histogram, Sentinel-2 image, 20210628 . . . . . . . . . . . . . . . . . 294
12.15 Unsupervised k-means classification, Red Clover Valley, Sentinel-2,
20210628 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295
12.16 Training samples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296
12.17 CART Decision Tree, Sentinel-2 20 m, date 20210628 . . . . . . . . . . . . 297
12.18 CART classification, probabilities of each class, Sentinel-2 20 m 20210628 . 298
12.19 CART classification, highest probability class, Sentinel-2 20 m 20210628 . 299
12.20 10 m CART regression tree . . . . . . . . . . . . . . . . . . . . . . . . . . . 304
12.21 CART classification, probabilities of each class, Sentinel-2 10 m 20210628 . 305
12.22 CART classification, highest probability class, Sentinel-2 10 m 20210628 . 306
12.23 CART decision tree, Sentinel 10-m, spring and summer 2021 images . . . . 310
12.24 CART classification, probabilities of each class, Sentinel-2 10 m, 2021 spring
and summer phenology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311
12.25 CART classification, highest probability class, Sentinel-2 10 m, 2021 spring
and summer phenology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312
xx List of Figures

12.26 Classification of Sentinel-2 20 m image . . . . . . . . . . . . . . . . . . . . 313


12.27 Classification of Sentinel-2 10 m spring and summer images . . . . . . . . 314

13.1 Red Clover Valley eddy covariance flux tower installation . . . . . . . . . . 319
13.2 Loney Meadow net ecosystem exchange (NEE) results (Blackburn et al.
2021) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 320
13.3 Time series of Nile River flows . . . . . . . . . . . . . . . . . . . . . . . . . 320
13.4 Decomposition of Mauna Loa CO2 data . . . . . . . . . . . . . . . . . . . 322
13.5 Seasonal deomposition of time series using loess (stl) applied to CO2 . . . 322
13.6 San Francisco monthly highs and lows as time series . . . . . . . . . . . . . 323
13.7 SF data with yearly period . . . . . . . . . . . . . . . . . . . . . . . . . . . 324
13.8 Greenhouse gases with 20 year observations, so 0.05 annual frequency . . . 325
13.9 Monthly sunspot activity from 1749 to 2013 . . . . . . . . . . . . . . . . . 326
13.10 Monthly sunspot activity from 1940 to 1970 . . . . . . . . . . . . . . . . . 326
13.11 Sunspots of the first 20 years of data . . . . . . . . . . . . . . . . . . . . . 327
13.12 11-year sunspot cycle decomposition . . . . . . . . . . . . . . . . . . . . . 328
13.13 San Pedro Creek E. coli time series . . . . . . . . . . . . . . . . . . . . . . 331
13.14 Decomposition of weekly E. coli data, annual period (frequency 52) . . . . 331
13.15 Moving average (order=15) of E. coli data . . . . . . . . . . . . . . . . . . 333
13.16 GHG CO2 time series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334
13.17 Moving average (order=7) of CO2 time series . . . . . . . . . . . . . . . . 334
13.18 Random variation seen by subtracting moving average . . . . . . . . . . . 335
13.19 Decomposition using stl of a 15th-order moving average of E. coli data . . 336
13.20 Marble Mountains resurgence data logger design . . . . . . . . . . . . . . . 336
13.21 Marble Mountains resurgence data logger equipment . . . . . . . . . . . . 337
13.22 Data logger data from the Marbles resurgence . . . . . . . . . . . . . . . . 338
13.23 stl decomposition of Marbles water level time series . . . . . . . . . . . . . 339
13.24 Flux tower installed at Loney Meadow, 2016. Photo credit: Darren Black-
burn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339
13.25 Facet plot with free y scale of Loney flux tower parameters . . . . . . . . . 341
13.26 Scatter plot of Bugac solar radiation and air temperature . . . . . . . . . . 342
13.27 Solstice 8-day time series of solar radiation and temperature . . . . . . . . 343
13.28 Bugac solar radiation and temperature . . . . . . . . . . . . . . . . . . . . 345
13.29 Manaus ensemble averages with error bars . . . . . . . . . . . . . . . . . . 346
13.30 Facet graph of Marble Mountains resurgence data (goal) . . . . . . . . . . 348

14.1 New Shiny Document dialog . . . . . . . . . . . . . . . . . . . . . . . . . . 352


14.2 Shiny Document Editor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352
14.3 Old Faithful geyser eruptions Shiny interface . . . . . . . . . . . . . . . . . 353
14.4 numericInput and renderPrint code . . . . . . . . . . . . . . . . . . . . . . 354
14.5 Numeric and slider inputs and print outputs . . . . . . . . . . . . . . . . . 355
14.6 Plot modified by input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 356
14.7 Radio buttons and check boxes . . . . . . . . . . . . . . . . . . . . . . . . 356
14.8 Simple Inline app . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357
14.9 Simple inline coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 358
14.10 Rendered data table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 360
14.11 Text entry and rendered text . . . . . . . . . . . . . . . . . . . . . . . . . . 361
14.12 Rendered box plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361
14.13 Shiny app of Sierra climate data, with multiple tabs available . . . . . . . 362
14.14 MODIS fire detection Shiny app . . . . . . . . . . . . . . . . . . . . . . . . 368
1
Background, Goals and Data

1.1 Environmental Data Science

Data science is an interdisciplinary field that uses scientific methods, processes, algorithms
and systems to extract knowledge and insights from noisy, structured and unstructured data
(Wikipedia). A data science approach is especially suitable for applications involving large
and complex data sets, and environmental data is a prime example, with rapidly growing
collections from automated sensors in space and time domains.

Environmental data science is data science applied to environmental science research. In


general, data science can be seen as being the intersection of math and statistics, computer
science/IT, and some research domain, and in this case it’s environmental (Figure 1.1).

FIGURE 1.1 Environmental data science

1.2 Environmental Data and Methods

The methods needed for environmental research can include many things since environmen-
tal data can include many things, including environmental measurements in space and time
domains.

1
2 Background, Goals and Data

• data analysis and transformation methods


– importing and other methods to create data frames
– reorganization and creation of fields
– filtering observations
– data joins
– reorganizing data, including pivots
• visualization
– graphics
– maps
– imagery
• spatial analysis
– vector and raster spatial analysis
∗ spatial joins
∗ distance analysis
∗ overlay analysis
∗ terrain modeling
– spatial statistics
– image analysis
• statistical summaries, tests and models
– statistical summaries and visualization
– stratified/grouped summaries
– confirmatory statistical tests
– physical, statistical and machine learning models
– classification models
• temporal data and time series
– analyzing and visualizing long-term environmental data
– analyzing and visualizing high-frequency data from loggers

1.3 Goals

While the methodological reach of data science is very great, and the spectrum of environ-
mental data is as well, our goal is to lay the foundation and provide useful introductory
methods in the areas outlined above, but as a “live” book be able to extend into more
advanced methods and provide a growing suite of research examples with associated data
sets. We’ll briefly explore some data mining methods that can be applied to so-called “big
data” challenges, but our focus is on exploratory data analysis in general, applied to en-
vironmental data in space and time domains. For clarity in understanding the methods and
products, much of our data will be in fact be quite small, derived from field-based environ-
mental measurements where we can best understand how the data were collected, but these
methods extend to much larger data sets. It will primarily be in the areas of time-series and
imagery, where automated data capture and machine learning are employed, when we’ll dip
our toes into big data.
Exploratory Data Analysis 3

1.3.1 Some definitions:

Machine Learning: building a model using training data in order to make predictions
without being explicitly programmed to do so. Related to artificial intelligence methods.
Used in:

• image and imagery classification, including computer vision methods


• statistical modeling
• data mining

Data Mining: discovering patterns in large data sets

• databases collected by government agencies


• imagery data from satellite, aerial (including drone) sensors
• time-series data from long-term data records or high-frequency data loggers
• methods may involve machine learning, artificial intelligence and computer vision

Big Data: data having a size or complexity too big to be processed effectively by traditional
software

• data with many cases or dimensions (including imagery)


• many applications in environmental science due to the great expansion of automated
environmental data capture in space and time domains
• big data challenges exist across the spectrum of the environmental research process,
from data capture, storage, sharing, visualization, querying

Exploratory Data Analysis: procedures for analyzing data, techniques for interpreting
the results of such procedures, ways of structuring data to make its analysis easier

• summarizing
• restructuring
• visualization

1.4 Exploratory Data Analysis

Just as exploration is a part of what National Geographic has long covered, it’s an impor-
tant part of geographic and environmental science research. Exploratory data analysis
is exploration applied to data, and has grown as an alternative approach to traditional
statistical analysis. This basic approach perhaps dates back to the work of Thomas Bayes
in the eighteenth century, but Tukey (1962) may have best articulated the basic goals of
this approach in defining the “data analysis” methods he was promoting: “Procedures for
analyzing data, techniques for interpreting the results of such procedures, ways of planning
the gathering of data to make its analysis easier, more precise or more accurate, and all the
machinery and results of (mathematical) statistics which apply to analyzing data.” Some
years later Tukey (1977) followed up with Exploratory Data Analysis.
4 Background, Goals and Data

Exploratory data analysis (EDA) is an approach to analyzing data via summaries and graph-
ics. The key word is exploratory, and while one might view this in contrast to confirmatory
statistics, in fact they are highly complementary. The objectives of EDA include (a) suggest-
ing hypotheses; (b) assessing assumptions on which inferences will be based; (c) selecting
appropriate statistical tools; and (d) guiding further data collection. This philosophy led to
the development of S at Bell Labs (led by John Chambers, 1976), then to R.

1.5 Software and Data

First, we’re going to use the R language, designed for statistical computing and graphics. It’s
not the only way to do data analysis – Python is another important data science language
– but R with its statistical foundation is an important language for academic research,
especially in the environmental sciences.

## [1] ”This book was produced in RStudio using R version 4.2.1 (2022-06-23 ucrt)”

For a start, you’ll need to have R and RStudio installed, then you’ll need to install various
packages to support specific chapters and sections.

• In Introduction to R (Chapter 2), we will mostly use the base installation of R, with
a few packages to provide data and enhanced table displays:
– igisci
– palmerpenguins
– DT
– knitr
• In Abstraction (Chapter 3) and Transformation (Chapter 5), we’ll start making a
lot of use of tidyverse 3.1 packages such as:
– ggplot2
– dplyr
– stringr
– tidyr
– lubridate
• In Visualization (Chapter 4), we’ll mostly use ggplot2, but also some specialized visu-
alization packages such as:
– GGally
• In Spatial (starting with Chapter 6), we’ll add some spatial data, analysis and mapping
packages:
– sf
– terra
– tmap
– leaflet
• In Statistics and Modeling (starting with Chapter 10), no additional packages are
needed, as we can rely on base R’s rich statistical methods and ggplot2’s visualization.
Software and Data 5

• In Time Series (Chapter 13), we’ll find a few other packages handy:
– xts (Extensible Time Series)
– forecast (for a few useful functions like a moving average)

And there will certainly be other packages we’ll explore along the way, so you’ll want to
install them when you first need them, which will typically be when you first see a library()
call in the code, or possibly when a function is prefaced with the package name, something
like dplyr::select(), or maybe when R raises an error that it can’t find a function you’ve
called or that the package isn’t installed. One of the earliest we’ll need is the suite of
packages in the “tidyverse” (Wickham and Grolemund (2016)), which includes some of the
ones listed above: ggplot2, dplyr, stringr, and tidyr. You can install these individually, or
all at once with:

`install.packages(”tidyverse”)`

This is usually done from the console in RStudio and not included in an R script or mark-
down document, since you don’t want to be installing the package over and over again. You
can also respond to a prompt from RStudio when it detects a package called in a script you
open that you don’t have installed.

From time to time, you’ll want to update your installed packages, and that usually happens
when something doesn’t work and maybe the dependencies of one package on another gets
broken with a change in a package. Fortunately, in the R world, especially at the main
repository at CRAN, there’s a lot of effort put into making sure packages work together, so
usually there are no surprises if you’re using the most current versions. Note that there can
be exceptions to this, and occasionally new package versions will create problems with other
packages due to inter-package dependencies and the introduction of functions with names
that duplicate other packages. The packages installed for this book were current as of that
version of R, but new package versions may occasionally introduce errors.

Once a package like dplyr is installed, you can access all of its functions and data by adding
a library call, like …

library(dplyr)

… which you will want to include in your code, or to provide access to multiple libraries in
the tidyverse, you can use library(tidyverse). Alternatively, if you’re only using maybe one
function out of an installed package, you can call that function with the :: separator, like
dplyr::select(). This method has another advantage in avoiding problems with duplicate
names – and for instance we’ll generally call dplyr::select() this way.

1.5.1 Data

We’ll be using data from various sources, including data on CRAN like the code packages
above which you install the same way – so use install.packages(”palmerpenguins”).

We’ve also created a repository on GitHub that includes data we’ve developed in the Insti-
tute for Geographic Information Science (iGISc) at SFSU, and you’ll need to install that
package a slightly different way.
6 Background, Goals and Data

GitHub packages require a bit more work on the user’s part since we need to first install
remotes1 , then use that to install the GitHub data package:

install.packages(”remotes”)
remotes::install_github(”iGISc/igisci”)

Then you can access it just like other built-in data by including:

library(igisci)

To see what’s in it, you’ll see the various datasets listed in:

data(package=”igisci”)

For instance, Figure 1.2 is a map of California counties using the CA_counties sf feature
data. We’ll be looking at the sf (Simple Features) package later in the Spatial section of the
book, but seeing library(sf), this is one place where you’d need to have installed another
package, with install.packages(”sf”).

library(tidyverse); library(igisci); library(sf)


ggplot(data=CA_counties) + geom_sf()

The package datasets can be used directly as sf data or data frames. And similarly to
functions, you can access the (previously installed) data set by prefacing with igisci:: this
way, without having to load the library. This might be useful in a one-off operation:

mean(igisci::sierraFeb$LATITUDE)

## [1] 38.3192

Raw data such as .csv files can also be read from the extdata folder that is installed on
your computer when you install the package, using code such as:

csvPath <- system.file(”extdata”,”TRI/TRI_1987_BaySites.csv”, package=”igisci”)


TRI87 <- read_csv(csvPath)

1 Note:you can also use devtools instead of remotes if you have that installed. They do the same thing;
remotes is a subset of devtools. If you see a message about Rtools, you can ignore it since that is only needed
for building tools from C++ and things like that.
Software and Data 7

42°N

40°N

38°N

36°N

34°N

124°W 122°W 120°W 118°W 116°W 114°W

FIGURE 1.2 California counties simple features data in igisci package

or something similar for shapefiles, such as:

shpPath <- system.file(”extdata”,”marbles/trails.shp”, package=”igisci”)


trails <- st_read(shpPath)

And we’ll find that including most of the above arcanity in a function will help. We’ll look
at functions later, but here’s a function that we’ll use a lot for setting up reading data from
the extdata folder:

ex <- function(dta){system.file(”extdata”,dta,package=”igisci”)}

And this ex()function is needed so often that it’s installed in the igisci package, so if you
have library(igisci) in effect, you can just use it like this:

trails <- st_read(ex(”marbles/trails.shp”))

But how do we see what’s in the extdata folder? We can’t use the data() function, so we
would have to dig for the folder where the igisci package gets installed, which is buried
pretty deeply in your user profile. So I wrote another function exfiles() that creates a
data frame showing all of the files and the paths to use. In RStudio you could access it
with View(exfiles()) or we could use a datatable (you’ll need to have installed “DT”).
You can use the path using the ex() function with any function that needs it to read data,
like read.csv(ex('CA/CA_ClimateNormals.csv')), or just enter that ex() call in the console
like ex('CA/CA_ClimateNormals.csv') to display where on your computer the installed data
reside.

DT::datatable(exfiles(), options=list(scrollX=T), rownames=F)


8 Background, Goals and Data

1.6 Acknowledgements

This book was immensely aided by extensive testing by students in San Francisco State’s
GEOG 604/704 Environmental Data Science class, including specific methodological contri-
butions from some of the students and a contributed data wrangling exercise by one from
the first offering (Josh von Nonn) in Chapter 5. Thanks to Andrew Oliphant, Chair of the
Department of Geography and Environment, for supporting the class (as long as I included
time series) and then came through with some great data sets from eddy covariance flux
towers as well as guest lectures. Many thanks to Adam Davis, California Energy Commis-
sion, for suggestions on R spatial methods and package development, among other things
in the R world. Thanks to Anna Studwell, recent Associate Director of the IGISc, for ideas
on statistical modeling of birds and marine environments, and the nice water-color for the
front cover. And a lot of thanks goes to Nancy Wilkinson, who put up with my obsessing
on R coding puzzles at all hours and pretended to be impressed with what you can do with
R Markdown.
Acknowledgements 9

Introduction to Environmental Data Science © Jerry D. Davis, ORCID 0000-0002-5369-1197,


Institute for Geographic Information Science, San Francisco State University, all rights
reserved.

Introduction to Environmental Data Science by Jerry Davis is licensed under a Creative


Commons Attribution 4.0 International License.

Cover art “Dandelion fluff – Ephemeral stalk sheds seeds to the universe” by Anna Studwell.
Taylor & Francis
Taylor & Francis Group
http://taylorandfrancis.com
References
Applied California Current Ecosystem Studies. n.d. https://pointblue.org.
Ballard, Grant , Annie E Schmidt , Viola Toniolo , Sam Veloz , Dennis Jongsomjit , Kevin R Arrigo , and David G
Ainley . 2019. “Fine-Scale Oceanographic Features Characterizing Successful Adélie Penguin Foraging in the
SW Ross Sea.” Marine Ecology Progress Series. https://doi.org/10.3354/meps12801.
Berry, Brian , and Duane Marble . 1968. Spatial Analysis: A Reader in Statistical Geography. Prentice-Hall.
Blackburn, Darren A , Andrew J Oliphant , and Jerry D Davis . 2021. “Carbon and Water Exchanges in a
Mountain Meadow Ecosystem, Sierra Nevada, California.” Wetlands 41 (3): 1–17.
https://doi.org/10.1007/s13157-021-01437-2.
Brown, Christopher. n.d. "R Accessors Explained". https://www.r-bloggers.com/2009/10/r-accessors-explained/.
Calculate Distance, Bearing and More Between Latitutde/Longitude Points". n.d. Movable Type Ltd.
https://www.movable-type.co.uk/scripts/latlong.html.
Clover Valley Ranch Restoration, the Sierra Fund. n.d. https://sierrafund.org/clover-valley-ranch/.
Cohen, Jacob. 1960. “A Coefficient of Agreement for Nominal Scales.” Educational and Psychological
Measurement 20. https://doi.org/10.1177/001316446002000104.
Copernicus Open Access Hub. n.d. European Space Agency - ESA. https://scihub.copernicus.eu/.
Davis, JD , P Amato , and R Kiefer . 2001. “Soil Carbon Dioxide in a Summer-Dry Subalpine Karst, Marble
Mountains, California, USA.” Zeitschrift Für Geomorphologie N.F. 45 (3): 385–400.
https://www.researchgate.net/publication/258333952_Soil_carbon_dioxide_in_a_summer-
dry_subalpine_karst_Marble_Mountains_California_USA.
Davis, JD , L Blesius , M Slocombe , S Maher , M Vasey , P Christian , and P Lynch . 2020. “Unpiloted Aerial
System (UAS)-Supported Biogeomorphic Analysis of Restored Sierra Nevada Montane Meadows.” Remote
Sensing 12. https://www.mdpi.com/2072-4292/12/11/1828.
Davis, JD , and GA Davis . 2001. “A Microcontroller-Based Data-Logger Design for Seasonal Hydrochemical
Studies.” Earth Surface Processes and Landforms 26 (10): 1151–1159. https://doi.org/10.1002/esp.262.
Davis, Jerry. n.d. San Pedro Creek Watershed Virtual Fieldtrip: Story Map.
https://storymaps.arcgis.com/stories/62705877a9f64ac5956a64230430c248.
Davis, Jerry D , and George A Brook . 1993. “Geomorphology and Hydrology of Upper Sinking Cove,
Cumberland Plateau, Tennessee.” Earth Surface Processes and Landforms 18 (4): 339–362.
https://doi.org/10.1002/esp.3290180404.
Davis, Jerry , and Leonhard Blesius . 2015. “A Hybrid Physical and Maximum-Entropy Landslide Susceptibility
Model.” Entropy 17 (6): 4271–4292. https://www.mdpi.com/1099-4300/17/6/4271.
Ellen, Stephen D , and Gerald F Wieczorek . 1988. Landslides, Floods, and Marine Effects of the Storm of
January 3-5, 1982, in the San Francisco Bay Region, California. Vol. 1434. USGS.
https://pubs.usgs.gov/pp/1988/1434/.
EPSG Geodetic Parameter Dataset. n.d. https://en.wikipedia.org/wiki/EPSG_Geodetic_Parameter_Dataset.
European Fluxes Database. n.d. http://www.icos-etc.eu/home.
preref-StatisticalMethodsWaterResourcesHelsel , Dennis R., Robert M. Hirsch , Karen R. Ryberg , Stacey A.
Archfield , and Edward J. Gilroy . 2020. “Statistical Methods in Water Resources.” In Hydrologic Analysis and
Interpretation. Reston, Virginia: U.S. Geological Survey. https://pubs.usgs.gov/tm/04/a03/tm4a3.pdf.
Hijmans, Robert J. n.d. Spatial Data Science. https://rspatial.org.
Horst, Allison Marie , Alison Presmanes Hill , and Kristen B Gorman . 2020. Palmerpenguins: Palmer
Archipelago (Antarctica) Penguin Data. https://allisonhorst.github.io/palmerpenguins/.
“Hysteresis.” n.d. https://en.wikipedia.org/wiki/Hysteresis.
Irizarry, Rafael A. 2019. Introduction to Data Science: Data Analysis and Prediction Algorithms with r. CRC
Press. https://cran.r-project.org/package=dslabs.
Johnston, Myfanwy , and Bob Rudis . n.d. Visualizing Fish Encounter Histories.
https://fishsciences.github.io/post/visualizing-fish-encounter-histories/.
Lovelace, Robin , Jakuv Nowosad , and Jannes Muenchow . 2019. Geocomputation with r. CRC Press.
https://geocompr.robinlovelace.net/.
Marine Debris Program. n.d. NOAA Office of Response; Restoration. https://marinedebris.noaa.gov/.
Nowosad, Jakub. n.d. Geostatistics in r. https://bookdown-
org.translate.goog/nowosad/geostatystyka/?_x_tr_sl=pl&_x_tr_tl=en&_x_tr_hl=pl.
Pebesma, Edzer. n.d. Gstat: Spatial and Spatio-Temporal Geostatistical Modelling, Prediction and Simulation.
https://cran.r-project.org/web/packages/gstat/index.html.
Powell, Cynthia , Leonhard Blesius , Jerry Davis , and Falk Schuetzenmeister . 2011. “Using MODIS Snow
Cover and Precipitation Data to Model Water Runoff for the Mokelumne River Basin in the Sierra Nevada,
California (2000–2009).” Global and Planetary Change 77 (1-2): 77–84.
https://doi.org/10.1016/j.gloplacha.2011.03.005.
Simple Features for r. n.d. https://r-spatial.github.io/sf/.
Sims, Stephanie. 2004. Hillslope Sediment Source Assessment of San Pedro Creek Watershed, California.
https://geog.sfsu.edu/theses/.
Studwell, Anna , Ellen Hines , Meredith L Elliott , Julie Howar , Barbara Holzman , Nadav Nur , and Jaime
Jahncke . 2017. “Modeling Nonresident Seabird Foraging Distributions to Inform Ocean Zoning in Central
California.” PLoS ONE. https://doi.org/10.1371/journal.pone.0169517.
Thiessen, A. 1911. “Precipitation Averages for Large Areas.” Monthly Weather Review 39 (7): 1082–1089.
Thompson, A , JD Davis , and AJ Oliphant . 2016. “Surface Runoff and Soil Erosion Under Eucalyptus and Oak
Canopy.” Earth Surface Processes and Landforms. https://doi.org/10.1002/esp.3881.
Tomlin, C Dana. 1990. Geographic Information Systems and Cartographic Modeling. Englewood Cliffs, N.J:
Prentice Hall.
Tukey, John W. 1962. “The Future of Data Analysis.” The Annals of Mathematical Statistics 33 (1): 1–67.
Tukey, John W. 1977. Exploratory Data Analysis. Reading, Mass: Addison-Wesley.
Voronoi, G. 1908. “Nouvelles Applications Des Paramètres Continus à La Théorie de Formes Quadratiques.”
Journal Für Die Reine Und Angewandte Mathematik 134: 198–287.
Wang, Earo , Dianne Cook , and Rob J Hyndman . 2020. “A New Tidy Data Structure to Support Exploration
and Modeling of Temporal Data.” Journal of Computational and Graphical Statistics 29 (3): 466–478.
https://doi.org/10.1080/10618600.2019.1695624.
Wickham, Hadley , and Garrett Grolemund . 2016. R for Data Science: Visualize, Model, Transform, Tidy, and
Import Data. O’Reilly Media, Inc. https://www.tidyverse.org/learn/.
Xie, Yihui. 2021. Bookdown: Authoring Books and Technical Documents with r Markdown. Boca Raton, Florida:
Chapman; Hall/CRC. https://bookdown.org/yihui/bookdown/.
Xie, Yihui , JJ Allaire , and Garrett Grolemund . 2019. R Markdown: The Definitive Guide. 1st ed. Boca Raton,
Florida: Chapman; Hall/CRC. https://bookdown.org/yihui/rmarkdown/.
Yang, W , H Kobayashi , C Wang , J Shen M abd Chen , B Matsushita , Y Tang , Y Kim , et al. 2019. “A Semi-
Analytical Snow-Free Vegetation Index for Improving Estimation of Plant Phenology in Tundra and Grassland
Ecosystems.” Remote Sensing of Environment 228: 31–44. https://doi.org/10.1016/j.rse.2019.03.028.

You might also like