0% found this document useful (0 votes)

15 views

Chapter 2-Data Preparation

Uploaded by

Alexsandra Tosta

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views

Chapter 2-Data Preparation

Uploaded by

Alexsandra Tosta

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

CHAPTER 2

Data preparation

Preparing data before analysis An example of species survey data

Before ecological data can be analysed, they need Imagine that you are interested in investigating
to be prepared and put into the right format. Data the hypothesis that soil depth influences tree
that are entered in the wrong format cannot be species diversity. The data that will allow you to
analysed or will yield wrong results. test this hypothesis are data on soil depth and
Different statistical programs require data in data on diversity collected for a series of sample
different formats. You should consult the manual plots. We will see in a later chapter that diversity
of the statistical software to find out how data need can be estimated from information on the species
to be prepared. Alternatively, you could check identity of every tree. Figure 2.1 shows species
example datasets. An example of data preparation and soil depth data for the first four sample plots
for the R package is presented at the end of this that were inventoried (to test the hypothesis, we
session. need several sample plots that span the range from
Before you embark on the data analysis, it is shallow to deep soils). For site A, three species
essential to check for mistakes in data entry. If you were recorded (S1, S2 and S3) and a soil depth
detect mistakes later in the analysis, you would of 1 m. For site B, only two species were recorded
need to start the analysis again and could have (S1 with four trees and S3 with one tree) and a soil
lost considerable time. Mistakes in data entry can depth of 2 m.
often be detected as exceptional values. The best
procedure of analysing your results is therefore to
start with checking the data.

Figure 2.1 A simplified example of information

recorded on species and environmental data.

19
20 CHAPTER 2

The species information from Figure 2.1 can be A general format for species
recorded as follows: survey data
Site Species S1 Species S2 Species S3 As seen above, all information can be recorded
(count) (count) (count) in the form of data matrices. All the types of
A 1 1 1 data that are described in this manual can be
B 4 0 1 prepared as two matrices: the species matrix and
the environmental matrix. Table 2.1 shows a part
C 2 2 0
of the species matrix for a well-studied dataset in
D 0 1 2 community ecology, the dune meadow dataset.
This dataset contains 30 species of which only
13 are presented. The data were collected on
the vegetation of meadows on the Dutch island
The environmental information from Figure 2.1 of Terschelling (Jongman et al. 1995). Table 2.2
can be recorded in a similar fashion: shows the environmental data for this dataset.
You can notice that the rows of both matrices
Site Soil depth (m) have the same names – they reflect the data
A 1.0 that were collected for each site or sample unit.
B 2.0 Sites could be sample plots, sample sites, farms,
biogeographical provinces, or other identities.
C 0.5
Sites are defined as the areas from which data were
D 1.5 collected during a specific time period. We will
use the term “site” further on in this manual. Sites
will always refer to the rows of the datasets.
Some studies involve more than one type of
This chapter deals with the preparation of data sampling unit, often arranged hierarchically. For
matrices as the two matrices given above. Note example, villages, farms in the village and plots
that the example of Figure 2.1 is simplified: within a farm. Sites of different types (such as plots,
typical species matrices have more than 100 rows villages and districts) should not be mixed within
and more than 100 columns. These matrices can the same data matrix. Each site of the matrix should
be used as input for the analyses shown in the be of the same type of sampling unit.
following chapters. They can be generated by a The columns of the matrices indicate the
decent data management system. These matrices variables that were measured for each site. The cells
are usually not the ideal method of capturing, of the matrices contain observations – bits of data
entering and storing data. Recording species data recorded for a specific site and a specific variable.
in the field is typically done with data collection We prefer using rows to represent samples and
forms that are filled for each site separately and columns to represent variables to the alternative
that contain tables with a single column for form where rows represent variables. Our preference
the species name and a single column for the is simply based on the fact that some general
abundance. This is also the ideal method of statistical packages use this format. Data can be
storing species data. presented by swapping rows and columns, since the
contents of the data will remain the same.
Table 2.1 An example of a species matrix, where rows correspond to sites, columns correspond to species and cell entries are the abundance of the
species at a particular site

Site Achmil Agrsto Airpra Alogen Antodo Belper Brarut Brohor Calcus Chealb Cirarv Elepal Elyrep …
X1 1 0 0 0 0 0 0 0 0 0 0 0 4 …
X2 3 0 0 2 0 3 0 4 0 0 0 0 4 …
X3 0 4 0 7 0 2 2 0 0 0 0 0 4 …
X4 0 8 0 2 0 2 2 3 0 0 2 0 4 …
X5 2 0 0 0 4 2 2 2 0 0 0 0 4 …
X6 2 0 0 0 3 0 6 0 0 0 0 0 0 …
X7 2 0 0 0 2 0 2 2 0 0 0 0 0 …
X8 0 4 0 5 0 0 2 0 0 0 0 4 0 …
X9 0 3 0 3 0 0 2 0 0 0 0 0 6 …
X10 4 0 0 0 4 2 2 4 0 0 0 0 0 …
X11 0 0 0 0 0 0 4 0 0 0 0 0 0 …
X12 0 4 0 8 0 0 4 0 0 0 0 0 0 …
X13 0 5 0 5 0 0 0 0 0 1 0 0 0 …
X14 0 4 0 0 0 0 0 0 4 0 0 4 0 …
X15 0 4 0 0 0 0 4 0 0 0 0 5 0 …
X16 0 7 0 4 0 0 4 0 3 0 0 8 0 …
X17 2 0 2 0 4 0 0 0 0 0 0 0 0 …
X18 0 0 0 0 0 2 6 0 0 0 0 0 0 …
X19 0 0 3 0 4 0 3 0 0 0 0 0 0 …
X20 0 5 0 0 0 0 4 0 3 0 0 4 0 …
DATA PREPARATION
21
22 CHAPTER 2

Table 2.2 An example of an environmental matrix, where rows correspond to sites and columns correspond to
variables

The species matrix and for site 13 indicates a range of 5-12.5% in

cover percentage. The species matrix should not
The species data are included in the species contain a range of values in a single cell, but a
matrix. This matrix shows the values for each single number (the database can contain the range
species and for each site (see data collection for that is used to calculate the coding for the range).
various types of samples). For example, the value An extreme method of collecting data that only
of 5 was recorded for species Agrostis stolonifera reflect a range of values is the presence-absence
(coded as Agrsto) and for site 13. Another name scale, where a value of 0 indicates that the species
for this matrix is the community matrix. was not observed and a value of 1 shows that the
The species matrix often contains abundance species was observed.
values – the number of individuals that were A site will often only contain a small subset of
counted for each species. Sometimes species data all the species that were observed in the whole
reflect the biomass recorded for each species. survey. Species distribution is often patchy. Species
Biomass can be approximated by percentage data will thus typically contain many zeros. Some
cover (typical for surveys of grasslands) or by statistical packages require that you are explicit
cross-sectional area (the surface area of the stem, that a value of zero was collected – otherwise the
typical for forest surveys). Some survey methods software could interpret an empty cell in a species
do not collect precise values but collect values that matrix as a missing value. Such a missing value
indicate a range of possible values, so that data will not be used for the analysis, so you could
collection can proceed faster. For instance, the obtain erroneous results if the data were recorded
value of 5 recorded for species Agrostis stolonifera as zero but treated as missing.
DATA PREPARATION 23

The environmental matrix For the thickness of A1 horizon of Table 2.2, we

obtain following summary statistics.
The environmental dataset is more typical of the
type of dataset that a statistical package normally Min. 1st Qu. Median Mean 3rd Qu. Max.
handles. The columns in the environmental dataset 2.800 3.500 4.200 4.850 5.725 11.500
contain the various environmental variables. The
rows indicate the sites for which the values were
recorded. The environmental variables can be These statistics summarize the values that were
referred to as explanatory variables for the types obtained for the quantitative variable. Another
of analysis that we describe in this manual. Some method by which the values for a quantitative
people prefer to call these variables independent variable can be summarized is a boxplot graph
variables, and others prefer the term x variables. as shown in Figure 2.2. The whiskers show the
For instance, the information on the thickness minimum and maximum of the dataset, except if
of the A1 horizon of the dune meadow dataset some values are farther than 1.5 × the interquartile
shown in Table 2.2 can be used as an explanatory range (the difference between the 1st and 3rd
variable in a model that explains where species quartile) from the median value. Note that various
Agrostis stolonifera occurs. The research hypotheses software packages or options within such package
will have indicated which explanatory variables will result in different statistics to be portrayed
were recorded, since an infinite number of in boxplot graphs – you may want to check
environmental variables could be recorded at each the documentation of your particular software
site. package. An important feature of Figure 2.2 is
The environmental dataset will often contain that it shows that there are some outliers in the
two types of variables: quantitative variables and dataset. If your data are normally distributed,
categorical variables. then you would only rarely (less than 1% of the
Quantitative variables such as the thickness of time) expect to observe an outlier. If the boxplot
the A1 horizon of Table 2.2 contain observations indicates outliers, check whether you entered the
that are measured quantities. The observation for data correctly (see next page).
the A1 horizon of site 1 was for example recorded
by the number 2.8. Various statistics can be
calculated for quantitative variables that cannot be
calculated for categorical variables. These include:
• The mean or average value
• The standard deviation (this value indicates how
close the values are to the mean)
• The median value (the middle value when values
are sorted from low to high) (synomyms for this
value are the 50% quantile or 2nd quartile)
• The 25% and 75% quantiles = 1st and 3rd
quartiles (the values for which 25% or 75% of
values are smaller when values are sorted from
low to high)
• The minimum value
• The maximum value
24 CHAPTER 2

Figure 2.2 Summary of a quantitative variable as a boxplot. The variable that is summarized is the thickness of the
A1 horizon of Table 2.2.

Figure 2.3 Summary of a quantitative variable as a Q-Q plot. The variable that is summarized is the thickness of the
A1 horizon of Table 2.2. The two outliers (upper right-hand side) correspond to the outliers of Figure 2.2.
DATA PREPARATION 25

Graphically, the summary can be represented as

a barplot. Figure 2.4 shows an example for the
management of Table 2.2.
Some researchers record observations of
categorical variables as a number, where the
number represents the code for a specific type
of value – for instance code “1” could indicate
“standard farming”. We do not encourage the
usage of numbers to code for factor levels since
statistical software and analysts can confuse the
Figure 2.4 Summary of a categorical variable by a bar
plot. The management of Table 2.2 is summarized. variable with a quantitative variable. The statistical
software could report erroneously that the average
There are other graphical methods for checking management type is 2.55, which does not make
for outliers for quantitative variables. One of sense. It would definitely be wrong to conclude
these methods is the Q-Q plot. When data are that the average management type would be 3 (the
normally distributed, all observations should be integer value closest to 2.55) and thus be hobby-
plotted roughly along a straight line. Outliers will farming. A better way of recording categorical
be plotted further away from the line. Figure 2.3 variables is to include characters. You are then
gives an example. Another method to check for specific that the value is a factor level – you could
outliers is to plot a histogram. The key point is to for instance use the format of “c1”, “c2”, “c3” and
check for the exceptional observations. “c4” to code for the four management regimes.
Categorical variables (or qualitative variables) Even better techniques are to use meaningful
are variables that contain information on data abbreviations for the factor levels – or to just use
categories. The observations for the type of the entire description of the factor level, since
management for the dune meadow dataset most software will not have any problems with
(presented in Table 2.2) have four values: “standard long descriptions and you will avoid confusion of
farming”, “biological farming”, “hobby farming” collaborators or even yourself at later stages.
and “nature conservation management”. The Ordinal variables are somewhere between
observation for the type of management is thus quantitative and categorical variables. The manure
not a number. In statistical textbooks, categorical variable of the dune meadow dataset is an ordinal
variables are also referred to as factors. Factors can variable. Ordinal variables are not measured on
only contain a limited number of factor levels. a quantitative scale but the order of the values
The only way by which categorical variables is informative. This means for manure that
can be summarized is by listing the number progressively more manure is used from manure
of observations or frequency of each category. class 0 until 4. However, since the scale is not
For instance, the summary for the management quantitative, a value of 4 does not mean that four
variable of Table 2.2 could be presented as: times more manure is used than for value 1 (if it
was, then we would have a quantitative variable).
For the same reason manure class 3 is not the
Category
average of manure class 2 and 4.
BF HF NM SF
You can actually choose whether you treat
observations 3 5 6 6
ordinal variables as quantitative or categorical
26 CHAPTER 2

variables in the statistical analysis. In many entered instead of 4.3. Compare with Figure 2.2.
statistical packages, when the observations of You should be aware of the likely ranges of all
a variable only contain numbers, the package quantitative variables.
will assume that the variable is a quantitative Some mistakes for categorical data can easily
variable. If you want the variable to be treated be spotted by calculating the frequencies of
as a categorical variable, you will need to inform observations for each factor level. If you had entered
the statistical package about this (for example by “NN” instead of “NM” for one management
using a non-numerical coding system). If you are observation in the dune meadow dataset, then
comfortable to assume for the analysis that the a table with the number of observations for
ordinal variables were measured on a quantitative each management type would easily reveal that
scale, then it is better to treat them as quantitative mistake. This method is especially useful when
variables. Some special methods for ordinal data the number of observations is fixed for each
are also available. level. If you designed your survey so that each
type of management should have 5 observations,
then spotting one type of management with 4
Checking for exceptional observations and one type with 1 observation
would reveal a data entry error.
observations that could be
Some exceptional observations will only be
mistakes spotted when you plot variables against each
The methods of summarizing quantitative other as part of exploratory analysis, or even later
and categorical data that were described in when you started conducting some statistical
the previous section can be used to check for analysis. Figure 2.6 shows a plot of all possible
exceptional data. Maximum or minimum values pairs of the environmental variables of the dune
that do not correspond to the expectations will meadow dataset. You can notice the two outliers
easily be spotted. Figure 2.5 for instance shows for the thickness of the A1 horizon, which occur
a boxplot for the A1 horizon that contained a at moisture category 4 and manure category 1,
data entry error for site 3 as the value 43 was for instance.

Figure 2.5 Checking for exceptional observations.

DATA PREPARATION 27

After having spotted a potential mistake, you need should not be changed or assumed to be missing.
to record immediately where the potential mistake If it is clearly a nonsense value, but no explanation
occurred, especially if you do not have time to can be found, then it should be omitted. If it is
directly check the raw data. You can include a text just a strange value then various courses are open
file where you record potential mistakes in the to you. You can try analysing the data with and
folder where you keep your data. Alternatively, without the observation to check if it makes a big
you could give the cell in the spreadsheet where difference to results. You might have to go back to
you keep a copy of the data a bright colour. Yet the field and take the measurement again, finding a
another method is to add an extra variable in your field explanation if the odd value is repeated.
dataset where comments on potential mistakes are Do not get confused when you have various
listed. However the best method is to directly check datasets in various stages of correction. Commonly
and change your raw data (if a mistake is found). scientists end up with several versions of each data
Always record the changes that you have made and file and loose track of which is which. The best
the reasons for them. Note that an observation that method is to have only one dataset, of which you
looks odd but which can not be traced to a mistake make regular backups.

Figure 2.6 Checking for exceptional data by pairwise comparisons of the variables of Table 2.2.
28 CHAPTER 2

Methods of transforming the We recommend only transforming variables if

values in the matrices you have a good reason to investigate a particular
pattern that will be revealed by the transformation.
There are many ways in which the values of For example, an extreme way of transforming the
the species and environmental matrices can be species matrix is to change the values to 1 if the
transformed. Some methods were developed species is present and 0 if the species is absent. The
to make data more conform to the normal subsequent analysis will thus not be influenced by
distribution. What transformation you use will differences in species’ abundances. By comparing
depend on your objectives and what you want the results of the analysis of the original data with
to assume about the data. For several types of the results from the transformed data, you can get
analysis described in later chapters you do not an idea of the influence of differences in abundance
need to transform the species matrix, and most on the results. If one species dominates and the
analyses do not actually require the explanatory ordination results are only influenced by that one
variables to be normally distributed. It is species, then you could use a logarithmic or square-
therefore not good practice to always transform root transformation to diminish the influence of
explanatory variables to be normally distributed. the dominant species – again this means that there
Moreover, in many cases it will not be possible to is a good reason for the transformation and such
find a transformation that will result in normally should not be a standard approach. The fact that
distributed data. the results are influenced by the dominant species
is actually a clear demonstration of an important
pattern in your dataset.
DATA PREPARATION 29

Examples of the analysis with the menu options of Biodiversity.R

See in chapter 3 how data can be loaded from an external file:
Data > Import data > from text file…
Enter name for dataset: data (choose any name)
Click “OK”
Browse for the file and click on it

To save data to an external file:

Data > Active Dataset > export active dataset…
File name: export.txt (choose any name)

Select the species and environmental matrices:

Biodiversity > Environmental Matrix > Select environmental matrix
Select the dune.env dataset
Biodiversity > Community matrix > Select community matrix
Select the dune dataset

To summarize the data and check for exceptional cases:

Biodiversity > Environmental Matrix > Summary…
Select variable: A1
Click “OK”
Click “Plot”
30 CHAPTER 2

Examples of the analysis with the command options of Biodiversity.R

To load data from an external file:
data <- read.table(file=”D://my files/data.txt”)
data <- read.table(file.choose())

To save data to an external file:

write.table(data, file=”D://my files/data.txt”)
write.table(data, file.choose())

To summarize the data and check for exceptional cases:

summary(dune.env)
boxplot(dune.env$A1)
points(mean(dune.env$A1),cex=1.5)
table(dune.env$Management)
plot(dune.env$Management)
pairs(dune.env)

To transform the data:

dune.ln.transformed <- log(dune+1)
dune.squareroot.transformed <- dune^0.5
dune.speciesprofile <- decostand(dune,”total”)
dune.env$A1.standard <- scale(dune.env$A1)

Checking whether data is normally distributed:

qq.plot(dune.env$A1)
shapiro.test(dune.env$A1)
ks.test(dune.env$A1,pnorm)

Business Case - Netflix - Data Exploration and Visualisation - Ipynb - Colab
No ratings yet
Business Case - Netflix - Data Exploration and Visualisation - Ipynb - Colab
9 pages
International Standard For Bollard Pull Trials 2019
No ratings yet
International Standard For Bollard Pull Trials 2019
31 pages
User's Guide For iNEXT Online: Software For Interpolation and Extrapolation of Species Diversity
No ratings yet
User's Guide For iNEXT Online: Software For Interpolation and Extrapolation of Species Diversity
14 pages
02-Sampling - Techniques Lab Report
50% (2)
02-Sampling - Techniques Lab Report
8 pages
Bio 160 - Exercise No 1 - Statistics
No ratings yet
Bio 160 - Exercise No 1 - Statistics
9 pages
zoba10
No ratings yet
zoba10
49 pages
Development of Software Tools For Ecological Field Studies
No ratings yet
Development of Software Tools For Ecological Field Studies
13 pages
iNEXTOnline UserGuide
No ratings yet
iNEXTOnline UserGuide
25 pages
Biology Project: Aim: To Compare Species Diversity of The Field in Areas Under A Tree and Areas Not
No ratings yet
Biology Project: Aim: To Compare Species Diversity of The Field in Areas Under A Tree and Areas Not
2 pages
Classification and Ordination Methods As A Tool For Analyzing of Plant Communities
No ratings yet
Classification and Ordination Methods As A Tool For Analyzing of Plant Communities
34 pages
EcoSim 5.0 Help System
No ratings yet
EcoSim 5.0 Help System
23 pages
Exploratory Data Analysis: 2.1 Objectives
No ratings yet
Exploratory Data Analysis: 2.1 Objectives
23 pages
Biodiversity Ecological Sampling
No ratings yet
Biodiversity Ecological Sampling
20 pages
Probsets
No ratings yet
Probsets
30 pages
Analysis of Environmental Data: Conceptual Foundations
No ratings yet
Analysis of Environmental Data: Conceptual Foundations
16 pages
DK Ch1
No ratings yet
DK Ch1
14 pages
Techni Manual For ANASTU - 2020
No ratings yet
Techni Manual For ANASTU - 2020
280 pages
Montagna Using SAS to Manage Biological Species Data and Calculate Diversity Indices
No ratings yet
Montagna Using SAS to Manage Biological Species Data and Calculate Diversity Indices
5 pages
(R. H. G. Jongman, C. J. F. Ter Braak, O. F. R. Va PDF
No ratings yet
(R. H. G. Jongman, C. J. F. Ter Braak, O. F. R. Va PDF
321 pages
BIO112 Population and Community Ecology
No ratings yet
BIO112 Population and Community Ecology
12 pages
ESci 117-Module 2-Lesson 2.1
No ratings yet
ESci 117-Module 2-Lesson 2.1
15 pages
Chapter 1-SIMFF L2 Ecology (1)
No ratings yet
Chapter 1-SIMFF L2 Ecology (1)
10 pages
Community Structure
No ratings yet
Community Structure
6 pages
1 Geography Skills
No ratings yet
1 Geography Skills
42 pages
A Manual and Software For Common Statistical Methods For Ecological and Biodiversity Studies
No ratings yet
A Manual and Software For Common Statistical Methods For Ecological and Biodiversity Studies
18 pages
mishel6
No ratings yet
mishel6
2 pages
HSDM Vignette
No ratings yet
HSDM Vignette
99 pages
BIO2 Notes
No ratings yet
BIO2 Notes
187 pages
ECOLOGY Lab Exercise 4
No ratings yet
ECOLOGY Lab Exercise 4
6 pages
Bio Mad
No ratings yet
Bio Mad
94 pages
Craw 1989 Quantitative Panbiogeography Introduction To Methods
No ratings yet
Craw 1989 Quantitative Panbiogeography Introduction To Methods
11 pages
b13695 Annotated
No ratings yet
b13695 Annotated
207 pages
Krebs Chapter 16 2013
No ratings yet
Krebs Chapter 16 2013
45 pages
David I Warton - Eco-Stats - Data Analysis in Ecology - From T-Tests To Multivariate Abundances (Methods in Statistical Ecology) - Springer (2022)
No ratings yet
David I Warton - Eco-Stats - Data Analysis in Ecology - From T-Tests To Multivariate Abundances (Methods in Statistical Ecology) - Springer (2022)
434 pages
Laboratory Activity 1 Basic Fieldwork Methods (January 29, 2019)
No ratings yet
Laboratory Activity 1 Basic Fieldwork Methods (January 29, 2019)
6 pages
Pcord - Exemple Envirenemment Analysis
100% (1)
Pcord - Exemple Envirenemment Analysis
288 pages
Jurnal Metode Kuadran
No ratings yet
Jurnal Metode Kuadran
28 pages
Environmental Science Basics III
No ratings yet
Environmental Science Basics III
17 pages
Species Data Issues of Acquisition and Design
No ratings yet
Species Data Issues of Acquisition and Design
25 pages
Chapter 6-Analysis of Counts of Trees
No ratings yet
Chapter 6-Analysis of Counts of Trees
31 pages
10.2 TeknikMetode Pengukuran Biodiversitas
No ratings yet
10.2 TeknikMetode Pengukuran Biodiversitas
99 pages
stat notes-2
No ratings yet
stat notes-2
118 pages
M6 L11 Final
No ratings yet
M6 L11 Final
12 pages
The Analysis of Biological Data Solutions Manual First Edition Michael C. Whitlock - Read the ebook online or download it to own the full content
100% (1)
The Analysis of Biological Data Solutions Manual First Edition Michael C. Whitlock - Read the ebook online or download it to own the full content
53 pages
Instant download Data Analysis in Vegetation Ecology 1st Edition Otto Wildi pdf all chapter
100% (1)
Instant download Data Analysis in Vegetation Ecology 1st Edition Otto Wildi pdf all chapter
67 pages
Krebs Chapter 06 2013
No ratings yet
Krebs Chapter 06 2013
42 pages
SS LECTURE2 ENB 310 Sampling Method
No ratings yet
SS LECTURE2 ENB 310 Sampling Method
22 pages
Sampling
No ratings yet
Sampling
4 pages
Sta 111 1ST Lecture Note
No ratings yet
Sta 111 1ST Lecture Note
6 pages
Ecological Sampling
100% (1)
Ecological Sampling
5 pages
Jurnal Statistik
No ratings yet
Jurnal Statistik
9 pages
Chapter 10-Analysis of Ecological Distance by Ordination
No ratings yet
Chapter 10-Analysis of Ecological Distance by Ordination
44 pages
Ecosystems, Sampling and Population Studies
100% (1)
Ecosystems, Sampling and Population Studies
7 pages
Halley Et Al. - 2016 - Dynamics of Extinction Debt Across Five Taxonomic Groups
No ratings yet
Halley Et Al. - 2016 - Dynamics of Extinction Debt Across Five Taxonomic Groups
34 pages
Species Area Curve Lab
100% (1)
Species Area Curve Lab
6 pages
Statistical Analysis of Ecological Communities: Progress, Status, and Future Directions
No ratings yet
Statistical Analysis of Ecological Communities: Progress, Status, and Future Directions
4 pages
Exam M 2023-24
No ratings yet
Exam M 2023-24
14 pages
002 Write Up
No ratings yet
002 Write Up
6 pages
Quadtree: Exploring Hierarchical Data Structures for Image Analysis
From Everand
Quadtree: Exploring Hierarchical Data Structures for Image Analysis
Fouad Sabry
No ratings yet
Co-Clustering: Models, Algorithms and Applications
From Everand
Co-Clustering: Models, Algorithms and Applications
Gérard Govaert
No ratings yet
Geometric functions in computer aided geometric design
From Everand
Geometric functions in computer aided geometric design
Oscar Ruiz
No ratings yet
Digital Signal Processing (DSP) with Python Programming
From Everand
Digital Signal Processing (DSP) with Python Programming
Maurice Charbit
No ratings yet
Six Sigma Tool
No ratings yet
Six Sigma Tool
23 pages
Knowledge-Based Systems: Michał Koziarski Michał Woźniak Bartosz Krawczyk
No ratings yet
Knowledge-Based Systems: Michał Koziarski Michał Woźniak Bartosz Krawczyk
16 pages
One Variable Statistics Project: Names: Shreya Maddireddy & Chris Lee
No ratings yet
One Variable Statistics Project: Names: Shreya Maddireddy & Chris Lee
6 pages
Math 7 Module 10 and 11 Practice Problems
No ratings yet
Math 7 Module 10 and 11 Practice Problems
45 pages
Name: Muhammad Siddique Class: B.Ed. Semester: Fifth Subject: Inferential Statistics Submitted To: Sir Sajid Ali
No ratings yet
Name: Muhammad Siddique Class: B.Ed. Semester: Fifth Subject: Inferential Statistics Submitted To: Sir Sajid Ali
6 pages
Robust Statistics For Outlier Detection: Peter J. Rousseeuw and Mia Hubert
No ratings yet
Robust Statistics For Outlier Detection: Peter J. Rousseeuw and Mia Hubert
7 pages
On MAPE-R As A Measure of Cross-Sectional Estimation and Forecast Accuracy
No ratings yet
On MAPE-R As A Measure of Cross-Sectional Estimation and Forecast Accuracy
15 pages
Teaching Basic Data Analysis With The Royal Australian Chemical Institute For UNSWworks
No ratings yet
Teaching Basic Data Analysis With The Royal Australian Chemical Institute For UNSWworks
17 pages
Superposition, Memorization, and Double Descent
No ratings yet
Superposition, Memorization, and Double Descent
30 pages
Third Lecture in Elementary Statistics 101
No ratings yet
Third Lecture in Elementary Statistics 101
46 pages
Statistics Using Libreoffice
No ratings yet
Statistics Using Libreoffice
56 pages
Practice of Statistics in the Life Sciences 4th Edition Brigitte Baldi - The complete ebook version is now available for download
100% (1)
Practice of Statistics in the Life Sciences 4th Edition Brigitte Baldi - The complete ebook version is now available for download
49 pages
Alternatives To The 15% Rule: Sandia Report
No ratings yet
Alternatives To The 15% Rule: Sandia Report
55 pages
ABS 2010 - Journal Lists
No ratings yet
ABS 2010 - Journal Lists
46 pages
Experiment No. 5: Objective
No ratings yet
Experiment No. 5: Objective
5 pages
Variance and Standard Deviation (2)
No ratings yet
Variance and Standard Deviation (2)
35 pages
Maths Apt
No ratings yet
Maths Apt
11 pages
Monitoring Current Runs of Periodic Process Chains - SAP Documentation
No ratings yet
Monitoring Current Runs of Periodic Process Chains - SAP Documentation
8 pages
(Oot) Results: Out of Specification (Oos) and Out of Trend
No ratings yet
(Oot) Results: Out of Specification (Oos) and Out of Trend
3 pages
Measuring The Upstreamness of Production and Trade Flows
No ratings yet
Measuring The Upstreamness of Production and Trade Flows
13 pages
Regression and Correlation
No ratings yet
Regression and Correlation
9 pages
Base de Datos Kuiper Analisis
No ratings yet
Base de Datos Kuiper Analisis
121 pages
"Can I Trust You, Doc?": User Perception of Online Health Information
No ratings yet
"Can I Trust You, Doc?": User Perception of Online Health Information
7 pages
Reporting Mann-Whitney U-Test in Apa
100% (1)
Reporting Mann-Whitney U-Test in Apa
25 pages
Linear Regression and Tire Correlation
No ratings yet
Linear Regression and Tire Correlation
54 pages
(A) What Is Machine Learning? Explain The Impact of Various Machine Learning Techniques in Today's World
No ratings yet
(A) What Is Machine Learning? Explain The Impact of Various Machine Learning Techniques in Today's World
6 pages
Bi Ut2 Answers
No ratings yet
Bi Ut2 Answers
23 pages
Regression and Analysis
No ratings yet
Regression and Analysis
132 pages

Chapter 2-Data Preparation

Uploaded by

Chapter 2-Data Preparation

Uploaded by

CHAPTER 2

Preparing data before analysis An example of species survey data

Figure 2.1 A simplified example of information

Site A1 Moisture Management Use Manure

The species matrix and for site 13 indicates a range of 5-12.5% in

The environmental matrix For the thickness of A1 horizon of Table 2.2, we

Graphically, the summary can be represented as

Figure 2.5 Checking for exceptional observations.

Methods of transforming the We recommend only transforming variables if

Examples of the analysis with the menu options of Biodiversity.R

To save data to an external file:

Select the species and environmental matrices:

To summarize the data and check for exceptional cases:

Examples of the analysis with the command options of Biodiversity.R

To save data to an external file:

To summarize the data and check for exceptional cases:

To transform the data:

Checking whether data is normally distributed:

You might also like