0% found this document useful (0 votes)
19 views36 pages

Palaeontological Community and Diversity Analysis

The document discusses quantitative methods for analyzing paleontological community data. It describes using occurrence matrices to compare samples and taxa. Various analyses are covered, including comparing samples statistically, clustering samples by similarity, ordinating samples along environmental gradients, and extracting univariate time series from samples to analyze trends over time.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views36 pages

Palaeontological Community and Diversity Analysis

The document discusses quantitative methods for analyzing paleontological community data. It describes using occurrence matrices to compare samples and taxa. Various analyses are covered, including comparing samples statistically, clustering samples by similarity, ordinating samples along environmental gradients, and extracting univariate time series from samples to analyze trends over time.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 36

Palaeontological community and diversity analysis

– brief notes

Oyvind Hammer
Paläontologisches Institut und Museum, Zürich
[email protected]

Zürich, June 3, 2002


Contents

1 Introduction 2

2 The basics of palaeontological community analysis 3

3 Comparing samples 5

4 Cluster analysis 8

5 Ordination and gradient analysis 11

6 Diversity 17

7 Curve fitting 25

8 Time series analysis 29

Bibliography 34

1
Chapter 1

Introduction

Palaeontology is becoming a quantitative subject, like other sciences such as biology and geology. The
demands are increasing on palaeontologists to support their conclusions using statistics and large data
bases. Palaeontology is also moving in the direction of becoming more analytical, in the sense that fossil
material is used to answer questions about environment, evolution and ecology. A quick survey of recent
issues of journals such as Paleobiology or Lethaia will show this very clearly indeed.
In addition to hypothesis testing, quantitative analysis serves other important purposes. One of them
is sometimes referred to as ’fishing’ (or data mining), that is searching for unknown patterns in the data
that may give us new ideas. Finally, quantitative methods will often indicate that the material is not
sufficient, and can give information about where we should collect more data.
This short text will describe some methods for quantitative treatment of palaeontological material
with respect to community analysis, biogeography and biodiversity. In practice, this means using com-
puters. It goes without saying that all the methods rely on good data. With incomplete or inaccurate data,
almost any result can be achieved. Quantitative analysis has not made ’old fashioned’ data collection
redundant - quite the opposite is true.
In this little course we can not go in any depth about the mathematical and statistical basis for the
methods - the aim is to give a practical overview. But if such methods are used in work that is to be
published, it is important that you know more about the underlying assumptions and the possible pitfalls.
For such information I can only refer to the literature.

The PAST software


In this course we will use a simple, but relatively comprehensive computer program for Windows, called
PAST (PAlaeontologigal STatistics). This program has been designed for teaching, and contains a col-
lection of data analysis methods that are commonly used by palaeontologists in a user-friendly package.
It can also be used for ’real’ work. PAST is free, and can be downloaded to your own computer from this
address:
http://folk.uio.no/ohammer/past/

Here you will also find the manual for the program, and a number of ’case studies’ demonstrating
the different methods. Some of these examples will be used in the course.

2
Chapter 2

The basics of palaeontological community


analysis

The starting point for most community analysis is the occurrence matrix. This table consists of rows
representing samples, and columns representing taxa (often species). The occurrences of the taxa in the
different samples can be in the simple form of presence or absence, conventionally coded with a 1 or a
0, or they can be given in terms of specimen counts (abundance data). Whether to use presence/absence
or abundance depends on the material and the aims of the analysis, but it is generally preferable to try
to collect abundance data if possible. Abundance can easily be converted to presence/absence, but the
converse is impossible!

A. expansus P. ridiculos M. limbata


Quarry A 143 13 0
Outcrop X 12 56 3
Outcrop Y 9 73 14

The samples (rows) may come from different localities that are supposed to be of the same age, or
they may be come from different levels in a single or composite section, in which case the rows should
be arranged in stratigraphical order. The former type of data forms the basis of biogeographical and
ecological studies, while the latter involves geological time and is unique to palaeontology. The analysis
of stratigraphically ordered samples borders upon biostratigraphy, but pure biostratigraphical analysis
with the purpose of correlation will not be treated in this text.
In practice, most occurrence matrices have a number of features that can cause problems for analysis.
They are normally sparse, meaning that they have many zero entries. They are almost always noisy,
meaning that if there is some structure present it will normally be degraded by errors, taxonomical
confusion, missing data and other ’random’ variation. And they are commonly redundant, meaning that
many samples will have similar taxon composition and many taxa will have similar distributions on
samples.
Ideally, a sample should represent an unbiased random selection of individuals that actually lived to-
gether in the same place and at the same time (a census). This is rarely the case in palaeontology, where
post-mortem transportation and time-averaging due to slow sedimentation and bioturbation cause mixing

3
CHAPTER 2. THE BASICS OF PALAEONTOLOGICAL COMMUNITY ANALYSIS 4

of fossil communities both in space and time. In addition, sorting and differential preservation potential
can severely bias both presence-absence data and (even more) abundance data. This can invalidate some
assumptions of some statistical tests, but it does not invalidate the whole field of palaeontological com-
munity analysis. In many cases the samples probably do represent unbiased selections within a fossil
group at least. Unless there has been some very selective and serious hydrodynamical sorting, we can
hope that for example a sample of gastropod shells within a limited size range is relatively unbiased.
Time-averaging on the order of a few hundred or perhaps even a few thousand years is not necessarily
detrimental if the communities were reasonably stable throughout this time. Still, we need to always
keep these potential problems in mind.
The analysis of occurence matrices can take many different directions. We may simply want to
compare samples in a pairwise manner, to test statistically whether two samples should be considered
to have different compositions. Large numbers of samples may be divided into groups according to
similarity (cluster analysis), and these groups may be interpreted in terms of biogeographical regions or
facies. Samples may also be ordered in a continuum according to their taxon content (ordination), which
can be interpreted in terms of an environmental gradient.
So far we have discussed similarities between samples, which we can refer to as sample-centered
analysis (also known as Q-mode analysis). We could also compare taxa, and look at what species tend
to co-occur. This can be called taxon-centered (or R-mode) analysis.
The occurrence matrix represents a multivariate data set, where each data point (sample) is described
using a number of values (taxon occurrences). Analysis is much simplified if we can reduce this data
set by extracting a single parameter for each sample, describing some aspect of its taxon composition.
Many such parameters are used by ecologists, attempting to measure qualities such as species richness
or dominance (the numerical dominance of one or a few species). When such a parameter, for example
number of species, is extracted for a number of samples in stratigrapical order, we have a univariate
time series which can be analyzed in order to detect trends or cycles, perhaps associated with changes in
climate or sea level.
All these methods will be covered in the course, with examples from the ’real world’.
Chapter 3

Comparing samples

The comparison of samples from different localities or stratigraphical levels forms the basis of much of
community analysis. Such comparison can be done within a stringent statistical framework by using the
Chi-squared test, or we can use a ’heuristic’ distance measure.

The Chi-squared test (counted data in categories)


The Chi-squared ( ) test is designed for comparing two samples consisting of the number of occur-
rences within a set of categories. This makes it an appropriate method for testing whether two samples
with taxon abundance data are likely to have been taken from the same community. The test gives the
probability that the samples are the same, and if this value is very low (say  
 ) we can say that the
null hypothesis of the samples being taken from the same community is rejected at a significance level
of 0.05. As with most statistical tests we can sometimes reject the null hypothesis of equality, but we can
never confirm it with statistical significance.
The Chi-squared test assumes that all taxon abundance value within a sample is larger than 5. A
proper execution of this test may therefore require the removal of rare taxa or fossil-poor samples.
Another detail considers the number of constraints, which needs to be set by the user. This would
normally be left at 0, but should be set to 1 if the abundance data have been normalized to percentage
values.
As an example, consider the data from the Llanvirnian (Ordovician) as presented in Case Study
8. Abundance data have been collected at ten levels in a section. For statistically testing whether the
samples come from the same population using the Chi-squared test, we first remove the rarest taxa, but
the nature of this data set is such that we must retain some abundance values with less than 5 specimens.
This possible source of error must be kept in mind, but the Chi-square test should be reasonably robust to
this invalidation of the assumptions. Comparing successive samples, we find that the null hypothesis of
the samples coming from the same population can be rejected at   
 for all the successive sample
pairs from sample 3 to sample 8. The community thus seems to be changing throughout this interval.
The null hypothesis can not be rejected for sample pairs 1-2, 2-3, 8-9 and 9-10, meaning that we can not
assume that these sample pairs are from different populations.

5
CHAPTER 3. COMPARING SAMPLES 6

Sample similarity measures for presence/absence data


A large number of heuristic indices have been defined for measuring the distance between two samples
containing taxon occurrences. They can be divided into two groups: those using presence/absence data
and those using abundance data. For presence/absence data the following distance measures should be
mentioned:


Jaccard similarity. A match is counted for all taxa with presences in both samples. Using for
the number of matches and for the the total number of taxa with presences in just one sample,
we have
Jaccard similarity = M / (M+N)

Dice (Sorensen) coefficient. Puts more weight on joint occurences (M) than on mismatches.
Dice similarity = 2M / (2M+N)

The Simpson similarity is defined as


 

, where   

is the smaller of the numbers of
presences in the two samples. This index treats two associations as identical if one is a subset of
the other, making it useful for fragmentary data.

Raup-Crick index for absence-presence data. This index (Raup & Crick 1979) uses a random-
ization ("Monte Carlo") procedure, comparing the observed number of species ocurring in both
samples with the distribution of co-occurrences in 200 pairs of random replicates of the pooled
sample. It is an example of a more general class of similarity index based on bootstrapping (see
the chapter on Diversity).

All these indices range from 0 (no similarity) to 1 (identity). Further information can be found in
Krebs (1989), Magurran (1988) and Ludwig & Reynolds (1988).

Sample similarity measures for abundance data


The Euclidean distance:

 "! % #$$ ' &  0/  21 


)(+*,.- -
Correlation (of the variables along rows) using Pearson’s 3 .

Correlation using Spearman’s rho (basically the 3 value of the ranks).

Bray-Curtis distance measure, sensitive to absolute abundances.

4 3 6 5879 3;: <= 0!?> @& (+*BA -  D C / -  ; 1 A


> @& (+* ,.-  - 
CHAPTER 3. COMPARING SAMPLES 7

Chord distance for abundance data. This index is sensitive to species proportions and not to abso-
lute abundances. It projects the two multivariate sample vectors onto a hypersphere and measures
the distance between these points, thus normalizing abundances to 1.

7 3    ! % #$$  /   > )& (+* ., -   -   1


> @& (+* -   > @& (+* -  
Morisita’s similarity index for abundance data.

  /  1 1
 * ! > @> (+* @& (+*   ,.- > ,.@- (+ *   / 1
&> - ,   &   - / 1 1
(3.1)

 ! @& (+* ., - ,.-  / 1


> @(+*  > @(+* 
  3 <  :    !  & C -   > 1 , @& (+&* ,.-  -  -    1 
, *  > )& (+* -  > @& (+* - 
This index was recommended by Krebs (1989).

The existence of all these indices is highly confusing. The Euclidean index is often used, but the
Chord distance and the Morisita index may perform better for community analysis. See also Krebs
(1989).

If your samples are characterized by high dominance (overwhelming numerical abundance of one or
a few species), you may choose to take the logarithm of all abundance values before measuring distance.
This will put a smaller weight on the dominant taxa, allowing the rarer taxa to contribute more to the
distance value.
Chapter 4

Cluster analysis

Cluster analysis means finding groupings of samples (or taxa), based on an appropriate distance measure.
Such groups can then be interpreted in terms of biogeography, environment and evolution. Hierarchi-
cal cluster analysis will produce a so-called dendrogram, where similar samples are grouped together.
Similar groups are further combined in ’superclusters’, etc. (fig. 4.1).

Figure 4.1: Dendrogram. The vertical axis is in units of group similarity.

Using cluster analysis of samples, we can for example see whether limestone samples group together
with shale samples, or if samples from Germany group together with those from France or those from
England. For a stratigraphic sequence of samples, we can detect turnover events in the composition of
communities.
We can also cluster taxa (R mode). In this way we can detect associations (or ’guilds’) of taxa,
for example whether a certain brachiopod is usually found together with a certain crinoid. Many of
the distance measures described above for comparing samples can also be used when comparing the
distributions of taxa.
There are several algorithms available for hierarchical clustering. Most of these algorithms are ag-

8
CHAPTER 4. CLUSTER ANALYSIS 9

glomerative, meaning that they cluster the most similar items first, and then proceed by grouping the
most similar clusters until we are left with a single, connected supercluster. In PAST, the following
algorithms are implemented:

Mean linkage, also known as Unweighted Pair-Group Moving Average (UPGMA). Clusters are
joined based on the average distance between all members in the two groups.

Single linkage (nearest neighbour). Clusters are joined based on the smallest distance between the
two groups.

Ward’s method. Clusters are joined such that increase in within-group variance is minimized.
Being based on variance, this method makes most sense using the Euclidean distance measure.

For community analysis, the UPGMA algorithm is recommended. Ward’s method seems to perform
better when the Euclidean distance measure is chosen, but this is not the best distance measure for
community analysis. It may however be useful to compare the dendrograms given by the different
algorithms and different distance measures in order to informally assess the robustness of the groupings.
If a grouping is changed when trying another algorithm, that grouping should perhaps not be trusted.
It must be emphasized that cluster analysis by itself is not a statistical method, in the sense that no
significance values are given. Whether a cluster is ’real’ or not must be more or less informally decided
on the basis of how well it is separated from other clusters (fig. 4.2). One approach may be to decide
a priori on a cut-off value for the across-cluster similarity. More formal tests of significance exist as
extensions to the basic clustering algorithms, but they are not in common use. Significance values based
on testing whether two clusters could have been taken from the same population are not valid, because
these clusters have already been constructed precisely in order to maximize the distance between them.
This would be circular reasoning. Investigating the robustness of the clusters after random perturbations
of the data might be a somewhat more fruitful approach.
More information on cluster analysis can be found in Krebs (1989), Ludwig & Reynolds (1988) and
Jongman et al. (1995).
CHAPTER 4. CLUSTER ANALYSIS 10

A B
Figure 4.2: Dendrogram A shows two well separated clusters, while dendrogram B (with the same
branching topology) is quite unresolved. The groups in dendrogram B must be interpreted with great
caution.

Figure 4.3: Clustering of Ordovician trilobite families, with a distance measure based on the correlation
of their generic diversities in four intervals. From Adrain et al. (1998). This analysis has been part
of the foundation for splitting the Ordovician trilobites into two major ’evolutionary faunas’ (Ibex and
Whiterock). The two clusters are however not very well separated.
Chapter 5

Ordination and gradient analysis

Ordination means ordering the samples or the taxa along a line or placing them in a low-dimensional
space in such a way that distances between them are preserved as far as possible. Concentrating on
samples, each original sample is a data point in a high-dimensional space, with a number of variables
equal to the number of taxa. Ordination means projection of this very complicated data set onto a low-
dimensional space, be it 3D space, a plane or a line. If the variation in the original data set is mostly
controlled by a single environmental gradient, we might be able to find a way of optimally projecting
the points onto a line such that distances between samples are to a large degree preserved. This will
simplify our study of the data, and the line (axis) found by the algorithm may be given an ecological
interpretation.
There are two main types of ecological gradient analysis. Indirect gradient analysis proceeds as
described above, where the gradient axis is found from the data in such a way that distances along the
axis are preserved as much as possible. Direct gradient analysis means analysing the samples in terms of
a gradient that was known a priori, such as a measured temperature or depth gradient. This latter type of
analysis is rarely possible in palaeontology, and we will therefore concentrate on indirect methods.
A thorough introduction to ordination is given by Jongman et al. (1995).

Principal components analysis


Principal components analysis (PCA) is a method that produces hypothetical variables (components),
accounting for as much of the variation in the data as possible. The components are linear combinations
of the original variables. This is a method of data reduction that in well-behaved cases makes it possible
to present the most important aspects of a multivariate data set in two dimensions, in a coordinate system
with axes that correspond to the two most important (principal) components. In addition, these principal
components may be interpreted as reflecting real, underlying environmental variables (fig.5.1).
5
PCA is tricky to grasp in the beginning. What is the meaning of those abstract components? Consider

-
another example, this time from morphometry. We have measured shell size , shell thickness and a
colour index on 1000 foraminiferans of the same species but from different climatic zones. From
these three variables the PCA analysis produces three components. We are told that the first of these
(component A) can explain 73 percent of the variation in the data, the other (B) explains 24 percent, while
the last (C) explains 3 percent. We then assume that component A represents an important hypothetical

11
CHAPTER 5. ORDINATION AND GRADIENT ANALYSIS 12

PCA axis 2 PCA axis 1

Walruses

Polar bears

Figure 5.1: Hypothetical example of PCA. 12 communities have been sampled from the Barents Sea.
Only two species are included (polar bear and walrus). The 12 samples are plotted according to their
species compositions. PCA implies constructing a new coordinate system with the sample centroid at
the origin and with axes normal to eachother such that the first axis explains as much of the variation in
the data as possible. In this case, we might for example interpret axis 1 in terms of temperature.

variable which may be related to environment.


The program also presents the ’loadings’ of component A, that is how much each original variable
contributes to the component:
! /  C   5C  
 
 
-
5 -
This tells us that A is a hypothetical variable that reduces sharply as (shell size) increases, but
increases when (shell thickness) increases. The colour index has almost no correlation with A. We
guess that A is an indicator of temperature. When temperature increases, shell size diminishes (organ-
isms are often larger in colder water), but shell thickness increases (it is easier to precipitate carbonate in
warm water). Plotting the individual specimens in a coordinate system spanned by the first two compo-
nents supports this interpretation: We find specimens collected in cold water far to the left in the diagram
(small A), while specimens from warm water are found to the right (large A).
It is sometimes argued that PCA assumes some statistical properties of the data set such as mul-
tivariate normality and uncorrelated samples. While it is true that violation of these assumptions may
degrade the explanatory strength of the axes, this is not a major worry. PCA, like other indirect ordination
methods, is a descriptive method without statistical significance anyway.
Ecological ordination involves sorting samples along one, two or sometimes more axes that are ex-
pected to relate to underlying environmental parameters or geography. PCA is a good method for or-
dination, but as we have seen it assumes linear relationships between the components and the original
variables (fig. 5.2). This may sometimes indeed hold true to some extent, but it is also common that an
original variable, such as the number of individuals of a species, displays a peak for a certain value of
the environmental parameter. This value is then referred to as the optimum for the species. For example,
the species may prefer a certain temperature, and become rarer for both higher and lower temperatures
(fig. 5.3).
CHAPTER 5. ORDINATION AND GRADIENT ANALYSIS 13

B
Abundance

C
D

Environmental gradient

Figure 5.2: Hypothetical abundance of four species (A-D) along an environmental gradient. Each species
has a linear dependence on the environmental parameter. B is indifferent with respect to the parameter.
Such a linear abundance pattern is assumed by PCA. A figure like this (and the one in fig. 5.3) is called
a coenocline.

Correspondence analysis
Correspondence analysis (CA) is a method for ordination which has been constructed specifically for
situations where different taxa have localized optimal positions on the gradients (fig. 5.3). Like in
PCA, ’hypothetical variables’ are constructed (in decreasing order of importance) which the original data
points can be plotted against. CA can also produce diagrams showing both taxon-oriented (R-mode) and
sample-oriented (Q-mode) ordination simultaneously.
Instead of maximizing the amount of variance along the axes as in PCA, CA maximizes the corre-
spondence between species scores (positions along the gradient axis) and sample scores. To understand
this, it may help to consider one of the possible algorithms for correspondence analysis, known as recip-
rocal averaging. We start with the species in a random order along the ordination axis. The samples are
placed along the axis at positions decided by a weighted mean of the scores of the species they contain.
The species scores are then updated to weighted means of the scores of the samples in which they are
found. In this way, the algorithm goes back and forth between species scores and sample scores until
they have stabilized. It can be shown that this will lead to optimum correspondence between species
scores and taxon scores whatever the initial random ordering.
Correspondence analysis can often give diagrams where the data points are organized in a horseshoe-
like shape (the ’arch effect’), and where points towards the edges of the plot are compressed together.
This is to some extent an artefact of the mathematical method, and many practicioners prefer to ’detrend’
and ’rescale’ the result of the CA such that these effects disappear. This is called Detrended Correspon-
dence Analysis (DCA), and is presently the most popular type of ecological ordination (Hill & Gauch
1989). An interesting effect of the rescaling is that the average width of each species response along the
gradient (tolerance) becomes 1. We can then use the total length of an axis to say something about how
well the species are spread out along that gradient (beta diversity). If for example an axis has length 5, it
means that species at one end of the gradient have little or no overlap with those at the other end.
CHAPTER 5. ORDINATION AND GRADIENT ANALYSIS 14

B E

Abundance A

C
D

Environmental gradient

Figure 5.3: Hypothetical abundance of five species (A-E) along an environmental gradient. Each species
has an abundance peak for its optimal living conditions. C has a wide distribution (high tolerance for
variation in the environmental parameter), while E is a specialist with a narrow range. Correspondence
analysis is suitable for this situation.

Correspondence analysis without detrending is also used as the basis for a divisive clustering method
known as TWINSPAN. The first ordination axis is divided in two, and the species/samples on the dif-
ferent sides of the dividing line are assigned to two clusters. This is continued until all clusters are
subdivided into single species/samples.

Other ordination methods


Seriation
Seriation was developed by archeologists, and can be regarded as a simple ordination method for pres-
ence/absence data. Rows and columns in the matrix are moved around in such a way that presences are
concentrated along the diagonal in the matrix. This diagonal can be regarded as an ordination axis, along
which samples (rows) and taxa (columns) are sequenced.

Principal coordinates analysis


Principal coordinates (PCO) analysis starts from distance values between all pairs of data points using
any distance (or similarity) measure. The points are then placed in a low-dimensional space such that
the distances are preserved as fas as possible. PCO is also known as metric multidimensional scaling
(MDS).
PCO is attractive because it allows the use of any distance measure, including those based on pres-
ence/absence data. For some reason it is not used much in ecology, and its behaviour is not very well
studied. Quite often it suffers from arch effects similar to those of correspondence analysis.
CHAPTER 5. ORDINATION AND GRADIENT ANALYSIS 15

Figure 5.4: Detrended Correspondence Analysis of five samples from the Silurian of Wales (Case Study
9). The horizontal ordering corresponds to the presumed distance from the coastline, and we therefore
interpret Axis 1 as an onshore-offshore gradient.

Figure 5.5: Detrended Correspondence Analysis of plant fossil communities from the Permian. Sample
ordination to the left, taxon ordination to the right. Axis 1 correlates well with latitude. In the sample
ordination, open symbols are low latitude (China, Euramerica, North Africa and northern South Amer-
ica), filled squares are high southern latitude (Gondwana), and filled triangles and circles are mid-to
high-latitude northern latitude (Russia and Mongolia). From Rees et al. (2002).

Non-parametric multidimensional scaling


Non-parametric multidimensional scaling (NMDS, Kruskal 1964) starts from a ranking (ordering) of
distance values between all pairs of data points using any distance measure. These ranks are then used
in an iterative procedure in order to try to place the points in a low-dimensional space such that ranked
distances are preserved. One of the problems with this procedure is technical: Available algorithms do
not guarantee an optimal ordination solution.

Examples
’Case study’ nr. 7, 9 and 10.
CHAPTER 5. ORDINATION AND GRADIENT ANALYSIS 16

Figure 5.6: Result of seriation. Taxa in rows, samples in columns. Black square means presence.
Chapter 6

Diversity

Diversity is roughly the same as species richness (sometimes the former is used in a general sense, while
the latter refers to the number of species). We can measure diversity in different ways. The simplest
approach is of course simply to count the number of species, but often we would like to include the
distribution of numbers of individuals of the different species. Such diversity indices will vary over time
and space, and can be important environmental indicators.
The somewhat confusing concepts of alpha, beta and gamma diversity need to be briefly explained:

alpha diversity is the local diversity (diversity of one community).

beta diversity is the rate of change in species composition along a gradient.

gamma diversity is the diversity of a region.

Diversity indices
These diversity indices can be calculated in PAST:

Number of taxa ( )

Total number of individuals ( )

 ! >     
Dominance=1-Simpson index. Ranges from 0 (all taxa are equally present) to 1 (one taxon dom-
 where is number of individuals of taxon

inates the community completely).
.

Simpson index=1-dominance. Measures ’evenness’ of the community from 0 to 1. Note the


confusion in the literature: Dominance and Simpson indices are often interchanged, and sometimes
they are defined as being reciprocal (Simpson=1/Dominance).

Shannon index (entropy). A diversity index, taking into account the number of individuals as well

 ! / > 
    
as number of taxa. Varies from 0 for communities with only a single taxon to high values (up to
about 5.0) for communities with many taxa, each with few individuals.

17
CHAPTER 6. DIVERSITY 18

Menhinick’s richness index - the ratio of the number of taxa to the square root of sample size. This
is an attempt to correct for sample size - larger samples will normally contain more taxa.
/  1 
+1 , where 
Margalef’s richness index:
individuals. , , is the number of taxa, and is the number of

Equitability. Shannon diversity divided by the logarithm of number of taxa. This measures the
evenness with which individuals are divided among the taxa present.
 !
 C + 1
 where

Fisher’s alpha - a diversity index, defined implicitly by the formula
is number of taxa, is number of individuals and  is the Fisher’s alpha. This index refers to a ,
parameter in a logarithmic abundance model (see below), and is thus only applicable to samples
where such a model fits.

Discussions of these and other diversity indices are found in Magurran (1988) and Krebs (1989).
The confusing multitude of indices can be approached pragmatically: Use the index you like best
(the one that best supports your theory!), but also check some other indices to see if your conclusions
will change according to the index used. This approach has been formalized by Tothmeresz (1995), who
suggested to use some family of diversity indices dependent upon a single continuous parameter. One
example is the so-called Renyi family, which is dependent upon a parameter  as follows:

 !
>  / @& (+ *   


<  
that this index gives the number of species for 
!
Here, is the number of species and  is the proportional abundance of species . It can be shown

 , the Shannon index for  and a number behaving
like the Simpson index for 
!  . We can then plot a diversity profile for a single sample, letting  vary
from say 0 to 4. For comparing the diversities in two samples, we can plot their diversity profiles in the
same figure. If the curves cross, the ordering of the two samples according to diversity is dependent upon
 . The diversities are then said to be non-comparable.

A word on bootstrapping
The diversity indices above may be practically useful for comparing diversity in different samples, but
they have little statistical value. If we are told that one community has Shannon index 2.0 and another
has index 2.5, is the latter significantly more diverse? This is like asking whether 7 is close to 8 - it’s
a meaningless question unless we know the variances of the parent populations. What we need is some
kind of idea about how the diversity index would vary when taking repeated samples from the same two
populations. If these variances are small relative to the difference between the populations, the difference
is statistically significant.
So how can we estimate confidence intervals for diversity parameters? One possible method is
bootstrapping. This is a general and very simple way of estimating confidence intervals for almost any
type of statistical problem, and has become extremely popular in ecological data analysis, morphometry
and systematics. The basic idea is to use the sample we have (or preferably several samples which we
hope are from the same population) as an estimate of the statistical distribution in the parent population.
CHAPTER 6. DIVERSITY 19

This is of course only an approximation, and sometimes a very bad one, but often it’s the best we can
do. We then ask a computer to produce a large number (for example 1000) of random simulated samples
from the estimated parent population, and see what range of variation we get in this set of samples. This
variation is used as an estimate of the ’real’ variance.
To make this more concrete, we can take the example of diversity indices. Say that we have collected
abundance data for 273 individuals of 12 species in one sample, and calculated a Shannon index of 2.5.
We want to know what range of Shannon indices we might expect if we had collected many samples
with the same total number of individuals from the same parent population. We proceed as follows.
First, take all the individual fossils we have collected and put them in a hat (it might be more practical
to make one piece of paper for each fossil, with the species name). Assume, or rather hope, that the
relative abundances represent a reasonable approximation to the ’real’ distribution of abundances in the
field. Then, pick a fossil 273 times from the hat with replacement, meaning that you put each fossil
back into the hat, and calculate the Shannon index for this random sample. Repeat this whole procedure
1000 times, producing a set of 1000 Shannon indices. The mean represents an estimate of the mean
of Shannon indices from the parent population. Then disregard the 25 smallest and 25 largest indices,
leaving 950 indices with a range corresponding to a 95 percent confidence interval.
A similar approach is useful for comparing the diversity indices from two samples. We first pool
the samples, meaning that we put all the specimens from both (or more) samples into the same hat. A
number of random replicate pairs of samples are then made, and the diversities compared for each pair.
If we rarely observe a difference in diversity between the replicates as large as the difference between
the original samples, we conclude that the difference is significant. The same method can of course be
used for estimating significance values for any community similarity measure, not only from differences
in diversity indices. This gives an alternative to the Chi-squared test, and can be used also for presence-
absence data. A special case of this approach, using number of shared taxa as the similarity measure, is
known as ’Raup-Crick similarity’ (Raup & Crick 1979).

Abundance plots
A useful way of summarizing the distribution of abundances on the members of a community is to
plot species abundances in descending order. This is called an abundance plot (fig. 6.1). If the curve
drops very rapidly and then levels off, we have a community dominated by a few taxa. It is quite
commonly seen, in particular for species-poor communities, that the curve drops exponentially so that
plotting logarithms (’Whittaker plot’) produces a straight descending line. This type of curve, known as a
geometric series or geometric distribution, is sometimes seen in ’severe’ environments or in early stages
of a succession. Another type of common abundance pattern, especially in species-rich communities, fits
the log-normal model where many taxa have a certain abundance and fewer taxa have lower or higher
abundance. This produces a Whittaker plot with a plataeu in the middle. This is sometimes taken as an
indication of a situation where many independent random factors decide the abundance of the taxa, and
is expected in environments which are randomly fluctuating (fig. 6.2).
The significance of the fit to a specific abundance model can be approximated with specially designed
Chi-squared tests.
All the common species abundance models (geometric, log-series, log-normal and broken stick)
refer to some simple theory of how the ecospace is divided into niches, occupied by the different species,
CHAPTER 6. DIVERSITY 20

Figure 6.1: Ranked abundance plot for horizon 8 in Case Study 8 (Ordovician of Wales), showing the
number of specimens (vertical axis) of the different species (horizontal axis). The function is close to
negative exponential, such that taking the logarithms of abundances would have produced an almost
straight descending line.

Figure 6.2: Ranked log-abundance (Whittaker) plot for three contemporary communities from the Sil-
urian Waldron Shale, Indiana. The Biohermal community, above storm wave base, approximates to a
log-normal distribution. The Inter-reef community, below storm wave base, follows a geometric distri-
bution (or perhaps rather a so-called log series distribution which flattens off for the rarest species). The
Deeper Platform community approximates to a so-called broken stick model, typical of stable environ-
ments with strong inter-species interactions. From Peters & Bork (1999).

under different models of competition. A new, comprehensive model, covering many aspects such as
immigration, speciation and extinction, has been put forward by Hubbell (2001). Known as the ’neutral’
or ’ecological drift’ model, it is a null hypothesis with random drift of abundances, much like the genetic
drift model in population genetics. This model predicts a certain shape of abundance plots somewhat
like the log-normal model but with a larger number of rare species, which seems to fit the communities
studied so far better than any previous model. Being a theory which makes very few assumptions and
which incorporates evolutionary aspects, it should be of great interest to paleontologists.

Rarefaction
It is unfortunately the case that the number of taxa (diversity) in a sample increases with sample size.
We find more conodont species in a 10 kilo sample than in a 100 gram sample. To compare the number
of taxa in samples of different sizes we must therefore try to compensate for this effect. Some of the
diversity indices described above try to account for sample size, but rarefaction (e.g. Krebs 1989) is
a much more precise method. The rarefaction program must be told how many specimens we have of
each taxon in the largest sample we have got. The program then computes how many taxa we would
expect to find in samples containing smaller numbers of specimens, with standard deviations (fig. 6.3).
Technically this can be done using bootstrapping, or with a faster ’direct’ method (Krebs 1989). These
numbers can then be compared with the number of taxa in real samples of corresponding sizes. Another
CHAPTER 6. DIVERSITY 21

way of using rarefaction curves, which may be less sensitive to differences in compositions between the
samples, is to perform the rarefaction on each sample separately. Normalized diversities can then be
found by standardizing on a small sample size and reading the corresponding expected taxon count from
each rarefaction curve.

Figure 6.3: Rarefaction on a sample from the Ordovician (sample 9 in the data set from Case Study
8) with 7 species and 57 specimens. By extrapolation, the curve indicates that further sampling would
have increased the number of taxa. The curve also shows how many taxa we would expect to find if the
number of specimens in the sample were lower. Standard deviations are not shown.

Diversity curves
Curves showing diversity as a function of time have become popular in studies of the history of life.
Such curves may (or may not!) be correlated with environmental parameters, and can show interesting
phenomena such as adaptive radiations and mass extinctions.
The compilation of diversity curves from the fossil record is not as easy as just counting taxa. First,
we have to decide on a taxonomic level for our study. Some classical diversity curves have been based on
counts of families or genera, but it must always be remembered that these taxonomic units are the results
of quite arbitrary decisions, and that they are influenced as much by disparity (levels of morphological
difference) as by diversity. A consensus is now emerging that diversity curves should ideally be based
on species counts. However, this leads to other problems. How do we delineate the species? How do we
deal with synonyms? Any diversity study has to consider taxonomical issues very carefully in order to
make reasonable species counts.
The second major problem is that of incompleteness of the data. The fossil record itself is rela-
tively sparse, and even worse, the completeness varies wildly through the stratigraphic column either
because of preservational factors or because of different intensities of collection. Ideally one should try
to compensate for this. One method is based on rarefaction, where samples with abundances have been
collected. The sample size is standardized using the smallest sample, and rarefaction is used to answer
the question of how many taxa we would have seen in each larger sample if it had been as small as the
standard size. Another, similar method involves randomized resampling (bootstrapping) in order to see
how sampling intensity and structure influences diversity. This can be done even with presence/absence
CHAPTER 6. DIVERSITY 22

data.
140
Tremadoc Arenig Caradoc Ashgill

120

100

Mean standing diversity

80

60

40

20

0
−490 −485 −480 −475 −470 −465 −460 −455 −450 −445 −440
Age

Figure 6.4: Diversity curve for the Ordovician of Norway (upper curve), produced from a large database
of published first and last occurrences at a number of localities. Diversities are counted within 1 million
years intervals. The lower curves show the upper and lower limits of the 90 percent confidence interval
resulting from random resampling of localities with replacement. The curve correlates well with sea
level, with low diversity at highstands.

A third problem involves imprecise stratigraphical correlation, which will invariably add noise to the
diversity curve. In order to reduce this problem, and also to simplify data collection, diversity is simply
counted within each stratigraphical unit (often on the stage level), in the hope that the unit boundaries
are reasonably well correlated. However, this reduces time resolution, and it forces us to define standing
diversity more carefully. Should we correct for the time duration of the unit? It is obvious that if species
longevity is very short compared with unit duration, there will be many more species within the unit than
there ever was at any particular point in time. We should then divide taxon count by unit length in order
to get a standardized standing diversity estimate. A related issue is illustrated by the fact that if two units
of equal duration have different turnover rates, they will have different taxon counts even if standing
diversity was in reality the same, resulting in artificial diversity spikes in units containing turnover events
(fig. 6.5). This can to some extent be corrected by letting taxa that originate or disappear within a unit
count as 1/2 instead of 1. In addition one may choose to let a taxon that exists only within the unit count
as 1/3. This reflects mean longevity of a taxon within the time unit in the case of uniform distribution of
first and last appearances.
We are usually making the ’range-through assumption’, meaning that a taxon is supposed to have
been present from its first to its last appearance. Gaps in the record are disregarded. This means that the
diversity curves will usually be artificially depressed near the beginning and end, due to gaps in these
regions not being filled in by assuming range-through from possible occurrences outside the time period
we are studying (’Lazarus taxa’). This so-called edge effect is more serious when taxon longevities are
CHAPTER 6. DIVERSITY 23

A B C

Figure 6.5: Range chart of seven species. The diversity count in interval A is unquestionably 4. In
interval C there are altogether 3 species, but mean standing diversity is perhaps closer to 2.3 due to the
species which disappears in the interval. Interval B has high turnover. The total species count (7) in this
interval is much higher than the maximum mean standing diversity of 4. By letting species that appear
or disappear in the interval count as 0.5 units, we get estimated mean standing diversities of A=4, B=3.5,
C=2.5.

long (meaning that species counts are less sensitive than genus counts) and sampling is incomplete.
In spite of all these problems, it has been shown theoretically and by computer simulation that the
inaccuracies mentioned above are not necessarily serious as long as they are unsystematic. They may
add noise and obscure patterns, but they will rarely produce false, strong signals, at least as long as parts
of the biotas are at all preserved. A further comfort comes from the fact that in the few cases where
published diversity curves have been tested by others using different (improved) data sets and methods,
they normally turn out to be robust except details. However, the question of the reliability of diversity
curves is still being debated.

Testing for extinction and radiation


Given a stratigraphically ordered sequence of diversity estimates, we may note some points of sudden
decrease or increase, and wonder whether these represent extinction or radiation ’events’. It has turned
out to be rather difficult to test this statistically, for many reasons. First, there is always minor extinction
going on (’background extinction’). The event must be significantly more severe than the background
extinction in order to classify as a mass extinction. As a first approach, a bootstrap test may be attempted
to investigate this. But such a test can only show that the event is large relative to the background
extinction, and whether it should be called an extinction event is a matter of definition.
Another problem is the so-called Signor-Lipps effect, which will always bias the signal towards
gradual rather than sudden extinction. This is related to the edge effect mentioned above, and comes
about because even if a sudden extinction of many taxa took place at some boundary, it is unlikely that
we will find all these taxa in the very small volume of rock just below that level. In fact, the probability
of finding a taxon above a certain level drops inversely with the distance below the boundary, producing
an artificial gradual decline. To some extent, one can try to correct for this effect.
In recent years, there has been interest in testing palaeontological time series (whether from morphol-
ogy or community analysis) against the null hypothesis of a so-called random walk, where the positive or
CHAPTER 6. DIVERSITY 24

negative change from each time step to the next is randomly distributed. Such random walks can display
both gradual and sudden patterns which might well be mis-interpreted as meaningful if observed in the
fossil record.
Statistical testing for extinction in the fossil record is being much debated right now, and one should
refer to recent literature for possible methods.
Chapter 7

Curve fitting

Many data sets consist of pairs of measurements. Examples are lengths and thicknesses of a number of
bones, grain sizes at a number of given levels in a section, and the number of species at different points
in geological time. Such data sets can be plotted with points in a coordinate system (scatter plot). Often
we wish to see if we can fit the points to a mathematical function (straight line, exponential function
etc.), perhaps because we have a theory about an underlying mechanism which is expected to bring the
observations into conformation with such a function. Most curve fitting methods are based on least
squares, meaning that the computer finds the parameters that give the smallest possible sum of squared
error between the curve and the data points.

Fitting to a straight line


The most common type of curve fitting consists in finding the parameters
 and that give the best
possible fit to a straight line:
5 !  C 
-
5
There are two forms of linear curve fitting. The most common type is regression, which assumes
5
- found in the values. In the example of grain sizes we can perhaps assume that the
that the given values are exact and independent from , such that the measurement error or random
deviation is only
level in meters is a given, almost exact, strictly increasing value, while the grain size is a more randomly
varying variable.
5
The other form of linear curve fitting, called RMA (Reduced Major Axis) is to be preferred in the
5 -
example of lengths and thicknesses of bones. The and values are here of more comparable nature,

-
and errors in and are both contributing to the total squared error.
Regression and RMA can often give quite similar results, but in some cases the difference may be
substantial.

Correlation, significance, error estimates


A linear regression or an RMA analysis wil produce some numbers indicating the degree of fit. It should
be noted that the significance value  and the estimation of standard errors on the parameters depend

25
CHAPTER 7. CURVE FITTING 26

upon several assumptions, including normal distribution of the residual (distances from data points to the
fitted line) and independence of the residual upon the independent variable. Least- squares curve fitting
as such is perfectly valid even if these assumptions do not hold, but significance values can then not be
trusted.
PAST produces the following values:


    3;3 1 : The probability that 5 are uncorrelated. If 
    3;3 1 is small (   
 ), you
,
can use the values below. - and
,
3 : Correlation coefficient. This value shows the strength of the correlation. 5
increasing together, and are placed perfectly on a straight line, we have 3
!  .
When
- and are

 3;3  : Standard error in 


 3;3 : Standard error in

Figure 7.1: Example of linear regression. Note that in this case, the assumption of independence of the
standard deviation of the residual upon the independent variable does not seem to hold well (the points
scatter more for larger ).
-
Log and log-log transformation

5
We can use linear regression also for fitting the data points to an exponential curve. This is done simply
by fitting a straight curve to the logarithms of the values (taking the logarithm transforms an exponential
function to a straight line). If we use the natural logarithm, the parameters and from the regression

are to be interpreted as follows:
5 !   
CHAPTER 7. CURVE FITTING 27

5
In PAST there is also a function for taking the base-10 logarithms of both the
data points are then fitted to the power function - and the values. The

5 ! 


For the special case


!  -
we are back to a linear function.

Examples
’Case study’ nr. 1 (last part), 3 (first part) and 5.

Fitting to periodic functions


Some geological and palaeontological phenomena are periodic, meaning that they vary in a cyclical
pattern. Examples are climatic cycles (Milankovitch), annual cycles in isotope data from belemnites and
ammonites, and possibly periodic mass extinctions. We may try to fit such data to a sinusoid (fig. 7.2):
5 !    /
1 
Here we have three parameters:
, - (7.1)

 (amplitude)
 (period): Decides the duration of each cycle

(phase): Translates the curve left or right

In PAST the user must set an assumed


 . The machine then optimizes the values of
 and

to give
the best fit.

Example
’Case study’ nr. 11.

Nonlinear curve fitting


All the functions above are linear in the parameters, or they can be linearized. It is more tricky for the

5 !  C    
computer to fit data to nonlinear functions. One example is the logistic curve

The logistic curve is often used to describe growth with saturation (fig. 7.3). It was used as a model
for the marine Palaeozoic diversity curve by Sepkoski (1984).
The question of whether, for example, a logistic curve fits a data set better than a straight line does is
difficult to answer. We can always produce better fits by using models with more parameters. If Mr. A
has a theory, and Mrs. B also has a theory but with more parameters, who shall we believe? There are
formal ways of attacking this problem, but they will not be described here.
CHAPTER 7. CURVE FITTING 28

y=4*cos(2*pi*x/5−pi/4)
4

0
y

−1

−2

−3

−4
0 1 2 3 4 5 6 7 8 9 10

 !
x

Figure 7.2: Sinusoid. =4, =5,

y=3/(1+30*exp(−7*x))
3

2.5

1.5
y

0.5

0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

 
x

Figure 7.3: Logistic curve. =3, =30, =7.


Chapter 8

Time series analysis

Data sets where we have measured a quantity at a sequence of points in time are called time series. Such
data sets can be studied with curve fitting as described above, but there are also analysis methods that
have been specifically constructed for time series.

Spectral analysis
Spectral analysis involves searching for periodicities in the time series, preferably with a measure of
statistical significance. Such periodicities may be difficult to spot by eye in the original data set, but the
spectral analysis may bring them out clearly. The analysis consists in calculating how much ’energy’ we
have at different frequencies, that is how strong presence we have of different sinusoidal components at
the different frequencies.
There are many different methods for spectral analysis. Some of them involve the use of the Fourier
Transform, which is simply correlation of the signal with a harmonic series of sine and cosine functions.
One spectral analysis method which I would like to promote is the Lomb periodogram (Press et al. 1992).
This method has the advantage of being able to handle data points that are not evenly spaced.
It is important to understand that such spectral analysis only attempts to detect sinusoidal period-
icities. Other periodic functions, for example a ’sawtooth curve’, will appear in the spectrogram as a
’fundamental’ with ’harmonics’ at whole number multiples of the fundamental frequency. The function
is thus decomposed into its sinusoidal parts.
Spectrograms such as the one in fig. 8.2 must be interpreted correctly. There are a number of pitfalls
to consider, most of them having to do with the fact that it is impossible for the analysis to increase the
information content of the signal. The following check list applies to the Fourier Transform, but similar
limitations exist for the unevenly spaced case and for other algorithms:

The highest frequency that can be studied (the Nyquist frequency) is the one corresponding to the
period of two consecutive samples.

The lowest frequency inspected by the algorithm is the one corresponding to the period given
by the total length of the analyzed time series. However, effects such as spectral leakage (see
below) cause the lowest trustworthy frequency channel to be the one corresponding to four periods
over the duration of the time series. In other words, you need four full cycles to be able to detect

29
CHAPTER 8. TIME SERIES ANALYSIS 30

periodicity with confidence (this rule of thumb is rather conservative, and some people would push
the number down to maybe three).

The frequency resolution is limited by the total number of samples in the signal, so that the number
of analysis channels is half the number of samples.

The use of a finite-length time series implies a truncation of the infinitely long signal expected by
the Fourier transform. This leads to so-called spectral leakage, limiting the frequency resolution
further and potentially producing spurious low-amplitude peaks in the spectrogram.

A simple test of statistical significance involves comparing the strength of the spectral peak with
the distribution of peaks expected from a random signal (’white noise’). A similar test involves random
reordering of the sample points in order to remove their temporal relationships. If the original spectral
peaks are not much stronger than the peaks observed in the ’shuffled’ spectrum, we have a low signifi-
cance.

Autocorrelation
Autocorrelation is a simple form of time series analysis which in some cases may show periodicities
more clearly than spectral analysis. As the name indicates, the time series is correlated with a copy
of itself. This gives of course perfect correlation (value 1). Then the copy is translated by a small time
difference, called lag time, and we get a new (lower) correlation value. This is repeated for increasing lag
times, and we get a diagram showing correlation as a function of lag time. If the time series is periodic,
we will get high correlation for lag times corresponding to the period, which will show up as peaks in
the autocorrelogram (fig. 8.3).

Wavelets
Wavelet analysis is a new type of time series analysis that has lately become popular in geophysics
and petrology, but it should also have potential in palaeontology. Using the so-called quasi-continuous
wavelet transform we can study a time series on several different scales simultaneously. This is done
by correlating the time series against a particular, short-duration time series (’mother wavelet’) with all
possible locations in time, and scaled (compressed) to different extents. We can say that the wavelet
function is like a magnifying glass that we use to observe the time series at all points in time, and the
analysis also continuously adjusts the magnification so that we can see the time series at different scales.
In this way we can see both long-term trends and short-term details.
Wavelet analysis was used by Prokoph et al. (2000) to illustrate a 30-million year cycle in diversity
curves for planktic foraminifera.

Example: Neogene isotope data (case study 13)


Oxygen isotope data from foraminiferan shells in core samples give a good indication of temperature
change through time. In this example we shall look at such an oxygen isotope log (Shackleton et al.
CHAPTER 8. TIME SERIES ANALYSIS 31

1990). The data have already been fitted to an age model, so that we can treat the data set as a time series
(fig. 8.1).

5.2

4.8

4.6

4.4
d18O

4.2

3.8

3.6

3.4

3.2
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
MYBP

Figure 8.1: Oxygen isotope data from core sample, one million years back in time (present time to the
left).

5
We can first try to find sinusoidal periodicities using spectral analysis. Figure 8.2 shows the Lomb

-
periodogram, where the axis shows cycles per million years and the axis shows the strength of the
sinusoidal components. The peaks around 8 and 11 cycles per million years correspond to periods of
1/8=0.122 and 1/11=0.094 million years, respectively. These periods fit well with the 100,000 years
Milankovitch cycle connected with orbital eccentricity, or alternatively periodicity in orbital inclination.
The peak at 24 cycles per million years indicate a 41,000 years cycle (axial obliquety), while the peak
at 43 cycles per million years indicates a 23,000 years cycle (precession). We see that the Milankovitch
cycles are very prominently shown with this type of analysis.
The autocorrelogram (fig. 8.3) indicates periodicites of 14, 30 and 39 samples. The samples in the
time series are placed with a distance of 3000 years, so this corresponds to periodicites of 42, 90 and
131 thousand years, in reasonable accordance with the Milankovitch cycles. In this case the periodicities
are better shown with spectral analysis than with autocorrelation, to some extent because the sinusoidal
nature of the cycles is well suited for spectral methods.
Finally we can study the time series at different scales using the continuous wavelet transform (fig.
8.4). The horizontal axis shows samples in units of 3000 years, while the vertical axis shows the two-
logarithm of the number of samples for the scale at which the time series is observed. Thus, the value
3 on the vertical axis means that at this horizontal level in the diagram, the signal is observed at a scale
CHAPTER 8. TIME SERIES ANALYSIS 32

Figure 8.2: Spectral analysis of isotope data from core sample (1 million years BP to the present). The
peaks in the spectrum indicate strong periodicities. The frequency axis is in units of periods (cycles) per
million years.

(96000 years), 
  ! !
corresponding to   !  
samples, or 24000 years. We can glimpse periodicities for 
 samples (45000 years) and 
 ! 
 samples
 samples (29000 years), in relatively good
accordance with the Milankovitch periodicities.
An advantage of wavelet analysis over spectral analysis is that we can see how periodicities change
over time. The spectral analysis considers the time series as a whole, and does not give any information
localized in time.
CHAPTER 8. TIME SERIES ANALYSIS 33

Figure 8.3: Autocorrelogram of isotope data from core. The peaks in the curve indicate periodicities.

-
The axis shows lag time in units of 3000 years.

5 -
Figure 8.4: Quasi-continuous wavelet diagram of isotope data from core. The axis whows time in units
of 3000 years, while the axis shows the scale at which the time series is observed, from about 380,000
years (top) down to 6,000 years (bottom). We can glimpse periodicities at three different levels.
Bibliography

[1] Adrain, J.M., Fortey, R.A. & Westrop, S.R. 1998. Post-Cambrian Trilobite Diversity and Evolu-
tionary Faunas. Science 280:1809.

[2] Harper, D.A.T. (ed.). 1999. Numerical Palaeobiology. John Wiley & Sons.

[3] Hill, M.O. & H.G. Gauch Jr. 1980. Detrended Correspondence analysis: an improved ordination
technique. Vegetatio 42:47-58.

[4] Hubbell, S.P. 2001. The Unified Neutral Theory of Biodiversity and Biogeography. Princeton
University Press.

[5] Jongman, R.H.G, ter Braak, C.J.F. & van Tongeren, O.F.R. (eds.). 1995. Data Analysis in Com-
munity and Landscape Ecology. Cambridge University Press.

[6] Krebs, C.J. 1989. Ecological Methodology. Harper & Row, New York.

[7] Kruskal, J.B. 1964. Multidimensional Scaling by Optimizing Goodness of Fit to a Nonmetric
Hypothesis. Psychometrika 29:1-27.

[8] Ludwig, J.A. & Reynolds, J.F. 1988. Statistical Ecology. A primer on methods and computing.
John Wiley & Sons.

[9] Magurran, A.E. 1988. Ecological Diversity and its Measurement. Princeton University Press.

[10] Peters, S.E. & Bork, K.B. 1999. Species-abundance Models: An Ecological Approach to Inferring
Paleoenvironment and Resolving Paleoecological Change in the Waldron Shale (Silurian). Palaios
14:234-245.

[11] Press, W.H., S.A. Teukolsky, W.T. Vetterling & B.P. Flannery. 1992. Numerical Recipes in C.
Cambridge University Press.

[12] Prokoph, A., Fowler, A.D. & Patterson, R.T. 2000. Evidence for periodicity and nonlinearity in a
high-resolution fossil record of long-term evolution. Geology 28:867-870.

[13] Raup, D. & R.E. Crick. 1979. Measurement of faunal similarity in paleontology. Journal of Pale-
ontology 53:1213-1227.

[14] Rees, P.M., Ziegler, A.M., Gibbs, M.T., Kutzbach, J.E., Behling, P.J. & Rowley, D.B. 2002. Per-
mian phytographic patterns and climate data/model comparisons. Journal of Geology 110:1-31.

34
BIBLIOGRAPHY 35

[15] Sepkoski, J.J. 1984. A kinetic model of Phanerozoic taxonomic diversity. Paleobiology 10:246-
267.

[16] Shackleton, N.J., A. Berger & W.R. Peltier. 1990. An alternative astronomical calibration of the
lower Pleistocene timescale based on ODP Site 677. Transactions of the Royal Society of Edin-
burgh: Earth Sciences 81:251-261.

[17] Tothmeresz, B. 1995. Comparison of different methods for diversity ordering. Journal of Vegeta-
tion Science 6:283-290.

You might also like