100% found this document useful (7 votes)
75 views84 pages

Instant Download Digging Numbers Elementary Statistics For Archaeologists 2nd Edition Mike Fletcher PDF All Chapter

ebook

Uploaded by

pokcekgoyes
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (7 votes)
75 views84 pages

Instant Download Digging Numbers Elementary Statistics For Archaeologists 2nd Edition Mike Fletcher PDF All Chapter

ebook

Uploaded by

pokcekgoyes
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 84

Full download ebook at ebookgate.

com

Digging Numbers Elementary Statistics for


Archaeologists 2nd Edition Mike Fletcher

https://ebookgate.com/product/digging-numbers-
elementary-statistics-for-archaeologists-2nd-
edition-mike-fletcher/

Download more ebook from https://ebookgate.com


More products digital (pdf, epub, mobi) instant
download maybe you interests ...

Elementary Statistics for Geographers 3rd Edition James


E. Burt

https://ebookgate.com/product/elementary-statistics-for-
geographers-3rd-edition-james-e-burt/

Elementary Statistics 10th Edition Mario F. Triola

https://ebookgate.com/product/elementary-statistics-10th-edition-
mario-f-triola/

THE PRESENT PAST An Introduction to Anthropology for


Archaeologists 2nd Revised edition Edition Ian Hodder

https://ebookgate.com/product/the-present-past-an-introduction-
to-anthropology-for-archaeologists-2nd-revised-edition-edition-
ian-hodder/

3D Math Primer for Graphics and Game Development 2nd


Edition Fletcher Dunn

https://ebookgate.com/product/3d-math-primer-for-graphics-and-
game-development-2nd-edition-fletcher-dunn/
Mixing Secrets for the Small Studio 2nd Edition Mike
Senior

https://ebookgate.com/product/mixing-secrets-for-the-small-
studio-2nd-edition-mike-senior/

Digital Photography Lighting For Dummies 1st Edition


Dirk Fletcher

https://ebookgate.com/product/digital-photography-lighting-for-
dummies-1st-edition-dirk-fletcher/

Nonparametric Statistics for Health Care Research


Statistics for Small Samples and Unusual Distributions
NULL 2nd Edition Michael Knight

https://ebookgate.com/product/nonparametric-statistics-for-
health-care-research-statistics-for-small-samples-and-unusual-
distributions-null-2nd-edition-michael-knight/

New Opportunities Elementary Students Book Global


Elementary Students Book 2nd Edition Michael Harris

https://ebookgate.com/product/new-opportunities-elementary-
students-book-global-elementary-students-book-2nd-edition-
michael-harris/

Printed Test Bank Mark Schultz to Accompany Elementary


Statistics Eighth Edition Mario F. Triola

https://ebookgate.com/product/printed-test-bank-mark-schultz-to-
accompany-elementary-statistics-eighth-edition-mario-f-triola/
Oxford Uni-
33

DIGGIN i .s ‫נ‬
ELEMENTARY STATISTICS FOR
ARCHAEOLOGISTS
(Second Edition)

Mike Fletcher and Gary Lock

Oxford University Committee for Archaeology


2005
Published by
Oxford University School of Archaeology CONTENTS
Institute of Archaeology
Beaumont Street Preface
Oxford
Section 1: Techniques for describing and presenting archaeological data

L An introduction to data. 1
© Mike Fletcher and Gary Lock 2005 1.1 The example data set 1
1.2 Levels of measurement l
First published 1991 1.3 Coding 5
reprinted 1994. 200 I. 2004 1.4 Transforming variables 6
Second edition 2005
2. A statistical approach - signposting the way 9

ISBN O 947816 69 0
3. Tabular and pictorial display 14
3.1 Basic aims and rules 14
3.2 Tabulating measurements 14
A C!P record for this book is available from the British Library 3.3 Tabulating frequencies 15
3.3. l One variable 15
3.3.2 Two variables 17
3.4 Pictorial displays for nominal and ordinal data 19
This book is availahle directfiwn 3.4.1 The bar chart 19
Oxbow Books, Park End Place, Oxford OX I I HN 3.4.2 The pie chart 21
(Phone: 01865-241249; Fax: 01865-794449) 3.5 Pictorial displays for continuous data 22
3 .5. I The histogram 22
lll!d 3.5.2 The stem-and-leaf plot
an alternative histogram? 25
The David Brown Book Company 3.5.3 The ogive
PO Box 51 I. Oakville, CT 06779, USA the total so far 27
(Phone: 860-945-9329: Fax: 860-945-9468)
3.5.4 The scatterplot
displaying two variables 29
and

via our website


4. Measures of position - the average 32
www.oxbowbooks.com 4.1 Introduction 32
4.2 The mode 32
4.3 The median 34
4.4 The mean 36
4.5 Comparing the mode, median and mean 37

5. Measures of variability - the spread 41


Printed in Great Britain !JI' 5.1 Introduction 41
Antony Rowe Ltd, Chippenham. Wiltshire 5.2 The range 41
42 8.3.2 Difference in means for paired data
5.3 The quartiles
5.4 The mean deviation 45 assuming a normal distribution 92
5.5 The standard deviation 47 8.3.3 Difference in means for paired data
5.6 The coefficient of variation 48 no assumption of normality 94
49 8.3.4 Difference in means for two independent samples
5.7 Standardisation
5.8 Boxplots
50 assuming a normal distribution 95
8.3.5 Difference in means for two independent samples
Section 2: Techniques for drawing inferences from archaeological data - no assumption of nonnality 97
53 8.3.6 Difference of two proportions 98
6. An introduction to probability and inference drawing conclusions
53
6. I Introduction 9. Tests of distribution 101
6.2 Probability 9.1 Introduction l OI
measuring chance and risk 53
53
9.2 Tests for randomness 101
6.2. l The concept of probability 9.3 Tests for nonnality 105
6.2.2 The concept of independence 9.4 Tests between two distributions 111
are two events related? 55
6.3 Probability distributions 10. Measures of association for continuous or ordinal data
- predicting results 59
- are two variables related? 115
6.4 The logic of hypothesis testing I 0.1 Introduction 115
is it significant? 63
I 0.2 Product-moment correlation coefficient 118
66
10.2.1 Testing the significance of
7. Sampling theory and sampling design the product-moment correlation coefficient 121
7. I Introduction 66
I 0.3 Speannan's rank correlation coefficient 123
7.2 Sampling strategies - which measurements to take 67
70
l 0.3. l Testing the significance of
7.3 A statistical background to sampling Speannan's rank correlation coefficient 126
7.3.l The central-limit theorem I0.4 Predicting using regression 127
- the law of averages 70
7.3.2 Confidence limits 11. Measures of association for categorical data
the reliability of results 74
- are two characteristics related? 128
7.4 Conclusions 78
11. l Introduction 128
80
11.2 The Chi-squared test 129
8. Tests of difference 11.3 Guttman's lambda 134
80
8.1 Introduction 11 .4 Kendall's tau 135
8.2 One sample tests .
comparing an observed measurement with an 12. An introduction to multivariate analysis
81
139
expected measurement 12.1 Reduction and grouping 140
81 12.1.1 Cluster Analysis 140
8.2. l Test for sample mean 12.1.2 Correspondence Analysis 145
8.2.2 Test for median 87
8.2.3 Test for proportions 89
12.2 Prediction 149
8.3 Two sample tests 12.2. l Multiple Regression 149
- comparing two observed measurements 90
90
12.2.2 Discriminant Analysis 151
8.3. l Test for variation
Preface (First Edition)
Section 3: Books and software

13. A few recommended books


154 Note on the contents list: we have intentionally tried to produce user friendly headings
to try and overcome the problems inherent in statistical beginners being faced with a
14. SPSS for Windows
158 list of technical names. This has resulted in a considerable amount of simplification
which may offend some statistical purists. We beg understanding in advance.
195
Appendix. Statistical tables . . .
Table A. Random digits from a uniform d1stnbution 195 Digging Numbers comprises four sections;
Table B. Percentage points of the t-distribution 196
Section l. Simple techniques for describing and presenting archaeological data,
Table C. 5% points of the F distribution 197
Section 2. Techniques for drawing inferences from archaeological data,
Table D. Kolmogorov-Smirnov single sample test . . . . Section 3. An introduction to statistical computing,
- (uniform and other completely specified d1stnbut10ns) 198
Section 4. A catalogue of selected statistical packages.
Table E. Kolmogorov-Smirnov single sample test
(normal distribution) 198
The first two sections are sequential in the sense that Section 2 assumes familiarity
Table F. Kolmogorov-Smimov two sample test 199
with the concepts and techniques covered in Section l .
Table G. Critical values for correlation coefficient (pmcc) 200
Table H. Critical values for Spearman's rank correlation coefficient 201
Section l starts with a discussion of the structure and organisation of archaeological
Table I. Percentage points of the X
2
distribution 202 data-sets which are suitable for statistical analysis. It introduces a hypothetical data-set
which describes measurements and other aspects of forty bronze and iron spearheads.
203 This data-set is used throughout Sections l, 2 and 3 to demonstrate the different
Index
statistical techniques and concepts. Chapter 2 outlines a statistical approach to
analyzing such a data-set. It assumes familiarity with Chapter 1 and is meant to act as
a guide through the rest of Sections l and 2.

The rest of Section l is concerned with what are usually called Descriptive Statistics.
These include several methods of displaying the distribution of a single variable in
tabular and pictorial form as well as simple ways of displaying the relationship
between two variables. Measures of position (usually thought of as 'the average') and
measures of dispersion or variation (the 'spread' around the average) are also
described. All of these are applied to the spearhead data-set.

Section 2 outlines the main types of Inferential Statistics. These involve the concepts
of Sampling, Probability, Hypothesis Testing and Statistical Significance. Some of the
more commonly used Tests of Difference, Tests of Distribution and Tests of
Association are described and illustrated with examples from the spearhead data-set.

In Sections l and 2 statistical fonnulae are stated and used without derivation or
proofs. This is due to limited space and mainly to the fact that this book is aimed at
people with little or no statistical knowledge. It is felt that most users will be prepared
to accept a formula as stated. If statistical derivation is required appropriate books are
recommended in Chapter l 2. References throughout the text are to the few recom-
mended books described in Chapter 12, there is no formal bibliography.
T~chni~al note: camera-ready copy for this book has been produced by the authors
The emphasis here is on using the appropriate technique and understanding the results usmg Tuneworks Des!(top Publisher. Figures have been produced using Gem Graph
in both statistical and archaeological terms. Where applicable, this includes working and Gem. Draw and mtegrated electronically. We would be happy to discuss this
through examples by hand (with a calculator). This may seem a little old-fashioned in process with any mterested parties.
todays world of computers but we feel that the benefits in understanding are well
worth the effort. GRL and MF, February 1991.

Even so, many people will have access to a computer and this is where Sections 3 and
4 comes in. Most of the techniques described in Sections 1 and 2 have one or two
corresponding computer programs listed in Section 3. These are written in SPSS and Preface (Second Edition)
Minitab.
O:'er_ the last thirteen ye~rs or so we have been pleased by the continuing popularity of
Section 4 is a catalogue of commercially available statistical packages. This gives D1g_g1~g Numbers as an 111troduct01y text for archaeologists wanting to get started with
details of hardware requirements, the software's contents and availability and includes statistics. _Several p~ople and organizations have requested that we update it and so,
general packages as well as specific archaeological software. after a senes of repnnts of the First Edition, here is the Second Edition.

This combination should allow the user of Digging Numbers to approach statistical The underlying philosophy and much of the text remains the same. This is still an
analysis either by calculation by hand, or by using a commercial software package. mtroductory book that is meant to get people doing statistics for themselves within a
Although the two packages we have chosen to demonstrate are relatively expensive basic understandi?g of the strengths and limitations of various techniques. There are,
ones (SPSS and Minitab ), many of those listed in Section 4 are inexpensive with some however, several important changes within the Second Edition:
being available as Shareware.
I.A new chapter ( 12) has been added which provides an introduction to
It is probably already apparent that this book provides only an introduction to the ~ultiva~iate tec~niques. The emphasis of the book is still on descriptive and
complex world of statistics. There are whole areas of statistical reasoning and analysis m~erentJal techmques but this new chapter gives a taste of what can be done
which are not even mentioned. Many different methods of multivariate analysis, for usmg more than one or two variables.
example, have proved to be of importance to archaeologists. Even so, statistics seems 2. Section 4 of the First Edition, the catalogue of statistical software has been
to be one of those subjects that can cause instant mental paralysis in many otherwise omitted as it is no longer relevant. '
competent archaeologists. If this book can give someone enough confidence to 3. Chapter 13, recommended books, has been re-written and updated.
approach a more advanced text then our aim will have been achieved. 4. Ch~~ter 14, computer programs, has been completely re-written. This Second
Ed1t1on uses SPSS for PC and many of the figures within the text are SPSS
Throughout the book three icons are used to quickly highlight either a reference in output so that text and figures are more closely linked.
Chapter 12, a link with another chapter or a program number from Section 3.
We would like to thank the '.nany people who have contacted us about Digging
Acknowledgements. Number~ _over the years, especrnlly those who have pointed out errors and typos, not
We would like to thank Clive Orton of University College London for his meticulous least ~h1hp Balco~nbe. We have attempted to rectify them all but please do get in
reading of an earlier draft of this book. His detailed comments were of great help to touch tf any remam. Thank you also to Barry Cunliffe for encouragement and Val
us. Thank you also to Hazel Dodge for being a guinea-pig, to all the suppliers of Lamb of Oxbow Books for guidance.
software for Section 4 and to H.R. Neave for permission to use some of his statistical
tables. Any mistakes or misunderstandings that remain in the text are the
responsibility of the authors. GRL and MF, January 2005.

Simon Pressey drew the cover illustration.


CHAPTER l

AN INTRODUCTION TO DAT A

1.1 The Example Data Set


It is impo1tant that any data set to be used for statistical analysis be well organised and
properly defined. This often results in a rectangular block of numbers which is called a
data matrix. Table 1.1 is a data matrix with 40 rows and 14 columns, or a 40 by 14
matrix (downloadable from http:i/www.soc.staffs.ac.uk/mf4/spears.zip ).

Table 1.1 describes forty spearheads. Each horizontal row represents one spearhead
and is, therefore, one item in archaeological terms. In statistical ten11S this is referred
to as one case ( often one record in database terminology). Each vertical column
represents one observation on the item and is, archaeologically speaking, one
attribute. In statistical terms this is a variable (equivalent to a field in many database
applications).

There are fourteen variables in Table 1.1. The first one is a label in the fonn of a
unique number for each case; this is essential for any form of cross-referencing with
other infonnation about the spearheads. Each variable has a variable name which is
displayed at the top of the column.

From now on variable names that refer to the spearhead data-set are enclosed in<>.

1.2 Levels of measurement


Variables can be measured at one of four levels. This classification was first
introduced in 1946 and has become universally accepted by statisticians. As will
become apparent during this and the next section, it is important to know at what level
variables are measured. Many statistical techniques can only be applied to variables at
a certain level of measurement or higher. The four levels are, in ascending order;
nominal, ordinal, interval and ratio.

NOMINAL (in name only). Nominal variables consist of categories which have no
inherent ordering or numeric value. Each category is assigned an arbitrary name. In
Table 1. l the following variables are nominal;

<MAT>, <CON>, <LOO> and <PEG>


M. Fletcher and G. R. Lock An introduction to data

~
NL~! MAT CON LOO PEG COM) DATE ~L'\XU: SOCLE MAXWI lJPSOC LOSOC rvtr\ \\'IT \11:IGHT ·•,
.,
1 2 .J 1 2 3 300 12.4 3.1 3.6 1.0 1.7 6.2 167.0 1 Column Variable Name Description Values
2 2 3 1 2 4 450 22.6 7.8 4.3 l.3 1.6 11.3 342.1 2 <MAT> Material 1 Bronze
3 2 3 I 2 4 400 17.9 5.2 4.1 1.7 2.0 7.5 322.9
2 Iron
4 2 3
.,
1 0 4 350 * * * 1.4 2.0 * 154.8 ~ 3 <CON> Context l = Stray find
5 2 .J l 1 3 350 16.8 6.6 5.7 1.1 1.7 7.0 358. l l
., (inc. hoards)
6 2 .J 1 2 3 400 13.3 3.1 4.1 1.6 1.9 5.6 227.9
7 2 3 1 .
')
? 450 14.l 5.8 5.8 1.2 1.8 6.8 323.8
2 Settlement
3 Burial
8 2 ? I 2 4 600 * 6.1 5.9 l.3 I. 7 7.1 285.2
9 2 .
')
1 2 4 150 22.5 9.2 6.2 l.3 2.0 13.l 613.8 J
4 <LOO> Loop 1 =No
10 2 1 I .
')
3 300 16.9 4.5 3.6 1.4 l.9 :," __
? 254.3 I
5 <PEG> Peghole
2=Yes
l No
11 2 1 1 2 2 50 19.1 4.6 4.1 1.5 l.8 10.6 310.1 ~
12 ? 1 1 2 3 100 25.8 8.6 4.7 1.4 1.6 12.7 426.8 f 2 Yes
13 2 I 1 2 ')
-'- 600 22.5 8.4 3.9 1.7 2.7 18.0 521.2 ii
14 2 1 1 2 3 300 ?7.6 8.7 6.0 1.5 2.! 14.4 765.1 For each of these variables there is no significance in the values 'l ', '2' and '3' that
15 2 1 1 2 2 350 38.0 9.6 5.6 2.0 ?.6 13.6 1217.2 have been assigned to the categories (i.e. '2' is not twice the value of 'l '), any other
16 2 1 I 2 2 350 7?.4 14.4 6.4 2.0 2.4 17.6 2446.5 numbers or names would do. Note that it is good practice to avoid the use of 1 for
17 2 1 I 2 2 350 37.5 10.2 3.9 1.8 2.1 14.1 675.7 'yes' and O for 'no' as this can confuse the distinction which often needs to be made
18 2 2 I 2 3 450 10.2 3.0 2.7 1.4 1.5 5.8 90.9 t between 'no' and 'no infonnation' (or 'missing data').
,.
19 2 2 l 2 2 200 l l.6 4.6 2.0 0.9 1.7 5.6 86.8
20 2 2 l 1 3 400 10.8 3.1 2.7 1.9 1.7 5.4 109.1 ORDINAL (forming a sequence). Ordinal variables also consist of categories but this
21 I 1 2 I '.) 900 11.4 4.2 1.8 0.8 1.5 6.1 67.7 time they have an inherent ordering or ranking. There is, however, no fixed distance
2? I I 1 2 2 900 16.6 7.2 2.8 1.6 2.0 9.5 204.5 between the categories. The only ordinal variable in Table l. l is column 6 <COND>
'.)"'
_.) 1 l '.) I 1 1000 10.2 3.4 3.3 1.9 2.3 5.4 170.3 which has the following values;
24 1 I 2 1 l 1200 18.6 6.6 2.7 1.4 1.6 8.5 176.8 "
25 1 1 2 I 2 1200 24.4 7.5 4.4 1.7 2.3 11.3 543.2 I = Excellent 2 = Good 3 Fair 4 Poor.
26 I I 2 I I 1000 23.5 8.0 4.5 2.0 2.7 8.7 628.2
27 1 I 2 I 2 1200 24.8 8.1 3.5 2.0 2.1 11.1 40 l.0 A
Here we can state the relationship of '2' as being between '1' and '3' but it is wrong to
28 1 1 1 2 1 800 14.1 3.4 3.9 1.7 2.5 6.1 302.4 h
assume equal distance between categories as is implied by the numeric values.
29 I l I 2 2 800 24.6 6.0 4.8 2.1 2.4 8.6 623.5
30 I 1 2 1 .
')
800 30.9 5.1 6.0 1.5 2.4 8.0 978.9 K
It is possible that a nominal variable could become ordinal if an ordering is imposed
31 1 1 I 2 I 700 20.2 5.9 5.7 1.7 2.4 9.4 607.9
32 I 1 1 2 2 700 12.8 3.5 2.8 1.5 2.1 5.9 165.6 by a typology although this will be based on some external criteria and is not inherent
within the data.
33 I l I 2 I 800 16.9 5.5 3.6 1.6 2.3 8.2 307.9
34 I I l 2 1 800 14.2 4.3 2.8 1.3 2.2 6.0 192.4'
35 1 '.)
1 2 2 700 18.0 4.5 5.3 1.6 2.5 9.9 524.7 Some statistical tests will accept a dichotomous nominal variable ( one with only two
36 I 1 2 l '.) 1000 11.7 3.6 2.4 2.2 1.8 6.6 111.2 .. categories) as being ordinal. Many dichotomous variables are presence/absence
37 I 1 I 2 I 800 14.1 5.4 2.4 1.5 2.4 8.4 118.1 J variables; they record whether the attribute is there or not Care must be taken when
38 1 1 2 1 '.) P00 17.7 4.8 3.9 1.2 1.8 9.6 273.4 . dealing with missing values which can be frequent in archaeological data. Spearhead
.,
39 1 I 2 2 .J 1200 36.6 13.5 6.0 1.6 2.7 18. J l 304.4i Number 4 has missing values (indicated by*) for variables 8, 9, 10 and 13 because it
')
40 I l 2 1 -'- 800 12.3 2.4 5.4 1.1 1.6 7.2 233.8 is badly damaged and those measurements can not be taken. It can be confusing to
represent missing values with a numeric value such as 0.0 or 99.9, choose something
Table 1.1. The spearhead data-set. obvious such as *. Missing values can cause complications in presence/absence data.
The value 'absent' is different to 'not known' (if the relevant piece of information can

2 3
M. Fletcher and G. R. Lock An introduction to data

not be measured) and a third category may have to be introduced; Present Absent and 8. Maximum length ( cm)
Missing. <MAXLE>
9. Length of socket (cm)
INTERVAL (a sequence with fixed distances). An interval variable has the properties <SOCLE>
of an ordinal variable with the added property that the distances between the values I 0. Maximum width ( cm)
can be interpreted. A popular way of explaining this concept is to look at method~ of <MAXWI>
measuring temperature. The values 'hot', 'warm', 'cool'_, and 'cold' are ~rdma~ 11. Width of upper socket
because the difference between 'hot' and 'warm' and the difference between warm (cm)
and 'cool' are not defined. A temperature of 30°C is not only higher than one_of ~0°C <UPSOC>
but it is J0°C higher. The interval is meaningful, therefore temperature Celsms is an 12. Width of lower socket
ll
interval scale. (cm)
-++ <LOSOC>
The only interval variable in Table 1.1 is column 7 <DATE>. If we take spea!·head 13. Distance between
numbers 9. JO and 18 they have the dates 150BC, 300BC and 450BC respectively. maximum width and lower
The difference in years between Number 18 and Number I O is t!1e same as between 13 socket (cm)
Number 10 and Number 9. It is obviously incorrect, however, to mterpret Nt:mber 10 <MA WIT>
as beina twice as old as Number 9 even though this is implied by the numenc values 14. Weight (g)
of '300';' and '150'. With interval variables there is no meaningful datum or zero. <WEIGHT>

RATIO (fixed distances with a datum point). This is the hi_ghest level ofmeasur~ment
with the properties of interval data plus a fixed zero pomt. If the dates, mentlo~ed
above were converted to a new variable <AGE> so that a value of' 1,000 was_ twice
as '500' this would then be a ratio variable. Returning to the measunng of
~ Old . . d
temperature, 20°c is not twice as hot as l 0°C be~ause 0°C is not a datum pomt, 1t oes Figure 1. 1 The seven quantitative variables.
not imply no heat. Temperature in degrees Kelvm, ~n t?e o~her hand, are n;!easured on
a ratio scale because 0°K does mean no heat and 20 K 1s twice as hot as 10 K. All observations involve a level of accuracy, especially on continuous variables. The
level of accuracy decided on must be adequate as a basis for sensible decisions and
Jn Table l.1 columns 8 to 14 inclusive are all ratio variables. They are metric interpretations during analysis. Variables 8 to 13 in Table Li are all recorded to the
measurements as shown in Figure 1.1. nearest millimetre, to be any more accurate is unnecessary although not physically
impossible. Once data have been collected no amount of statistical manipulation will
Jt is also quite common to refer to nominal and ordi1'.al variable_s as categorica! (or improve their accuracy.
discrete) variables and to interval and ratio as contmuous vanables. The va:iable
values of categorical variables are usually chosen by the analyst and _bec_ause tl'.1s can 1.3 Coding
be a fairly arbitrary process these are sometimes referred to as quabtat1ve vanables. With categorical variables it is necessary to represent the values of the categories in a
The valu~s of continuous variables tend to be more objectively arrived at and these are standardised way by using a coding system. It is common in statistical analysis to use
sometimes called quantitative variables. a numeric coding system, in fact, using letters rather than numbers can cause problems
with some statistical software. All of the categorical variables in Table 1.1 have values
Just because nominal variables are classified as the lowest level of measurement their represented by a unique integer number. This is easy to process but is obscure because
importance within archaeology must not be under~stimated. Some fundamental the meanings of the code have to be remembered or looked up, if the data set is large
archaeological concepts involve the use of nommal data, the processes of and/or complex this can be very time consuming and become a major drawback with
classification and typology are important examples. numeric coding. Another problem with this method is the potential for a higher error
rate in the data and the associated problem of error coding.

4 5
M. Fletcher and G. R. Lock An introduction to data

Obscurity (and thus many errors!) can be reduced by using an abbreviated keyword As an example, the ratio between the two variables <MAXLE> and <MAXWI> will
coding system. In such a system the values of the variable <CON>, for example, could express something of the overall shape of the spearhead. Short, wide spearheads will
be represented by the code 'str', 'set' and 'bur'. For complete clarity a full keyword have a different value to long narrow ones. The ratio can be calculated by dividing
code would use the values 'stray find', 'settlement' and 'burial'. Both keyword <MAXLE> by <MAXWI> as follows;
systems can create more work during data recording although the extra time spent
typing can be offset by not having to look up codes. Codes containing letters Spearhead <MAXLE> <MAXWI> Ratio
(alphanumeric) can cause problems with some software; make sure to check first! number <LE/WIRAT>
l 12.4 3.6 3.4
Whatever coding system is used it must be exhaustive and exclusive. Exhaustive in 39 36.6 6.0 6.1
that every possible data value is catered for and exclusive because every value will
only fit into one category. Each observation must fit into one and only one category of The difference in overall 'shape' 1s expressed in the two values of the ratio for
the coding system (even if it is a category called 'miscellaneous' for those values that spearheads 1 and 39.
don't fit elsewhere).
It is now possible to use the two new variables <PERIOD> and <LE/WIRA T> to
1.4 Transforming variables investigate temporal trends in the shape of spearheads. It is often the case that as
Table 1.1 shows the observations as recorded, these are the raw data. It is sometimes exploration of a data-set progresses so new ways of expanding the original variables
useful to transform one or more of the original variables to create new variables for by creating new ones are thought of. It can be informative to 'play' with the data, to
analysis. Transformations can involve a single variable or be a relationship between explore relationships and see if the results are interesting.
two variables.
Another measure of some aspect of shape could be a proportion stated as a
GROUPING. Values of a continuous variable can be grouped to create a new percentage. A good example is to take the length of socket as a proportion of the
categorical variable. The variable <DATE> could be chopped up into the three values maximum length by dividing <SOCLE> by <MAXLE> and multiplying by 100 as
'1200 to 650', '649 to l 00' and 'after 99' to create the new variable <PERIOD>. The follows:
values of <PERIOD> would be 'Later Bronze Age', 'Earlier Iron Age' and 'Late Iron
Age' and could be used for the basis of establishing changes in the spearheads through Spearhead <SOCLE> <MAXLE> Proportion
time. Performing statistical analyses on each of the three groups of <PERIOD> could number
identify temporal trends. l 12.4 3.1 0.25 (25%)
22 16.6 7.2 0.43 (43%)
The grouping of continuous variables is flexible in that new groups can be created to
suit a particular analysis. This is a useful technique for exploring a data-set. 1f data for Percentages are often used to measure frequencies or counts but can be deceptive
many more spearheads became available it may prove interesting to divide unless the raw counts are also given.
<PERIOD> into more than three categories for finer temporal investigations.
Points to remember:
Although grouping of continuous variables can be very useful it must be remembered However a variable is measured, mm, g, %, years etc. it is essential to state clearly the
that it involves a loss of infonnation. It is always better to record data as a continuous units used for this measurement and, as far as possible, to use a consistent set of units.
variable and then group, rather than to record initially as a categorical variable. Do not mix mm. with inches!

RATIOS. Sometimes the relationship between the values of two variables can express Always keep a copy of the original data. As an analysis progresses the data being used
a new attribute of interest. By performing a calculation on the two values the new can change in form. lf a computer is being used it is very easy to overwrite old
attribute can be stored as an extra variable. This usually applies to continuous versions of data with new versions.
variables.

6 7
M. Fletcher and G. R. Lock

Keep a record of any changes made to data. It is very easy to lose track of how an
analysis has developed. If the results are to be published it is important for other CHAPTER2
workers to have access to the original data and to be aware of how the data have been
altered. A STATISTICAL APPROACH SIGNPOSTING THE WAY

We are now in a position to be able to record data in a suitable format for statistical
analysis. This chapter outlines a general statistical approach which can be applied to
any data-set while, at the same time, it attempts to guide the reader through the
following chapters. It is useful to preserve the two stages implied by the structuring of
this book: the initial descriptive and exploratory stage and then the inferential stage
when hypotheses can be formally tested. Going beyond these relatively simple
techniques it may then be suitable to apply multivariate techniques to try and
understand more complex patterns within the data.

The descriptive and exploratory stage (Chapters 1, 3, 4 and 5).


The suggested approach is meant to emphasise the exploratory nature of statistical
analysis. The aim is not to perform 'an analysis' to produce 'the answer' but rather to
execute successive passes through the data gradually identifying trends and patterns that
look interesting and can be followed up by further investigation. A series of sequential
steps can be recommended, of which the first two have already been described.

Step 1 (Chapter 1).


Establish the structure of the data.
- Assign variable names, identify the level of measurement for each variable.
- Assign a case identifier if there is not one.
- Decide on the coding of nominal and ordinal variables.
- Decide how to code missing values.

Step 2 (Chapter I).


- Produce a rectangular data matrix aligning the columns.
- Visually scan the matrix for any obvious errors.

Step 3 (Chapter 3).


- Investigate the gross values of each variable individually (i.e. univariate analysis).
This is still primarily screening for errors. It is important to be sure that the data
are absolutely error free.
- The minimum and maximum values of each variable can be initially important in
identifying possible errors. For categorical variables using a numeric or
alphanumeric code this can show cases of gross misclassification. For
continuous variables this can show errors of measurement (although it could be
a genuine outlier).
- Correct any errors and repeat this step.

8 9
M. Fletcher and G.R. Lock A statistical approach

Step 4 (Chapters 3, 4 and 5) Comparison using two categorical variables (including grouped continuous
Investigate the distribution and parameters of each variable (still univariate) using the variables). This is a contingency table approach (Chapter 3), an example being
full range of descriptive statistics. material of spearhead by find context.
Comparison using one categorical variable (including grouped continuous
- For categorical variables the most useful will be frequency tables, bar charts and variables) and one continuous variable (Chapters 3, 4 and 5). This approach
the modal value(s). produces statistics for the continuous variable, using the techniques as in Step 4,
- For continuous variables the mean, median, range and standard deviation together for each category of the categorical variable and compares them. A simple
with histograms, stem-and-leaf plots, boxplots and ogives will probably be the example would be a histogram, mean and standard deviation for the maximum
most productive. length of spearheads from each category of find context how do they
- Use pictures and graphical techniques wherever possible, these can be much more compare?
infonnative than numbers alone.
- Investigate the same variable several times over don't just produce one result It is quite common in both of the above comparisons for one of the categorical
and claim it is 'the answer'. For example, if a continuous variable is being variables to be either time or position related. This results in the investiaation of
analyzed by a histogram or stem-and-leaf plot, use several values for class temporal and spatial trends respectively - the two most important lines of e~qui1-y in
intervals and midpoints and compare the results. archaeology.
- Create new variables by transformations (Chapter 1) and repeat step 4.
- Anomalies and errors in the data can still be identified at this stage. Correct any - Comparison using two continuous variables. A scatterplot of the maximum
and return to step 3. length by the maximum width of the spearheads is an example.

This is the end of the basic analysis and en-or checking procedures. It is important to realise that:

Step 5 (Chapters 3, 4 and 5) - All three methods of comparison could include data from another data-set
Certain simple, albeit often important, archaeological questions will have been comparing data from two different sites or areas for example. How does th~
answered during step 4, these will have been univariate in nature i.e. concerning the maximum length of our spearheads compare to the maximum length of those
distribution and other characteristics of a single variable. The minimum, maximum from a different area?
and average weight of spearheads, the numbers of spearheads from different context - AU three methods of comparison can be developed to include techniques of
types are such questions. The next stage of archaeological questioning will involve formal inference and hypothesis testing. ls there a statistically significant
some kind of comparison of two variables: bivariate analysis. association between the material of the spearheads and their find context or
could it have happened by chance? ls the relationship between the maximum
length and width of the spearheads significant?
It is here that the intuitive nature of statistical analysis becomes more important
because control is in the hands of the analyst: the analysis should be archaeologically
The answering of such questions involves the concepts of probability theory and
driven. On the one hand statistics are just a tool capable of providing answers to
statistical significance and move us into the second stage.
archaeological questions but the real power of statistics is that they can be more than
that - statistics can trigger new approaches to a data-set, generate new questions, and
it is this that makes the intuitive, iterative nature of a statistical analysis important. The inferential stage (Chapters 6 to 11)
~hapters 6 and 7 provide the underlying theory for the techniques involved in drawing
Bivariate questions involve comparison of some kind and fonn the basis of much mferences from the data. Both should be read before attempting anything described in
archaeological analysis. Comparisons will probably be one or more of the following: Chapters 8 to 11.

It is important to 1:ealise that moving into the inferential stage is not an essential step,
the methods descnbed above form the basis of many an excavation report or research
paper. The difficult part of a statistical analysis is often the initial posing of the

JO 11
M. Fletcher and G.R. Lock A statistical approach

archaeological question in statistical terms. This has been compared with translating hmvever, that human beings and their resulting material and social worlds are multi-
between two different languages: archaeology has its own theoretical language and dimensional and complex. That complexity can not always be reduced to single
statistics has an operational language. Once the translation has been done, and it is variables or the relationship between two variables and this has resulted in a long
clear just what relationship between which variables represents the archaeological history of applying multivariate statistical techniques in archaeology.
question to be answered together with which statistical technique is needed. it could be
that one of the descriptive methods will provide enough information. We would still suggest that, as for the simpler techniques, multivariates are used in an
exploratory way. Many multivariate techniques produce some kind of graphical output
There is a general move in statistics away from rigid confinnatory approaches (i.e. one (together with statistics) which is descriptive in the sense that it simplifies and
analysis produces 'the answer') towards a much more flexible exploratory approach, presents patterns within the data. Chapter 12 offers a simple introduction to the two
this applies to inferential techniques as well as descriptive methods. main areas of multivariate techniques that have been used in archaeology. The first is
the general theme of clustering or grouping, the techniques of P1incipal Components
It has already been stated that some of the descriptive techniques mentioned above and Factor Analysis, Correspondence Analysis and Cluster Analysis. Given several
fotm the basis for inferential statistics. Distribution characteristics such as the variance measurements on each of a set of objects can the objects be placed in groups so that
and the mean can be tested (Chapters 8 and 9), as can relationships displayed by within each group the objects are similar but between the groups there are
scatterplots (Chapter l 0) and contingency tables (Chapter 11 ). In every situation, interpretable differences. Secondly, given several measurements on a set of objects is
however, it is important to remember just what it is that is being tested, i.e. it is the it possible to predict a variable of interest from the others, and if so which variables
statistical significance. This is a very different thing to archaeological significance are important in this prediction. These are the techniques of Multiple Regression and
and ihe two should not become conflated. We may identify patterns within the data Discriminant Analysis.
that are statistically significant at the 95%) level but unless this can be translated back
into the theoretical language of archaeology, and be given meaning in archaeological Our argument for multivariate techniques being used in an exploratory way is a simple
terms, it will not be archaeologically significant. Another problem, which again can one. Because the statistics underlying these techniques are more complex than for
only be answered in archaeological terms, is that of which level of statistical descriptive and inferential techniques they are in more danger of being seen as a
significance is really meaningful. If something is statistically significant at the 90% 'black box'. It is essential to use a computer and 'answers' are always provided
level but not at the 95% level what does this mean in archaeological terms? ls it whether or not you understand the manipulations being performed on the input data
important? Statistical analysis of archaeological data should not be reduced to a search you have provided. In one sense the process is 'objective' in that the same result will
for statistical significance (see Chapter 6 for more on this). always be attained from the same data whomever perfonns the analysis. In reality,
however, it is a deeply 'subjective' process because firstly, we decide on which
Statistical significance, then, is formally defined and involves testing that is repeatable characteristics to measure and input as variables and, secondly, all of these techniques
(i.e. any two people could apply the same test to the same data and get the same involve making decisions during the process. For example, there are several different
result). Archaeological significance is much more difficult to pin down. Almost any methods of cluster analysis involving different ways of measuring the 'similarity'
identifiable pattern within a data-set can be subjectively analyzed and declared to be between objects and then displaying them. So, just as when using a histogram it can be
significant either because it is the same as some existing pattern or because it 1s enlightening to change the interval width and centre points, when using cluster
different to some existing pattern, this involves testing that is often not repeatable. analysis it can be interesting to experiment with different methods and settings.

While an increasing use and understanding of statistical techniques by archaeologists


will not close this rift between the two different methodologies, it should provide
alternative ways of approaching data.

Multivariate analysis (Chapter 12)


Some archaeological questions can be answered (and many more thought about) by
using the relatively simple univariate and bivariate techniques described above. For
many people and for many analyses these will be adequate. It has to be acknowledged,

12 13
Tabular and pictorial display

CHAPTER 3 It is important that each row has a unique identifier. If this is not included within the
list of variables (some kind of catalogue number, for example) then a new variable
TABULAR AND PICTORIAL DISPLAY <ROW NUMBER> should be created. Each column should also be labelled with a
<VARIABLE NAME>. Try to use meaningful variable names, even if abbreviated,
3.1 Basic aims and .rules. rather than something like V 1, V2, V3 etc.
Descriptive statistics involve the display and summary of data. Tables, diagrams and
individual summary statistics enable a rapid understanding of the main characteristics The standardisation of units within a table can avoid confusion. This applies to all
of a raw data set. The parameters of individual variables, different relationships measurements for a single variable and to all variables within a table. All of the
between two variables and trends and peaks within the data can all be recognised and observations on the variable <MAXLE>, for example, are in centimetres. It would be
quantified with these simple techniques. unacceptable to have some recorded in centimetres and others in inches, or even in
millimetres. All six variables in Table 1.1 that record a distance measurement are in
This chapter describes tabular and pictorial descriptive statistics. The next two centimetres. Again, it would be confusing if different units were used for different
concentrate on the individual summary statistics usually classified as measures of variables.
central tendency and measures of dispersion. The techniques in all three chapters are
exploratory in nature. They can be used together, several times over in different All six are also recorded to one decimal place with the decimal points aligned
combinations, to draw out salient points from a data set. vertically. It is advisable to standardise the number of decimal places, certainly within
the values of one variable and, if possible, within the whole table. This not only makes
For a table or picture to convey the maximum information, 111 a clear and the table easier to understand visually but can also simplify future analysis.
unambiguous way, several simple rules should be followed:
3.3 Tabulating frequencies.
I. Include a title. 3.3.1 One variable.
2. All units of measurement must be clearly stated. If percentages are used try and It is usually the case in archaeology that a data set consists of a large number of items
include actual counts as well, or at least a total so that counts can be calculated. (rows). The tabulation of measurements, therefore, is of little use in trying to analyze
3. State the source of the data if it is not obvious. the whole data set at any level. The usual way around this is to work with frequencies
4. Use footnotes to define ambiguous or non-standard terms and to help clarification instead of measurements. A frequency (usually abbreviated to f) is the number of
generally. times a particular value (measurement) occurs, these are displayed in a frequency
5. Use a key for all symbols and shadings. table.
6. Keep it simple so that the important information is not swamped by unnecessary
detail. If the variable is categorical, a convenient way of building up a frequency table is to
7. Diagrams must be sufficiently large for any detail to be clear. use a system of tally marks. It can be seen from Table 3 .1 that a tally mark is made
for each measurement alongside the category into which it falls.
3.2 Tabulating measurements.
The starting point for any statistical analysis is a data set. This will consist of a table of Condition Tally Frequency
observations which will be measurements at different levels as defined in Chapter 1. A 1 1--1-1-1- Ill 8
table is made up of horizontal rows and vertical columns with a cell at each Ill 18
2 1--1-1-1- 1--1-1-1- 1--1-1-1-
intersection which contains a value. Each row represents an item, sometimes called a
3 1--1-1-1- Ill/ 9
case, and each column is an attribute of the item, usually called a variable. It is
4 1--1-1-1- 5
obvious that in Table I. I each row is one spearhead and each column is a variable as
Total 40 spearheads
described in Chapter 1.

Table 1.1 shows typical archaeological data in the form of tabulated measurements. It
Table 3.1. A univariatefi·equen()' table shmving the condition ofspearheads,
consists of a mixture of categorical and continuous variables.
<COND>.

14 15
M. Fletcher and G.R. Lock Tabular and pictorial display

The tally marks are bundled into fives using the 'five-bar gate' method. At the end of Interval (cm) frequency (t)
each row is a row total and at the bottom is a table total. [n this example the variable is 1.25 - 1.75 0
<COND>; it is ordinal with four categories. Because the table only describes one l.75 2.25 2
4-
variable it is a univariate frequency table. 2.25 - 2.75 5
2.75 3.25 3
It is often of interest in archaeology to comment on proportions as well as counts 3.25 - 3.75 5
hence it is common to convert the figures to percentages. Table 3.2 shows the same 3.75 4.25 7
data as the last table but in a slightly different form. 4.25 4.75 4
4.75 5.25 1
Condition Frequency Percent Cumulative Cumulative 5.25 - 5.75 5
Frequency Percent 5.75 - 6.25 6
Excellent 8 20.0 8 20.0 6.25 6.75 l
Good 18 45.0 26 65.0 Total 39
Fair 9 22.5 35 87.5
Poor 5 12.5 40 100.0 Table 3.3. A grouped univariatefi·equenq table.for maximum width, <MAXWI>.
Total 40 100.0
Table 3.3 shows the continuous variable <MAXWI> grouped into classes of 0.50 cm.
Table 3.2. A 1111ivariatefi·equenc:v table showing the condition ofspearheads, It is important that class intervals do not overlap and have no gaps that could contain a
<COND>. value. With these data, recorded to one decimal place, an interval 1.3 to 1. 7 would
contain all true measurements from 1.250000 ..... to l. 749999 ..... This is replaced by
Column one has category labels (sometimes called 'value labels') rather than the the interval 1.25 to 1.75 as in Table 3.3. Although the value 1.75 occurs twice (in two
meaningless values 1, 2, 3 and 4. The second column shows category frequencies and different intervals) this will not cause a problem since the actual data value of 1.75
the third shows category percentages. The fourth and fifth columns are cumulative will never be recorded because of the accuracy of the data ( it will be either L 7 or 1.8).
frequencies and percentages established by adding consecutive category values. The The precision of the class intervals will depend on the level of accuracy of the
cumulative figures can be of value if the table has many rows or if categories need to variable. If the coding is such that 'boundary values' do occur (values that could be
be combined. From the table above, for example, we could deduce that 65% (26) of assigned to two intervals) it must be decided whether to always put them into the
the spearhead sample were in at least good condition and 35% ( 14) were less than lower or higher interval.
good.
3.3.2 Two variables.
If the variable of interest is a continuous rather than a categorical variable a grouped It is also possible to produce a frequency table for two variables at once. This is a
frequency table should be used. This involves dividing the range of the values for the bivariate frequency table, more usually called a two-way contingency table.
variable into classes and then proceeding as above (treating the variable as Contingency tables are the basis of a group of statistical tests of significance which are
categorical). described in Chapter 11. They are, however, also important in their own right as a
means of rapidly assessing the relationship between two variables as shown in Table
There are no firm rules about how many classes to use although it is nonnal to have 3.4. Here, each spearhead has been assigned to one of the six cells according to its
between five and fifteen of equal size. Less than five would lose too much information values on the two categorical variables <MAT> and <CON>. A tallying procedure
and more than fifteen would make the table too complicated. Classes of equal size similar to that described above is used to produce the six cell frequencies.
give a better idea of the distribution of the variable, as shown in Table 3.3.

16 17
M. Fletcher and G.R. Lock Tabular and pictorial display

Each cell contains its frequency as a percentage of the row, the column and the whole
Context table as indicated by the 'within cell order'. This shows the power of contingency
Stray find Settlement Burial Group Total tables in being able to present a lot of inf01111ation quickly and simply. For example,
Material Bronze 19 1 20 Table 3.5 shows us amongst many other things that 67.5% of all our spearheads are
Iron 8 5 7 20 stray finds, 70.4% of all stray finds are bronze and that 95% of all bronze spearheads
Group Total 27 6 7 40 are stray finds!

Table 3.4. A bivariatefi·equenc)' table (two-way contingency table), context, <CON>, 3.4 Pictorial displays for nominal and ordinal data.
by material, <MAT>. 3.4.1 The bar chart.
The bar chart is the most popular method of representing categorical data, it is
If the variables are nominal or ordinal then the categories to be used will be their sometimes called a 'bar diagram' or a 'block diagram'. The categories of the variable
values ( as in this case). If one or both of the variables are on a continuous scale then are positioned along the horizontal axis and a measure of popularity is the scale of the
decisions about grouping the values will have to be taken. Sometimes the grouping of vertical axis.
a variable will produce a contingency table with many empty cells (a sparse table).
Such a sparse table can cause problems if statistical tests are to be performed so a A bar chart is a graphic version of a frequency table. The ve1iical scale can be in
common solution is to redefine the grouping to produce fewer groups with higher frequencies or percentages (in which case it is a percentage bar chart). If percentages
frequencies. This is discussed in more detail in Chapter 11. are used the frequency for each bar should also be shown.

Table 3.4 shows a 2 by 3 contingency table since it has 2 rows and 3 columns. The
50
row and column sub-totals (20, 20, 27, 6 and 7) are called the marginal frequencies
and the table total (40) is also shown. As with univariate frequency tables,
proportions can be shown by converting the frequencies to percentages. Three 18
40
different percentages can be calculated as shown in Table 3.5.

Context Group 30 '


Stray find , Settlement Burial Total
Material Bronze Count 19 l ! 20
Row% 70.4% 16.7% • 50.0% 20 9
8
Col% 95.0% 5.0% 100.0%
Table% 47.5% 2.5% 50.0%
Iron Count 8 5 7 20 10
5
Row% 29.6% 83.3% 100.0% 50.0%
Col% 40.0% 25.0% 35.0% 100.0%
Table% 20.0% 12.5% 17.5% 50.0(%
0
- -- . -
Excellent Good Fair Poor
Group Count 27 6• 7 40
Total Row% 100.0% 100.0% 100.0% 100.0% Condition
Col% 67.5% 15.0% 17.5% 100.0% Figure 3.1. A vertical percentage bar chartfor condition, <COND>.
Table% 67.5% I 15.0% 17.5% 100.0%
The bars should be of the same width with each one separated by a gap to show that
Table 3.5. A contingency table showing row, column and table percentages. the variable is categorical and not continuous. It is quite acceptable to reverse the two

18 19
M. Fletcher and G.R. Lock Tabular and pictorial display

axes and produce a horizontal bar chart with the bars horizontal. Figure 3.1 shows a
vertical percentage bar chart with frequencies stated. If there are many categories (more than three or four) for the second variable,
compound bar charts can become difficult to interpret and the multiple version may
There are two variations of bar charts that allow the representation of two variables in well be superior. Both Figures 3.2 and 3.3 are relatively simple and show the better
one diagram. These are graphical equivalents to bivariate frequency tables. If we reservation of bronze spearheads compared to iron.
wanted to see how the condition of spearheads varied according to material we could
use either a multiple bar chart or a compound bar chart.
100

Figure 3.2 shows a percentage multiple bar chart. Notice that the bars for the two
categories of <MAT> are drawn together for each category of <COND>. 35
80 .
60,0%
11

....., 60
50.0% C
(1J
u 55
'-
(1)
0...
8 40 ' 40
40.0% 40

....C:
(I)
u 25 Material
20 '
a) 30.0%
ll.
5 D1ron

5
200%
0
- . . . Dsronze
Excellent Good Fair A:Jor

Condition
10.0%
Material Figure 3.3. A vertical percentage compound bar chartfor condition, <COND> and
Iii Bronze
material, <A1AT>.
D Iron

Excellent Good Fair Poor


3.4.2 The pie chart.
Condition A pie chart is a circular diagram divided into sectors where each sector represents a
value of a categorical variable. Each sector is proportional in size corresponding to the
Figure 3.2. A vertical percentage multiple bar chartfor condition, <COND> and frequency (or percentage) value of that category. Figure 3.4a shows a percentage pie
material, <MAT>. chart for the condition of the spearheads ( equivalent to Figure 3.1 ).

Figure 3.3 shows a compound bar chart. The bar for each category of <COND> is a Calculation:
total proportion (the same as Figure 3.1) divided according to the values of <MAT>. When drawing a pie chart it is the angle at the centre which is proportional for each
Both 1miltiple and compound bar charts require some form of shading to represent the category. The proportion of 360 degrees can be calculated in the following way using
categories of the second variable together with an appropriate key. the figures for Figure 3.4;

20 21
M. Fletcher and G.R. Lock Tabular and pictorial display

There is a fundamental difference between a bar chart and a histogram. In a bar chart
Excellent 20% (20/100) X (360) 72 degrees. each block represents a category and block widths are equal so that frequency is
Good =45% (45/100) X (360) = 162 degrees. measured by the height of each block. In a histogram the width of each block is
Fair = 22.5% = (22.5/ 100) X (360) 81 degrees. proportional to the class interval (which need not be constant) and it is the area of each
Poor 12.5% (] 2.5/100) X (360) = 45 degrees. block that measures the frequency. It is usual to have equal class intervals but the
choice of width can affect the appearance.
A protractor can then be used to draw the pie chart. If some of the sectors are so small
10---------------------------~
in size that the labelling will not fit within them a system of shading and a key can be
used.

Good
18.00 i 45.0%, 2

0 ,,__,..._->--.-.~..........-'--•..--L....-;•.--J'---;•.--J'---;.r-'-....,..........--..---'-..---..---.-
....................--'----.--'
2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 11.0 12.0 13.0 14.0 15.0

Figure 3.4. A pie chart showing proportion categories of condition. <COND>)


Length of socket ( cm)
exploded to emphasise 'excellent'.
Figure 3.5. A histogram ofsocket length, <SOCLE>, ·with a 1cm interval.
If the purpose of the pie chart is to emphasise one particular sector this can be
achieved by pulling out or 'exploding'. Figure 3.4 shows an exploded pie chart
Figure 3.5 shows a histogram of <SOCLE> with a 1.0 cm class width and class
which focuses attention on the proportion of spearheads in excellent condition.
midpoints marked. Notice that adjacent blocks touch to indicate a continuous variable.

Notice also the relationship between the accuracy of measurement and class width.
3.5 Pictorial displays for continuous data.
The interval 4.6 to 5.5, for example, is strictly from 4.55 to 5.55 since any socket
3.5.1 The histogram.
length whose true value is 5.53 would have been recorded as 5.5 and a true value of
A histogram is the pictorial equivalent of the grouped frequency table; it displays a
4.56 as 4.6. Because of this the class width is 1.0 exactly.
continuous variable that has been divided into classes. As with bar charts, histograms
can be horizontal or vertical although the latter is much more usual (as in Figures 3.5
to 3.7).

22 23
M. Fletcher and G.R. Lock Tabular and pictorial display

Figure 3.6 shows the same data as Figure 3.5 but with the class widths changed so that
they arc each of width 4 cm. Some of the details have been hidden but the overall 30
shape is still clear.

30----------------------------,

20 .

20•

10 '

10
I

0
. . . . I .
1.5 4.5 7.5 10.5

Length of socket (cm)


0 .l----.-._ _ _.1,...__ __,•. - - - - - ' - - - - - - - . . - - - - ' - - - - - - - , , - . - - - - '
2.0 6.0 10.0 14.0 Figure 3. 7. Figure 3.5 redrawn again with different class intervals and different
midpoints.
Length of socket (cm)

Figure 3.6. Figure 3.5 redrawn with different class intervals. 3.5.2 The stem-and-leaf plot - an alternative histogram?
The stem-and-leaf plot (or stem-and-leaf display or stemplot) is a relatively new type
Because the values of the class width and the class midpoint for a histogram are under of diagram which forms part of the approach known as Exploratory Data Analysis
the control of the analyst, the use of histograms must be approached with caution. (EDA). It is similar in many ways to the histogram but has one important advantage.
They are an exploratory tool which can produce many different results from the same The stem-and-leaf plot displays the actual data values whereas a histogram displays
data set simply by varying the class midpoint and/or the class width. only the frequencies of each class. Stem-and-leaf plots are designed for interval and
ratio data.
Figure 3.7 shows a histogram of <SOCLE> as in Figure 3.5 but with a class width of
3.0 cm instead of 1.0 cm and different midpoints as marked. Using the same data as Figure 3.5 (<SOCLE> which is column 9 in Table 1.1) we can
see in Figure 3.8 how a stem-and-leaf plot is built up.
There are differences between Figures 3.5, 3.6 and 3. 7 reinforcing the exploratory
nature of histograms. It is probably a little naive and can certainly be misleading to Each data value is split into two parts: a stem and a leaf, in this case the digits before
produce just one histogram and accept it as the only interpretation of the data. the decimal point are the stem and those after the point are the leaf. The stem values
are listed once only to the left of the vertical line and the leaves are added to their

24 25
M. Fletcher and G.R. Lock Tabular and pictorial display

appropriate stems. Figure 3.8 shows the first ten values in column 9 of Table 1.1 The advantage over histograms is immediately apparent because the original data
plotted as stems and leaves. values are recoverable from the stem-and-leaf plot. It can be seen that the distribution
is biased towards the lower end of the scale, the distribution has gaps (stems without
stem leaves leaves) and that the two values of 13.5 and 14.4 are high outliers.
2
.., One decision to be made when constructing a stem-and-leaf plot is the size of the leaf
.) 11
4 56 unit. In Figures 3.8 and 3.9 the leaf unit is 0.1 and this indicates the units of the data
5 28 values. Decimal points are not used in stem-and-leaf plots which means that the
6 61 numbers 3500, 350, 35, 3.5 and 0.35 would all be split into a stem of 3 and a leaf of 5.
The differences are indicated in the Leaf Unit statement as follows;
7 8
8
3,500 Leaf Unit= 100
9 2
350 Leaf Unit JO
JO
35 Leaf Unit 1
11
3.5 Leaf Unit 0.1
12
0.35 Leaf Unit 0.01
13
14 The choice of leaf unit will depend on the particular application and on the range of
values to be displayed. Complications can arise if a leaf contains more than two digits.
Figure 3.8. The beginnings ola stem-and-!eaf'plotfor socket length, <SOCLE>. The number 583, for example, may end up as a stem= 5 and a leaf 8 with the three
being dropped (leaf unit= 10).
Figure 3.9 shows the diagram completed with all 39 values plotted. Notice that the leaf
values have also been ordered within each stem to convey extra infonnation. Stem-and-leaf plots can become unwieldy with large data sets although it is possible
to increase the number of horizontal lines per stem. For example, one stem value could
stem leaves have the leaf values Oto 4 and 5 to 9 on a line each.
2 4
3 01114456 3.5.3 The ogive - the total so far.
4 2355668 The ogive ( or cumulative frequency graph) is a graphical technique for showing
5 124589 cumulative frequencies as described in section 3 .3 .1. An ogive takes the form of a
6 0166 graph with the values on the horizontal axis representing the stated value and all
7 258 values below. The vertical axis can be scaled in frequencies or percentages ( or both).
8 01467 Ogives can be used for grouped data and for actual values of continuous data.
9 26
JO 2 Figure 3.10 shows the same data as used in Figure 3.5, socket length, <SOCLE>, with
11 frequencies as follows:
12
13 5 Interval (cm) Frequency Cumulative
14 4 Frequency
1.55 2.55 l l
Figure 3. 9. The completed stem-and-leaf'plotfor socket length, <SOCLE>. 2.55 3.55 6 7
3.55 4.55 4 1l
4.55 5.55 8 19

26 27
M. Fletcher and G.R. Lock Tabular and pictorial display

5.55 - 6.55 5 24 The ogive allows the rapid assessment of some useful characteristics of a distribution.
,.,
6.55 - 7.55 _) 27 ln Figure 3.10, for example, we can see that 50% of all the spearheads have a socket
7.55 8.55 5 32 length between 2.0 and 5.0 cm and the other 50% between 5.0 and 14.0 cm. More
8.55 9.55 3 35 formally, the Median, Quartiles and Percentiles of a distribution can be calculated
9.55 - 10.55 2 37 from the ogive, these are described in Chapters 4 and 5.
10.55 11.55 0 37
l 1.55 12.55 0 37 3.5.4 The scatterplot - displaying two variables.
12.55 13.55 0 37 Most of the techniques described so far in this chapter refer to the display of a single
13.55 14.55 2 39 variable, the exceptions are the bivariate frequency table and the multiple and
compound bar charts. The scatterplot ( or scattergram, scattergraph or scatter diagram)
For any value along the horizontal axis the corresponding point on the vertical axis allows the plotting of the values of one variable against another variable.
shows how many are less than or equal to that value. In Figure 3.10, for example, 7
spearheads have a socket length ofless than 3.55 cm and 37 less than 11.55 cm. Scatterplots provide a quick and easy visual estimate of the relationship (or
correlation) between the two variables. It is essential that the two variables are paired;
Cumulative frequency they must be two attributes of the same item or case. We can go further than this and
state that they must be paired in an archaeologically meaningful way. For spearheads,
40 the <MAXLE> and <AGE OF FINDER> (if available!) are paired variables which
may be correlated although the relationship would be difficult to explain in
archaeological terms, whilst <MAXLE> and <MAXWI> could well have a
meaningful relationship.
30
Scatterplots can be drawn for variables measured at the ordinal, interval and ratio
levels. A scatterplot takes the form of a graph where a horizontal (x) axis and a
20 vertical (y) axis define an area of two-dimensional space. The axes are scaled
according to the range of values for the variable each represents. It is standard practice
for the points of lowest measurement to meet in the bottom left-hand corner. Each axis
10 should be labelled with the variable name and unit of measurement.
/ As the two variables are paired the items will be positioned in the two dimensional
space according to their values on the two axes. Points are marked with an appropriate
symbol and not joined. Figure 3.11 shows a scatterplot of <MAXLE> and <MAXWI>
0 2 4 6 8 10 12 14 16 from Table I. I. This scatterplot shows a positive association or correlation between
the width and length of spearheads suggesting that longer ones tend to be wider.
Socket length (cm)

Figure 3. I 0. An ogive olsocket length, <SOCLE>.

The shape of the curve betrays the nature of the distribution of the variable. The form
of an ogive is that the curve is always increasing upwards. Large differences in
consecutive class frequencies will produce a steep section of curve whereas small
accumulations will result in a flat curve. It is convention to join the points of the curve
on an ogive with straight lines.

28 29
M. Fletcher and G.R. Lock
Tabular and pictorial display

7---------------------~ the correlation coefficients that are available. These are explained in detail in Chapter
10.
X
X
6' X ♦ ♦ Outliers such as the one large bronze spearhead in Figure 3.11 are immediately visible
X X ♦ in a scatterplot.
X

The distributions could break down into different size groups which will often show as
clusters of points in a scatterplot suggesting a classificatory line of enquiry. If we
decided that the maximum width and the ratio of socket length to maximum length
X XX
♦ ♦ X X
were significant enough variables to base a simple typology of spearheads on, a
X • clustered result from a scatterplot would indicate classes or 'types'.

.....

♦♦
Material
Points to remember:
Methods of tabular and pictorial display are some of the most important ways of
presenting archaeological data and results, but only if they are capable of
2, X interpretation by the reader! Keep them simple, clear and uncluttered. Include
• x Iron information on the raw data where possible.


. . . . . . Bronze
All of these methods are EXPLORATORY in nature. Use them in different ways on
0 10 20 30 40 50 60 70 80 different variables to extract information from the data which could be of interest It is
often dangerous to just do one analysis and present the result as 'THE ANSWER'.
Maximum length ( cm)

Figure 3.11. Scatte1plot of maximum ·width. <MAXWI> and maximum length.


<MAXLE> by material, <l'vJAT>.

As in Figure 3.11, it is possible to introduce a third variable into a scatterplot. This


must be categorical so that each category can be represented by a different symbol, in
Figure 3. l 1 the two categories of <MAT> are shown. If too many different symbols
are displayed on the same plot it can become confusing and difficult to interpret, three
or four is the maximum. If two points fall on exactly the same position they are shown
by a 2 on the diagram (or a 3 for three points etc.).

It is also possible to label each item on the plot for identification. In Figure 3.11 the
unique value in column 1 of Table 1.1 could be displayed next to each point. Again,
though, care must be taken not to overcrowd the diagram.

From a scatterplot it is possible to get a quick visual estimate of the correlation


between the two variables displayed. This could be a positive or negative linear
correlation, a non-linear correlation or a zero correlation. This visual estimate often
fonns the first stage of a more formal test of correlation and significance using one of

30 31
Measures of position

CHAPTER4
50.0%

MEASURES OF POSITION - THE AVERAGE

4.1 Introduction.
One of the less contentious uses of statistics is to condense and describe large bodies 40.0%

of data in a precise manner. Looking at the raw data in Table 1.1 it is impossible to get
an immediate understanding of the spearheads because there is too much detail. What
is an average spearhead? How many are larger or smaller than average? The tabular
and pictorial displays of the last chapter go some way towards summarising and - 30.0%
c::
making sense of the data-set but it is possible to do more, and to be yet more precise. <JJ
,_
(.)
(l)
a.
Although the term 'average' is often used it is, in fact, very imprecise. When most
20.0%
people talk of the average (add up all the values and divide by the number of values)
they are actually referring to the mean. There are two other common measures of
position or average which are useful in archaeology: the mode and the median. It is
important to use the correct term for the particular type of 'average' being used. All
three measures have different advantages and disadvantages, the most suitable can
depend on the level of measurement of the variable being used ( see Chapter l ).

4.2 The Mode.


The mode is the only measure of position that can be used for nominal data. It can be Good Fair Poor
Excellent
used for variables measured at any level although interval and ratio variables are Condition
usually grouped.
Figure 4.1. The condition, <COND>, of the spearheads showing
The mode of a distribution is that value that occurs the most, i.e. it is the most popular, the modal class.
the most fashionable, it has the highest frequency.
of 8. It would only take one more in the 800-900 group to make the mode very
Figure 4. l shows a barchart of the ordinal variable <COND>, there are 8 spearheads in different. Also, the mode is not sensitive to frequencies in any of the other class
excellent condition, I 8 are good, 9 fair and 5 poor. Value 2 (good) is the modal class, intervals. They could all have values of 1 or they could all have values of 7, the mode
it is simply the most popular. would not alter.

Figure 4.2 shows a histogram of the ratio variable <LOSOC>. The values have been It must be remembered when using a grouped interval or ratio variable that class
grouped with a class interval of 0. I cm. intervals and midpoints can drastically influence the mode.

Note that there are two classes with the highest frequency of 5, l .65 to l. 75 and 2.35 Despite these problems with the mode it is still often useful to know the 'typical' or
to 2.45. This distribution is, therefore, bimodal and the two modes can be estimated to 'most popular' value in a distribution. If the variable is nominal then there is no
be I. 7 and 2.4. If there had been three modes it would be trimodal, etc. alternative, to speak of the 'average' is to use the mode.

Because the mode is a relatively simple statistic there are problems with it. It is an The mode is also a useful measure if the distribution is asymmetrical (skewed) rather
unstable measure and can swing wildly by the alteration of only a few values. Figure than symmetrical (see section 4.5).
4.3 shows a histogram of <DATE>: 300 to 400 BC is the modal class with a frequency

32 33
M. Fletcher and G.R. Lock Measures of position

5,- - ~

8
Modal class

4 - ~ ~

- - - >,
0
C:
Q)
:::;
C-4
f!
I.I.

-- '-- -

- '-- 1--

0 I l l I I I
I 0 200 400 600 800 1000 1200
1.6 ,~ 20 22 24 2.6 2.8
Date BC
Width of lower socket (cm)
Figure 4.3. A histogram ofdate, <DATE>, shmving the modal class.
Figure 4.2. The bimodal distribution ofsocket width, <LOSOC>.
If the number of values is even (40 in this case) the median (abbreviated to Md) is
4.3 The median. halfway between the middle two,
The median (from the Latin for 'middle') of a distribution is that value which cuts the
distribution in half. One half of the values will be larger than the median and the other
20th value + 21 st value 600 + 700 1300
half smaller. Md = - - - - - - - - - 650
2 2 2
The median can be calculated for variables that are ordinal or higher but not for
nominal variables. It is most suitable for ordinal variables. If the number of values is odd the median will be an actual value. Suppose the first
spear in the list was not dated leaving only 39 values, the median is now the 20th
Cakulation: value 600. There are 19 values above and 19 below the 20th value.
List the values in order, for example the variable <DATE> as shown in Figure 4.3:
Thirty-eight of the spearheads have a measurement for the variable <MAXLE>, the
50, I 00, 150, 200, 300, 300, 300, 350, 350, 350, 350, 350, 400, 400, 400, 450, 450, median is 17.8. It will be seen from Table 1.1 that although 17.8 is not an actual value
450,600,600,700,700,700,800,800,800,800,800,800,800,900,900, 1000, 1000, there are 19 above and 19 below it.
1000, 1200, 1200, 1200, 1200, 1200.

34 35
M. Fletcher and G.R. Lock Measures of position

The median can also be calculated from the ogive (see Chapter 3.5.3) by reading off Note that calculating a mean usually produces an answer to several decimal places,
the value of the variable (horizontal axis) that corresponds to half of the total especially when using a calculator. Always round the answer down to a sensible level
frequency (vertical axis). of accuracy when quoting it. In this instance the level of accuracy is meaningful but it
may not always be so. Values that are recorded solely as integers could produce means
As with the mode, changes in just one or two values can have an effect on the median. to two or three decimal places check that it is archaeologically sensible.
In the <DATE> example above changing just the 20th value to 700 would cause the
median to also change to 700. The mean is truly representative of a distribution if the values are grouped closely
around a central value. 1t is sensitive to all values in the distribution, however, and can
The median, however, has the advantage of not being sensitive to occasional extreme be very misleading. If the distribution is widely spread, unevenly distributed, has
values (outliers) which can seriously influence the mean (see below). groups towards the extremes or even just occasional outliers, the mean on its own may
not be a good measure of position or average.
4.4 The mean.
Strictly speaking the mean described here is the arithmetic mean. There are other 4.5 Comparing the mode, median and mean.
means such as the 'harmonic mean' and the 'geometric mean' but they are It is important to understand that the mode,.median and mean are three quite different
infrequently used in the social sciences and will not be detailed here. measures of position which can give three different values when applied to the same
data-set. The logic behind their calculation is different as they are measuring different
The mean is the most common form of average and can be used on interval or ratio qualities of the same distribution.
data but not nominal or ordinal.

The mean is the most important measure of position because a lot of further statistical
analyses are based on it. Much standard statistical theory is devoted to testing means
and the variation about the mean.

Calculation:
x
The usual notation for the mean of a variable x is (x bar). Mean
Sum the values and divide by the number of values. Mode
Median
Symmetrical
Formula: x
n
c)
b)
where:
L, (sigma) the sum of I
I
x the individual value I
n the total number of values.

Using the variable x <MAXLE>: Median Median


Positive skew Negative skew
X 785.46/38 20.67

Figure 4.4. Symmetrical and asymmetrical distributions.

36 37
M. Fletcher and G.R. Lock Measures of position

All three measurements are sensitive to the symmetry (or skewness) of the Figure 4.5 shows the frequency table, histogram and ogive for the variable <LOSOC>
distribution. Figure 4.4 shows three hypothetical distributions, a) is symmetrical with a class interval of 0.2 cm. ( 1.5-1. 7 is really 1.5-1.699 etc.).
whereas both b) and c) are asymmetrical; b) is positively skewed and c) is negatively
skewed. The distribution is fairly symmetrical with the following measures;

All three measurements are read from the horizontal axis. modal class 1.5 to l. 7 (really 1.5-1.699)
mode 1.64
In the symmetrical distribution the mode, median and the mean all have the same median 2.05
value. Note that in both of the skewed distributions the three values are different with mean 2.05
the mode at the 'highest' point, the mean towards the tail of the distribution and the
median in between. The modal class is 1.5 to 1.7 and the mode could be taken to be the middle of this
interval. A more accurate estimate of the mode is obtained by using the simple 'cross'
The mean can be affected by a few low scores in a negative skew or by a few high construction shown in Figures 4.5 and 4.6.
scores in a positive skew. In both cases it is not a good measure of position and if used
alone would not accurately describe the distribution. For skewed distributions it is
advisable to use all three measures as shown in Figures 4.5 and 4.6. Mode

12 ! Mode

Maximum length 1o

Class interval(cm) f cf
8•
10-20 23 23
Width of lower socket
0
20-30 10 33
Class interval(cm) f cf
30-40 4 37
1.5-1.7 11 11 20 40 60 80
Maximum length (cm)
40-50 0 37
--
2 ·· 40
1.7-1.9 6 17
50-60 0 37
1.9-2.1 4 21 14 i 6 l 8 20 22 24 26 28
Width of lower socket (cm)
60-70 0 37 30

//'
2.1-2.3 8 29 40
70-80 38
2.3-2.5 5 34 20

2.5-2.7 3 37 30
I
i
2.7-2.9 3 40
20
10
/M
I
di8'1

0 I
20 4G 60 80
10 Maximum length (cm)
1,1 :an

0 L
Figure 4.6. Frequency table, histogram and ogivejimnaximum length, <MAXLE>.
'4 16 18 20 22 24 26 28
Width of lowar socket (cm)

Notice that when <LOSOC> was discussed earlier (Figure 4.2) an interval of 0.1 cm
Figure 4.5. Frequency table, histogram and ogive.for socket width, was used and the distribution was bimodal. There are no absolutely definitively correct
<LOSOC>.

38 39
M. Fletcher and G.R. Lock

methods of describing data, different approaches may produce different results, which CHAPTER 5
is why it is important to always state or define the techniques being used.
MEASURES OF VARIABILITY - THE SPREAD
In contrast to Figure 4.5, Figure 4.6 shows a histogram of the <MAXLE> of the
spearheads which is strongly positively skewed. 5.1 Introduction.
Using methods described in Chapters 3 and 4 we can now display the distribution of a
There is one very high value and several quite high values which are stretching the variable and give a measure of its central tendency in the fonn of a single value, its
distribution in one direction to produce the following measures; ·average' value. These alone are not enough to adequately describe a distribution as is
shown in Figure 5.1:
modal class I 0-20 (really I 0-19. 999)
mode 16.0
median 17.8
mean 20.7

In these circumstances it could be misleading to quote just one measure as the


'average' length of the spearheads, all three together give a more accurate description.

Points to remember:
Be precise about which 'average' you are using. Depending on the level of
measurement of the variable under investigation, try as many of the three methods as
possible. Compare the results.
5 7 8 5 6 7 8 9 10
It is always useful to 'visualise' the data by using the simple graphical methods. as Mean Mean
demonstrated here, rather than just looking at numbers.
Figure 5.1. The spread ofa distribution.

These two hypothetical distributions both have the same mean (and median and mode)
but it is immediately obvious that they are very different. One has a wide spread of
values while the other has values which are much more clustered around the mean.
This chapter describes the main ways of quantifying the spread, or variation of a
distribution, called measures of dispersion or measures of variability.

Measures of dispersion only apply to interval or ratio data.

5.2 The range.


The range measures the total spread of the distribution. It is a simple measure and is of
limited use.

Calculation:
The range is calculated by subtracting the minimum value from the maximum value.

40 41
M. Fletcher and G.R. Lock Measures of variability

Example: Q, 350
The variable <MAXLE> has a maximum value of 72.4, a minimum of 10.2 and a Q2 = 650 (the median)
range of 62.2. Q3 850

Because the range is such a simple measure there are problems with it. Like the mean, Interquartile range 850 - 350 = 500
it is seriously affected by outliers (single extreme values). The <MAXLE> of the
spearheads has an outlier with a value of 72.4 (look back to Figure 4.6). If this one
value was removed the range now becomes 38.0 10.2 = 27.8. The removal of this Quartile deviation 250
one value has altered the range from 62.2 to 27.8, a drop of 34.4 points. 2

The range can clearly only be used as a sensible measure of dispersion when all the 2. Draw an ogive (method as in Chapter 3.5.3). Draw horizontal lines from the 25%,
values are clustered together. It gives the impression of an evenly spread distribution 50%i and 75% points on the vertical axis, when they hit the ogive line drop vertically
despite the presence of outliers. and read off the values on the horizontal axis. Figure 5.2 shows the same ogive as in
Figure 3.10 with quartiles calculated.
5.3 The quartiles.
Another form of range, and one that eliminates the problems associated with outliers, Cumulative frequency
is the inter-quartile range and its associated statistic the quartile deviation.
40
The quartiles are the three values in a distribution that partition it into four parts with
an equal number of values in each part. They are usually referred to as Q 1, Q 2 and Q 3
so that 25% of the values are less than Q,, 50% are less than Q 2 and 75% are less than
30
Q 3, (Q2 is also the median).
01=4.3
Using this same concept, the points that divide a distribution up into one hundred 02=5.6
equal parts are called the percentiles. If a value falls on the 73rd percentile, for
example, we know that 73% of the distribution is less than that value. During the
discussion of probability and hypothesis testing in Chapters 6 and 7 the importance of
I O3=8.0

+-----
I

the 5th and 95th percentiles will be shown. Occasionally deciles are used, with an 10
obvious interpretation. The median, Q 2 , 50th percentile and 5th decile are all different
ways of describing the same value.
' '

I
I
I
The inter-quartile range is the difference between Q, and Q 3 and the quartile deviation
__j_--=---~~~---+--------~--------
is halfof this, ie. the deviation around the median. 0 2 4 6 8 10 12 14 16
Socket length (cm)
Calculation:
The quartiles can be calculated in two ways: Figure 5.2. Calculating quartilesfi·om an ogive.for socket length, <SOCLE>.

l. Referring back to the method for calculating the median (Chapter 4.3), the data The inter-quartile range is then found by subtracting Q 1 from Q 3 . The quartile
values are listed in increasing order and the list is then quartered. For the variable deviation is found by subtracting Q 1 from Q 3 and dividing by 2.
<DATE> this will produce the following quartiles:

42 43
M. Fletcher and G.R. Lock Measures of variability

Formulae:
Iron:
Inter-quartile range= Q 3 Q, Min Max Range lnt-Quart Quart
Range Dev
Q,-Q1 10.2 72.4 62.2 13.07 26.25 13.18 6.59
Quartile deviation = - - -
2

The relationship between the quartiles and the range is summarised in Figure 5.3. The effects of the one large iron spearhead are obvious when the two values for the
range are compared. The similarity between the two inter-quartile ranges, however,
shows that there is not much difference between the variation in the maximum length
+------------ Ranked Values
of bronze and iron spearheads when all of the values are considered rather than just the
tv,o extremes. Figure 5.4 shows histograms of the two distributions with class
intervals of 5 cm and class midpoints as marked.
1 25% 25% 25%
! of the values _J of the values J . of the values of
8

t t t i:i' 6
C:
Q2 Cl)

5- 4 Ir o n
I e
LL 2
Median
Minimum Maximum
N w "' m -,
p
0
0
a
0
b
"'a"
0 9
o
0
O
0
0
Maximum length (cm

i:i'
C:
6
Bronze
Cl)

- - - - - - - - - - - - Range 5- 4
...
Cl)

LL 2

Figure 5.3. The quartiles and the range. 10 20 30 40 50 60 70 80


Maximum length
Example: (Cm )
If we calculate the above for <MAXLE> of the spearheads categorised by the two
values of <MAT> we get the following:
Figure 5.4. Distributions of the maximum length, <MAXLE>, by the two categories al
Bronze: material, <MAT>.
Min Max Range Q1 Q3 lnt-Quart Quart
Range Dev 5.4 The mean deviation.
10.2 36.6 26.4 13.12 24.17 I 1.05 5.53 Another measure of dispersion is the mean deviation. This is also more reliable than
the range because it is calculated using every value rather than just the two extremes.
It also differs from the inter-quartile range and the quartile deviation because it uses
every value rather than just the values at certain rank positions.

44 45
M. Fletcher and G.R. Lock Measures of variability

The mean deviation is a measure of how much each value deviates from the mean; it Comparing these results with the range and the inter-quartile range above it is the most
would, in fact, be more accurate to call it the mean of the deviations from the mean. sensitive of the three if all the values are to be taken into account

Calculation: 5.5 The standard deviation.


Calculate the difference between each value and the mean (ignoring the sign of the You may have realised that in calculating the Mean Deviation each deviation from the
difference). Total the differences and divide by the number of values. This gives the mean is treated as being positive even though half of them are negative! This is rather
mean of the differences which is the Mean Deviation. inelegant and is overcome in the calculation of the Standard Deviation by squaring
each deviation from the mean. This also has the effect of weighting in favour of the
Formula: larger deviations thus giving a more realistic measure of the dispersion.

Mean deviation
Ilx-xl The Standard Deviation is the most used measure of dispersion. It is important, as is
11 the mean, because it forms the basis of further statistical tests. This also applies to the
variance (the Standard Deviation squared) although this measure tends not to be used
where: as a measure of dispersion because it can be a very large number.
x the individual value.
= I- 31 = 3
Calculation:
11 absolute value ie. ignore minus signs so that 131
The Standard Deviation can be abbreviated to: S.D., s, or 0 (sigma, small s in Greek).
x = the mean of all the values. Calculate the difference between each value and the mean. Square each difference.
Li the sum of. Total the squared differences to obtain the sum of squares. Divide the sum of squares
by the number of values to obtain the variance. Square root the variance to find the
n the number of values in the list. Standard Deviation.

Example: Formula:
For the two material categories of <MAXLE> of the spearheads the mean deviations
are:

For Bronze: For Iron:


S.D. (s) J( }) x-
11
x)2 J
11.4 18.68 7.28 12.4 22.89 = l 0.49
16.6-18.68 2.08 22.6 22.89 0.29 The variance is usually called s2 and has the following formula:
10.2 18.68 8.48 17.9 22.89 4.99
18.6- 18.68 0.08 16.8 22.89 6.09
24.4 18.68 = 5.72 13.3 22.89 = 9.59 n
Etc. for all 20 values Etc. for all 18 values

Total of differences Total of differences where:


= 108.52 173.72 x = the individual value.

Mean deviation Mean deviation


Li the sum of.
x the mean of the values.
108.52 I 20 173.72/18
n the number of values.
= 5.43 = 9.65

46 47
M. Fletcher and G.R. Lock Measures of variability

Example: towards O a very narrow spread (small S.D.), and


For Bronze <MAXLE>: towards l very wide spread (large S.D.)

(x x) (x x) 2 Calculation:
18.68 - 11.4 = 7.28 The coefficient of variation is usually denoted as V. It 1s found by dividing the
52.998 (7.28 squared)
Standard Deviation by the mean.
18.68 16.6 2.08 4.326 (etc.)
18.68 l 0.2 8.48 71.910
Formula:
18.68 - 18.6 0.08 = 0.006
18.68 24.4 5.72 32.718 V = S.D.
Etc. X
Etc. for all 20 values
Example:
Total of the squared differences 968.832 The following table shows the means, standard deviations and coefficients of variation
for the two variables <MAXLE> and <MA WIT>.by material category:
Variance 968.832/20 48.442
Maximum length Socket end to maximum width
<MAXLE> <MA WIT>
S.D. ✓ 48.442 6.96
X S.D. V x S.D. V
[f this is applied to both of the material categories of <MAX LE> we get the following: Bronze 18.68 6.96 0.37 8.63 2.82 0.33
Iron 22.89 14.85 0.65 9.87 4.34 0.44
S.D. (Bronze) 6.96
S.D. (Iron) 14.85 One standard deviation of <MAXLE> for bronze spearheads is approximately one
third of the mean (reflected in V= 0.37) whereas for iron spearheads the relative
Compared to the Mean Deviations calculated in the last section the Standard spread is greater at over one half (V= 0.65). The coefficient of variation, therefore, is a
Deviations are quite different. The main difference is in the much greater value of the convenient way of expressing this comparison. ·
Standard Deviation for iron spearheads. This is because of the weighting that the
Standard Deviation gives to values with larger deviations from the mean. The three For iron spearheads the difference between the variability in the distance from the end
large iron spearheads indicated in the histogram of Figure 5.4 are responsible for most of the socket to the maximum width and the variability in the maximum length is
of the variation in the maximum length of all iron spearheads. noticeable compared to the figures for bronze, (V= 0.44 and V= 0.65, V= 0.33 and V=
0.3 7). This suggests that iron spearheads have similar sized socket lengths regardless
It is worth remembering that the Standard Deviation never approaches anywhere near of their overall length, the half of the blade towards the tip is responsible for most of
the range. As a rough rule of thumb when n= I 0 the Standard Deviation will be about the variation in maximum length.
one third of the range and when n= I 00 it will drop to about one fifth.
5. 7 Standardisation.
5.6 The coefficient of variation. Standardising values in a distribution is a similar concept to that just introduced with
It is sometimes difficult to compare the spread of two or more distributions by just the coefficient of variation. lt allows the comparison of values for different variables
looking at the means and standard deviations. The actual values of these statistics on a fixed scale. This is done by converting any value to a z-value (or z-score ).
could be of very different orders of magnitude. The coefficient of variation provides a
comparative measure on a fixed scale from Oto l (values remain positive) where: Consider a comparison between <MAXLE> and <MA WIT> for a bronze spearhead
with a maximum length 20.0 cm and socket end to maximum width 13.0 cm.
Comparing these values with the results given in the table above it can be seen that the

48 49
M. Fletcher and G.R. Lock Measures of variability

20.0 cm is a fairly typical length while 13 .0 cm is an unusually high measurement. An Minimum Q1 Median Q3 Maximum
objective way of measuring the 'typicality' of the two measurements is to convert
them to z-scores. It is convenient to think of z-scores as units of Standard Deviation in
relation to the position of the mean. A z-score of 1.0, therefore, is one S.D. away from
the mean. For most distributions an interval of 3 SDs each side of the mean contains
nearly all of the values, so z-scores usually fall within the range -3.0 to +3.0.

Formula:
X X
z-value

where x is the raw or unstandardised value with mean x and standard deviation u.
Thus the standardised value for a maximum length of 20.0 is

20.00-18.68
z 0.19
6.96
Scale
while for a socket end to maximum width of 13.0
Figure 5.5. A boxplot.
z 1.55
2.82 At a distance of 1.5 x h-spread above and below the edges of the box are the inner
fences. At a distance of 3 x h-spread either side of the edges of the box are the outer
These results show that the value of 13.0 is 'relatively' larger or more extreme than fences.
the value of 20.0. The latter is very close to the mean of the distribution whereas the
former, 13.0, is over one and a half S.D.s away from the mean of 8.63. Any values that fall between the inner and outer fences are considered to be possible
outliers, and any values that fall beyond the outer fences are probable outliers. The
5.8 Boxplots. minimum and maximum points are found by ignoring any outliers of either type.
Boxplots (sometimes called Box-and-Whisker plots) are a graphical representation of
a distribution using some of the concepts described above. A boxplot divides a Example:
distribution according to the value of the inter-quartile range. Figure 5.5 illustrates a Figure 5.6 shows boxplots for the variable <MAXLE> categorised by material.
hypothetical boxplot with the appropriate terminology.
The boxplots show the larger spread of the iron spearheads (longer whiskers) despite
Calculation: the similarity in the size and position of the central parts of the two distributions (the
On an appropriate scale for the distribution the median is plotted as are Q1 and Q3. median and the boxes which contain 50% of the values). The distribution of bronze
These are called the lower hinge and upper hinge respectively and the difference spearheads is fairly symmetrical as shown by the central position of the median within
between them is the h-spread, they also form the box. 50% of the values are within the box. The assymmetry of the iron distribution is shown by the off-centre position of
the box. The whiskers run from each side of the box to the minimum and maximum the median within the box. Remember that the S.D.s are 6.96 and 14.85 for bronze and
values as defined below. iron respectively and the difference between these is reflected in the boxplots. Note
that the iron distribution has one probable outlier at the higher end.

50 51
M. Fletcher and G.R. Lock

CHAPTER6

AN INTRODUCTION TO PROBABILITY AND INFERENCE ORA WING


CONCLUSIONS

6.1 Introduction.
iron
*
16
Section I was concerned with methods of descriptive statistics; describing, presenting
and condensing data. These alone will often be enough to isolate trends and patterns
within the data enabling certain archaeological questions to be answered and
generating new questions to be asked.

This section goes one step further and introduces methods of formally testing patterns
within data. These statistical tests are referred to as 'inferential statistics' because they
are performed within a framework of hypothesis (or theory) testing and something is
inferred from the result. Jn a nutshell, a certain pattern within the data is tested and
Bronze found to be significant or not. Of course there are many different tests which can be
applied and different levels of significance.

Important note: This chapter, together with the next, covers the three important areas
of probability, inference and sampling. Both inference and sampling depend upon the
concepts of probability and although sampling comes before inference in practice,
here we discuss probability and inference together because of their close logical
10.0 20.0 30.0 40.0 50.0 60.0 70.0 80.0 relationship.
Maximum length (cm)

Figure 5.6. Boxplotsfor maximum length, <MAXLE> by material, <MAT>. At the heart of all inferential statistics are the concepts of randomness and probability.
Many natural and artificial phenomena (including many archaeological data) are
Without a proper investigation of the dispersion or variability of a distribution no random in the sense that they are not predictable in advance although they do exhibit
meaningful comparisons or inferences can be made. Of all the different measures of long term patterning. It is the study of these patterns (statistical distributions) which
variation the standard deviation is certainly the most used and together with its close will involve probabilistic (also called stochastic or random) models.

relative the variance (SO ✓ variance) forms the basis for a great deal of statistical The mathematical theory of probability was started by the two French mathematicians
inference (see Section 2). We must repeat the emphasis on exploring the data use Blaise Pascal (1623-1662) and Pie1Te Fermat (1601-1665). Probability theory also
different techniques and compare the results, use graphical representations whenever owes a great deal to the work in 1933 of the modern day Russian A.N. Kolmogorov.
possible.
6.2 Probability measuring chance and risk
6.2.1 The concept of probability

Three important definitions:

Definition l. Often called 'a Priori' or Classical


For a complete set of n equally likely outcomes of which r are favourable, the
probability of a favourable outcome is:

52 53
M. Fletcher and G.R. Lock An introduction to probability

P(peg hole) r/n


P(Favourable outcome) r/n =27/39
0.692 (this is the same as 69.2%)
Note the notation, P( ) simply means the probability of whatever is in the brackets, r
and n are as defined above. Note that probabilities are often stated as percentages. If a new source of spears is
found and the assumption is that they are the same as the existing group, 69.2% of the
Definition 2. Often called 'a posteriori' or Frequentist. new spears will have a peg hole. If this turns out not to be true the difference could
For n past outcomes of which r were favourable, the probability of a favourable yield interesting archaeological conclusions (testing such differences is discussed in
outcome is Chapter 8).

P(Favourable outcome) r/n Example 2


Of the 40 spears, only one has a socket length, <SOCLE>, less than 3cm, and so:
Both Definitions 1 and 2 will clearly produce a probability which is a measure within
the range 0.0 to l .O inclusive. This is the standard way of stating a probability, so that: P(socl<3cm) = l / 40
0.025
a probability of 0.0 implies the event is impossible (eg. the probability of an iron
spearhead being made in the Neolithic) Any probability less than 0.05 (5%) is usually considered to be so low that, in this
instance, it is improbable that a spear chosen at random would have a socket length of
a probability of 1.0 implies the event is certain (eg. the probability of finding less than 3cm. Equally, any probability greater than 0.95 (95%) is usually considered
something important protruding from the baulk of an excavation on the last afternoon. to be very high (this introduces the concept of significance and is discussed below in
JOKE!) section 6.4). It is highly probable, therefore, that a spear chosen at random will have a
socket length of 3cm or more.
a probability of 0.3 implies the event has a reasonable chance of occurring but is not as
likely as an event with a probability of 0.8. 6.2.2 The concept of independence - are two events related?
For a spear chosen at random the probability that it is made of bronze and iron is
Definition 3. Often called Subjective. clearly 0.0 since all of these spears are made of either bronze or iron but not both. The
For a particular event, give a personal estimate of its probability using a scale of 0.0 to two events 'choose a bronze spear' and 'choose an iron spear' are said to be Mutually
1.0. Exclusive (M.E.) - they cannot occur together. Other examples of such a dichotomy
are Male/Female or Pig/Sheep etc.
All three definitions have their place in archaeological analysis although Definition 2
is the most often used, assigning probabilities to past events. If a particular A very important concept in probability theory, both generally and in archaeology, is
archaeological theory is to be tested (compared to the observed data) it is often that of independence. To illustrate this consider the classification of the 40 spearheads
necessary to use statistical theory based on Definition 1. In some cases, when according to their material and whether or not they have a peg hole. We have already
investigating a new phenomenon or characteristic, Definition 3 is used to provide a seen that
first estimate of the probabilities.
P(PH) 27 I 39 P(PH) 12/39
Example I
In the spear data-set of 40 cases (observations, trials or experiments using statistical where:
language) there are 27 with a peg hole (see Table 1.1, variable number 5). Using PH denotes does have a peg hole, and
Definition 2 with n 39 (spearhead number 4 is not counted as it has missing values)
and r = 27 we can conclude that the probability of a randomly chosen spear from
P(PH) denotes does not have a peg hole.
among the 39 having a peg hole is:

54 55
M. Fletcher and G.R. Lock An introduction to probability

(Notice that P(PH) + P(PH) = 1.0 as expected). These ideas are often referred to as conditional probability and the example above
would be written as:
Since 20 out of the 40 are made of Iron and 20 out of the 40 are made of Bronze, we
also have P(B/PH) = 0.37
P(B) 0.50
P(I) 20 I 40 P(B) = 20 I 40
where:
If a spear has a peg hole, what is the probability that it is made of bronze? Does having B/PH bronze given it has a peg hole
a peg hole make it more likely or less likely to be made of bronze? If having a peg B bronze.
hole makes no difference to the probability of the material then the two variables (Peg
Hole v Material) are independent, otherwise they are dependent. The following Two important rules.
contingency table (see Chapter 3.3.2 for an introduction to contingency tables) There are two simple rules which are fundamental to the understanding of
illustrates these ideas: manipulating probabilities:

Rule 1: If the two events A and B are mutually exclusive, then


Material Pe2: Hole
Yes No P(A or B) = P(A) + P(B)
Iron 17 2 19
Bronze 10 10 20 ie. when using 'or' the probabilities are added.
27 12 39
Example. Since 8 spearheads are classified as Condition 1 and 18 as Condition 2
(Table 1.1, variable number 6),
Using this table the following probabilities can be found:
P(CI) 8/40 0.20
P(Bronze and Peg hole) 10/39 0.256
P(C2) = 18/40 0.45
P(lron or Peg hole or both) 27 + 2/39 0.744
P(lron and no Peg hole) = 2/39 0.051 and so:
P(Cl or C2) 8/40 + 18/40 26/40 0.65
Returning to the earlier question, if a spear has a peg hole what is the probability that
Rule 2: lf the two events A and Bare independent, then
it is made of Bronze? The above shows that the proportion of all spearheads that have
a peg hole and are made of bronze is 0.256 but the following calculates the proportion
P(A and B) P(A) x P(B)
of those spearheads which have a peg hole which are made of bronze. Of the 27 spears
that have a peg hole, 10 are bronze, so:
ie. when using 'and' the probabilities are multiplied.
P(Bronze given it has a peg hole) 10/27 0.37
Example. If we assume that half of all spears are Bronze, then the probability of
whilst:
choosing two spears and both of them being bronze is,
P(Bronze) 20140 0.50
P(lst Band 2nd B) P(B) X P(B)
Since these two probabilities are not equal we can conclude that Material and Peg
0.5 X 0.5
Hole are dependent. In fact, if a spearhead has a peg hole it is less likely to be made of
0.25 (or 25%)
bronze and so more likely to be of iron.

56 57
M. Fletcher and G.R. Lock An introduction to probability

Note: To be strictly accurate this result should be (20/40) x (19,139) 0244 since whilst:
having chosen one bronze spearhead from our forty there now remains only 39 of
which 19 are bronze. These ideas are better illustrated with a second contingency P(I) x P(C2) = 20/40 x 18/40
table, this time showing the relationship between Condition and Material. 0.5 X 0.45
0.225

Material Condition Since these two results are not the same, Rule 2 does not hold, Material and Condition
1 2 3 4 are not independent. This means that the material of the spearhead does have an effect
1nm 8 11 l 0 20 on its condition.
Bronze 0 7 8 5 20
8 18 9 5 40 6.3 Probability distributions - predicting results
The data on spearhead condition, <COND>, can be transformed from a simple
frequency count into a probability distribution by replacing each frequency with a
Since Condition 3 and Condition 4 are M.E., it is true that: probability, as shown below:

P(C3 or C4) P(C3) + P(C4) Condition Frequency Probability


9/40 + 5.140 (or relative frequency)
14/40 l 8 0.200
0.35 2 18 0.450
3 9 0.005
However, Iron and Condition 2 are not M.E. since it is possible for an iron spearhead 4 5 0.125
to be in good condition. From the table we have: Total 40 1.00

P(l or C2 or Both) (8 + 11 + 1 + 0 + 7)/40 The variable <COND> is discrete (a condition of 2.3 is meaningless) and ordinal
27/40 (condition 2 is better than condition 4). There are many different models (or
= 0.675 theoretical distributions) which can be suggested as fits for the distributions of discrete
random variables. The commonest two are the Binomial and Poisson distributions
Compare this result with the incorrect (although easily done) way of doing it which provide good models for answering such questions as:

P(I or C2 or Both) P(I) + P(C2) L Of all graves in a cemetery, 23% contain beads. What is the probability that in a
20/40 + 18/40 random sample of 12 graves more than five will contain beads? (Use the Binomial
= 38/40 distribution).
0.95 (WRONG!)
2. The average number of sherds per sq.m. is three. What is the probability that in an
This result is wrong because the score of 11 in the lron/C2 cell of the table has been area of 4 sq.m. there are less than five sherds? (Use the Poisson distribution).
counted twice.
Further discussion of these and other distributions are beyond the scope of this book,
Notice that the table also yields: but see Chapter 13 for further reading.

P(I and C2) 11/40 For continuous random variables which are measured on a ratio, interval or ordinal
0.275 scale (see Chapter 1.2 for levels of measurement) the most important model is the
Normal distribution. It has many applications in archaeology and also plays a very

58 59
M. Fletcher and G.R. Lock An introduction to probability

important role in sampling theory. The nonnal distribution (recognised by the bell-
shaped curve) is the most useful of all distributions because many naturally occurring
distributions are very similar to it. Its mathematical derivation was first presented by
De Moivre in 1733 but it is often referred to as the Gaussian distribution after Carl
Gauss ( 1777-1855) who also derived its equation from a study of errors in repeated
measures of the same quantity.
-3
Consider Figure 6.1 which shows the distribution of the upper socket width,
<UPSOC> for the 40 spearheads. The mean and standard deviation for this variable
are 1.535cm and 0.331cm respectively (see Chapters 4.4 and 5.5 for calculating the
mean and standard deviation). It can be seen that the widths vary more or less
Figure 6.2. A normal distribution with mean 0 and standard deviation I.
symmetrically about the mean with the more extreme values being less probable. This
is typical of a sample from the normal distribution. Using statistical theory and tables
lt can be shown that:
(which are not relevant here) it is possible to produce a curve showing what an exact
normal distribution with this mean and standard deviation would look like. This is
(i) 50% of the values are less than 0
shown as the smooth line in Figure 6.1. Figure 6.2 shows a nonna! distribution with a
(ii) 50% of the values are more than 0
mean of O and a standard deviation of 1.
(iii) approximately 68% are between -1.0 and+ 1.0
(iv) approximately 95% are between -2.0 and +2.0
( v) exactly 95% are between I. 96 and + l. 96
(vi) exactly 90% are between -1.645 and+ l .645
(vii) exactly 99% are between -2.576 and +2.576

These results allow the following statements to be made for any variable with a
normal or near normal distribution:

ii' 9 (a) approximately 95% of all values should be within two standard deviations of the
C
!l)
::i
mean
er (b) practically all values should be within 3 SD of the mean
e:!
I.!.

Example:
For the variable upper socket width, <UPSOC>, (Table l .1, variable number 11)

Mean= 1.535, SO 0.331

Therefore, we would expect about 95% of the values to be within 1.535 ± 2(0.331)
ie. within the limits (0.873, 2.197)
OD 05 10 15 20 25 30
Width of upper socket (cm) In fact there are two (5%) outside these limits with values of 0.8 and 2.2.

Figure 6.1. The distribution of the upper socket width, <UPSOC>.

60 61
M. Fletcher and G.R. Lock
An introduction to probability

The variable maximum length, <MAXLE>, on the other hand, has a distribution
6.4 The logic of hypothesis testing - is it significant?
which is clearly not symmetrical (Figure 6.3) and here the normal distribution is a
Most of the rest of this section, and indeed most applications of inferential statistics in
poor model.
archaeology, are concerned with hypothesis testing (sometimes called tests of
significance).

It is important to understand just what hypothesis testing in this formal context means.
Theories of one kind or another abound in archaeology although many of them cannot
be tested in any way let alone in the formal way to be described below. A hypothesis,
therefore, must represent a quantifiable relationship and it is this rel!ltionship which is
tested formally. We could say that all hypotheses are theories whereas not all theories
are hypotheses.

In order to illustrate the logic of a hypothesis test consider testing the hypothesis that
at least 40% of all bronze spearheads come from burials (this is the quantifiable
association between the number of bronze spearheads and the variable <CON>).

The first step:


is to formulate two hypotheses, one is called the mdl hypothesis (denoted by Ho) and
the other is the alternative hypothesis (H 1). This must be done so that one and only
one must be true. In this case we would have:

20 40 60 80
H0 : Proportion of bronze spearheads from burials is 40% or more
Maximum length (cm) H 1: Proportion of bronze spearheads from burials is less than 40%.

Figure 6.3. The asymmetric distribution olthe maximum length, <MAXLE>. The second step:
is to take suitable measurements or observations from which a test statistic and its
The smooth curve shown in Figure 6.3 indicates the theoretical or expected shape of a associated probability (described in step 3) can be calculated. Here we have a sample
normal distribution which has the same mean and standard deviation as those for of twenty bronze spearheads seven of which have been found in burials (this is the
<MAXLE>. If the underlying distribution of <MAXLE> really was normal, we would observed result).
still expect small discrepancies between the expected and the actual (or observed)
results from a sample, with such differences getting smaller for larger samples. In
So far so good!
Figure 6.3 the differences between the observed frequencies and those expected from a
normal distribution (or a model which assumes a normal distribution) are large, and so The third step:
the normal distribution can be considered a poor model. There are formal ways of is more difficult. Calculate a test statistic which can then be tested for significance in
testing whether a normal distribution is a good model which are discussed in Chapter step 4. The test statistic allows for the calculation of the probability of the observed
9.
result which is often called the p-value. If Ho is true and at least 40°/o of all bronze
spearheads do come from burials what is the probability of a sample of 20 containing
It is important here to mention the Central Limit Theorem (this is explained in detail seven from burials? Using the ideas from Section 2.2 of this chapter we have:
in Chapter 7) because many statistical applications rely upon it. The Central Limit
Theorem provides a rationale for the use of the Nonnal distribution which is why it is P(Burial) 0.40 and so P(Not burial) 0.60
the most important of all distributional models. P(Not burial for I st and 2nd) (0.60)(0.60) (0.60)"
Hence P(Not burial for 13) = (0.60) 13 0.0013

62
63
M. Fletcher and G.R. Lock An introduction to probability

The p-value (probability of the observed result) is 0.00 I 3 or 0. I 3% (very small!) archaeological terms. ls the decision to accept or reject the null hypothesis important
enough to waITant a 1% level or will 5% do, or why not 7%? What are the
Step 4 (and last!): archaeological costs involved in accepting or rejecting the null hypothesis? If lives
Remember that the null hypothesis is being tested. The significance of the test statistic depended on the outcome, as in testing drugs or parts of aircraft, we could easily
will determine whether the Null Hypothesis is accepted or rejected. There are set justify using the 0.1 % level but the situation is not so clear cut in archaeology.
conventions for significance testing which allow this decision to be made.
Returning to the p-value of 0.0013 we can now see that Ho can be rejected at the I%
The p-value calculated above is very small which means either Ho is true and we have level concluding that we are at least 99% certain (this is not absolutely true but will
been very unlucky in drawing an unrepresentative sample, or else Ho is false. The do in simple terms) that the percentage of spearheads from burials is less than 40% (to
convention for significance testing is as follows: be precise we are 99.87% certain but it is convention to stick to the 10%, 5%, l % and
0.1 c!lu levels).
If p<0.05 (5%) reject H0 at the 5% level and conclude that there is significant evidence
to show that the percentage of bronze spearheads from burials is less than 40% (in In the four steps described above it is the calculation of the test statistic and its
other words if H0 is rejected H 1 must be accepted). It is also valid to conclude that we associated probability that can be difficult. In various situations and under various
are 95% certain that the percentage of bronze spearheads from burials is less than assumptions there are a number of accepted methods of doing this, Chapters 8 to 11
40%. describe some of them

If the p-value had turned out to be greater than 0.05, the conclusion would have been
that there is insufficient evidence to reject H0 at the 5% level and so Ho is accepted. A
somewhat philosophical point here it is impossible to prove a hypothesis in this
formal way, it is either accepted or rejected at certain levels of significance. In a way
this is in line with the falsificationist views of Karl Popper and has interesting
implications for archaeology by suggesting that the advancement of knowledge is not
based on proving things but on disproving things. If, for example, we reconstruct an
Iron Age roundhouse and it falls down the first time the wind blows, it is reasonable to
argue that this disproves that building hypothesis. If, however, it stands for many years
it doesn't prove that that is how roundhouses were built in the Iron Age. Which
advances knowledge more?

Although the 5% significance level has been used above other levels can also be
applied, those often used in the social sciences are:

p<0.10 reject at the 10% level


p<0.05 reject at the 5% level
p<0.0 l reject at the l % level
p<0.001 reject at the 0.1 % level.

Important note:
having said this about significance levels it is important to emphasise the arbitrary
nature of the whole concept of significance (we thank Clive Orton for forcing this
issue). We have explained the statistical reasoning behind significance levels but it is
up to the archaeologist to justify the choice of a certain significance level in

64 65
Sampling theory and sample design

CHAPTER 7 Sample: the subset or part of the population that is selected.

SAMPLING THEORY AND SAMPLE DESIGN Sample size (n): the number in the sample. A sample size of 5 is considered small,
while, formally, a sample size of 50 is large. The sample size may be stated as a
7.1 Introduction. percentage of the sampling frame, e.g. a 10% sample.

Archaeologists are permanently working with samples. An area of excavation is a Clearly, the larger the sample size, the more reliable will be any inferences made from
sample of the complete site which in itself is a sample of all sites of that type. The the sample (and, usually, the more costly in time and resources it will be to collect). A
same goes for artefact assemblages which represent samples of a larger, often smaller sample will be less expensive although the resulting infom1ation will be less
unknown, group. For many years such samples have been selected by a variety of ad reliable. Faced with a cemetery of 127 graves, excavating a sample of 100 should
hoe methods and have served archaeology well, indeed, virtually all of our current allow reliable inferences about the whole group, whereas inferences from a sample of
archaeological knowledge rests upon the results from thousands of judgement only l O would be very unreliable. As with judgement sampling, the size of the sample
samples, so called because they are chosen in a mathematically non-rigorous manner. is often a product of many constraining factors.

This chapter is about a different type of sampling, usually called :random sampling in The rest of this chapter is concerned with two important aspects of sampling:
the UK or probability sampling in North America because it uses ideas from
probability theory. Strictly speaking, judgement samples do not allow any statements (i) How to make the sample as representative of the population as possible so that it
to be made about the material that was not included in the sample (although yields the maximum information.
archaeologists do this all the time) whereas random sampling provides a means of
making such statements within a stated confidence limit. If a 20% sample of the (ii) Having taken a sample, how to measure the precision or reliability of the results it
interior of a hil!fort was excavated, and the sample was a random one, it would be produces.
possible to estimate the number of sherds, pits or any other characteristic for the whole
interior within stated confidence limits. If the sample was a judgement sample, as
most excavations are, any statements made concerning the whole would be informed 7.2 Sampling strategies -which measurements to take.
guesswork. The following are the more common and useful sampling strategies for drawing a
random sample. It should be emphasised, however, that it is often difficult to apply
The essence of all sampling is to gain the maximum amount of information by these rigidly in many archaeological situations. They are not claimed to be a substitute
measuring or testing just a part of the available material. Because of the ability of for archaeological intuition and experience, but a useful tool to be used when and
random sampling to enable extrapolation from the sample taken it can often provide where appropriate.
more information than a judgement sample although, of course, the procedure of
random sampling is more difficult to set up and perfonn. Before going any further, Each method will be applied to designing a sample of 10 spearheads from our
some formal definitions need to be established: population of 40 (a 25% sample) in order to estimate the mean weight. We know that
the mean weight of all 40 is 442.4 g and the S.D. 436.0 g.
Population: the whole group or set of objects (or other items) about which inference
is to be made. This could be all Bronze Age spearheads or all spearheads in some sub- A Simple random sample.
set (e.g. a particular county). Each unit in the population has the same chance of being selected as any other unit. To
take a simple random sample of l O spearheads we could:
Sampling frame: a list of the items, units or objects that could be sampled. Often, and
ideally, the sampling frame will contain all of the population, but it need not. (a) Stick a pin into the list ten times without looking (not professional and open to
abuse, fiddling and criticism!)
Variable: a characteristic which is to be measured for the units, such as weight of
spearheads.

66 67
M. Fletcher and G.R. Lock Sampling theory and sample design

(b) Put the 40 spearheads into a large box, shake and withdraw I O (not a good idea, using material as a stratum has no advantage, (to test this claim see the measures of
think of at least three faults!) association for categorical data described in Chapter 11 ).

(c) Number the spearheads I to 40 and using random number tables identify 10 by If the context the spearhead was found in is suspected to have a relationship to weight,
choosing numbers from the tables. This is the usual method which is sometimes then we should stratify our sample using context as a stratum. The 40 spearheads are
improved upon by using a computer to select the random numbers. Table A in the spread between the three contexts as follows:
Appendix is a typical table of random digits between O and 9. They are grouped into
blocks of 5 just for ease of reading. If we were to start reading at the top left and read Context 2 3
across (we could start reading anywhere but should then read from the table in a Number of spearheads 27 6 7
steady sequence either down, up or across), the first number would be 72, the second Percentage of total 67.5% 15.0% 17.5%
3 L then 02, 85. 27 etc. This will give a sample of I O spears to be numbers:
This means that to design a sample of 10 that is stratified proportionally according to
31,2,27,33,8,26,23,29,22,21. context, we would need to take 67.5% of the l O from context l etc. Using nearest
whole numbers (in order to save having to saw spearheads into bits!) the 10% sample
Notice that 00 and any number larger than 40 has been ignored, and if we had the would consist of the following:
same number twice this would also have been ignored. Using this ! 0% sample the
mean weight is 349.51 g which is rather low compared to the true mean of 442.4 g. context l context 2 context 3
67.5% 15% 17.5%
A Systematic sample. 7 spearheads l spearhead 2 spearheads
To take a systematic sample of I O from our population of 40 spearheads take every
fourth one. Although in this example we start with number one, strictly speaking the Again usinub the random number table and only taking numbers between l and 40, the
~

start number (between one and four) should be chosen at random. This method has the following proportional stratified random sample is taken (starting top left in the table
advantage of being easy to design although if the units have inherent patterning in and going down in pairs):
their ordering systematic sampling could be inappropriate. Our sample starts at
number one and ends with number 37 (numbers 1, 5. 9, 13, 17, 21, 25, 29, 33 and 37) Context I: 38, 14, 26, 31, 40, 13, 34
producing a sample mean weight of 405.68 g (true mean is 442.4 g). Context 2: 8
Context 3: 7, 4
A Stratified sample.
In order to get a representative sample the structure of that sample should reflect the The mean weight of this sample is 398.6 g, again a little low. By designing such a
structure of the population in terms of characteristics that are thought to be influential. stratified sample there is no guarantee that the results will be more accurate or better
For example, the spearhead population consists of 20 iron and 20 bronze items which but we are less likely to produce unrepresentative results based on some bias in the
could influence the weight (if one group was a lot heavier than the other), in order to sample.
reflect this our sample should contain 5 iron and 5 bronze spearheads. Using the
random number tables again (Table A in the Appendix), we select numbers in the A Cluster sample.
range I to 40 in order to give the first 5 iron and 5 bronze: Rather than select individual items, select clusters or groups of items that are close
together. This would be better illustrated by a spatial application where, for example, a
Iron: 2, 8, 13, 17, 4 survey of a large area is being conducted. To save on travelling time, groups of sites
Bronze: 31, 27, 33, 26, 23, are selected at random, if each group is then sampled individually this could be called
a two stage sampling design.
These give a sample mean of 409.43; still a little low. Perhaps there is no real
relationship or association between material and weight, in which case stratifying

68 69
M. Fletcher and G.R. Lock Sampling theory and sample design

A Convenience Sample.
This sampling design does not use random methods to select the sample, and 15
consequently can produce poor results. If, for example, the first I O spearheads were
taken as the sample (simple because this is convenient) the mean would be 304.99 g
which is very low. If the next 10 are taken their mean is 664. 94 g which is very high.
12
There are other sampling strategies, although the first three described above are the
main ones and certainly adequate for most archaeological applications.

7.3 A statistical background to sampling. i:;' 9


C:
Once a sample has been taken various statistics can be calculated from it (using a,
:::s
methods described in Section 1 of the book). The commonest of these sample statistics C'"

are:
e
LL

x, the sample mean


s, the sample standard deviation, and
p, the sample proportion, i.e. the proportion of the sample having a particular
characteristic. 3

The true or population values for these statistics are usually unknown, and formally
denoted by Greek letters so that:
0.0 500.0 1000.0 1500.0 2000.0 2500.0
x, the sample mean, is an estimate of p, the population mean Weight (g)
s, the sample standard deviation, is an estimate of a, the population standard deviation
p, the sample proportion, is an estimate of 7r, the population proportion. Figure 7.1. The distribution of'weight, <WEIGHT>.

7.3.1 The central-limit theorem - the law of averages. 9 spearheads. (We have taken a sample size of 9 for reasons that will become clear
later, but these ideas hold whatever the sample size).
In order to comment on how good an estimate the sample statistics are, the nature of
their distribution needs to be known. To illustrate this, consider trying to estimate the
Comparing the two distributions shown in Figures 7.1 and 7.2, original weights and
mean weight of the spearheads from a sample of only 9 randomly chosen ones. We
sample mean weights, the differences in shape can be seen. This demonstrates two
know that for the population of 40, p 442.4 and a 436.0. Figure 7.1 shows the
important concepts of sampling which together form the Central Limit Theorem, (here
distribution of weight for the whole population which is not symmetrical (it is skewed)
this theorem is only demonstrated then stated but it can be proved formally),
with a large amount of variation (a 436.0).
(i) The distribution of the sample means is reasonably symmetrical and can be well
If a simple random sample of 9 spearheads is taken, and the sample mean calculated, modelled by the normal distribution (see Chapter 6 for the importance of this). As the
we would expect it to be a reasonable estimate of the truth (the population mean). If
sample size increases so the closeness of the approximation to the n?rmal distributio!1
many such random samples are taken each will produce an estimate of the population increases. For most distributions a sample size as low as four will produce a fair
mean and we would expect most of them to be near to the truth. Figure 7.2 shows the
approximation, while a sample of 20 or 30 will give a very good approximation, (this
distribution of 40 sample means each one calculated from a separate random sample of is why a sample of 50 is considered to be large). The sequence in Figure 7.3 (a to e)
shows how the distribution of sample means becomes more symmetrical as the sample
size increases.

70 71
M. Fletcher and G.R. Lock Sampling theory and sample design

15 -
-
3J'

12 - 25
>,
20000 25Ci0 g 2':i
Weight (g)
"'5- 15
e
- U. 10

i:;' 9 -
C:
Cl) 20CO 2 58:
::, 500 1.00G i 5G0 2 000 2 500 Sample mean weighl{n=9)
C" Sample mean weigh!(n=4)
e
Li.

6-

-
-
3-
-- fil
500
I
1 OOG
I
1.500
I
2 000 2 501 500 1 OOQ 1.500 2 ODO 2 500
Sample mean weighl(n=16) Sample mean weight(n=25)

0 I I I
0 500 1,000 1,500 2,000 2,500
Figure 7.3. The change in the distribution of sample means as sample size increases.
Sample mean weight(n=9)
It can be proved that if the standard deviation of the individuals in a population is a,
Figure 7.2. The distribution of'mean vl'eight, < WEIGHT> for samples of size nine. then the standard deviation of the means of samples of size n, often called the
standard error, of the mean (or simply the SE of the mean) is

(ii) These distributions also show that the vanation is reducing as the sample size {5
increases. Variation is usually measured by the standard deviation and the following SE (mean)= ✓n
table shows how the SD changes as the sample size increases from l to 25 individuals:

Sample size (n) Standard Deviation The results in the table above confirm this since by taking samples of size 4, the
1 436.0 original SO has been approximately halved and for n = 9 the SO is reduced to
4 182.6 approximately one third.
9 129.7
16 75.9 Before discussing the implications of this any further, the Central Limit Theorem can
25 54. l be stated (without proof);

If a random sample of size n is taken from a population with mean p and SO a, then:

72 73
M. Fletcher and G.R. Lock Sampling theory and sample design

(I) the sample means are approximately normally distributed with this approximation a
becoming closer as n increases, 99% x±2.576 ✓n

(2) the mean of the sampling distribution is p


Jf we had taken a sample of 16 spearheads and calculated x 402. l, then using (J
a
(3) the SO of the sampling distribution is ✓n 436.0 would give 95% confidence limits of:

436.0
The important consequence of this theorem is that whenever a population mean is to 402.1 ± 1.969 ,,-;;
be estimated we are able to use the known characteristics of the normal distribution -v16
(see Chapter 6) to evaluate how precise or reliable our estimate is. For example, if we
had been trying to estimate the population mean for weight from a sample of just 9 = 402. l ± 1.960(109.0)
spearheads, by looking at Figure 7.3(c) we can see that the truth is between = 402.1 ± 213.64
approximately 300 g and 700 g. If we had a larger sample of 25 we could estimate the
true mean to be between 400 and 500 g (Figure 7.3 (e)). i.e. from402.l 213.64to402.l +213.64
from 188.46 to 615.74
7.3.2 Confidence limits - the reliability of results.
Although the estimates for the true mean get more accurate as the sample size We could conclude from this that it is 95% certain that the true mean is between
increases they are still rather vague. It is statistical convention to calculate a 188.46 g and 615.74 g. If these limits are too wide, as they probably are, then clearly a
confidence interval (or confidence limits) for an estimate based on a particular larger sample needs to be taken to reduce them (the importance of ✓ n in the
probability value, usually 90%, 95% or 99%. This is a formal way of stating equation). In fact these very broad limits reflect the high variability (standard
confidence in the estimate. Given a sample of n observations which lead to a sample deviation) in the weight of the spearheads. If we consider our 40 spearheads as a
mean of x, from a population with unknown mean but standard deviation a, sample of some larger population (all spearheads perhaps), then we can calculate the
confidence limits are calculated as below. 95% Confidence Limit for the mean weight to be:

In practice, when trying to estimate the mean, the standard deviation is also unknown
436.0
but for large samples the calculated standard deviation is often taken to be the actual 442.4 ± 1.96 r-;-;;
true standard deviation. When the sample size is small so that the sample standard -v40
deviation is a poor estimate for the true standard deviation, the following theory
should be modified by using the Students-t distribution to calculate the confidence 442.4 ± 135.1
limits. Confidence intervals are calculated using:
i.e. we are 95% sure that the mean weight of all spearheads is between 307.3 g and
Confidence level Confidence intervals 577.5 g. This is still rather a poor estimate because of the large standard deviation
a which we can see is mainly due to one spearhead having an unusually high weight
90% x ± l.645 , (number 16 in Table I.I, at 2446.5 g).We could, of course, omit this one spearhead
-v n from our calculations considering it to be an unrepresentative outlier or freak. This
would produce a better (narrower) confidence interval but such editing of the data is
er archaeologically dangerous without sound justification.
95% x ± 1.960 ✓n
If, instead of estimating the population mean, we were interested in estimating the
proportion of the population with a particular characteristic, then similar ideas are used

74 75
M. Fletcher and G.R. Lock Sampling theory and sample design

to establish confidence limits for this proportion. If a sample of size n gives a estimate the true proportion of bronze spearheads with loops to within ± 0.05 (± 5%)
proportion of size p, then the various confidence limits are given by: i.e. with 95% certainty. Given that p 0.45, what sample size is needed?

Confidence level Confidence limits The 95% confidence limits are:

90% p ± 1.645✓ p(l: p)


.J(0.45)(1-0.45)
0.45 ± 1.96 ✓n

95% p± l.960✓p(l: p)
.J(0.45)(0.55)
i.e. 0.45 ± 1.96 1
99% p± 2.576✓ p(l: p) -vn
We need to substitute the 'error' part of the above with the required level of ± 0.05,
so:
Consider that our 40 spearheads are a sample from a much larger unknown population,
and we need to estimate the true proportion of bronze spearheads that have loops. Of 1.96.J(0.45)(0.55)
the 20 bronze spearheads in our sample, 9 have loops giving a sample proportion of 9 0.05 = 1
in 20, 9/20 0.45 or 45%. vn

If we wanted to compare this proportion to another population (perhaps another site, . l.96.J(0.45)(0.55)
1 _
area or assemblage) we need to know just how reliable this figure of 45% is. Given 1.e. -vn - 0.05
that n 20 and p 0.45, the 95% confidence limits for the true proportion are:

.Jn= 19.5
0.45±1
2
n 19.5

0.45 ± l.96✓0.012375 n = 380.3

Hence we need a sample of 380 bronze spearheads to give an estimate of the


0.45±0.218
proportion with loops with a confidence level of 95%. To save doing the previous
analysis there are simpler formulae to find the required sample size, n, to estimate
i.e. from 0.232 to 0.668
either the mean or proportion to within ± D (here with 95% confidence):
This means that with a sample of 20 we can be 95% certain that the true proportion of
2
bronze spearheads is somewhere between 0.232 and 0.668 (23% and 67%). Again this l1.9 6al
is rather vague, but this in itself serves to emphasise the need to quote confidence Mean: n L0 j
limits to show the uncertainty.

The previous formulae for confidence limits can also be used to estimate the required
sample size to obtain a given level of reliability. Suppose that we really wanted to

76 77
Sampling theory and sample design
M. Fletcher and G.R. Lock

certain aspects of the site can be established using random strategies and this does not

[-1.96a]
2
, preclude the use of the much favoured judgement sampling of areas which look
Proportion: n - p(l p)
D particularly interesting for some reason.

So, for the previous example where D 0.05 and p 0.45

n 0.45)

380.3

To estimate the true mean weight to within 10 g with a probability of 95%, would
require a sample size of n, where:

= 7303. what a big sample!

7.4 Conclusions
The techniques discussed in this chapter are about making generalising statements
describing an unknown population based on information gleaned from a sample of that
population. Providing that the sample is collected using one of several mathematically
random strategies the formulae given above can be used. The reliability of estimates of
characteristics of the population will vary according to the size of the sample and the
variability within the sample. A small sample containing a lot of variability will never
yield precise estimates of the population.

These concepts are of potential interest to archaeologists. If for example an excavation


produced a very large assemblage of flints and available resources did not allow for a
full analysis, a randomly drawn sample would allow general statements to be made
about characteristics such as means of measurements and proportions of types etc. It is
obvious that the emphasis of this approach is on establishing general parameters
describing the population and not on identifying and describing unusual and 'special'
elements within the population. Of course there is no reason why judgement samples
containing obviously special cases can not be taken as well as the random sample so
long as it is realised that the aims of the two.are different.

It is also not difficult to interpret sampling strategies spatially. An area to be excavated


could be gridded into sampling units and the appropriate number selected randomly.
Areas of different interest could be built into a stratified design. General parameters of

79
78
Tests of difference

CHAPTER 8 worked through carefully as it contains many new ideas that are relevant to the other
tests.
TESTS OF DIFFERENCE
8.2 One sample tests comparing an observed measurement with an expected
8.1 Introduction. measurement
This chapter is concerned with what are formally called tests of hypothesis for the When a single sample is taken, one of the following two questions is likely to be
common statistical parameters. These provide fonnal methods of answering such asked:
archaeological questions as:
(i) Is the average (mean or median) of the sample value significantly different from
( i) Do the two populations from which these two samples are drawn have some fixed value (18 for example).
significantly different variances (i.e. is there a significant difference in the
variability of the weight of sherds of two different pot types?). Note: If it can be assumed that the population from which the sample is taken has a
normal distribution, go to Section 8.2.1. If the population is clearly not nonnal (it is
(ii) Could this sample of sherds come from a population that has a mean skewed or bi-modal for example), then go to 8.2.2.
weight of 250 g?
(ii) Is the proportion of objects with a certain characteristic significantly different from
(iii) Is the proportion of graves containing a particular type of artefact different some fixed value (25% for example). See section 8.2.3.
between two cemeteries?
8.2.l Test for sample mean
The parameters used are calculated from the sample or samples taken and are; the
mean, median, variance or standard deviation, proportion and rankings. Assumptions: The population has a normal distribution.

In all cases the logic of the test is as explained in Chapter 6. There are many different Sample statistics used:
tests that can be used depending upon the hypothesis to be tested. the assumptions the n sample size
archaeologist is prepared to make about the data and the size of the samples. Each test x sample mean
is designed for a particular set of assumptions and it can be more difficult choosing the s sample standard deviation.
test than actually performing it!

The most important assumption is whether the parent population has a normal
distribution or not. If the assumption of normality holds (or can be assumed to hold by Step 1:
virtue of the central-limit theorem (see Chapter 7.3.1 )) it is almost certain that a The hypothesis to be tested is often of the type: We have a mean of 18 Type Z sherds
parametric test is needed, whilst if it doesn't hold a non-parametric test will be per pit from site A, is our sample from site B significantly different? It is usual to
appropriate. Parametric tests are generally more powerful because they use the denote the fixed value of the mean (18 in this case) as p 0 • The formal statement of
mathematical properties of the normal distribution in distinguishing sets of data. The both the null hypothesis (Ho) and the alternative hypothesis (H 1) is:
sample size is another important determinant of the proper test, this is discussed in
more detail where appropriate. The examples that follow in this chapter cover both VS
parametric and non-parametric tests for different sample sizes although for certain
extreme cases, such as small sample size and non-normality, references are given for
more suitable tests. Step 2:
From the sample of n values calculate the sample mean, x, and the sample standard
The first test is described in detail and the rest only briefly since the basic ideas are
deviation s. If x is very close to /lo we will accept the claim that the true mean is /lo
similar. Whichever test is to be used, therefore, it is recommended that Section 8.2.1 is
(accept H0 ) but if x is very different from /lo we should be suspicious and so reject

80 81
M. Fletcher and G .R. Lock Tests of difference

H0 . The obvious question is, how far can x be from p 0 before we reject H0 ? The Having gone into some detail on the importance of §, we can now state the test
answer depends upon how large the sample is and how variable the data are. statistic (TS) to be used:

If the sample variance has been calculated to be s2 , it can be shown that the best X JI
estimate for the population variance eris not s 2 but a slightly modified value (there is TS
s1✓n
not space here to show proofs of this). This best estimate for the population variance is
2
denoted by § (pronounced s hat squared!) and calculated using: Step 3:
If TS is very large we should reject H0 , but accept it if TS is small. We need to define
n , large here because this is a Two-Tailed Test (see below for explanation) we are
--s-
n l talking about absolute values, a TS of-3.9 is just as extreme as a value of +3.9.

So, if a sample of l 0 values gave s2 1.392, a better estimate for the population To calculate the probability of a TS as large or larger than the value we have arrived at
variance would be: in Step 2 it is necessary to use an appropriate table which has been produced based
upon the underlying assumptions. In this case the theory was first developed by W.
2 Gosset in the early 20th century and published under the pseudonym of 'Student'
§ (10/9). 1.392
using the symbol t for the test statistic. This is now referred to as the Students t-test or
].547
t-distribution, and the associated tables are called t-tables.
This means that if we needed an estimate for the population standard deviation we
Table B in the Appendix is a Hable and it is used frequently later in this chapter. The
should use:
use of a Hable needs further explanation and a little practice.

Using t-tables:
If we were testing the pair of hypotheses:

Ho: p 18 vs
1.244
and a sample size of 6 gave x = 21.9 and § 3.0, then the test statistic would be
Clearly if the sample is large, there is little difference between § and s (they have
similar standard deviations), but for small samples this difference is important. 21.9-18
TS
3.0/ ✓6
Many calculators with statistical calculation now give both values for standard
deviation. Try with the following data: TS 3.18

Sample: 3, 7, 5, 2 A standard deviation estimated from a sample size n is said to be based upon n-1
Key on calculator: 11 x CTn (Jn-I degrees of freedom (df) and the appropriate point of entry in the t-table is the row
Interpretation: sample size mean s § corresponding to the correct df. In this case n 6 so df 5, indicating the row to use
Contents: 4 4.250 1.920 2.217 in the table, (the reasoning behind dfs is involved and not important here). Table B
shows:
In all of the following tests, if these data were to be used, 2.217 would be the
appropriate value.

82 83
M. Fletcher and G.R. Lock Tests of difference

df 10% 50//0 1% 0.1% df Ho: p :S 200 vs


5 2.01 2.57 3.36 4.03 6.86 5
The logic of this test is:
For 5 df the probability of a TS more extreme than 2.01 is I 0%, the probability of a
TS more extreme than 2.57 is 5% and more extreme than 3.36 is 2%. Since we have a
calculated TS of 3 .18, this result is said to be significant at the 5% level but not at the
2% level as it falls between those two values in the table. We can reject Ho at the 5%
level and be 9SC½i confident that the true mean is not 18.

In the example above a TS of -3.18 would lead us to exactly the same conclusion
since this would still contradict the assumption that the mean is 18. If the sample mean
had been 14.1 rather than 21.9 (3.9 smaller than 18 instead of 3.9 bigger) the TS Claimed mean
would be -3. I 8 rather than 3.18.
ACCEPT REJECT
This is an example of a Two-sided Test or a Two Tailed Test (TTT), we are testing
for variation both sides of the true mean, and the logic of this is shown in the
following diagram:

This is an example of a One-sided or One Tailed Test (OTT) and with such tests
care must be taken when using the t-tab!e. If a sample of 13 is taken to test the claim
that the population mean is not more than 200 ( or is 200 or less), and a TS of 2. 91 is
calculated, the TS should be compared with those values in the Table B corresponding
to one tail having probabilities of 5% and l %. Since the df = 13-1 12, we should
Claimed mean compare 2.91 with l.78 (5%) and 2.68 (I%) and so conclude that H0 can be rejected at
the 1% level. We are 99% confident that the mean is more than 200.

Example:
REJECT ACCEPT REJECT
Suppose we were interested in the sockets of spearheads and a previous study had
shown that the mean width at the top of the socket was 1.3 cm. Because the original
study was carried out a long time ago and the data are not now available, it is of
interest to know whether our sample of 40 spearheads are likely to have come from the
same population as the originals or not (in other words, are they similar or
significantly different). Any variable of interest ( on a continuous level of
measurement) could be used but in this case we are testing the hypothesis that our
sample comes from a population with a mean upper socket width of 1.3 cm. In order
There are many claims that are only one sided, such as "The population mean is not to use a t-test we need to assume that the population from which the sample came has
more than 200", giving: a normal distribution. This is a fair claim judging from Figure 6.1. The hypotheses to
be tested are:

84 85
M. Fletcher and G.R. Lock Tests of difference

Ho:p 1.3 vs
TS
[Note: rhefollowing is a Test Summrnyfor this test. Thisformat will be.followed.for
all subsequent tests, note that Assumptions is used to include pre-requisits such as
1.3
minimum sample size.} 0.3118/40

3. 753 (3 decimal places are enough for any test statistic).

One sample test for mean Table B shows that for 39 df for a two tailed test the 5% value is 2.02 and the 1%
value is 2.70. Since 3.753 > 2.70, we can reject H 0 at the 1% level and conclude that
Ho: JI= Po we have strong evidence (at least 99% certain) that our 40 spearheads come from a
population with <UPSOC> different from 1.3 cm. In other words, looking just at this
single variable our spearheads are significantly different from those of the earlier
study.
Assumption: population normally distributed.
8.2.2 Test for median
Sample statistics needed: n. x, § If the population from which the sample is taken does not have a normal distribution,
if it is skewed for example (see Chapter 4.5), this test on the median can be used as a
substitute for the test in 8.2.1. Now, however, the key parameter is the median,
X µ denoted by 0, rather than the mean.
Test statistic: TS
s/ ✓ n
Note: If n is less than 20. this test is still appropriate although the Binomial tables
Table: t-tab!e (Table B) with df n-1. should be used.

Example:
For the sample of 20 bronze spearheads, the maximum length has a distribution which
This is a two tailed test. is not symmetrical as can be seen from the following stem-and-leaf plot:

The sample mean and standard deviation for <UPSOC> are calculated to be (using the l 01122444
methods in Chapters 3.4 and 4.5) x = 1.4850 and s 0.3079. 1 66788
2 03444
So, 2
3 0
§ = [4o(0.3079) 2 3 6
~39
To test the claim that this sample came from a population with a median of 20
0.3118 (this is usually given by CTn.J on a calculator) (perhaps this figure has been stated in another report but without the actual data, a
comparison would be useful):
and the test statistic:

86 87
M. Fletcher and G.R. Lock Tests of difference

8.2.3 Test for proportions


This test should be used when the proportion of items in a sample with a particular
characteristic is under scrutiny. If just one sample is taken it is common to test
One sample test for median whether the proportion of interest is different to a particular proportion claimed (larger
than or smaller than). This could be applicable to questions such as, "judging from our
vs H 1: 0 t 00 (TTT) sample of Wiltshire hillfo1is are more than 80% of Wiltshire hillforts likely to be
rnultivallate?"
Ho: 0<0 vs
The symbol used to represent proportion is usually re, with rc0 being the claimed
Assumption: parent population may not be proportion and p the observed proportion from the sample.
normal, n > 20.

Sample statistics needed: n, T (where T is the


number of the n observations which are greater than One sample test for proportions
Bo).
Ho: TC Teo vs H 1: re 't n 0 (TTT)
2T-n
Test statistic: TS I
✓n VS

Table: Hable (Table B) with df= 00 (infinity). Assumption: n > 25.

Sample statistics needed: n, p.

Test statistic: TS
Ho: 0 20 VS

Out of the 20 spearheads it can be seen that 6 have a length of greater than 20 and 14 Table: Hable (Table 8) with df 00

equal to or less than 20, so T = 6. If the null hypothesis were true we would expect 10
below and I 0 above the value of 20, so here we are asking if 6 above the value of 20 is
significantly different to 10 above:

Notes:
Test Statistic TS
1) If 7ro is approximately 50%, then this test is valid for samples smaller than 25.
2) If the assumptions are not met (no is not near 50% and n < 25), the Binomial test
-1.777 should be used.

Looking at Table 8, our result of 1.777 does not reach the 5% value of 1.960 and so Example:
is not significant and H0 cannot be rejected. We should conclude that this sample does To test the claim that the proportion of spearheads with a peg hole is at most 0.5 (or
not provide enough evidence to reject the claim that it came from a population with a 50°/ti).
median of 20.

88 89
Another random document with
no related content on Scribd:
And, while he shines, all other, lesser lights
May wane and fade unnoticed from the sky.
But more than friend, e’en he can never be:
[Heaves a deep sigh.
That thought is sorrowful, but yet I’ll hope.
What is my rival? Nought but a weak girl,
Ungifted with the state and majesty
That mark superior minds. Her eyes gleam not
Like windows to a soul of loftiness;
She hath not raven locks that lightly wave
Over a brow whose calm placidity
Might emulate the white and polished marble.
[A white dove flutters by.
Ha! what art thou, fair creature? It hath vanished
Down that long vista of low-drooping trees.
How gracefully its pinions waved! Methinks
It was the spirit of this solitude.
List! I hear footsteps; and the rustling leaves
Proclaim the approach of some corporeal being.
[A young girl advances up the vista, dressed in green, with a
garland of flowers wreathed in the curls of her hazel hair.
She comes towards Lady Zenobia, and says:
Girl.
Lady, methinks I erst have seen thy face.
Art thou not that Zenobia, she whose name
Renown hath come e’en to this fair retreat?
Lady Ellrington.
Aye, maiden, thou hast rightly guessed. But how
Didst recognise me?
Girl.
In Verreopolis
I saw thee walking in those gardens fair
That like a rich, embroidered belt surround
That mighty city; and one bade me look
At her whose genius had illumined bright
Her age, and country, with undying splendour.
The majesty of thy imperial form,
The fire and sweetness of thy radiant eye,
Alike conspired to impress thine image
Upon my memory; and thus it is
That now I know thee as thou sittest there
Queen-like, beneath the over-shadowing boughs
Of that huge oak-tree, monarch of this wood.

Lady Ellrington [smiling graciously].


Who art thou, maiden?

Girl.
Marian is my name.

Lady Ellrington [starting up: aside].


Ha! my rival! [Sternly] What dost thou here alone?

Marian [aside].
How her tone changed! [Aloud] My favourite cushat-dove,
Whose plumes are whiter than new-fallen snow,
Hath wandered, heedless, from my vigilant care.
I saw it gleaming through these dusky trees,
Fair as a star, while soft it glided by:
So have I come to find and lure it back.

Lady Ellrington.
Are all thy affections centred in a bird?
For thus thou speakest, as though nought were worthy
Of thought or care saving a silly dove!
Marian.
Nay, lady, I’ve a father, and mayhap
Others whom gratitude or tenderest ties,
If such there be, bind my heart closer to.

Lady Ellrington.
But birds and flowers and such trifles vain
Seem most to attract thy love, if I may form
A judgment from thy locks elaborate curled
And wreathed around with woven garlandry,
And from thy whining speech, all redolent
With tone of most affected sentiment.
[She seizes Marian, and exclaims with a violent
gesture:
Wretch, I could kill thee!

Marian.
Why, what have I done?
How have I wronged thee? Surely thou ’rt distraught!
Lady Ellrington.
How hast thou wronged me? Where didst weave the net
Whose cunning meshes have entangled round
The mightiest heart that e’er in mortal breast
Did beat responsive unto human feeling?

Marian.
The net? What net? I wove no net; she’s frantic!

Lady Ellrington.
Dull, simple creature! Canst not understand?
Marian.
Truly, I cannot. ’Tis to me a problem,
An unsolved riddle, an enigma dark.

Lady Ellrington.
I’ll tell thee, then. But, hark! What voice is that?

Voice [from the forest].


Marian, where art thou? I have found a rose
Fair as thyself. Come hither, and I’ll place it
With the blue violets on thine ivory brow.

Marian.
He calls me; I must go; restrain me not.

Lady Ellrington.
Nay! I will hold thee firmly as grim death.
Thou need’st not struggle, for my grasp is strong.
Thou shalt not go: Lord Arthur shall come here,
And I will gain the rose despite of thee!
Now for my hour of triumph: here he comes.
[Lord Arthur advances from among the trees, exclaiming on
seeing Lady Ellrington.

Lord Arthur.
Zenobia! How com’st thou here? What ails thee?
Thy cheek is flushed as with a fever glow;
Thine eyes flash strangest radiance; and thy frame
Trembles like to the wind-stirred aspen-tree!

Lady Ellrington.
Give me the rose, Lord Arthur, for methinks
I merit it more than my girlish rival;
I pray thee now grant my request, and place
That rose upon my forehead, not on hers;
Then will I serve thee all my after-days
As thy poor handmaid, as thy humblest slave,
Happy to kiss the dust beneath thy tread,
To kneel submissive in thy lordly presence.
Oh! turn thine eyes from her and look on me
As I kneel here imploring at thy feet,
Supremely blest if but a single glance
Could tell me that thou art not wholly deaf
To my petition, earnestly preferred.

Lord Arthur.
Lady, thou’rt surely mad! Depart, and hush
These importunate cries. They are not worthy
Of the great name which thou hast fairly earned.

Lady Ellrington.
Give me that rose, and I to thee will cleave
Till death. Hear me, and give it me, Lord Arthur!

Lord Arthur [after a few minutes’ deliberation].


Here, take the flower, and keep it for my sake.
[Marian utters a suppressed scream, and sinks to the
ground.

Lady Ellrington [assisting her to rise].


Now I have triumphed! But I’ll not exult;
Yet know, henceforth, I’m thy superior.
Farewell, my lord; I thank thee for thy preference!
[She plunges into the wood and disappears.
Lord Arthur.
Fear nothing, Marian, for a fading flower
Is not symbolical of constancy.
But take this sign; [Gives her his diamond ring] enduring adamant
Betokens well affection that will live
Long as life animates my faithful heart.
Now let us go; for, see, the deepening shades
Of twilight darken our lone forest-path;
And, lo! thy dove comes gliding through the murk,
Fair wanderer, back to its loved mistress’ care!
Luna will light us on our journey home:
For, see, her lamp shines radiant in the sky,
And her bright beams will pierce the thickest boughs.
[Exeunt, and curtain falls.

From an unpublished manuscript by Charlotte Brontë, entitled ‘Visits in Verreopolis,’ vol.


i., completed December 11th, 1880.
THE FAIRY GIFT
Under the title of ‘The Four Wishes’ this story was first printed by Mr.
Clement Shorter in April 1918, in an edition limited to twenty copies
for private circulation only.
It was published, with three illustrations, in the Strand Magazine,
December 1918, pp. 461-466.
The title of ‘The Fairy Gift’ was given to the story by Charlotte
Brontë.
C. W. H.
THE FAIRY GIFT

One cold evening in December 17—, while I was yet but a day
labourer, though not even at that time wholly without some
aspirations after fame and some intimations of future greatness, I
was sitting alone by my cottage fire engaged in ambitious reveries of
l’avenir, and amusing myself with wild and extravagant imaginations.
A thousand evanescent wishes flitted through my mind, one of which
was scarcely formed when another succeeded it; then a third,
equally transitory, and so on.
While I was thus employed with building castles in the air my frail
edifices were suddenly dissipated by an emphatic ‘Hem!’ I started,
and raised my head. Nothing was visible, and, after a few minutes,
supposing it to be only fancy, I resumed my occupation of weaving
the web of waking visions. Again the ‘Hem!’ was heard; again I
looked up, when lo! sitting in the opposite chair I beheld the
diminutive figure of a man dressed all in green. With a pretty
considerable fluster I demanded his business, and how he had
contrived to enter the house without my knowledge.
‘I am a fairy,’ he replied, in a shrill voice; ‘but fear nothing; my
intentions are not mischievous. On the contrary, I intend to gift you
with the power of obtaining four wishes, provided that you wish
them at different times; and if you should happen to find the fruition
of my theme not equal to your anticipations, still you are at liberty to
cast it aside, which you must do before another wish is granted.’
When he had concluded this information he gave me a ring, telling
me that by the potency of the spell with which it was invested my
desires would prove immediately successful.
I expressed my gratitude for this gift in the warmest terms, and
then inquired how I should dispose of the ring when I had four times
arrived at the possession of that which I might wish.
‘Come with it at midnight to the little valley in the uplands, a mile
hence,’ said he, ‘and there you will be rid of it when it becomes
useless.’
With these words he vanished from my sight. I stood for some
minutes incredulous of the reality of that which I had witnessed,
until at last I was convinced by the green-coloured ring set in gold
that sparkled in my hand.
By some strange influence I had been preserved from any feeling
of fear during my conversation with the fairy, but now I began to
feel certain doubts and misgivings as to the propriety of having any
dealings with supernatural beings. These, however, I soon quelled,
and began forthwith to consider what should be the nature of my
first wish. After some deliberation I found the desire for beauty was
uppermost in my mind, and therefore formed a wish that next
morning when I arose I should find myself possessed of surpassing
loveliness.
That night my dreams were filled with anticipations of future
grandeur, but the gay visions which my sleeping fancy called into
being were dispelled by the first sounds of morning.
I awoke lightsome and refreshed, and springing out of bed
glanced half-doubtingly into the small looking-glass which decorated
the wall of my apartment, to ascertain if any change for the better
had been wrought in me since the preceding night.
Never shall I forget the thrill of delighted surprise which passed
through me when I beheld my altered appearance. There I stood,
tall, slender, and graceful as a young poplar tree, all my limbs
moulded in the most perfect and elegant symmetry, my complexion
of the purest red and white, my eyes blue and brilliant, swimming in
liquid radiance under the narrow dark arches of two exquisitely-
formed eyebrows, my mouth of winning sweetness, and, lastly, my
hair clustering in rich black curls over a forehead smooth as ivory.
In short, I have never yet heard or read of any beauty that could
at all equal the splendour of comeliness with which I was at that
moment invested.
I stood for a long time gazing at myself in a trance of admiration
while happiness such as I had never known before overflowed my
heart. That day happened to be Sunday, and accordingly I put on my
best clothes and proceeded forthwith towards the church. The
service had just commenced when I arrived, and as I walked up the
aisle to my pew I felt that the eyes of the whole assembly were
upon me, and that proud consciousness gave an elasticity to my gait
which added stateliness and majesty to my other innumerable
graces. Among those who viewed me most attentively was Lady
Beatrice Ducie. This personage was the widow of Lord Ducie, owner
of the chief part of the village where I resided and nearly all the
surrounding land for many miles, who, when he died, left her the
whole of his immense estates. She was without children and
perfectly at liberty to marry whomsoever she might chance to fix her
heart on, and therefore, though her ladyship had passed the
meridian of life, was besides fat and ugly, and into the bargain had
the reputation of being a witch, I cherished hopes that she might
take a liking for me, seeing I was so very handsome; and by making
me her spouse raise me at once from indigence to the highest pitch
of luxury and affluence.
These were my ambitious meditations as I slowly retraced my
steps homeward.
In the afternoon I again attended church, and again Lady Ducie
favoured me with many smiles and glances expressive of her
admiration. At length my approaching good fortune was placed
beyond a doubt, for while I was standing in the porch after service
was over she happened to pass, and inclining her head towards me,
said: ‘Come to my house to-morrow at four o’clock.’? I only
answered by a low bow and then hastened back to my cottage.
On Monday afternoon I dressed myself in my best, and putting a
Christmas rose in the buttonhole of my coat, hastened to the
appointed rendezvous.
When I entered the avenue of Ducie Castle a footman in rich livery
stopped me and requested me to follow him. I complied, and we
proceeded down a long walk to a bower of evergreens, where sat
her ladyship in a pensive posture. Her stout, lusty figure was arrayed
in a robe of purest white muslin, elegantly embroidered. On her
head she wore an elaborately curled wig, among which borrowed
tresses was twined a wreath of artificial flowers, and her brawny
shoulders were enveloped in a costly Indian shawl. At my approach
she arose and saluted me. I returned the compliment, and when we
were seated, and the footman had withdrawn, business summarily
commenced by her tendering me the possession of her hand and
heart, both which offers, of course, I willingly accepted.
Three weeks after, we were married in the parish church by
special licence, amidst the rejoicings of her numerous tenantry, to
whom a sumptuous entertainment was that day given.
I now entered upon a new scene of life. Every object which met
my eyes spoke of opulence and grandeur. Every meal of which I
partook seemed to me a luxurious feast. As I wandered through the
vast halls and magnificent apartments of my new residence I felt my
heart dilating with gratified pride at the thought that they were my
own.
Towards the obsequious domestics that thronged around me I
behaved with the utmost respect and deference, being impelled
thereto by a feeling of awe inspired by their superior breeding and
splendid appearance.
I was now constantly encompassed by visitors from among those
who moved in the highest circles of society. My time was passed in
the enjoyment of all sorts of pleasures; balls, concerts, and dinners
were given almost every day at the castle in honour of our wedding.
My evenings were spent in hearing music, or seeing dancing and
gormandising; my days in excursions over the country, either on
horseback or in a carriage.
Yet, notwithstanding all this, I was not happy. The rooms were so
numerous that I was often lost in my own house, and sometimes got
into awkward predicaments in attempting to find some particular
apartment. Our high-bred guests despised me for my clownish
manners and deportment. I was forced to bear patiently the most
humiliating jokes and sneers from noble lips. My own servants
insulted me with impunity; and, finally, my wife’s temper showed
itself every day more and more in the most hideous light. She
became terribly jealous, and would hardly suffer me to go out of her
sight a moment. In short, before the end of three months I sincerely
wished myself separated from her and reduced again to the situation
of a plain and coarse but honest and contented ploughboy.
This separation was occasioned by the following incident sooner
than I expected. At a party which we gave one evening there
chanced to be present a young lady named Cecilia Standon. She
possessed no mean share of beauty, and had besides the most
graceful demeanour I ever saw. Her manner was kind, gentle, and
obliging, without any of that haughty superciliousness which so
annoyed me in others of my fashionable acquaintances. If I made a
foolish observation or transgressed against the rules of politeness
she did not give vent to her contempt in a laugh or suppressed titter,
but informed me in a whisper what I ought to have done, and
instructed me how to do it.
When she was gone I remarked to my wife what a kind and
excellent lady Miss Cecilia Standon was. ‘Yes,’ exclaimed she,
reddening, ‘every one can please you but me. Don’t think to elude
my vigilance, I saw you talking and laughing with her, you low-born
creature whom I raised from obscurity to splendour. And yet not one
spark of gratitude do you feel towards me. But I will have my
revenge.’ So saying she left me to meditate alone on what that
revenge might be.
The same night, as I lay in bed restless, I heard suddenly a noise
of footsteps outside the chamber door. Compelled by irresistible
curiosity, I rose and opened it without making any sound. My
surprise was great on beholding the figure of my wife stealing along
on tiptoe with her back towards me, and a lighted candle in her
hand. Anxious to know what could be her motive for walking about
the house at this time of night I followed softly, taking care to time
my steps so as to coincide with hers.
After proceeding along many passages and galleries which I had
never before seen, we descended a very long staircase that led us
underneath the coal and wine cellars to a damp, subterraneous
vault. Here she stopped and deposited the candle on the ground. I
shrank instinctively, for the purpose of concealment, behind a
massive stone pillar which upheld the arched roof on one side.
The rumours which I had often heard of her being a witch passed
with painful distinctness across my mind, and I trembled violently.
Presently she knelt with folded hands and began to mutter some
indistinguishable words in a strange tone. Flames now darted out of
the earth, and huge smouldering clouds of smoke rolled over the
slimy walls, concealing their hideousness from the eye.
At length the dead silence that had hitherto reigned unbroken was
dissipated by a tremendous cry which shook the house to its centre,
and I saw six black, indefinable figures gliding through the darkness
bearing a funeral bier on which lay arranged, as I had seen her the
previous evening, the form of Cecilia Standon. Her dark eyes were
closed, and their long lashes lay motionless on a cheek pale as
marble. She was quite stiff and dead.
At this appalling sight I could restrain myself no longer, and
uttering a loud shriek I sprang from behind the pillar. My wife saw
me. She started from her kneeling position, and rushed furiously
towards where I stood, exclaiming in tones rendered tremulous by
excessive fury: ‘Wretch, wretch, what demon has lured thee hither
to thy fate?’ With these words she seized me by the throat and
attempted to strangle me.
I screamed and struggled in vain. Life was ebbing apace when
suddenly she loosened her grasp, tottered, and fell dead.
When I was sufficiently recovered from the effects of her infernal
grip to look around I saw by the light of the candle a little man in a
green coat striding over her and flourishing a bloody dagger in the
air. In his sharp, wild physiognomy I immediately recognised the
fairy who six months ago had given me the ring.
That was the occasion of my present situation. He had stabbed my
wife through the heart, and thus afforded me opportune relief at the
moment when I so much needed it.
After tendering him my most ardent thanks for his kindness I
ventured to ask what we should do with the dead body.
‘Leave that to me,’ he replied. ‘But now as the day is dawning, and
I must soon be gone, do you wish to return to your former rank of a
happy, honest labourer, being deprived of the beauty which has been
the source of so much trouble to you, or will you remain as you are?
Decide quickly, for my time is limited.’
I replied unhesitatingly, ‘Let me return to my former rank,’ and no
sooner were the words out of my mouth than I found myself
standing alone at the porch of my humble cottage, plain and coarse
as ever, without any remains of the extreme comeliness with which I
had been so lately invested.
I cast a glance at the tall towers of Ducie Castle which appeared
in the distance faintly illuminated by the light reflected from rosy
clouds hovering over the eastern horizon, and then, stooping as I
passed beneath the lowly lintel, once more crossed the threshold of
my parental hut.
A day or two after, while I was sitting at breakfast; a neighbour
entered and, after inquiring how I did, etc., asked me where I had
been for the last half year. Seeing it necessary to dissemble, I
answered that I had been on a visit to a relation who lived at a great
distance. This satisfied him, and I then inquired if anything had
happened in the village since my departure.
‘Yes,’ said he, ‘a little while after you were gone Lady Ducie
married the handsomest young man that was ever seen, but nobody
knew where he came from, and most people thought he was a fairy;
and now about four days ago Lady Ducie, her husband, and Lord
Standon’s eldest daughter all vanished in the same night and have
never been heard of since, though the strictest search has been
made after them. Yesterday her ladyship’s brother came and took
possession of the estate, and he is trying to hush up the matter as
much as he can.’
This intelligence gave me no small degree of satisfaction, as I was
now certain that none of the villagers had any suspicion of my
dealings with the fairy.
But to proceed. I had yet liberty to make three more wishes; and,
after much consideration, being convinced of the vanity of desiring
such a transitory thing as my first, I fixed upon ‘superior talent’ as
the aim of my second wish; and no sooner had I done so than I felt
an expansion, as it were, of soul within me.
Everything appeared to my mental vision in a new light. High
thoughts elevated my mind, and abstruse meditations racked my
brain continually. But you shall presently hear the upshot of this
sudden éclaircissement.
One day I was sent to a neighbouring market town, by one Mr.
Tenderden, a gentleman of some consequence in our village, for the
purpose of buying several articles in glass and china.
When I had made my purchases I directed them to be packed up
in straw, and then with the basket on my back trudged off
homeward. But ere I was half-way night overtook me. There was no
moon, and the darkness was also much increased by a small
mizzling rain. Cold and drenched to the skin, I arrived at The Rising
Sun, a little wayside inn, which lay in my route.
On opening the door my eyes were agreeably saluted by the light
of a bright warm fire, round which sat about half a dozen of my
acquaintance.
After calling for a drop of something to warm me, and carefully
depositing the basket of glass on the ground, I seated myself
amongst them. They were engaged in a discussion as to whether a
monarchical or republican form of government was the best. The
chief champion of the republican side was Bob Sylvester, a
blacksmith by trade, and of the largest loquacity of any man I ever
saw. He was proud of his argumentative talents, but by dint of my
fairy gift I soon silenced him, amid cheers from both sides of the
house.
Bob was a man of hot temper, and not calculated for lying down
quietly under a defeat. He therefore rose and challenged me to
single combat. I accepted, and a regular battle ensued. After some
hard hits he closed in furiously, and-dealt me a tremendous left-
handed blow. I staggered, reeled, and fell insensible. The last thing I
remember was a horrible crash as if the house was tumbling in
about my ears.
When I recovered my senses I was laid in bed in my own house,
all cut, bruised, and bloody. I was soon given to understand that the
basket of glass was broken, and Mr. Tenderden, being a miserly,
hard-hearted man, made me stand to the loss, which was upwards
of five pounds.
When I was able to walk about again I determined to get rid of
my ring forthwith in the manner the fairy had pointed out, seeing
that it brought me nothing but ill-luck.
It was a fine clear night in October when I reached the little valley
in the uplands before mentioned. There was a gentle frost, and the
stars were twinkling with the lustre of diamonds in a sky of deep and
cloudless azure. A chill breeze whistled dreamily in the gusty passes
of the hills that surrounded the vale, but I wrapped my cloak around
me and standing in a sheltered nook boldly awaited the event.
After about half an hour of dead silence I heard a sound as of
many voices weeping and lamenting at a distance. This continued
for some time until it was interrupted by another voice, seemingly
close at hand. I started at the contiguity of the sound, and looked on
every side, but nothing was visible. Still the strain kept rising and
drawing nearer. At length the following words, sung in a melancholy
though harmonious tone, became distinctly audible:—
Hearken, O Mortal! to the wail
Which round the wandering night-winds fling,
Soft-sighing ’neath the moonbeams pale,
How low! how old! its murmuring!

No other voice, no other tone,


Disturbs the silence deep;
All, saving that prophetic moan,
Are hushed in quiet sleep.

The moon and each small lustrous star,


That journey through the boundless sky,
Seem, as their radiance from afar
Falls on the still earth silently,

To weep the fresh descending dew


That decks with gems the world:
Sweet teardrops of the glorious blue
Above us wide unfurled.

But, hark! again the sighing wail


Upon the rising breeze doth swell.
Oh! hasten from this haunted vale,
Mournful as a funeral knell!

For here, when gloomy midnight reigns,


The fairies form their ring,
And, unto wild unearthly strains,
In measured cadence sing.

No human eye their sports may see,


No human tongue their deeds reveal;
The sweetness of their melody
The ear of man may never feel.

But now the elfin horn resounds,


No longer mayst thou stay;
Near and more near the music sounds,
Then, Mortal, haste away!
Here I certainly heard the music of a very sweet and mellow horn.
At that instant the ring which I held in my hand melted and became
like a drop of dew, which trickled down my fingers and falling on the
dead leaves spread around, vanished.
Having now no further business I immediately quitted the valley
and returned home…
Being very tired and sleepy I retired to bed. As I have no doubt
my reader is by this time in much the same state, I bid him good-
bye.
Charlotte Brontë,
December 18th, 1830.

From Visits in Verreopolis, vol. II. chap. ii., by the Honourable Charles Albert Florian, Lord
Wellesley, aged ten years. Published by Sergeant Bud. The tale is related by, and is a
passage from the early life of, Captain Bud, the father of the fictitious publisher.—C. W. H.
LOVE AND JEALOUSY
No title was given by Charlotte Brontë to this story, which was
probably intended as a sequel to the short drama printed on pp. 95-
104.
The original manuscript has been divided into two parts, one sheet
of four pages having been removed and certain words erased (see
footnotes on pages 126 and 129), apparently in an attempt to make
it appear as two separate and complete manuscripts. The missing
words have been obtained from a transcript made before the
manuscript was mutilated.
C. W. H.
LOVE AND JEALOUSY

In the autumn of the year 1831, being weary of study, and the
melancholy solitude of the vast streets and mighty commercial marts
of our great Babel, and being fatigued with the ever-resounding
thunder of the sea, with the din of a thousand self-moving engines,
with the dissonant cries of all nations, kindreds, and tongues,
congregated together in the gigantic emporium of commerce, of
arts, of God-like wisdom, of boundless learning, and of superhuman
knowledge; being dazzled with continually beholding the glory, the
power, the riches, dominion, and radiant beauty of the city which
sitteth like a queen upon the waters; in one word, being tired of
Verdopolis and all its magnificence, I determined on a trip into the
country.
Accordingly, the day after this resolution was formed, I rose with
the sun, collected a few essential articles of dress, etc., packed them
neatly in a light knapsack, arranged my apartment, partook of a
wholesome repast, and then, after locking the door and delivering
the key to my landlady, I set out with a light heart and joyous step.
After three days of continued travel I arrived on the banks of a
wide and profound river winding through a vast valley embosomed
in hills whose robe of rich and flowery verdure was broken only by
the long shadow of groves, and here and there by clustering herds
and flocks lying, white as snow, in the green hollows between the
mountains. It was the evening of a calm summer day when I
reached this enchanting spot. The only sounds now audible were the
songs of shepherds, swelling and dying at intervals, and the murmur
of gliding waves. I neither knew nor cared where I was. My bodily
faculties of eye and ear were absorbed in the contemplation of this
delightful scene, and, wandering unheedingly along, I left the
guidance of the river and entered a wood, invited by the warbling of
a hundred forest minstrels. Soon I perceived the narrow, tangled
woodpath to widen, and gradually it assumed the appearance of a
green shady alley. Occasionally bowers of roses and myrtles
appeared by the pathside, with soft banks of moss for the weary to
repose on. Notwithstanding these indications of individual property,
curiosity and the allurements of music and cool shade led me
forwards. At length I entered a glade in the wood, in the midst of
which was a small but exquisitely beautiful marble edifice of pure
and dazzling whiteness. On the broad steps of the portico two
figures were reclining, at sight of whom I instantly stepped behind a
low, wide-spreading fig-tree, where I could hear and see all that
passed without fear of detection. One was a youth of lofty stature
and remarkably graceful demeanour, attired in a rich purple vest and
mantle, with closely fitting white pantaloons of white woven silk,
displaying to advantage the magnificent proportions of his form. A
richly adorned belt was girt tightly round his waist from which
depended a scimitar whose golden hilt, and scabbard of the finest
Damascus steel, glittered with gems of inestimable value. His steel-
barred cap, crested with tall, snowy plumes, lay beside him, its
absence revealing more clearly the rich curls of dark, glossy hair
clustering round a countenance distinguished by the noble beauty of
its features, but still more by the radiant fire of genius and intellect
visible in the intense brightness of his large, dark, and lustrous eyes.
The other form was that of a very young and slender girl, whose
complexion was delicately, almost transparently, fair. Her cheeks
were tinted with a rich, soft crimson, her features moulded in the
utmost perfection of loveliness; while the clear light of her brilliant
hazel eyes, and the soft waving of her auburn ringlets, gave
additional charms to what seemed already infinitely too beautiful for
this earth. Her dress was a white robe of the finest texture the
Indian loom can produce. The only ornaments she wore were a long
chain which encircled her neck twice and hung lower than her waist,
composed of alternate beads of the finest emeralds and gold; and a
slight gold ring on the third finger of her left hand, which, together
with a small crescent of pearls glistening on her forehead (which is
always worn by the noble matrons of Verdopolis), betokened that
she had entered the path of wedded life. With a sweet vivacity in her
look and manner the young bride was addressing her lord thus when
I first came in sight of the peerless pair:
‘No, no, my lord; if I sing the song you shall choose it. Now, once
more, what shall I sing? The moon is risen, and, if your decision is
not prompt, I will not sing at all!’
To this he answered: ‘Well, if I am threatened with the entire loss
of the pleasure if I defer my choice, I will have that sweet song
which I overheard you singing the evening before I left Scotland.’*
With a smiling blush she took a little ivory lyre, and, in a voice of
the most touching melody, sang the following stanzas:—
He is gone, and all grandeur has fled from the mountain;
All beauty departed from stream and from fountain;
A dark veil is hung
O’er the bright sky of gladness,
And, where birds sweetly sung,
There’s a murmur of sadness;
The wind sings with a warning tone
Through many a shadowy tree;
I hear, in every passing moan,
The voice of destiny.

Then, O Lord of the Waters! the Great and All-seeing!


Preserve in Thy mercy his safety and being;
May he trust in Thy might
When the dark storm is howling,
And the blackness of night
Over Heaven is scowling;
But may the sea flow glidingly
With gentle summer waves;
And silent may all tempests lie
Chained in Æolian caves!

Yet, though ere he returnest long years will have vanished,


Sweet hope from my bosom shall never be banished:
I will think of the time
When his step, lightly bounding,
Shall be heard on the rock
Where the cataract is sounding;
When the banner of his father’s host
Shall be unfurled on high,
To welcome back the pride and boast
Of England’s chivalry!

Yet tears will flow forth while of hope I am singing;


Still despair her dark shadow is over me flinging;
But, when he’s far away,
I will pluck the wild flower
On bank and on brae
At the still, moonlight hour;
And I will twine for him a wreath
Low in the fairy’s dell;
Methought I heard the night-wind breathe
That solemn word: ‘Farewell!’*
When the lady had concluded her song I stepped from my place
of concealment, and was instantly perceived by the noble youth
(whom, of course, every reader will have recognised as the Marquis
of Douro).
He gave me a courteous welcome, and invited me to proceed with
him to his country palace, as it was now wearing late. I willingly
accepted the invitation, and, in a short time, we arrived there.
It is a truly noble structure, built in the purest style of Grecian
architecture, situated in the midst of a vast park, embosomed in
richly wooded hills, perfumed with orange and citron groves, and
watered by a branch of the Gambia, almost equal in sight to the
parent stream.
The magnificence of the interior is equal to that of the outside.
There is an air of regal state and splendour throughout all the lofty
domed apartments which strikes the spectator with awe for the lord
of so imposing a residence. The marquis has a particular pride in the
knowledge that he is the owner of one of the most splendid, select,
and extensive libraries now in the possession of any individual. His
picture and statue galleries likewise contain many of the finest
works, both of the ancient and modern masters, particularly the
latter, of whom the marquis is a most generous and munificent
patron. In his cabinet of curiosities I observed a beautiful casket of
wrought gold. At my request he opened it and produced the
contents, viz. a manuscript copy of that rare work, ‘The
Autobiography of Captain Leaf.’ It was written on a roll of vellum, but
much discoloured and rendered nearly illegible by time. To my eager
inquiries respecting the manner in which he had obtained so
inestimable a treasure, he replied, with a smile:
‘That question I must decline to answer. It is a secret with which I
alone am acquainted.’
I likewise noticed a brace of pistols, most exquisitely wrought and
highly finished. He told me they were the chef-d’œuvre of Darrow,
the best manufacturer of firearms in the universe. I counted one
hundred gold and silver medals, which had been presented to this
youthful but all-accomplished nobleman by different literary and
scientific establishments. They were all contained in a truly splendid
gold vase awarded to him last year by the Academy of Modern
Athenians (as that learned body somewhat presumptuously chooses
to style itself) as being the composer of the best epigram in Greek.
Above this was suspended a silver bow and quiver, the first prize
given by the Royal Society of Archers, together with a bit, bridle,
spurs, and stirrups, all of fine gold, obtained from the Honourable
Community of Equestrians. Near these lay several withered wreaths
of myrtle, laurel, etc., etc., won by him as conqueror in the great
African Biennial Games. On a rich stand of polished ebony were
ranged twenty-three beautiful vases of marble, alabaster, etc., all
richly carved in basso-relievo, remarkable for classic elegance of
form, design, and execution. Some of these were filled with cameos,
others with ancient coins, and others again bore branches of scarlet
and white coral, pearls, gems of various sorts, fossils, etc. But what
interested me more than all these trophies of victory and specimens
of art and nature, costly, beautiful, and almost invaluable as they
were, was a little figure of Apollo, about six inches in height,
curiously carved in white agate, holding a lyre in his hand, and
placed on a pedestal of the same valuable material, on which was
the following inscription:—
In our day we beheld the god of Archery, Eloquence, and Verse, shrined in an
infinitely fairer form than that worn by the ancient Apollo, and giving far more
glorious proofs of his divinity than the day-god ever vouchsafed to the
inhabitants of the old Pagan world. Zenobia Ellrington implores Arthur Augustus
Wellesley to accept this small memorial, and consider it as a token that, though
forsaken and despised by him whose good opinion and friendship she valued
more than life, she yet bears no malice.
There was a secret contained in this inscription which I could not
fathom. I had never before heard of any misunderstanding between
his lordship and Lady Zenobia, nor did public appearances warrant a
suspicion of its existence. Long after, however, the following
circumstances came to my knowledge. The channel through which
they reached me cannot be doubted, but I am not at liberty to
mention names.
*
One evening about dusk, as the Marquis of Douro was returning
from a shooting excursion into the country, he heard suddenly a
rustling noise in a deep ditch on the roadside. He was preparing his
fowling-piece for a shot when the form of Lady Ellrington started up
before him. Her head was bare, her tall person was enveloped in the
tattered remnants of a dark velvet mantle. Her dishevelled hair hung
in wild elf-locks over her face, neck, and shoulders, almost
concealing her features, which were emaciated and pale as death.
He stepped back a few paces, startled at the sudden and ghastly
apparition. She threw herself on her knees before him, exclaiming in
wild, maniacal accents:
‘My lord, tell me truly, sincerely, ingenuously, where you have
been. I heard that you had left Verdopolis, and I followed you on
foot five hundred miles. Then my strength failed me, and I lay down
in this place, as I thought, to die. But it was doomed I should see
you once more before I became an inhabitant of the grave. Answer
me, my lord: Have you seen that wretch Marian Hume? Have you
spoken to her? Viper! Viper! Oh, that I could sheathe this weapon in
her heart!’
Here she stopped for want of breath, and, drawing a long, sharp,
glittering knife from under her cloak, brandished it wildly in the air.
The marquis looked at her steadily, and, without attempting to
disarm her, answered with great composure:
‘You have asked me a strange question, Lady Zenobia; but, before
I attempt to answer, you had better come with me to our
encampment. I will order a tent to be prepared for you where you
may pass the night in safety, and, to-morrow, when you are a little
recruited by rest and refreshment, we will discuss this matter
soberly.’*
Her rage Was now exhausted by its own vehemence, and she
replied with more calmness than she had hitherto evinced:
‘My lord, believe me, I am deputed by Heaven to warn you of a
great danger into which you are about to fall. If you persist in your
intention of uniting yourself to Marian Hume you will become a
murderer and a suicide. I cannot explain myself more clearly; but
ponder carefully on my words until I see you again.’
Then, bowing her forehead to the ground in an attitude of
adoration, she kissed his feet, muttering at the same time some
unintelligible words. At that moment a loud rushing, like the sound
of a whirlpool, became audible, and Lady Zenobia was swept away
by some invisible power before the marquis could extend his arms to
arrest her progress, or frame an answer to her mysterious address.
He paced slowly forward, lost in deep reflection on what he had
heard and seen. The moon had risen over the black, barren
mountains ere he reached the camp. He gazed for awhile on her
pure, undimmed lustre, comparing it to the loveliness of one far
away, and then, entering his tent, wrapped himself in his hunter’s
cloak, and lay down to unquiet sleep.
Months rolled away, and the mystery remained unravelled. Lady
Zenobia Ellrington appeared as usual in that dazzling circle of which
she was ever a distinguished ornament. There was no trace of
wandering fire in her eyes which might lead a careful observer to
imagine that her mind was unsteady. Her voice was more subdued
and her looks pale, and it was remarked by some that she avoided
all (even the most commonplace) conversation with the marquis.
In the meantime the Duke of Wellington had consented to his
son’s union with the beautiful, virtuous, and accomplished, but
untitled, Marian Hume.
Vast and splendid preparations were making for the approaching
bridal, when just at this critical juncture news arrived of the Great
Rebellion headed by Alexander Rogue. The intelligence fell with the
suddenness and violence of a thunderbolt. Unequivocal symptoms of
dissatisfaction began to appear at the same time among the lower
orders in Verdopolis. The workmen at the principal mills and
furnaces struck for an advance of wages, and, the masters refusing
to comply with their exorbitant demands, they all turned out
simultaneously. Shortly after, Colonel Grenville, one of the great
millowners, was shot. His assassins being quickly discovered and
delivered up to justice were interrogated by torture, but they
remained inflexible, not a single satisfactory answer being elicited
from them. The police were now doubled. Bands of soldiers were
stationed in the more suspicious parts of the city, and orders were
issued that no citizen should walk abroad unarmed. In this state of
affairs Parliament was summoned to consult on the best measures
to be taken. On the first night of its sitting the house was crowded
to excess. All the members attended, and above a thousand ladies of
the first rank appeared in the gallery. A settled expression of gloom
and anxiety was visible in every countenance. They sat for some
time gazing at ache other in the silence of seeming despair. At
length the Marquis of Douro rose and ascended the tribune. It was
on this memorable night he pronounced that celebrated oration
which will be delivered to posterity as a finished specimen of the
sublimest eloquence. The souls of all who heard him were thrilled
with conflicting emotions. Some of the ladies in the gallery fainted
and were carried out. My limits will not permit me to transcribe the
whole of this speech, and to attempt an abridgment would be
profanation. I will, however, present the reader with the conclusion.
It was as follows:—
I will call upon you, my countrymen, to rouse yourselves to action. There is a
latent flame of rebellion smouldering in our city, which blood alone can quench:
the hot blood of ourselves and our enemies freely poured forth! We daily see in
our streets men whose brows were once open as the day, but which are now
wrinkled with dark dissatisfaction, and the light of whose eyes, formerly free as
sunshine, is now dimmed by restless suspicion. Our upright merchants are ever
threatened with fears of assassination from those dependants who, in time past,
loved, honoured, and reverenced them as fathers. Our peaceful citizens cannot
pass their thresholds in safety unless laden with weapons of war, the continual
dread of death haunting their footsteps wherever they turn. And who has
produced this awful change? What agency of hell has affected, what master-
spirit of crime, what prince of sin, what Beelzebub of black iniquity, has been at
work in this Kingdom? I will answer that fearful question: Alexander Rogue! Arm
for the battle, then, fellow-countrymen; be not faint-hearted, but trust in the
justice of your cause as your banner of protection, and let your war-shout in the
onslaught ever be: ‘God defend the right!’
When the marquis had concluded this harangue, he left the house
amidst long and loud thunders of applause, and proceeded to one of
the shady groves planted on the banks of the Guadima. Here he
walked for some time inhaling the fresh night-wind, which acquired
additional coolness as it swept over the broad rapid river, and was
just beginning to recover from the strong excitement into which his
enthusiasm had thrown him when he felt his arm suddenly grasped
from behind, and turning round beheld Lady Zenobia Ellrington
standing beside him, with the same wild, unnatural expression of
countenance which had before convulsed her features among the
dark hills of Gibbel Kumri.
‘My lord,’ she muttered, in a low, energetic tone, ‘your eloquence,
your noble genius has again driven me to desperation. I am no
longer mistress of myself, and if you do not consent to be mine, and
mine alone, I will kill myself where I stand.’
‘Lady Ellrington,’ said the marquis coldly, withdrawing his hand
from her grasp, ‘this conduct is unworthy of your character. I must
beg that you will cease to use the language of a madwoman, for I
do assure you, my lady, these deep stratagems will have no effect
upon me.’
She now threw herself at his feet, exclaiming in a voice almost
stifled with ungovernable emotion:
‘Oh! do not kill me with such cold, cruel disdain. Only consent to
follow me, and you will be convinced that you ought not to be united
to one so utterly unworthy of you as Marian Hume.’
The marquis, moved by her tears and entreaties, at length
consented to accompany her. She led him a considerable distance
from the city to a subterranean grotto, where was a fire burning on
a brazen altar. She threw a certain powder into the flame, and
immediately they were transported through the air to an apartment
at the summit of a lofty tower. At one end of this room was a vast
mirror, and at the other a drawn curtain, behind which a most
brilliant light was visible.
‘You are now,’ said Lady Ellrington, ‘in the sacred presence of one
whose counsel, I am sure, you, my lord, will never slight.’
At this moment the curtain was removed, and the astonished
marquis beheld Crashie, the divine and infallible, seated on his
golden throne, and surrounded by those mysterious rays of light
which ever emanate from him.
‘My son,’ said he, with an august smile, and in a voice of awful
harmony, ‘fate and inexorable destiny have decreed that in the hour
you are united to the maiden of your choice, the angel Azazel shall
smite you both, and convey your disembodied souls over the swift-
flowing and impassable river of death. Hearken to the counsels of
wisdom, and do not, in the madness of self-will, destroy yourself and
Marian Hume by refusing the offered hand of one who, from the
moment of your birth, was doomed by the prophetic stars of heaven
to be your partner and support through the dark, unexplored
wilderness of future life.’
He ceased. The combat betwixt true love and duty raged for a few
seconds in the marquis’s heart, and sent his life-blood in a tumult of
agony and despair burning to his cheek and brow. At length duty
prevailed, and, with a strong effort, he said in a firm, unfaltering
voice:
‘Son of Wisdom! I will war no longer against the high decree of
heaven, and here I swear by the eternal—’
The rash oath was checked in the moment of its utterance by
some friendly spirit who whispered in his ear:
‘There is magic. Beware!’
At the same time Crashie’s venerable form faded away, and in its
stead appeared the evil genius, Danhasch,* in all the naked
hideousness of his real deformity. The demon soon vanished with a
wild howl of rage, and the marquis found himself again in the grove
with Lady Ellrington.

You might also like