0% found this document useful (0 votes)
81 views

04 - Area Data Analysis II PDF

This document summarizes key concepts about spatial autocorrelation and methods for measuring it. It discusses how spatial autocorrelation involves correlation between values at different locations. Join-count statistics, Moran's I, and Getis-Ord General G are introduced as common global indices used to test for spatial autocorrelation. Local indices like Local Moran's I and Getis statistic are also mentioned. Examples are provided to demonstrate calculating join-counts, Moran's I, and assessing their significance.

Uploaded by

Shawn Turnbull
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
81 views

04 - Area Data Analysis II PDF

This document summarizes key concepts about spatial autocorrelation and methods for measuring it. It discusses how spatial autocorrelation involves correlation between values at different locations. Join-count statistics, Moran's I, and Getis-Ord General G are introduced as common global indices used to test for spatial autocorrelation. Local indices like Local Moran's I and Getis statistic are also mentioned. Examples are provided to demonstrate calculating join-counts, Moran's I, and assessing their significance.

Uploaded by

Shawn Turnbull
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 65

EARTHSC/ENVIRSC/GEOG4GA3

/ User Conference/|
Esri International
AppliedSpatialStatistics
AreaDataAnalysisIV
l i
San Diego, CA

Technical Workshops |

PatrickDeLuca,MA,GISP
November3,2015
b

Exploring2nd OrderEffects

Doattributesin
Do
attributes in neighboring
neighboring zonesshowspatial
zones show spatial
dependency?i.e.Dotheycovary?
SpatialAutocorrelation
p

11/3/2015

Involvescorrelationbetweenvaluesofthesamevariableat
differentspatiallocations
Itisconceptuallyandempiricallythetwodimensional
equivalentofredundancy.

PatrickDeLuca AppliedSpatialStatistics

53

SpatialAutocorrelation

Null hypothesis: No spatial autocorrelation


Nullhypothesis:Nospatialautocorrelation

11/3/2015

Valuesobservedatalocationdonotdependonvalues
observedatneighbouring locations
Observedspatialpatternofvaluesisequallylikelyasany
otherspatialpattern

PatrickDeLuca AppliedSpatialStatistics

54

AlternativeHypothesesofSA

Positive Spatial Autocorrelation


PositiveSpatialAutocorrelation

Likevaluestendtoclusterinspace
Neighbours aresimilar

NegativeSpatialAutocorrelation

11/3/2015

Neighbours aredissimilar
Checkerboardpattern

PatrickDeLuca AppliedSpatialStatistics

55

SpatialAutocorrelation

Why is spatial autocorrelation important?


Whyisspatialautocorrelationimportant?

Moststatisticsarebasedontheassumptionthatthevalues
ofobservationsineachsampleareindependent

Iftheobservations,however,arespatiallycorrelatedinsomeway,
theestimatesobtainedwillbebiasedandoverlyprecise.

11/3/2015

Biased theareaswithhigherconcentrationofeventswillhavea
greaterimpactonthemodelestimate
t i
t
th
d l ti t
Overestimateprecision sinceeventstendtobeconcentrated,there
areactuallyfewernumberofindependentobservationsthanarebeing
assumed

PatrickDeLuca AppliedSpatialStatistics

56

IndicesofSpatialCorrelation

Most common global approaches


Mostcommonglobalapproaches

JoinCountStatistics
MoransI
GetisOrdGeneralG

LocalApproaches

11/3/2015

LocalMoransI
LocalGetis Statistic

PatrickDeLuca AppliedSpatialStatistics

57

JoinCountStatistics

Appliedtobinaryvariablesmappedastwocolours
Applied
to binary variables mapped as two colours
(BlackandWhite)suchthatajoin,oredgeisclassified
aseitherWW(00),BB(11)orBW(10)
Interestedinnumberofoccurrencesofeachpossible
joinbetweenneighbouringcells
Canshow

11/3/2015

Positivespatialautocorrelation(clustering)ifthenumberofBWjoinsis
significantly lower than what we would expect by chance
significantlylowerthanwhatwewouldexpectbychance
Negativespatialautocorrelation(dispersion)ifthenumberofBWjoinsis
significantlyhigherthanwhatwouldexpectbychance
N ll
NullspatialautocorrelationifnumberofBWjoinsissameasexpected
ti l t
l ti if
b
f BW j i i
t d
PatrickDeLuca AppliedSpatialStatistics

58

JoinCountStatistics

11/3/2015

BB

WW

BW

TOTAL

24

24

12

24

10

10

24

PatrickDeLuca AppliedSpatialStatistics

KeyistheBW
OBW =EBW,random
OBW neE
ne EBW,notrandom
not random
OBW>EBW,moredispersed
OBW<EBW,moreclustered

59

JoinCountStatistics

Expected values under free sampling


Expectedvaluesunderfreesampling

Free(ornormal)samplingusedwhenyoucandetermine
theprobabilityofanareabeingblackorwhite
JBBE=kp2B=6
JWWE= kp2W=6
JBWE=2kp
k BpW=12

11/3/2015

k=totalnumberofjoins(=24)
pB=probabilityofbeingcodedblack(=0.5)
p
y
g
(
)
pW =probabilityofbeingcodedwhite(=0.5)

PatrickDeLuca AppliedSpatialStatistics

60

JoinCountStatistics

Standard Deviations
StandardDeviations
Needtocomputethetotalsetofallpossibilities
Givenby
n

1
m = ki (ki 1) = 52
2 i =1

BB = kpB2 + 2mpB3 (k + 2m) pB4 = 3.32


BW = 2(k + m) p B pw 4(k + 2m) p p = 2.45
2
B

11/3/2015

PatrickDeLuca AppliedSpatialStatistics

2
w

61

JoinCountStatistics

ZScores
BB

WW

BW

TOTAL

Join Type
yp

24

24

BB

-1.81

-0.30

1.20

12

24

WW

-1.81

0.30

1.20

10

10

24

BW

4 90
4.90

0 00
0.00

-3.27
3 27

11/3/2015

PatrickDeLuca AppliedSpatialStatistics

62

JoinCountStatistics

11/3/2015

PatrickDeLuca AppliedSpatialStatistics

63

JoinCountStatistics

Contiguity Matrix
ContiguityMatrix

UsedRooksCase

TotalJoins=214/2=107

ThisonehereisQueensCase

TotalJoins=218/2=109

O
SullivanandUnwin,2003
OSullivan
and Unwin, 2003

11/3/2015

PatrickDeLuca AppliedSpatialStatistics

64

JoinCountStatistics

Obama won
Obamawon

Romneywon

59647121votes,p(Obama)=0.511
303electoralvotes,p(Obama)e=0.595
57022021votes,p(Romney)=0.489
206electoralvotes,p(Romney)e=0.405

k=107
m=421

11/3/2015

PatrickDeLuca AppliedSpatialStatistics

65

JoinCountStatistics

Obama Obama joins = 33.5


Romney Romney joins = 40
Obama Romney joins = 33
33.5
5
11/3/2015

PatrickDeLuca AppliedSpatialStatistics

66

JoinCountStatistics
P based on Votes
Join Type

Measured

Estimated

Std.Dev

ZScore

ObamaObama

33.5

27.94

8.694

0.640

RomneyRomney

40

25.586

8.353

1.726

ObamaRomney
Obama
Romney

33 5
33.5

53 474
53.474

3 066
3.066

6
6.514
514

P based on Electoral votes


Join Type

Measured

Estimated

Std.Dev

ZScore

ObamaObama

33.5

37.881

9.813

0.446

R
RomneyRomney
R

40

17 551
17.551

6 925
6.925

3 242
3.242

ObamaRomney

33.5

51.569

15.592

1.133

11/3/2015

PatrickDeLuca AppliedSpatialStatistics

67

JoinCountStatistics

US Election using NonFree


USElectionusingNon
FreeSampling
Sampling
Usedwhenthereisnoaprioriknowledgeofwhatshould
beBorW.
DifferentmethodtocomputeEBW andBW calculation

2 JBW
E =
N ( N 1)

J = total number of joins, B = # Black, W=#


W # White

BW

E + 2 J (2 J 1) BW 4[ J ( J 1 2 J (2 J 1)]B ( B 1)W (W 1)
+
E
N ( N + 1)
N ( N 1)( N 2)( N 3)

BW

BW

Zb =

OBW EBW

11/3/2015

BW

33.5 53.5
= 3.4
=
5.888
PatrickDeLuca AppliedSpatialStatistics

68

2
BW

JoinCountStatistics

Limitations

Thejoincountstatisticcanonlybeusedonbinarydata.

Equationsfordeterminingthestandarddeviationsare
q
g
reasonablycomplex

11/3/2015

Butalotofdatacanbetransformedintobinary.
e.g.rainfalldatacaneasilybeconvertedinto"wet"or"dry"
regionsbydeterminingthoseareasaboveorbelowthemean.

Easytomakeamistakewhenimplementingthem

PatrickDeLuca AppliedSpatialStatistics

69

MoransI

Oneoftheoldestindicatorsofspatialautocorrelation
One
of the oldest indicators of spatial autocorrelation
(Moran,1950).Thedefactostandardfordetermining
spatialautocorrelation
Appliedtozonesorpointswithcontinuousvariables
associatedwiththem.
Comparesthevalueofthevariableatanyone
locationwiththevalueatallotherlocationsfora
spatialmatrixW
l

11/3/2015

PatrickDeLuca AppliedSpatialStatistics

70

MoransI

BehavesinamannersimilartoPearsonscorrelation
coefficient
n
n
I=

n wij ( zi z )(z j z )
i =1 j =1

n
2

( zi z ) wij
i =1
i j

Valuesboundedby1to+1

11/3/2015

ve hasacheckerboardpattern
0isuncorrelated
+ve isclustered(nodistinctionbetweenhighandlow
values)
l )
PatrickDeLuca AppliedSpatialStatistics

71

MoransI

Positive Spatial Autocorrelation


PositiveSpatialAutocorrelation

I >1/(n1)
Spatialclusteringofhighand/orlowvalues

NegativeSpatialAutocorrelation
g
p

11/3/2015

I <1/(n1)
Checkerboardpattern

PatrickDeLuca AppliedSpatialStatistics

72

MoransI

AssessingsignificanceusingtheNormal
Assessing
significance using the Normal
approximationmethod

11/3/2015

Nullhypothesisstatesthatvaluesrepresentone
yp
p
manypossiblesamplesofvalues
Ifyourandomlyselectvaluestodistributeacross
yourstudyarea,mostofthetimeitwouldproduce
d
f h i
i
ld
d
apatternanddistributionofvaluesthatwouldnot
be markedly different from the observed pattern
bemarkedlydifferentfromtheobservedpattern
Assumingthatyourdataanditsarrangementare
oneofmany,many,possiblerandomsamples
PatrickDeLuca AppliedSpatialStatistics

73

MoransI

AssessingsignificanceusingtheNormal
Assessing
significance using the Normal
approximationmethod

Empiricaldistributioncanbecomparedtothetheoretical
distributionthroughZtest

I E(I )
Z (I ) =
SE(I )
SE(I )

11/3/2015

N 2 wij 2 + 3( wij ) 2 N ( wij ) 2


ij
ij
i
j

= SQRT
2
2
( N 1)(ijj wij )

PatrickDeLuca AppliedSpatialStatistics

74

MoransI

Normal Approximation example


NormalApproximationexample

11/3/2015

PatrickDeLuca AppliedSpatialStatistics

75

MoransI

Assessing significance using random permutations


Assessingsignificanceusingrandompermutations

11/3/2015

Supposewehavenvaluesyi relatingtoourstudyarea
Thenn!permutationsofthemaparepossible,each
correspondingtoadifferentarrangementofthendatavalues
ThevalueofI canbecalculatedforanyofthesepermutations,
so we can create an empirical distribution for possible values
sowecancreateanempiricaldistributionforpossiblevalues
ofI underrandompermutationsofthendatavalues.
Plotthedistributionofthesepermutationsandcompareour
p
p
observedtothedistribution

PatrickDeLuca AppliedSpatialStatistics

76

MoransI

Random permutation example


Randompermutationexample

11/3/2015

PatrickDeLuca AppliedSpatialStatistics

77

MoranScatterplots

Linearassociationbetweenvalueati
Linear
association between value at i andweighted
and weighted
averageofneighbours
Fourquadrants
q

HighHigh,LowLow=spatialclusters
HighLow,LowHigh=spatialoutliers

Whatcanbereadoffofthisgraph

11/3/2015

Slope=MoransI
Outliers
Highleveragepoints
Spatialregimes
l
PatrickDeLuca AppliedSpatialStatistics

78

Correlograms

UsingMoran
Using
MoranssItoproducecorrelogram
I to produce correlogram
UseproximitymatrixWk,wherekislag
Visualization

Spatialautocorrelationstatisticsforincreasinglag

Interpretation

Identificationofspatialprocess
Rangeofassociation
g

11/3/2015

Possibleindicationofmisspecifiedspatialweights

PatrickDeLuca AppliedSpatialStatistics

79

Correlograms

Malczewski,2009

11/3/2015

PatrickDeLuca AppliedSpatialStatistics

80

Correlograms
Correlogram of Respiratory Disease,
Hamilton, 2008
0.7

Mora
ans I

0.6
0.5
0.4
0.3
0.2
01
0.1
0
1

Spatial Lag

11/3/2015

PatrickDeLuca AppliedSpatialStatistics

81

Correlograms

Malczewski,2009

11/3/2015

PatrickDeLuca AppliedSpatialStatistics

82

Correlograms
Correlogram
&PartialCorrelogram
ofRespiratoryDisease
g
g
p
y
0.7
0.6
0.5

MoraansI

0.4
Correlogram

0.3

Partial Correlogram

0.2
0.1
0
1

-0.1
-0.2

SpatialLag

11/3/2015

PatrickDeLuca AppliedSpatialStatistics

83

GetisOrd GeneralG

measureshowconcentratedthehighorlowvalues
measures
how concentrated the high or low values
areforagivenstudyarea.

Thenullhypothesis:"thereisnospatialclusteringof
thevalues".

11/3/2015

SignificantandpositiveZscoresindicatehighvaluescluster
SignificantandnegativeZscoresindicatelowvaluescluster
PatrickDeLuca AppliedSpatialStatistics

84

GlobalandLocalAutocorrelation

Global

Onestatistictosummarizepattern
Informsifclusteringexistsinthedata

Local

11/3/2015

Locationspecificstatistics
Showsuswheretheclustersarelocated

PatrickDeLuca AppliedSpatialStatistics

85

LISAStatistics

LocalIndicatorofSpatialAssociation
p
Satisfiestworequirements:

indicatessignificantspatialclusteringforeachlocation
SumofLISAproportionaltoaglobalindicatorofspatial
association

LISA forms of global statistics


LISAformsofglobalstatistics

11/3/2015

LocalMoransI
LocalGetisOrdGi*

PatrickDeLuca AppliedSpatialStatistics

86

LISAStatistics

Use:

Identifyhotspots

Significantlocalclustersinabsenceofglobalassociation
Significantlocaloutliers

Indicate local instability


Indicatelocalinstability

11/3/2015

Highsurroundedbylowandviceversa

Localdeviationsfromglobalpatternofspatialassociation

PatrickDeLuca AppliedSpatialStatistics

87

LocalMoran

11/3/2015

PatrickDeLuca AppliedSpatialStatistics

88

LocalMoran

Interpretation
p

11/3/2015

Assessinglackofspatialrandomness
Suggestssignificantspatialstructure
Suggestinterestinglocations
Doesnotexplainthem

PatrickDeLuca AppliedSpatialStatistics

89

GetisOrd Gi*

Local version of GetisOrd


LocalversionofGetis
Ord G

Thelocalsumiscomparedproportionallytothesumofall
features

Whenlocalsumisverydifferentthantheexpectedlocalsum,and
thatdifferenceistoolargetobetheresultofrandomchance,a
statisticallysignificantZscoreresults.

11/3/2015

Significant+ve Zscores thelargertheZscoreis,themoreintensethe


clusteringofhighvalues
SignificantnegativeZscores,thesmallertheZscoreis,themoreintensethe
clusteringoflowvalues.
g

PatrickDeLuca AppliedSpatialStatistics

90

GetisOrd Gi*

11/3/2015

PatrickDeLuca AppliedSpatialStatistics

91

ModelingAreaData

Consideramultipleregression
Consider
a multiple regression
equation

yi =a +b1x1 +b2x2 +...+bnxn +ei

11/3/2015

yi=dependentvariable
x1,x2...xn =independentvariables
a =constant(intercept)
= constant (intercept)
b1,b2 ...bn =regressioncoefficients
ei =errorterm(residualor
difference between predicted and
differencebetweenpredictedand
observedvaluesofyi)

PatrickDeLuca AppliedSpatialStatistics

92

RegressionAnalysis:Assumptions

Multicollinearity:thereisnointer
Multicollinearity:
there is no intercorrelation
correlationof
of
independentvariables
Normality:Errorterms(e
y
( i))arenormallydistributed.
y
Andthemeanoftheerrortermis0
Homoskedasticity (equalvariance):theresidualsare
dispersedrandomlythroughouttherangeofthe
estimateddependentvariable
Spatial independence: there is no spatial
Spatialindependence:thereisnospatial
autocorrelationoftheresiduals

11/3/2015

PatrickDeLuca AppliedSpatialStatistics

93

Example:AverageAgeofDeath

Whatexplainsaverageageofdeath?
What
explains average age of death?
Variablesthatwerestatistically
significant in a bivariate
significantinabivariate
Variable
ttest
regression
Dwell
5.04

11/3/2015

MedInc

4 41
4.41

LICO_All

3.97

NoEdu

5.31

Univ

5.53

DropOut

4.02

Seniors

6.47

PatrickDeLuca AppliedSpatialStatistics

94

Example:CardiacAdmissions

11/3/2015

PatrickDeLuca AppliedSpatialStatistics

95

AnalysisofResiduals

Multicollinearity conditionnumber>30is
condition number > 30 is
problematic
JarqueBera testsjointhypothesisofskewness
tests joint hypothesis of skewness =0
=0
andkurtosis=0

11/3/2015

Isthedataconsistentwithhavingskewness
g
andkurtosis
equalto0?
Whenp>0.05itisconsistentwith0skewand0kurtosis
PatrickDeLuca AppliedSpatialStatistics

96

AnalysisofResiduals

BreuschPagan

11/3/2015

ttestsnullhypothesisthattheerrorvariancesareallequalvs
t
ll h
th i th t th
i
ll
l
thealternativethattheerrorvariancesareamultiplicative
functionofoneormorevariables
Alt.hyp.Statesthattheerrorvariancesincreaseordecrease
asthepredictedvaluesofyincreaseordecrease
P 0 05 i di t h t
P>0.05indicatesheteroskedasticity
k d ti it
PatrickDeLuca AppliedSpatialStatistics

97

AnalysisofResiduals

11/3/2015

PatrickDeLuca AppliedSpatialStatistics

98

AnalysisofResiduals

Are errors independent?


Areerrorsindependent?

11/3/2015

Mapofresiduals

PatrickDeLuca AppliedSpatialStatistics

99

AnalysisofResiduals

Are errors independent?


Areerrorsindependent?

11/3/2015

PatrickDeLuca AppliedSpatialStatistics

100

MyocardialInfarctionExample

11/3/2015

PatrickDeLuca AppliedSpatialStatistics

101

OLSOutput

Suggests
Non normality
Non-normality

Suggests
heteroskedasticity
Suggests
spatial
autocorrelation
11/3/2015

PatrickDeLuca AppliedSpatialStatistics

102

ModelingAreaData

What is the conclusion regarding this model?


Whatistheconclusionregardingthismodel?

Modelshowserrorautocorrelation

Twotypesofmodelspossiblebasedonthetwo
yp
p
primarytypesofspatialdependence

11/3/2015

SpatialErrorModel
SpatialLagModel

PatrickDeLuca AppliedSpatialStatistics

103

ModelingSpatialDependence

Spatial error
Spatialerror

Observationsinterdependentthrough
unmeasuredvariablesthatare
correlatedacrossspaceOR
measurementerrorthatiscorrelated
with space
withspace

11/3/2015

arisesbecausewecannotmodelallthe
facetsofageographicalregionthatmay
influence all nearby locations
influenceallnearbylocations
Mayalsoarisefromboundariesthatare
notperfectmeasures

PatrickDeLuca AppliedSpatialStatistics

Xi

Xj

YiYj
i

104

ModelingSpatialDependence

Spatial error
Spatialerror

Theoreticallypossibletoeliminatethistypeofspatial
dependencewithproperexplanatoryvariablesandcorrect
boundariesofobservations
Spacemattersonlyintheerrorprocess,notinthe
substantiveportionofthemodel
p
Assumptionofuncorrelatederrortermsisviolated

11/3/2015

Indicativeofomitted(spatiallycorrelated)covariates

PatrickDeLuca AppliedSpatialStatistics

105

ModelingSpatialDependence

Spatial Lag
SpatialLag

Dependentvariableisaffectedbythe
valuesofthedependentvariablesin
nearbyplaces

11/3/2015

E.g.LandvalueinaCTisafunctionof
landvalueinnearbyCTs,notjustrelated
tocommonunmeasuredvariables

Assumptionofuncorrelatederror
terms is violated
termsisviolated
Assumptionofindependent
observationsisviolated
PatrickDeLuca AppliedSpatialStatistics

Xi

Xj

YiYj
i

106

AnalysisofResiduals

LMLagandRobustLMLag

PertaintoSpatialLagmodelasalternative
p
g
Robust:testsforlagdependencyinpresenceofmissingerror

LMErrorandRobustLMError

11/3/2015

PertaintoSpatialErrormodelasalternative
Robust:testsforerrordependenceinpresenceofmissinglag

PatrickDeLuca AppliedSpatialStatistics

107

AnalysisofResiduals

From: Anselin 2005


From:Anselin

11/3/2015

PatrickDeLuca AppliedSpatialStatistics

108

SpatialLagModelResults

11/3/2015

PatrickDeLuca AppliedSpatialStatistics

109

RegressionDiagnostics

11/3/2015

PatrickDeLuca AppliedSpatialStatistics

110

RegressionDiagnostics
Noobviouspatternin
residuals

Nofunnellikepattern,noincrease/decrease,
suggests homoskedasticity
suggestshomoskedasticity

11/3/2015

PatrickDeLuca AppliedSpatialStatistics

111

FinalLagModelSpecification

Spatial Lag Model in notation form


SpatialLagModelinnotationform

y=a +Wy +X +

Myocardial~0.86+0.69W(Myocardial)+0.31(JarmanScore)+

11/3/2015

PatrickDeLuca AppliedSpatialStatistics

112

HypotheticalErrorModelSpecification

Spatial Error Model in notation form


SpatialErrorModelinnotationform

y=a +X +W +

Myocardial~75.99+0.25(JarmanScore)+0.71W +

11/3/2015

PatrickDeLuca AppliedSpatialStatistics

113

SummaryofStepsforModeling

Exploration

Aspatial examinedependentvariablefornormality

11/3/2015

Histogram
Boxplot
Normalitystatistics

Spatial
Spatial

ComputeMoranCoefficientScatterplot andMoransItosearchfor
evidenceofthepresenceofspatialand/oraspatial outliers

C
Canalsoexamineonalocallevel
l
i
l ll l

PatrickDeLuca AppliedSpatialStatistics

114

SummaryofStepsforModeling

ComputeOLSresults
Compute
OLS results
UsingOLSresiduals,computeMoransI

Ifsignificantautocorrelationisdetectedintheresiduals,
If
significant autocorrelation is detected in the residuals,
thenrerunmodelwithaspatialmodelandestimatethe
respectiveparameters

Continuewithotherregressiondiagnostics

11/3/2015

Nonnormality(JarqueBera Test)
H t
Heteroskedasticity
k d ti it (BreuschPagan,KoenkerBasset)
(B
hP
K
k B
t)
Multicollinearity (ConditionNumber)
MoranssIforspatialdependenceofresiduals
Moran
I for spatial dependence of residuals
PatrickDeLuca AppliedSpatialStatistics

115

SummaryofStepsforModeling

Fitaspatialmodelonlyifwarranted
Fit
a spatial model only if warranted
Usetheoryifpossibletodecidewhichmodeltofit,if
notpossible,usethediagnostics
p
,
g

11/3/2015

PatrickDeLuca AppliedSpatialStatistics

116

You might also like