CrimeStatChapter 8
CrimeStatChapter 8
Chapter 8
Kernel Density Interpolation
In t h is ch a pt er , we discu ss t ools a imed a t int er pola t ing incident s, us ing t h e ker n el
den sit y a pp r oach . In terpolation is a t ech n iqu e for gen er a lizin g in cid en t loca t ion s t o a n
en t ir e a r ea . Wh er ea s t h e spa t ia l d is t r ibu t ion a n d h ot spot st a t is t ics pr ovide st a t is t ica l
su m m a r ies for t h e da t a in cid en t s t h em selves, in t er pola t ion t ech n iqu es gen er a lize t h ose
da t a inciden t s t o t h e ent ire r egion . In pa r t icu lar , th ey provide density est im a t es for a ll
pa r t s of a r egion (i.e., a t a n y loca t ion ). Th e den sit y est im a t e is a n in t en sit y va r ia ble, a Zvalu e, th a t is est ima t ed a t a pa r t icu lar loca t ion . Con sequ en t ly, it ca n be displa yed by
eith er sur face maps or cont our ma ps th at show the intensity at all locat ions.
Th er e a r e m a n y in t er pola t ion t ech n iqu es, s u ch a s Kr igin g, t r en d su r fa ces, loca l
r egres sion m odels (e.g., Loess, sp lines), a n d Dir ich let t essella t ion s (An selin, 1992;
Clevelan d, Gr osse a n d Sh yu, 1993; Ven a bles an d Ripley, 1997). Most of t h ese r equ ire a
va r ia ble t h a t is bein g est im a t ed a s a fun ction of loca t ion. H owever , kernel density
estim ation is a n in t er pola t ion t ech n iqu e t h a t is a ppr opr ia t e for in divid u a l p oin t loca t ion s
(Silverm a n , 1986; H r dle, 1991; Bailey an d Ga t r ell, 1995; Bur t an d Ba r ber, 1996; Bowm a n
a n d Azalini, 1997).
Ke r n e l D e n s i t y E s t im a t i o n
Ker n el de n sit y est im a t ion in volves p la cing a sym m et r ical su r face over ea ch p oin t ,
eva lu a t in g t h e d is t a n ce fr om t h e p oin t t o a r efer en ce loca t ion ba s ed on a m a t h em a t ica l
fu n ct ion , a n d s u m m in g t h e va lu e of a ll t h e s u r fa ces for t h a t r efer en ce loca t ion . Th is
pr ocedu r e is r epea t ed for a ll r efer en ce locat ion s. It is a t echn iqu e t h a t wa s d eveloped in
t h e la t e 1950s a s a n a lt er n a t ive m et h od for est im a t in g t h e den sit y of a h ist ogra m
(Rosen bla t t , 1956; Wh it t le, 1958; P a r zen , 1962). A h is t ogr a m is a gr a ph ic r epr esen t a t ion of
a fr equ en cy dis t r ibu t ion . A cont in u ou s va r ia ble is divid ed in t o in t er va ls of size, s (t h e
int er val or bin widt h ), an d t h e n u m ber of ca ses in ea ch int er val (bin) ar e cou n t ed a n d
dis pla yed a s block dia gr a m s. Th e h ist ogr a m is a ss u m ed t o rep r esen t a sm oot h , un der lying
dis t r ibu t ion (a den sit y fu n ct ion ). H owever , in or der t o est im a t e a sm oot h den sit y fu n ct ion
fr om t h e h ist ogra m , tr a dit ion a lly r esea r ch er s h a ve link ed a djacent var iable int er vals by
con n ect in g t h e m idp oint s of th e in t er va ls w it h a ser ies of lin es (Figu r e 8.1).
Un for t u n a t ely, doing t h is ca u ses t h r ee s t a t ist ical pr oblem s (Bowm a n a n d Azalin i,
1997):
1.
Inform at ion is discar ded becau se all cases with in an int erval ar e assigned to
t h e m id poin t . Th e wid er t h e in t er va l, t h e gr ea t er t h e in for m a t ion los s.
2.
Figure 8.1:
Frequency
40
30
20
10
0
1
3
2
5
4
7
6
g(x j) =
E{
[Wi * I i ] * ----------*
h 2 * 2B
d ij2
- [--------- ]
2*h 2
(8.1)
Figure 8.2:
0.6
Density
0.5
0.4
0.2
0.1
0.0
0
2
1
4
3
6
5
8
7
10
9
12
11
14
13
Relative Location
16
15
18
17
20
19
Figure 8.3:
0.9
0.8
Kernel density estimate
0.7
Density
0.6
0.5
0.4
0.3
0.2
0.1
0.0
0
2
1
4
3
6
5
8
7
10
9
12
11
14
13
Relative Location
16
15
18
17
20
19
Figure 8.4:
0.4
Kernel density estimate
Density
0.3
0.1
0.0
0
2
1
4
3
6
5
8
7
10
9
12
11
Relative Location
14
13
16
15
18
17
20
19
Figure 8.5:
0.16
0.14
Density
0.12
0.10
Quartic functions over individual points
0.08
0.06
0.04
0.02
0
0
2
1
4
3
6
5
8
7
10
9
12
11
Relative Location
14
13
16
15
18
17
20
19
2.
(8.2)
g(x j) =
E{
3
d ij2
2
[W i * I i ] * [ ----------]
* [1 - -------]
h2 * B
h2
(8.3)
2.
(8.4)
E [K -
K/h ] * d ij
(8.5)
wh er e K is a con s t a n t . In Crim eS tat, th e con st a n t K is init ially set t o 0.25 an d t h en r escaled t o en su r e t h a t eith er t h e den sit ies or pr oba bilit ies su m t o t h eir a ppr opr iat e valu es
(i.e., N for den sit ies a n d 1.00 for pr oba bilit ies).
Th e n e g a t i v e ex p on e n t i a l (or pea k ed) dist r ibu t ion falls off ver y r a pid ly wit h
dis t a n ce up t o th e circu m scr ibed r a diu s. It s fu n ctiona l form is:
1.
2.
(8.6)
E A*e -K*dij
(8.7)
8.8
2.
(8.8)
EK
(8.9)
8.9
8.10
Figure 8.6:
+
=
Figure 8.7:
Interpolation Screen
Figure 8.8:
Coordinate
Baltimore County
City of Baltimore
Miles
0
4
Lower-left Coordinate
S in g le D e n si ty Es ti m at e s
Th e sin gle ker n el de n sit y r out in e in Crim eS tat is a pp lied t o a dist r ibu t ion of point
loca t ion s, s uch a s cr im e in cid en t s. It ca n be u sed wit h eit h er a pr im a r y file or a secon da r y
file; t h e pr im a r y file is t h e defa u lt . F or exa m ple, t h e pr im a r y file ca n be t h e loca t ion of
m otor veh icle t h efts. Th e poin t s ca n a lso ha ve a weigh t in g or a n a ss ocia t ed in t en sit y
va r ia ble (or bot h ). F or exa m ple, t h e poin t s cou ld r epr esen t t h e loca t ion of police st a t ion s
wh ile t h e we igh t s (or in t en sit ies ) repr es en t t h e n u m ber of calls for ser vice. Aga in , t h e u se r
mu st be car eful in h aving both a weight ing var iable and a n int ensity var iable as th e
r out in e will u se both va r ia bles in calcu la t in g den sit ies ; th is could lea d t o double we igh t in g.
H a vin g defined t h e file on t h e pr ima r y (or secon da r y) file t a bs, th e u ser ind ica t es
t h e r ou t ine by ch eckin g th e Sin gle box. Also, it is necessa r y to define a r efer en ce file,
eit h er a n exist in g file or one gen er a t ed by Crim eS tat (see cha pt er 3). Ther e a r e ot h er
par am eters t ha t m ust be defined.
F ile to be In te rp ola te d
Th e u ser m u st in dica t e wh et h er t h e pr im a r y file or t h e secon da r y file (if u sed) is t o
be int erpolated.
Me t h o d o f In t e r p o la t i on
Th e u ser m u st in dica t e t h e m et h od of in t er pola t ion . Five t ypes of ker n el de n sit y
estimat ors a re used:
1.
2.
3.
4.
5.
In ou r experien ce, th er e a r e a dva n t a ges t o ea ch . The n or m a l dist r ibut ion pr odu ces
a n est im a t e over t h e en t ir e r egion wh er ea s t h e ot h er fou r pr odu ce est im a t es on ly for t h e
cir cu m scr ibed ba n dwid th r a diu s. If t h e d is t ribu t ion of p oin t s is sp ar s e t owa r ds th e ou t er
pa r t s of t h e r egion , t h en t h e fou r cir cu m scr ibed fu n ct ion s will n ot pr odu ce est im a t es for
t h ose a r ea s, wh er ea s t h e n or m a l will. Con ver sely, t h e n or m a l d is t r ibu t ion ca n ca u se som e
edge effect s t o occu r (e.g., spik es a t t h e edge of t h e r efer en ce grid), pa r t icu lar ly if t h er e a r e
m a n y point s n ea r on e of t h e bou n da r ies of t h e st u dy ar ea . The fou r circum scribed
fu n ct ion s will p rod uce les s of a pr oblem a t t h e ed ges , a lt h ou gh t h ey s till ca n pr od uce s om e
s pik es . Wit h in t h e fou r cir cu m scr ibed fu n ct ion s, t h e u n ifor m a n d qu a r t ic t en d t o s m oot h
th e data more whereas t he tr iangular a nd n egat ive exponent ial tend t o empha size peaks
a n d va lleys . Th e differ en ces be t ween t h es e differ en t ker n el fun ction s a r e sm a ll, h owever .
Th e u ser sh ould pr obably st a r t wit h t h e defa u lt n orm a l fun ction a n d a dju st a ccord in gly t o
how the sur face or cont our looks.
8.14
Output U nits
Th e u ser m u st in dica t e t h e m ea su r em en t u n it s for t h e den sit y est im a t e in point s
per squ a r ed m iles, squ a r ed n a u t ica l miles, squ a r ed feet, squ a r ed k ilom et er s, or squ a r ed
m et er s. The d efa u lt is point s per squ a r e m ile.
Int e n si ty or We ig h tin g Vari ab le s
If an int en sit y or weight ing va r iab le is u sed , th ese boxes m u st be ch eck ed. Be
car efu l a bout u sin g both a n in t en sit y a n d a weigh t in g va r ia ble t o avoid dou ble w eigh t in g.
D e n s it y Ca lc u la ti on s
Th e u ser m u st ind ica t e t h e t ype of ou t pu t for t h e den sit y est ima t es. Ther e a r e t h r ee
t ypes of ca lcu lat ion t h a t ca n be con du ct ed wit h t h e ker n el dens ity r ou t ine. The
calcula t ion s a r e a pp lied t o each r efer en ce cell:
1.
2.
The kernel estima tes can be calculat ed as relative density estimat es. These
divid e t h e a bs olut e den sit ies by t h e a r ea of t h e gr id cell. It h a s t h e a dva n t a ge
of in t er pr et in g t h e den sit y in t er m s t h a t a r e fa m ilia r . Th u s, in st ea d of a
den sit y est im a t e r epr esen t ed by point s p er gr id cell, th e r ela t ive den sit y will
con ver t t h is t o point s p er , sa y, squ a r e m ile.
3.
Sin ce th e t h r ee t ypes of calcula t ion a r e dir ectly int er r ela t ed, t h e out pu t su r face will
n ot differ in it s var iability. The ch oice wou ld depen d on wh et h er t h e ca lcu lat ion s a r e u sed
t o est im a t e a bsolu t e den sit ies, r ela t ive den sit ies, or pr oba bilit ies. F or com pa r is on s
bet ween differ en t t ypes of crim e or bet ween t h e sa m e t ype of cr im e a n d d iffer en t t im e
per iods, u su a lly a bs olut e den sit ies a r e t h e u n it of choice (i.e., in ciden t s p er gr id cell).
H owever, t o expres s t h e ou t pu t a s a pr oba bilit y, t h a t is, th e likelih ood t h a t a n inciden t
would occur at an y one locat ion, then out put ing th e results as pr obabilities would ma ke
m or e sen se. For d ispla y pur poses, however , it m a kes n o differ en ce a s both look t h e sa m e.
Ou tp u t F ile s
Fin a lly, t h e res u lts can be displayed in a n ou t pu t t a ble or ca n be ou t pu t int o t wo
for m a t s: 1) Ra st er gr id for m a t s for dis pla y in a su r fa ce m a pp in g p rogr a m - S u rfer for
W in d ow s .d a t for m a t (Gold en Soft wa r e, 1994) or ArcView S pat ial A n alyst a s c for m a t
8.16
(ES RI, 1998); or 2) P olygon gr ids in ArcView .sh p, M apIn fo .m if or Atlas*GIS .bna
form at s. 2 H owever , a ll bu t S u rfer for Win d ow s r equ ire t h a t t h e r efer en ce grid be crea t ed
by Crim eS tat.
Exam ple 1: Kern el De ns ity Estim ate o f Stree t Robberie s
An exa m ple ca n illu st r a t e t h e u se of t h e s in gle k er n el d en sit y r ou t in e. F igu r e 8.9
s h ows a S u rfer for Win d ow s ou t pu t of t h e 1180 st r eet r obber ies for 1996 in Ba lt im or e
Coun t y. Th e r efer en ce grid w a s gen er a t ed by Crim eS tat a n d h a d 100 colum n s a n d 108
r ows. Th u s, t h e r out in e calcu la t ed t h e dist a n ce bet ween ea ch of t h e 10,800 r efer en ce cells
a n d t h e 1180 robbery inciden t loca t ion s, evalu a t ed t h e ker n el fu n ct ion for ea ch m ea su r ed
dist a n ce, an d su m m ed t h e r esu lts for ea ch r efer en ce cell. The n or m a l dist r ibut ion ker n el
fu n ct ion wa s select ed for t h e k er n el es t im a t or a n d a n a d ap t ive ba n d wid t h wit h a m in im u m
sam ple size of 100 was chosen as t he par am eters.
Th er e a r e t h r ee views in t h e figu r e: 1) a m a p view sh owin g t h e loca t ion of th e
in ciden t s; 2) a su r face view sh owin g a t h r ee-dim en sion a l in t er pola t ion of robbe r y den sit y;
a n d 3) a con t ou r view sh owin g con t ou r s of h igh r obber y d en sit y. Th e su r fa ce a n d con t ou r
views pr ovide differ en t per sp ect ives. Th e su r face s h ows t h e pea k s ver y clear ly a n d t h e
r ela t ive den sit y of t h e pea k s. As ca n be seen , t h e pea k for r obber ies on t h e ea st er n pa r t of
t h e Cou n t y is m u ch h igh er t h a n t h e t wo pea ks in t h e cen t r a l a n d west er n pa r t s of th e
Cou n t y. Th e cont ou r view ca n sh ow wh er e t h ese pea k s a r e loca t ed; it is difficu lt t o iden t ify
locat ion clear ly from a th ree-dimensiona l sur face map. Highways an d str eets could be
overla id on t op of t h e con t ou r view t o ident ify more pr ecisely wh er e t h ese pea ks a r e
loca t ed.
F igu r e 8.10 s h ows a n ArcViewS pat ial A n alyst m a p of t h e r obber y den sit y wit h t h e
r obber y inciden t locat ion s over la id on t op of th e den sit y cont our s. H er e, we ca n see qu it e
clea r ly t h a t t her e a r e t h r ee s tr on g con cen t r a t ion s of in cid en t s, on e s pr ea din g over a
dis t a n ce of sever a l m iles on t h e west sid e, one on n ort h er n bord er bet ween Ba lt im ore Cit y
a n d Ba lt im ore Coun t y, an d on e on t h e ea st sid e; th er e is a lso on e sm a ller pea k in t h e
sou t h ea st cor n er of t h e Cou n t y.
F r om a st a t ist ica l per spective, th e ker n el est ima t e is a bet t er h ot spot ident ifier
t h a n t h e clus t er a n a lysis r ou t ines discuss ed in cha pt er 6. Clus t er r ou t ines group in ciden t s
in t o clu st er s a n d dis t in gu is h bet ween in cid en t s wh ich belon g t o t h e clu st er a n d t h ose
wh ich do n ot belon g. Depen din g on wh ich m a t h em a t ica l algor ith m s a r e u sed, differ en t
clus t er ing r ou t ines will r et u r n differ ing a lloca t ion s of inciden t s t o clus t er s. The k er n el
est ima t e, on t h e ot h er h a n d, is a con t inu ou s su r fa ce; t h e den sit ies a r e ca lcu lat ed a t all
loca t ion s; t h u s, t h e u ser ca n vis ua lly in sp ect t h e va r ia bilit y in den sit y a n d d ecid e wh a t t o
ca ll a h ot s pot wit h ou t h a vin g t o d efin e a r bit r a r ily wh er e t o cu t -off t h e h ot s pot zon e.
Going ba ck t o t h e S u rfer for Win d ow s ou t pu t , figu r e 8.11 sh ows t h e effect s of
va r yin g t h e ba n dw idt h pa r a m et er s. Th er e a r e t h r ee fixed ba n dw idt h in t er va ls (0.5, 1, a n d
2 m iles res pectively) a n d t h er e a r e t wo a da pt ive ban dwidt h int er vals (a m inim u m of 25
a n d 100 poin t s r espectively). As can be seen , th e finen ess of t h e int er pola t ion is affect ed by
8.17
Figure 8.9:
Contour View
Surface View
39.70
N
39.60
39.50
39.40
39.30
39.20
-76.80
-76.70
-76.60
-76.50
-76.40
-76.30
Figure 8.10
Baltimore County Street Robberies: 1996
Kernel Density Estimate
Baltimore County
Baltimore Beltway
Major Road
City of Baltimore
Baltimore County
Robbery Density
Low
City of Baltimore
High
10
20 Miles
E
S
Figure 8.11:
Fixed/ h=0.5 mi
Fixed/ h=2.0 mi
Adaptive/ n=50
Adaptive/ n=100
Figures modified from Gajewski, Viau, Sawada et al. 2001. Global Biogeochemical Cycles,
9:00 AM
7:00 PM
1:00 PM
11:00 PM
I-5 corridor in
King County
Kernel density
interpolation
t h e ban dwidt h ch oice. For t h e t h r ee fixed int er vals, a n int er val of 0.5 miles pr odu ces a
fin er m es h in t er p ola t ion t h a n a n in t er va l of 2 m iles , wh ich t en d s t o over s m oot h t h e
dist r ibut ion . Per h a ps, t h e int er m edia t e int er val of 1 m ile gives t h e best ba lan ce bet ween
finen ess a n d gener a lity. F or t h e t wo a da pt ive int er vals, t h e m inim u m sa m ple size of 25
gives s ome ver y sp ecific pea k locat ion s wh er ea s t h e a da pt ive in t er va l wit h a m in im u m
sa m ple size of 100 gives a sm oot h er dis t r ibu t ion.
Which of t h ese sh ou ld be us ed a s t h e best ch oice wou ld depen d on h ow m u ch
con fid en ce t h e a n a lys t h a s in t h e r es u lt s . A k ey qu es t ion is wh et h er t h e p ea k s a r e r ea l or
m er ely byprodu ct s of sm a ll sa m ple sizes. Th e best ch oice wou ld be to pr odu ce a n
in t er p ola t ion t h a t fit s t h e exp er ien ce of t h e d ep a r t m en t a n d officer s wh o t r a vel a n a r ea .
Aga in , exper im en t a t ion a n d d iscu ss ion s wit h bea t officer s will be n ecess a r y t o est a blish
which ban dwidth choice should be used in fut ur e int erpolations.
Not e in a ll five of t h e in t er pola t ion s, t h er e is som e bia s a t t h e edges wit h t h e Cit y of
Ba lt im or e (t h e t h r ee-s id ed a r ea in t h e cen t r a l s ou t h er n p a r t of t h e m a p ). S in ce t h e
pr ima r y file on ly inclu ded in ciden t s for t h e Cou n t y, t h e int er pola t ion n evert h eless h a s
est ima t ed some likelihood a t t h e edges; t h ese a r e ed ge biases a n d n eed t o be ign or ed or
r em oved wit h a n AS CII ed it or .3 F u r t h er , t h e wid er t h e in t er va l ch osen , t h e m ore bia s is
pr odu ced a t t h e ed ge.
D u a l Ke rn e l E st im a te s
Th e du a l ker n el de n sit y r out in e in Crim eS tat is a pplied t o tw o d is t r ibu t ion s of p oin t
loca t ion s. For exa m ple, th e pr ima r y file cou ld be th e loca t ion of a u t o t h eft s wh ile t h e
secon da r y file cou ld be th e cen t r oids of cen su s t r a ct s, with t h e popula t ion of t h e cen su s
t r a ct be in g a n in t en sit y var ia ble. Th e du a l r out in e m u st be u sed wit h both a pr im a r y file
a n d a secon da r y file. Also, it is n ecess a r y t o define a r efer en ce file, eit h er a n exist in g file
or on e gen er a t ed by Crim eS tat (see cha pt er 3). Severa l pa r a m et er s n eed t o be defined .
F ile to be In te rp ola te d
The user m ust indicat e the order of th e int erpolation. The routine uses the
la n gu a ge first file and secon d file in m a kin g th e com pa r ison (e.g., dividin g th e firs t file by
t h e secon d; a dd in g t h e firs t file t o th e secon d). Th e u ser m u st in dica t e wh ich is t h e firs t
file, t h e pr ima r y or t h e secon da r y. The d efa u lt is t h a t t h e pr ima r y file is t h e firs t file.
Me t h o d o f In t e r p o la t i on
Th e u ser m u st ind ica t e t h e t ype of ker n el est ima t or . As wit h t h e sin gle ker n el
den sit y rout ine, five t ypes of ker n el dens ity est ima t or s a r e u sed
1.
2.
3.
8.25
4.
5.
In ou r experien ce, th er e a r e a dva n t a ges t o ea ch . The n or m a l dist r ibut ion pr odu ces
a n est im a t e over t h e en t ir e r egion wh er ea s t h e ot h er fou r pr odu ce est im a t es on ly for t h e
cir cu m scr ibed ba n dwid th r a diu s. If t h e d is t ribu t ion of p oin t s is sp ar s e t owa r ds th e ou t er
pa r t s of t h e r egion , t h en t h e fou r cir cu m scr ibed fu n ct ion s will n ot pr odu ce est im a t es for
t h ose a r ea s, wh er ea s t h e n or m a l will. Con ver sely, t h e n or m a l d is t r ibu t ion ca n ca u se som e
edge effect s t o occu r (e.g., spik es a t t h e edge of t h e r efer en ce grid), pa r t icu lar ly if t h er e a r e
m a n y point s n ea r on e of t h e bou n da r ies of t h e st u dy ar ea . The fou r circum scribed
fu n ct ion s will p rod uce les s of a pr oblem a t t h e ed ges , a lt h ou gh t h ey s till ca n pr od uce s om e
s pik es . Wit h in t h e fou r cir cu m scr ibed fu n ct ion s, t h e u n ifor m a n d qu a r t ic t en d t o s m oot h
th e data more whereas t he tr iangular a nd n egat ive exponent ial tend t o empha size peaks
a n d va lleys . Th e differ en ces be t ween t h es e differ en t ker n el fun ction s a r e sm a ll, h owever .
Th e u ser sh ould pr obably st a r t wit h t h e defa u lt n orm a l fun ction a n d a dju st a ccord in gly t o
how the sur face or cont our looks.
Choice of Band w idth
Th e u ser m u st defin e t h e ba n dwid t h pa r a m et er . Th er e a r e t h r ee t yp es of
ba n dwid t h s for t h e sin gle ker n el d en sit y r ou t in e - fixed in t er va l, va r ia ble in t er va l, or
a da pt ive in t er va l.
F i x e d i n t e rv a l
Wit h a fixed ba n dw idt h , t h e u ser m u st sp ecify t h e in t er va l t o be used a n d t h e u n it s
of m ea su r em en t (squ a r ed m iles, squ a r ed n a u t ica l miles, squ a r ed feet, squ a r ed k ilom et er s,
or s qu a r ed m et er s). Depen din g on t h e t ype of ker n el es t im a t e u sed, t h is in t er va l h a s a
s ligh t ly d iffer en t m ea n in g. F or t h e n or m a l k er n el fu n ct ion , t h e ba n dwid th is th e s ta n da r d
d evia t ion of t h e n or m a l d is t ribu t ion . F or t h e u n ifor m , qu a r t ic, t r ia n gu la r , or n ega t ive
exponen t ia l ker n els , t h e ba n dw idt h is t h e r a diu s of th e sea r ch a r ea t o be in t er pola t ed.
Sin ce th er e a r e t wo files bein g comp a r ed, t h e fixed in t er va l is a pp lied both t o th e firs t file
a n d t h e secon d file.
Va r ia b le i n t e r v a l
Wit h a va r ia ble in t er va l, ea ch file (t h e first a n d t h e secon d) h a ve differ en t in t er va ls.
F or bot h , t h e u n it s of m ea su r em en t s m u st be specified (squ a r ed m iles, s qu a r ed n a u t ica l
m iles , squ a r ed feet , squ a r ed kilomet er s, or squ a r ed m et er s). Th er e is a good r ea son wh y a
u ser m igh t wa n t va r ia ble in t er va ls . In com pa r in g t wo ker n el est im a t es, t h e m ost com m on
com p a r is on is t o d ivid e on e by t h e ot h er . H owever , if t h e d en s it y es t im a t e for a p ar t icu la r
cell in t h e den om ina t or a ppr oa ch es zer o, th en t h e r a t io will blow u p a n d becom e a very
la r ge n u m ber . Visu a lly, t h is w ill be s een a s s pik es in t h e dist r ibu t ion, t h e r es u lt , u su a lly,
of t oo few ca s es . In t h is ca s e, t h e u s er m igh t d ecid e t o s m oot h t h e d en om in a t or m or e t h a n
n u m er a t or in or der t o r edu ce t h ese spik es. F or exa m ple, t h e in t er va l for t h e fir st file (t h e
n u m er a t or) could be 1 m ile wh er ea s t h e in t er va l for t h e secon d file (th e den omin a t or) could
8.26
8.27
1.
2.
(8.10)
wh er e g(xj) is t h e den sit y est im a t e for t h e fir st file a n d g(y j) is t h e den sit y
est ima t e for t h e secon d file. For a var iable t h a t h a s a spa t ially skewed
distr ibut ion, such t ha t m ost r eference cells ha ve very low density estima tes,
bu t a few h a ve ver y h igh den sit y est im a t es, con ver t in g t h e r a t io in t o a log
fun ction will tend t o mu te th e spikes th at occur . This measu re ha s been
u se d in st u die s of ris k (Kelsa ll a n d D iggle, 1995b).
3.
4.
r ela t ive den sit y. This ca n be u seful in calcula t in g ch a n ges bet ween t wo tim e
per iods, for exam ple in ca lcu lat ing a cha n ge in rela t ive dens ity between t wo
cen su ses or a ch a n ge in t h e cr im e den sit y bet ween t wo t im e per iod s.
5.
Th er e is th e sum of the densities, t h a t is, t h e den sit y est im a t e for t h e firs t file
plu s t h e den sit y est im a t e for t h e secon d file. Aga in , t h is is a pplied t o ea ch
r efer en ce cell a t a t im e. A poss ible u se of t h e su m oper a t ion is t o combin e
t wo differ en t den sit y su r faces, for exa m ple t h e den sit y of robber ies plu s t h e
density of assa ults;
6.
Ou tp u t F ile s
F ina lly, th e u ser m u st specify th e file for m a t s for t h e ou t pu t . The r esu lts ca n be
out pu t in t h r ee form s. F ir st , t h e r es u lt s a r e disp la yed in a n out pu t t a ble. S econ d, t h e
r esu lt s ca n be out pu t in t o two ra st er gr id for m a t s for dis pla y in a su r face m a pp in g
p rogr a m : S u rfer for Win d ow s form a t a s a .d a t file (Golden Soft wa r e, 1994) a n d ArcView
S pat ial A n alyst for m a t a s a .a s c file (E S RI , 1998). Th ir d , t h e r es u lt s ca n be ou t p u t a s
polygon gr id s in t o ArcView .sh p, M apIn fo .mif an d Atlas*GIS .bn a for m a t (see foot n ot e
1). All bu t S u rfer for Win d ow s r equ ir e t h a t t h e r efer en ce grid be crea t ed by Crim eS tat.
Ex am ple 2: Ke r n e l D e n s i t y Es t i m a t e s o f Ve h i c l e Th e f t s
R e l a ti v e t o P o p u l a ti o n
As a n exa m p le of t h e u s e of t h e d u a l k er n el d en s it y r ou t in e, t h e d u el r ou t in e is
a pp lied in both t h e Cit y of Balt im ore a n d t h e Coun t y of Balt im ore t o 14,853 m otor veh icle
t h eft loca t ions for 1996 r ela t ive t o th e 1990 p opu la t ion of cen su s block gr oup s. Aga in , a
r efer en ce grid of 100 colu m n s by 108 r ows wa s gen er a t ed by Crim eS tat.
Figur e 8.12 shows the r esulting density estima te as a S u rfer for Win d ow s ou t pu t ;
a ga in , t h er e is a m a p view, a su r fa ce view, a n d a con t ou r view. Th e n or m a l k er n el fu n ct ion
wa s u sed a n d a n a da pt ive ban dwidt h of 100 point s wa s selected. As seen , th er e is a ver y
h igh con cent r a t ion of au t o th eft in ciden t s w it h in t h e cen t r a l pa r t of t h e m et r opolita n a r ea .
Th e con t our view s u ggest five or six p ea k a r ea s t h a t a r e close t o each ot h er .
8.29
Figure 8.12:
Contour View
Surface View
39.70
N
39.60
39.50
39.40
39.30
39.20
-76.80
-76.70
-76.60
-76.50
-76.40
-76.30
Mu ch of t h is con cent r a t ion , however , is pr odu ced by h igh popu la t ion den sit y in t h e
m et r opolita n cen t er . Figur e 8.13, for exam ple, shows t h e ker n el est ima t e for 1349 cen su s
block gr ou ps for bot h t h e City of Balt imore a n d t h e Cou n t y of Balt imore wit h t h e 1990
popula t ion a ssign ed a s t h e int en sit y va r iable. Again , th e n or m a l ker n el fu n ct ion wa s u sed
wit h a n a da pt ive ba n dwid t h of 100 poin t s be in g s elect ed. Th e m a p sh ows t h r ee views: 1) a
su r fa ce view; 2) a con t ou r view; a n d 3) a groun d level view lookin g directly nort h . The
dist r ibut ion of popula t ion is, of cou r se, also highly con cen t r a t ed in t h e m et r opolita n cen t er
with t wo pea ks , quit e close t o ea ch ot h er with severa l sm a ller pea ks .
Wh en t h ese t wo ker n el es t im a t es a r e com pa r ed u sin g t h e du a l ker n el de n sit y
r ou t ine, a m or e com plica t ed pictu r e em er ges (figu r e 8.14). This r ou t ine h a s con du ct ed
t h r ee oper a t ion s: 1) it ca lcula t ed t h e dist a n ce bet ween ea ch of t h e 10,800 r efer en ce cells
a n d t h e 14,853 a u t o t h eft loca t ion s, evalu a t ed t h e ker n el fu n ct ion for ea ch m ea su r ed
dis t a n ce, a n d su m m ed t h e r esu lt s for ea ch r efer en ce cell; 2) it ca lcu la t ed t h e dis t a n ce
bet ween ea ch of t h e 10,800 r efer en ce cells a n d t h e 1349 censu s block gr oup s wit h
popu la t ion a s a n in t en sit y va r ia ble, eva lu a t ed t h e k er n el fun ction for ea ch in t en sit yweight ed dist a n ce, an d su m m ed t h e r esu lts for ea ch r efer en ce cell; a n d 3) divided t h e
k er n el d en sit y est im a t e for a u t o t h eft s by t h e ker n el d en sit y est im a t e for popu la t ion for
ea ch r efer en ce cell loca t ion.
Wh ile t h e con cent r a t ion of m otor veh icle t h efts r ela t ive t o popu la t ion (m otor veh icle
t h eft r isk) is st ill h igh in t h e m et r opolita n cen t er , th er e a r e ban ds of h igh r isk t h a t spr ea d
ou t wa r d , p a r t icu la r ly a lon g m a jor a r t er ia ls . Th er e a r e n ow m a n y h ot s pot a r ea s wh ich
h a ve a h igh dis t r ibu t ion of m otor veh icle t h efts r ela t ive t o th e r esiden t ia l popu la t ion . We
cou ld, of cou r se, refine t h is a n a lysis fur t h er by ta kin g, for exam ple, employm en t a s a
ba selin e va r ia ble r a t h er t h a n popu la t ion ; employmen t is a bet t er in dica t or for t h e da yt im e
popu la t ion dis t r ibu t ion wh er ea s t h e r esid en t ia l p opu la t ion is a bet t er in dica t or for
n igh t t im e popu la t ion dis t r ibu t ion (Levin e, Kim , a n d Nit z, 1995a ; 1995b).
E x a m p l e 3: Ke r n e l D e n s i t y Es t i m a t e s a n d R is k -a d ju s t e d C lu s t e r i n g o f
R o b be r i e s R e la t i ve t o P o p u l a t io n
Th e fina l exa m ple sh ows h ow th e du el k er n el in t er pola t ion com pa r es wit h t h e r isk a dju st ed n ea r est n eigh bor clu st er in g, discus sed in cha pt er 6. F igur e 8.15 s h ows 7 first or der r isk-ad just ed clu st er s over laid on t h e a du el ker n el est ima t e of 1996 robberies
r ela t ive t o 19 90 popu la t ion . 4 As seen , th er e is a cor r espond en ce bet ween t h e ident ified
r isk-ad just ed clu st er s a n d t h e du el ker n el in t er pola t ion of t h e r a t io of r obber ies t o
popula t ion . For a br oa d r egion a l per spective, th e int er pola t ion pr odu ces a n a dequ a t e
m odel of wh er e t h er e is a h igh r obber y r isk . At t h e n eigh borh ood level, however , t h e r isk a dju st ed clu st er s a r e m ore s pecific a n d would be pr efer a ble for u se by police in iden t ifyin g
h igh -r is k loca t ion s.
Th e a dva n t a ge of a du a l ker n el dens ity int er pola t ion r ou t ine is t h a t t wo var iables
ca n be rela t ed t ogeth er . By int er pola t ing on e var iable t o a r efer en ce grid a n d t h en
int er pola t ing a secon d var iable t o t h e sa m e r efer en ce grid, t h e t wo var iables h a ve been
8.31
Figure 8.13:
Contour View
Surface View
39.70
N
39.60
39.50
39.40
39.30
39.20
-76.90
-76.80
-76.70
-76.60
-76.50
-76.40
-76.30
Figure 8.14:
Surface View
39.70
N
39.60
39.50
39.40
39.30
39.20
-76.80
-76.70
-76.60
-76.50
-76.40
-76.30
Figure 8.15:
Risk-adjusted Robbery Clusters and Interpolated Robbery Risk
1996 Robberies Relative to 1990 Population
Baltimore County
#
#
#
#
#
##
# #
#
#
#
#
#
# ##
# ##
##
#
#
# ###
#
##
#
#
#
#
# ##
# # ##
#
#
###
#
#
#
##
########
#
### ## #
## #
#
##
#
###
##
# # ##### ## #
# # # ##
# ###
##
# #
#
#
# ##
# #
#
## ### ## ## ##
#
# ##
#
#
##
#
## #
## # ##
## ###
###
##
##
## #
#
# ###
# # #### ## #### #
#
# #
#
#
#
#
##
##
##
#
##
## # #
##
#
###
# #
# #
# ###
#
#
# #
Robbery locations
1st-order robbery risk clusters
City of Baltimore
Baltimore County
Robberies Per 1000 Population
Low
#
##
## ##
## #
# ###
###
##
#
#
# #
##
# ##
##
#
#
#####
## #
##
# ##
#
############
### #
#
##
#
###
### ######
##
# ###
# # ##
#
#
##
#
#
#
#
#
# ##
### #
###
#
##
#
#
# #
##
##
#
##
## #
#
##
# #####
#
#
# ## ### #
##
#
##
#
#
#
##
#
##
######
####
#
##### ###
#
##
#
##
#####
#
# #
# ### ##
# ## # #
#
# ### # #######
# ## #
#
## ######## ### #
#
##
#
## ###### #
# # #
#
# #
## ###
#
#
#
#
#####
# ##
#
#
# #
# # ##
#
#
#
#
#
###
#
#
##
#
#
#
##
##
##
## #
###
#
#
##
#
#
#
##
#
#
###
#
#
# #
###### ##
# ## # # #
##
##
#
##
#
###
#
#
#
#
##
##
##
##
#
#
#
### ##
#
#
# ####
#
#
# #
#
#
## #
#
#
###
#
#
##
#
##
# #
# ##
###
#
##
#
# #
# # #
# #
### ### #
# ##
##
#
### #
#
#
## #
#####
# #####
##
# #
##
#
##
#
#
####
##
#
#
###
###
#
#
##
# ##
# ####
##
#
#
# ## #
## ### ## ##
## ### ##
#
##
# ##
##
#
#
#
##
## #
City of Baltimore
#
#
# ####
#
#
# #
#
## ## # #
#
#
#
### #
#
###
## ##
#
## ###
#
##
#####
#
#
# ###
#
#
# ## # ###
#
##
##
# # ##
#
##
#
#
#
#
#
######
##
#
## ##
# # ##
# ###
##
#
#
##
# # #
#
High
N
W
E
S
18 Miles
int er pola t ed t o t h e sa m e geogra ph ica l un its . The t wo int er pola t ion s can t h en be rela t ed, by
d ivid in g, s u bt r a ct in g, or s u m m in g. As h a s been m en t ion ed t h r ou gh ou t t h is m a n u a l, on e of
t h e pr oblems with t ech n iques t h a t depen d on t h e con cen t r a t ion of inciden t s is t h a t t h ey
ign or e t h e u n der lyin g p op ula t ion -a t -r is k. Wit h t h e d ua l r ou t in e, h owever , we ca n st a r t t o
exa m in e t h e r is k a n d n ot ju s t t h e con cen t r a t ion .
Conclusion
Ker n el den sit y est im a t ion is one of t h e m oder n sp a t ia l st a t ist ical t echn iqu es .
Th er e is cu r r en t ly resea r ch on t h e u se of t h is t echn iqu e in both t h e st a t ist ical t h eory a n d in
develop in g a pplica t ion s. F or cr im e a n a lysis , t h e t ech n iqu e r epr esen t s a power fu l wa y of
con d u ct in g bot h h ot s pot a n a lys is a s well a s bein g a ble t o lin k t h e h ot s pot s t o a n
u n der lying popula t ion -a t -r isk. It ca n be us ed bot h for police deploymen t by ta r getin g ar ea s
of h igh con cen t r a t ion of in cid en t s a s well a s for p r even t ion by t a r get in g a r ea s wit h h igh
r is k . It ca n a ls o be u sed a s a r esea r ch t ool for a n a lyzin g t wo or m or e dis t r ibu t ion s. Mor e
developm en t of t h is a ppr oa ch ca n be expect ed in t h e n ext few yea r s.
8.36
Fron tera
De ns ity
Lo w
Fron tera
Risk
Lo w
Medium
Medium
High
No Data
High
No Data
Both images are quite different, suggesting varying policing strategies. For
example, though there are two well-defined hot spot areas in the Province (one in
the north, the other in the south), the high levels of risk detected in the southern
areas came as a complete surprise. The northern area has a higher crime rate than
the southern area, hence a high police deployment. However, the level of
confrontation are approximately equal between the two areas.
En dn ot e s t o Ch ap te r 8
1.
t h eor y (Ba iley a n d Ga t r ell, 1995). By increa sin g th e ban dwidt h u n t il a fixed
n u m ber of poin t s a r e cou n t ed en su r es t h a t t h e level of p r ecis ion is con st a n t
th roughout th e region. As with all sam pling, th e stan dar d error of th e estima te is a
fu n ct ion of t h e s a m ple s ize; a la r ger s a m ple lea d s t o s m a ller er r or . In gen er a l, if
t h er e wa s in depen den t sa m plin g, t h e 95% con fid en ce in t er va l of a ba n dwid t h for a
n or m a l ker n el cou ld be ap pr oxim a t ed by
.5
95% C.I . = Mea n (Z) +/- 1.96 * --------- * sd(Z)
N(h )1 /2
wh er e N (h) is t h e a da pt ive sa m ple size (t h e n u m ber of point s cou n t ed wit h in t h e
ba n dwidt h for t h e a da pt ive ker n el). This a ssu m es t h a t a poin t h a s a n equa l
likelih ood of fallin g wit h in t h e ba n dw idt h of one cell com pa r ed t o an a dja cent cell
(i.e., it sit s on t h e boun da r y of th e ba n dw idt h cir cle). Th e a da pt ive ba n dw idt h
cr iter ia r equ ires t h a t t h e ban dwidt h be increa sed u n t il it cap t u r es t h e specified
n u m ber of poin t s. On a ver a ge, if t h er e a r e N poin t s in a r egion of ar ea , A, a n d if t h e
a da pt ive sa m ple size is N(p), t h en t h e a ver a ge a r ea r equ ir ed t o ca pt u r e N (p) poin t s
is
N(p) * A
A(p) = -------------N
a n d t h e a ver a ge ba n dw idt h , Mea n (h), is
A(p)
N(p) * A
Mea n (h ) = SQRT[------------] = SQRT[ ---------------]
B
N*B
E a ch of th es e pr ovide differ en t crit er ia for t h e ba n dw idt h size wit h t h e a da pt ive
bein g t h e m ost con ser va t ive. For exa m ple, for a st a n da r dized dis t r ibu t ion wit h
1000 dat a points, a sta nda rdized mean of Z of 0 and a sta nda rdized stan dar d
devia t ion of 1, t h e Silver m a n cr it er ia wou ld pr odu ce a ba n dwid t h of 0.2663; t h e
Bowm a n a n d Azza lin i cr it er ia wou ld pr odu ce a ba n dwid t h of 0.2661; t h e Scot t
crit er ia would pr odu ce a ba n dw idt h of 0.2874 a n d t h e Ba iley a n d Ga t r ell crit er ia
wou ld pr odu ce a ba n dwidt h of 0.1708. For t h e a da pt ive int er val, if t h e r equ ired
a da pt ive sa m ple size is 25, t h en t h e a ver a ge ba n dw idt h would be a pp r oxim a t ely
0.3162 (t h is a ssu m es t h a t t h e a r ea is a circle with a r a diu s of 2 st a n da r dized
st a n da r d d evia t ions ).
2.
4.
8.42