0% found this document useful (0 votes)

24 views18 pages

Ijgi 11 00400

This study investigates crime prediction and monitoring in Porto, Portugal, utilizing machine learning and spatial analysis techniques on police data from 2016 to 2018. The research identifies spatial patterns and hotspots of crime, applying methods like lasso regression and random forest for variable selection, while also incorporating sentiment analysis from social media. The findings aim to enhance urban safety and inform decision-making for police and planning professionals by integrating evidence-based knowledge with local contexts.

Uploaded by

oluempire

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

24 views18 pages

Ijgi 11 00400

Uploaded by

oluempire

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

International Journal of

Geo-Information

Article
Crime Prediction and Monitoring in Porto, Portugal, Using
Machine Learning, Spatial and Text Analytics
Miguel Saraiva 1, * , Irina Matijošaitienė 2 , Saloni Mishra 2 and Ana Amante 1

1 CEGOT—Centre of Studies in Geography and Spatial Planning, Faculty of Arts and Humanities,
University of Porto, Via Panorâmica s/n, 4150-564 Porto, Portugal; [email protected]
2 Data Science Institute, Saint Peter’s University, Jersey City, NJ 07306, USA;
[email protected] (I.M.); [email protected] (S.M.)
* Correspondence: [email protected]; Tel.: +351-226-077-100

Abstract: Crimes are a common societal concern impacting quality of life and economic growth.
Despite the global decrease in crime statistics, specific types of crime and feelings of insecurity, have
often increased, leading safety and security agencies with the need to apply novel approaches and
advanced systems to better predict and prevent occurrences. The use of geospatial technologies,
combined with data mining and machine learning techniques allows for significant advances in the
criminology of place. In this study, official police data from Porto, in Portugal, between 2016 and 2018,
was georeferenced and treated using spatial analysis methods, which allowed the identification of
spatial patterns and relevant hotspots. Then, machine learning processes were applied for space-time
pattern mining. Using lasso regression analysis, significance for crime variables were found, with
random forest and decision tree supporting the important variable selection. Lastly, tweets related to
insecurity were collected and topic modeling and sentiment analysis was performed. Together, these
methods assist interpretation of patterns, prediction and ultimately, performance of both police and
planning professionals.
Citation: Saraiva, M.; Matijošaitienė,
I.; Mishra, S.; Amante, A. Crime Keywords: spatial analysis; machine learning; criminology of place; sentiment analysis; topic
Prediction and Monitoring in Porto, modeling; Portugal
Portugal, Using Machine Learning,
Spatial and Text Analytics. ISPRS Int.
J. Geo-Inf. 2022, 11, 400. https://
doi.org/10.3390/ijgi11070400 1. Introduction
Academic Editors: Jamal Crime is defined as any act that is unlawful. The existence of crime, and more impor-
Jokar Arsanjani and Wolfgang Kainz tantly the feelings of insecurity that may stem directly from it, affects quality of life and the
Received: 13 May 2022
sustainability of societies. Relevant policy and planning agendas such as the UN’s Sustainable
Accepted: 12 July 2022
Development Goals, UN Habitat’s Safer Cities Program, OECD’s well-being index [1] or the
Published: 14 July 2022
EU’s Cohesion Reports [2] clearly stress the need to create urban spaces where inhabitants
feel safe and secure. In that sense, it has long been established that traditional crime fighting
Publisher’s Note: MDPI stays neutral
responses are not, in themselves, enough [3]. Already since the 1970s, but particularly in
with regard to jurisdictional claims in
the last two decades, policing paradigms have shifted from reaction to prevention, and from
published maps and institutional affil-
analyzing just the perpetrator and contextual social factors, to take into account urban factors
iations.
associated to space, time and the generation of opportunities.
Environmental criminology principles [4–6] are thus based on three main ideas. First
that criminal behavior is significantly influenced by the contextual nature of the environ-
Copyright: © 2022 by the authors.
ment it occurs in, i.e., place matters [5], because it possesses individual characteristics that
Licensee MDPI, Basel, Switzerland. potentiate or mitigate crime. Second, the distribution of crime patterns is not random,
This article is an open access article because it is a consequence of such territorial conditions that vary in space and time. Third,
distributed under the terms and by changing the characteristics and also by channeling resources (of police, of urban design
conditions of the Creative Commons or of social or cultural intervention) to these hot-spot locations, a significant reduction in
Attribution (CC BY) license (https:// insecurity can be obtained.
creativecommons.org/licenses/by/ The proliferation of computer modelling, geographical information systems and geospa-
4.0/). tial technologies [7–9] has allowed for significant advances in crime georeferencing, mapping

ISPRS Int. J. Geo-Inf. 2022, 11, 400. https://doi.org/10.3390/ijgi11070400 https://www.mdpi.com/journal/ijgi

ISPRS Int. J. Geo-Inf. 2022, 11, 400 2 of 18

and hot-spotting. Such use of spatial data and analytics to improve performance and preven-
tion has been dubbed as hot-spot policing [10,11], place-based policing [12] or even forensics
GIS [13], part of what Couldren et al. [14] have called the new paradigm of “smart policing”,
which also urges for greater integration and knowledge sharing between police organizations
and research institutes, such as universities. On one hand, increasingly advanced methods as
Space Syntax [15], as well as data mining and machine learning algorithms are being used to
understand spatial patterns and even predict occurrences, using linear methods or Bayesian
models [16–18]. These include, but are not limited to, random forest algorithm (RF) [19],
decision tree [20,21], K-nearest neighbor (KNN) [22,23], support vector machine (SVM) [24]
or artificial neural networks (ANN) [25]. On the other, authors such as Bannister et al. [26]
have recently cautioned for the increasing dependency of results derived from Big Data and
modelling algorithms, where “causation is dead, correlation is king” [26] (p. 323), because
they privilege “method over meaning by adopting a non-critical approach to the spatial and
temporal features of the data” [26] (p. 323). Furthermore, it is clear from the literature that the
use of these techniques is more prevalent in certain countries, whereas other countries are still
in the early stages of place-based policing, with a low academic and institutional culture of
crime mapping or even crime georeferencing [27].
Consequently, these advances need to be properly framed and understood in local
contexts. First, the impact that new technologies and this unprecedented capacity for
data management and spatial analysis can have on evidence-based policing needs to be
addressed. Second, how they can go beyond computation to a more holistic contribution
to decision support, in line with the sharing and shifting of responsibilities promoted by
new models of policing [28]. Third, as Andresen and Weisburd [12] suggest, how such
theories, methods and models behave outside the locations where most of them have been
developed and tested, namely outside larger metropolis and also in peripheral countries.
In this paper, these queries are addressed in a case study in Porto, Portugal. At
the western edge of Europe and recently overcoming a deep financial crisis, Portugal is
considered one of the safest countries in the world, holding the fourth worldwide position
in the Global Peace Index [29] and presenting one of the lowest victimization rates in
Europe [30], as well as a medium-threat status [31]. At the same time, it presents high fear
of crime [32], something which may be reflected on the fact that it has one of the highest
rates of police officers per inhabitant in Europe [33]. Furthermore, there is still a low crime
mapping, georeferencing and spatial analysis culture in the country [27] and very few
examples of crime modelling using space-based algorithms exist [34–36].
Using official registries of crime data from Porto’s Public Security Police from the
pre-pandemic period between January 2016 to December 2018, this paper aims to contribute
to the current literature on geospatial crime modelling by combining spatial analysis with
machine learning to create an experimental predictive model. More than the use of the
techniques themselves, the production of evidence- and space-based knowledge for urban
safety is deemed crucial at a time when often scarce local resources need to be properly
managed and integrated with planning and territorial agendas, catering for quality of life
and sustainability.

2. Machine Learning, Sentiment Analysis and Topic Modelling in Crime Hot-Spotting

and Prediction
The recent popularity of Criminology of Place research combined with the technologi-
cal advances of the 21st Century, allowed for a “nascent literature of algorithmic approaches
to time—and place—specific crime hot spot prediction” [26] (p. 323), where Big Data should
be recognized as “profound new instruments of social perception” [37] (p. 7). In the last
few years this has even been more pressing. Machine learning approaches have been
widely applied in different fields, such as urban science, transport and pedestrian flow
prediction, healthcare, biology, archeology, finance and even arts [38,39]. They have been
used to monitor illegal activities [40,41] and to model and predict crime, with authors often
comparing various methods [42–46].
ISPRS Int. J. Geo-Inf. 2022, 11, 400 3 of 18

For example, Lin et al. [42], working in Taiwan, proposed a data-driven method
based on the broken windows theory to predict emerging crime hotspots, improving
model performance by accumulating data with different time scales. Of all methods tested,
deep learning algorithms, random forest and naïve Bayes provided better predictions.
For Zhang et al. [43], however, the results based on the historical crime data and using
built environment points of interests and urban road network density as co-variates to
improve performance, suggested that the deep learning long short-term memory (LSTM)
model outperformed others. In another recent study on the space-time patterns of theft
in Manhattan, where an application prototype for searching safer parking was developed,
Matijosaitiene et al. [44] discovered that linear models performed better. Comparing
five boroughs of New York city, Pinto et al. [45] also uncovered that multivariate linear
regression yielded a better accuracy at predicting the type of crime represented but decision
trees were best at predicting the borough where the crime occurred.
Such findings should imply that considering place-specific conditions, rather than univer-
sal computation (a one-method fits all approach), should guide the use of these algorithms. In-
deed, authors have applied machine learning methods to extract knowledge and predict crime
data trends with underlying place-based social, urban and economic factors. Mittal et al. [46],
for example, used machine learning in an Indian context to predict the causality between
crime rates, such as of theft, robbery and burglary, with economic indicators, observing, in
that case, that unemployment was the greatest explanatory variable.
Recurrent as well is the integration of these models with spatial analysis using geo-
graphical information systems (GIS), as a way to clarify space-time patterns, uncover spatial
determinants and overall improve the geographical hot-spot and place-based approach of
modern day Criminology of Place. For example, Bogomolov et al. [47] used aggregated
behavioral Big Data derived from mobile phones in combination with basic demographic
information, to predict if areas in London were prone to being crime hotspots or not, arriv-
ing at an accuracy of 70%. The experiments of Zhou et al. [48] arrive at similar conclusions,
uncovering high efficiency and accuracy rates using a combined approach of a non-linear
algorithms, gradient boost decision trees (GBDT) and GIS models, to assess the influence
of over one thousand factors ranging from demographic, housing, education, economy,
social and city planning. GBDT performed, in this case, better than other methods as
logistic regression (LR), support vector machines (SVM), artificial neural networks (ANN)
or random forest (RF).
Such area-specific crime prediction models, as Boni et al. [49] named them, should
recognize the geographical non-homogeneity of crime patterns, something which fits with
Weisburd’s Law of Crime Concentration [50]. In the case of Boni, hierarchical and multi-task
statistical learning was used to predict crimes at ZIP code level, through localized models
where sparseness was mitigated by sharing information across areas. Spatial-temporal
prediction through the encoding of area-specific crime incidents was also applied, for
example, by Zhang et al. [51] and Bappee et al. [52], showing results in compliance with
Weisebud’s Law. The first used histogram-based statistical methods, discriminant analysis
(LDA), and K-nearest neighbors (KNN), comparing patterns with neighborhood features
and the temporal distance to important holidays, noticing greater performance as more
fine-tuned the temporal data was. The second used hierarchical density-based spatial
clustering of applications with noise (HDBSCAN) to extract hotpoints from crime hotspots
for different categories of crime, then computing a spatial distance between the cluster
centroids (i.e., hotpoints of crime hotspots) as a feature for classifiers. In this case LR and
SVM displayed more accuracy than RF. Like those of spatial analysis, these area-specific and
space-based results of machine learning can, to a certain extent, be displayed, interpreted
and shared in web GIS applications, to assist in decision support of local authorities but
also citizens [53].
Another point of debate is how to include non-structured data, related to perceptions,
routines and overall sentiments of city dwellers. Moving beyond surveys, research has
increasingly looked into mobile phone data as a proxy for activity patterns [47,54] but
ISPRS Int. J. Geo-Inf. 2022, 11, 400 4 of 18

also extensively at social media, constructing sentiment analyses, i.e., based on emotions
derived from the study of individual messages. Many of these have used Twitter data
as a source, due to substantial use in many countries, the free availability of data and
the fact that tweets are often associated to spatial and temporal coordinates [55–59]. In
the United States, Gerber [55] showed how the use of Twitter data, through linguistic
analysis and statistical topic modeling, improved the performance of prediction models
for 19 of 25 types of crime, in comparison with a standard interpolation approach based on
kernel density estimation. In India, Thanh et al. [56] found that sentiment analysis based on
Twitter data led to results which matched with real crime rate data, whereas Wang et al. [57]
display how a model including the automatic semantic analysis of Twitter posts combined
with dimensionality reduction and prediction via linear modeling outperformed baseline
models. Using data from nearby tweets of a criminal occurrence, Siriaraya et al. [58] also used
sentiment analysis to uncover the negative characteristics of spatial areas related to different
crimes, again emphasizing the relevance of a geographical baseline in such analysis.
Contrary to sentiment analysis, not many examples are found that have used topic
modelling on crime-related data [60,61]. This method uses statistical machine learning
techniques to identify patterns (as a verbal description) in a corpus or large amount of
unstructured text. For example, Pandey et al. [60] analyzed crime reports from Los Angeles,
evaluating topic coherence against spatial concentration, in a test of the Law of Crime
Concentration. Their findings suggest that latent dirichlet allocation (LDA) generated
crime-related topics with higher coherence and crime concentration, whereas non-negative
matrix factorization (NMF) improved the coherence, but the spatial concentration was not
as high.
As Bannister et al. [26] suggest, studies like these all have data limitations related to
the representativeness of the social media data but also in connection with the accuracy of
the geographical and temporal crime data used [62]. More research is needed into models
that can cross detailed spatial analysis using GIS and official geo-temporal crime data, with
the advances in machine learning and data-mining techniques.

3. Data and Methods

3.1. Case-Study Context
The case-study of this research is the city of Porto, in Portugal. The second city in
the country, after the capital Lisbon, Porto is home to around 240,000 inhabitants [63].
Recent diagnoses have placed Porto as one of the cities with the highest reported levels of
criminality in Portugal, registering particularly crimes against property (e.g., auto-thefts,
pickpocketing, robbery of buildings); against people (notably physical integrity but also
domestic violence, threat or coercion); crimes against society (such as forgery or drunk
driving) and miscellaneous crimes (as narcotics traffic) [27] (p. 64). As a prime tourist
destination in Europe, it is also prone to rises in non-violent street crime in the summer
months [31]. The total number of registered crimes per year has been decreasing somewhat
over the last decade in Porto (from around 16 to 14 thousand), yet the city has also lost
inhabitants to peripheral suburbs, leading to a more or less steady number of 65 criminal
occurrences per thousand inhabitants [64].

3.2. Data Sources

The crime data used in this study are confidential data purposely supplied to the
research team by the Public Safety Police of Porto, as the only publicly available crime data
in Portugal are the totals by municipality. This restricted and not georeferenced dataset
consisted of a spread sheet, compiled by the police, containing the date, the hour, the
typology, the parish and the street name of all reported crimes occurring inside the city
limits between January 2016 and December 2018, amounting to around 42 thousand entries.
Only 4% of data had not enough information to be georeferenced. The remainder, after
extensively cleaning the database (mainly street names, which were not unified), was
georeferenced by the research team at street segments, considering parish divisions.
ISPRS Int. J. Geo-Inf. 2022, 11, 400 5 of 18

Other datasets included census data, obtained from the Portuguese Institute of Na-
tional Statistics [63], reporting from the last population census or more recent data, when
available. This consisted of over 150 indicators at a city block level, related to building
data (such as building type, age and type of use); dwelling data (such as size, typology,
conditions and occupancy); population data (such as age, gender or education); family data
(types, size, number of children) and employment data. Urban and land-use data were
retrieved either from official sources of Porto’s Municipality or Open Street Maps when
the first was unavailable. This includes land-use and points-of-interest; connectivity, road
network and traffic signal data; as well as the location of police stations and CCTV cameras.
Tweets for topic modeling and sentiment analysis were extracted using Snscrape [65].
A radius of 1 km from all crime data points was considered to extract the tweets, and a
specific set of terms related to crime in English and Portuguese were searched. Based on
the literature analysis, a set of crime-related terms was prepared. The list consists of over
fifty crime-related terms.

3.3. Methodology
Three types of methodological procedures were used to identify the crime pattern in
the city, forecast crime rates and then predict crime as occurs/does not occur. These were
geospatial analysis, machine learning modeling and natural language processing (NLP).
For understanding crime patterns, spatial analysis tools were applied to the dataset,
using ArcGIS 10.7.1. After all datasets were merged and the final merged dataset was
preprocessed and cleaned, crime entries were georeferenced considering street coordinates,
and then plotted with a kernel density estimation (KDE), an interpolation technique often
used in crime analysis, as it presents more precise results and is easily understood by
stakeholders [66,67]. Although there is not a consensus regarding which parameters to
use [68], authors have advocated that it is a very useful methodology to describe small
local changes [69]. For that reason, and also catering to the smaller size of Portuguese cities,
a cell size of 50 m was tested. This is smaller than those recently used in crime mapping
literature as for example 63 m [67], 90 m [70] or 100 m [71], but consistent with previous
research for Portugal [72]. Results were validated with officers from the Public Safety
Police of Porto. Further emerging hot-spot analysis was performed [73], i.e., a data-mining
technique which reveals which hot and cold spots have been maintained or changed over
space and time. A fishnet grid was used, taking into account a larger cell size.
Considering this information, a random forest algorithm was used to predict the
values of each location of a space-time cube. The tool builds two models for each location
in cube, and then forecasts the future time phase values. The fit of the model is determined
by the value of the forecast root mean square error (RMSE). A “windowing” technique is
used, when for each location of the space-time cube two random forest regression models
are built. The model uses the actual and then forecasted values to forecast the values for
the future time steps. The model with a smaller RMSE is selected as the best fit model out
of two models for each location of a space-time cube.
After understanding the point pattern of registered crimes, various machine learning
analysis based on supervised methods were performed to determine the influence of
contextual urban, morphological and socio-economic factors. Variable selection, in order
to pick the most appropriate subset of predictors for the model, thus avoiding noise,
complexity and multicollinearity issues, was carried out using LASSO regression (least
absolute shrinkage and selection operator) [74]. Then, for the crime modeling, where crime
rates were converted into a binary target—0 if no crime occurred and 1 if at least one crime
occurred—four different classification methods were applied to predict crime classes 0 “No
Crime Will Occur” or 1 “At Least One Crime Will Occur”. First logistic regression, where
the sigmoid function is used to map the predictions to probabilities, where L-1 penalty
is added to perform variable selection (i.e., to select only the most important for crime
variables out of a large number of initial variables), which shrinks the coefficients of the
less contributive variables to zero. Second, decision trees, a non-parametric supervised
ISPRS Int. J. Geo-Inf. 2022, 11, 400 6 of 18

learning method where a model is built by splitting the data records until all or most of
the records classify into their respective class labels 0 “No Crime Will Occur” or 1 “At
Least One Crime Will Occur”. Decision trees are applied with the “pruning” of leaves and
branches responsible for classification [75] to prevent tree-based model from overfitting.
Overfitting happens when the model learns very well patterns in the training data and
therefore, demonstrates a high model performance on the training data; however, it is
unable to generalize the learned patterns on a new data. Third, random forest, where a
large number of individual decision trees, constructed from samples taken from the training
set, are considered, with each predicting a class and then an ensemble method determining
the class with the most votes as the prediction of the model [76,77]. To build and train the
random forest model, a random split on the features is also performed, in addition to a
random selection of bootstrap samples. Fourth, support vector machine (SVM), which aims
to allocate hyperplanes that specifically classify data points, i.e., the ones with the greatest
difference between data points in both groups [78].
Lastly, two natural language processing methods, topic modeling and sentiment
analysis, were used. The first, through latent dirichlet allocation (LDA) [79], classifies text
in a document to a particular topic. For each document d, it processes each word w and
computes p (topic t|document d), i.e., the proportion of words in document d that are
assigned to topic t. Then p (word w|topic t), i.e., the proportion of assignments to topic
t over all documents that come from the word w. On the other hand, sentiment analysis
mines the text to identify and extract subjective information related, for example, to positive
or negative sentiments [80]. An approach is to use machine learning and different functions
to construct a classifier that can recognize sentimental text. Another, which does not include
data training, is lexicon based and uses a variety of terms annotated by the polarity score.
Both approaches can be merged into a third hybrid approach. Though, in this research,
the two methods LDA and sentiment analysis are used separately as valuable additions to
each other.

4. Porto’s Crime Pattern between 2016 and 2018

4.1. Statistical Pattern
Between 2016 and 2018, official police records contain a little over 42 thousand en-
tries, of which around 1600 (3.8%) cannot be georeferenced at street level, due to lack of
information in the registry or in the case of a crime where the victim is unable to know the
exact location (e.g., a wallet theft). The total amount of registered crimes has been slightly
augmenting, from around 13 thousand in 2016, to 14 thousand in 2017 and to around 14,500
in 2018. Consistent with national tendencies reported elsewhere [27], in Porto the most
common types of crime are crimes against heritage/property (64%; including as the main
subcategories auto theft and wallet theft) and crimes against people (18%; including offense
against physical integrity, domestic violence or threats and coercion). These are followed by
crimes against life in society (as drunk driving or gun trafficking) and miscellaneous crimes
(as drug trafficking or driving without a license); with around 8% each. Other types of crimes,
against cultural identity, against pets or against the state, account for less than 2%.
During the day (Figure 1a), crime occurrences gradually increase from 8 a.m. onward,
peaking between 6 p.m. and 8 p.m., then gradually decreasing again, which indicates
that the evenings are more crime-prone than any other time of the day. During the year
(Figure 1b), the number of overall registered crimes per month is relatively steady (be-
tween 3200 and 3700), with the highest numbers occurring between May and September,
something which corresponds to previous country assessments [31]. The days with the
least reported crimes are associated to Christmas and New Year festivities (20, 25 and
31 December and 2 January), while the largest number of reported crimes are associated to
other holidays: 24 June, the day of Porto’s municipal holiday (celebrated on the night of
the 23) or 1 November, a religious holiday.
year (Figure 1b), the number of overall registered crimes per month is relatively steady
(between 3200 and 3700), with the highest numbers occurring between May and Septem-
ber, something which corresponds to previous country assessments [31]. The days with
the least reported crimes are associated to Christmas and New Year festivities (20, 25 and
31 December and 2 January), while the largest number of reported crimes are associated
ISPRS Int. J. Geo-Inf. 2022, 11, 400 7 of 18
to other holidays: 24 June, the day of Porto’s municipal holiday (celebrated on the night
of the 23) or 1 November, a religious holiday.

(a) (b)
Figure 1.
Figure 1. Porto’s
Porto’s registered
registered crime
crime occurrence
occurrencebetween
between2016
2016and
and2018:
2018: (a)
(a) by
by hour;
hour; (b)
(b) by
by month
month and
and
day (source: authors, based on data reports of Porto’s Public Safety Police).
day (source: authors, based on data reports of Porto’s Public Safety Police).

4.2. Spatial
4.2. Spatial and
and Temporal
TemporalPattern
Pattern
Figure 22 shows a KDE for
Figure for Porto,
Porto, based
basedon onthe
thevalues
valuesofofstreet
streetsegments.
segments.The The Law
Law of
Crime
of Crime Concentration
Concentration is isconfirmed,
confirmed,asasspecific
specificsegments
segmentsand andareas
areasofof the
the city are more
more
prone to
prone to criminal
criminal occurrences
occurrences than than others.
others. This
This happens
happens particularly
particularly inin the
the downtown
downtown
area
area (the greatest concentration)
concentration)in inand
andaround
aroundthe themain
main pedestrian/shopping
pedestrian/shopping street
street of
of the
the
city,city, Santa
Santa Catarina
Catarina Street,
Street, and
and themain
the mainsquare
squarewhere
wherethetheCity
City Hall
Hall is located (Aliados
(Aliados
Avenue),
Avenue),both bothclose
closeto
tothe
thecity’s
city’snighttime
nighttimedistrict.
district.Elsewhere,
Elsewhere,noticeable
noticeableconcentrations
concentrations
also
also occur on the northern edge of the city, where the largest university campus
occur on the northern edge of the city, where the largest university campus and and the
the
city’s
city’s main
ISPRS Int. J. Geo-Inf. 2022, 11, x FOR PEER REVIEWmain hospital are
hospital arelocated,
located,andandin other
in main
other mainavenues as
avenues Boavista
as Avenue
Boavista (to
Avenue the city’s
(to the
8 of 20
west), CampoCampo
city’s west), AlegreAlegre
Street (west
Streetof(west
the city center),
of the Constituição
city center), AvenueAvenue
Constituição (north of(north
the city
of
center),
the city Costa Cabral
center), CostaorCabral
Fernando Magalhães
or Fernando Street (toStreet
Magalhães the northeast).
(to the northeast).

Figure 2. Porto’s kernel density estimation of reported crimes from 2016 to 2018 by street segment
Figure 2. Porto’s kernel density estimation of reported crimes from 2016 to 2018 by street segment
(source: authors,
(source: authors, based
based onon data
data reportsofofPorto’s
reports Porto’sPublic
PublicSafety
SafetyPolice).
Police).

Emerging hot-spot analysis was performed considering a space-time bin of 3 months

(Figure 3). The downtown is confirmed as the most statistically significant hotspot of
the city, being a hotspot for ninety percent or more of the time steps, including the final
time step (intensifying and persistent hotspot, respectively). The Boavista and Campo
Alegre areas have locations of consecutive hotspots (single uninterrupted run of statistically
significant hotspot bins in the final time-steps), or sporadic hotspots (a location on-again
then off-again hotspot). A small persistent hotspot pattern is witnessed up north around
the Hospital/University campus. Noteworthy is the sporadic and particularly the new
hotspot
ISPRS Int. J. Geo-Inf. 2022, 11, x FOR PEER REVIEW area (i.e., a location that is a statistically significant hotspot for the final time
9 of step
20
and has never been a statistically significant hotspot before) west of the city center around
the middle of Boavista Avenue.

Figure 3. Emerging hotspot analysis for reported crimes (source: authors, based on data reports of
Figure 3. Emerging hotspot analysis for reported crimes (source: authors, based on data reports of
Porto’s Public Safety Police).
Porto’s Public Safety Police).
4.3. Forecasting
4.3. Forecasting
Using clustering, an unsupervised machine learning tool, it is possible to identify
Using clustering,
natural patterns an unsupervised
of clusters in the data.machine
To obtainlearning
spatialtool, it is possible
clusters of crime toin identify
regard to
natural patterns of clusters in the data. To obtain spatial clusters
census data, the latter was merged with crime data using a spatial join techniqueof crime in regard to a
(i.e.,
census data, the latter was merged with crime data using a spatial join
merged crime-census data is used as input into the clustering algorithm). Then DBSCAN technique (i.e., a
merged crime-census
clustering analysis wasdataperformed,
is used as input
where into the clustering
epsilon = 533 m algorithm). Then DBSCAN
(optimum radius for cluster
clustering
analysis) analysis
was definedwasby performed,
the “elbow” where
method epsilon
while=plotting
533 m (optimum radius
cluster distance for cluster
against a wide
analysis)
range ofwas defined
possible by the
epsilon “elbow” method while plotting cluster distance against a
values.
wide range
Then,oftopossible epsilon
forecast the crimevalues.
counts, random forest forecast tool within ArcGIS was used.
Then, to forecast the crime
Using the Breiman’s [81] extension counts,
of therandom
randomforest
forestforecast
algorithm,toolthe
within
modelArcGIS
forecastswas the
used.
valuesUsing the Breiman’s
of each space-time[81]cubeextension
location, of
in the
thisrandom forest algorithm,
case performed the model
on a cell size of 500m. fore-
The
forecasting
casts of crimes
the values of eachwas performed
space-time forlocation,
cube the twelve in months
this caseafter the dataset,
performed on afrom January
cell size of
2019 The
500m. to December
forecasting 2019. Figure 4was
of crimes demonstrates
performed forecasted crimemonths
for the twelve countsafter
on the
theunseen
dataset, test
dataJanuary
from set. The2019
forecasted crime counts
to December vary from
2019. Figure 0 to 746 per square,
4 demonstrates withcrime
forecasted the highest
countscrime
on
density
the unseenareas beingset.
test data in The
the center of the
forecasted city counts
crime and then along
vary from the main
0 to 746 axes as previously
per square, with
identified. A new hotspot location has also been indicated.
the highest crime density areas being in the center of the city and then along the main axes
as previously identified. A new hotspot location has also been indicated.
ISPRS Int.
ISPRS J. Geo-Inf.
Int. 2022,
J. Geo-Inf. 11,11,
2022, x FOR
400 PEER REVIEW 10 of9 of
2018

Figure 4. Crime forecast in Porto based on 2016 to 2018 data (source: authors, based on data reports
Figure 4. Crime forecast in Porto based on 2016 to 2018 data (source: authors, based on data reports
of Porto’s Public Safety Police).
of Porto’s Public Safety Police).
5. Machine Learning for Crime Prediction
5. Machine Learning for Crime Prediction
To apply machine learning methods for crime prediction, all datasets were spatially
To apply
joined: Crimemachine learning
data, census datamethods for crime prediction,
about buildings, all datasets were
dwellings, population, spatially
family and em-
joined: Crime data, census data about buildings, dwellings, population, family and
ployment data, urban and land-use data with points-of-interest, connectivity, road network em-
ployment data, urban and land-use data with points-of-interest,
and traffic signals, locations of police stations and CCTV cameras. connectivity, road net-
work and traffic signals, locations of police stations and CCTV cameras.
5.1. Feature Selection with Lasso Regression
5.1. Feature
LassoSelection withwas
regression Lasso Regression
applied to Porto’s crime data to select a subset of predictors that
areLasso regression
the most importantwasinapplied
terms of tocrime.
Porto’sHaving
crime data
fewertopredictors
select a subset of predictors
that have a stronger
that are the most important in terms of crime. Having fewer predictors that havetime
predictive power decreases the prediction error and minimizes the computational a
and resources,
stronger predictiveas well as prevents
power decreases thethe
prediction
predictionmodel
errorfrom
andoverfitting.
minimizesLasso regression
the computa-
usestime
tional L1 penalty that allows
and resources, as wellregression
as preventscoefficients for unimportant
the prediction model from and less important
overfitting. Lasso
predictorsuses
regression to shrink to zero.that
L1 penalty Theallows
proportion of the coefficients
regression training andfor testunimportant
sets used forandthe Lasso
less
regressionpredictors
important was 67% and 33% accordingly.
to shrink to zero. TheAproportion
positive regression coefficient
of the training andindicates
test sets that
usedas
the
for value
the of regression
Lasso the predictor was variables
67% and increases, the value A
33% accordingly. of positive
the response variable
regression also tends
coefficient
to increase. Whereas a negative regression coefficient suggests that as the
indicates that as the value of the predictor variables increases, the value of the response predictor variable
increases, the response variable tends to decrease. Variables “Population
variable also tends to increase. Whereas a negative regression coefficient suggests that with a low level
asof
schooling”
the predictorand “Percentage
variable increases, of youngsters”
the responsehave positive
variable tendscoefficients,
to decrease.and therefore,
Variables with
“Pop-
the increase
ulation with aoflowthese variables,
level crime and
of schooling” rates“Percentage
tend to increase. Whereas variables
of youngsters” “Population
have positive coef-
with a higher
ficients, education
and therefore, (university
with the increasedegree)”, “Institutional
of these families”,
variables, crime rates“Present
tend to population
increase.
Whereas variables “Population with a higher education (university degree)”, residential
(male)”, “Classic family dwellings of usual residence with 1 or 2 rooms”, “Mainly “Institu-
buildings”
tional and“Present
families”, the presence of CCTV
population have “Classic
(male)”, negativefamily
coefficients,
dwellingsandoftherefore, with the
usual residence
increase of these variables crime rates tend to decrease.
with 1 or 2 rooms”, “Mainly residential buildings” and the presence of CCTV have nega-
tive coefficients, and therefore, with the increase of these variables crime rates tend to
5.2. Classification
decrease.
Classification is a machine learning task that classifies records into classes by predicting
5.2. Classification them labels. There are many methods in the classification, in this study
and assigning
different classification algorithms were applied. For classification purposes, the target, i.e.,
Classification is a machine learning task that classifies records into classes by predict-
crime rate, is transformed into a binary variable, where 0 means “No Crime Will Occur”
ing and assigning them labels. There are many methods in the classification, in this study
and 1 means “At Least One Crime Will Occur”.
different classification algorithms were applied. For classification purposes, the target, i.e.,
ISPRS Int. J. Geo-Inf. 2022, 11, 400 10 of 18

First, logistic regression with L1 penalty was applied to identify variables that are
associated with crime as a binary target. To train and test the logistic regression model,
records in the data set were divided into 70% train and 30% test sets. Using the grid
search with cross-fold validation over a range of hyper-parameters allowed us to tune
the best alpha = 0.151 for L1 penalty that selected the most important variables for the
presence of reported crime. “Buildings with a wall structure in masonry with plate”,
“Buildings built before 1919”, “Present population (male)”, “Buildings built between 1946
and 1960”, “Buildings built between 2006 and 2011” and CCTV have negative coefficients
and, therefore, make crime less likely to occur. Whereas “Classic family dwellings of usual
residence with 1 or 2 rooms”, “Population with a low level of schooling”, “Buildings with 5
or more floors” have positive coefficients and, therefore, make crime more likely to occur.
To build the SVM classification model, the grid search with cross-fold validation
over a range of hyper-parameters allowed us to tune the best kernel = rbf, regularization
parameter C = 1 and gamma parameter = 0.1.
Crime prediction models were also built using decision tree and random forest by tuning
the hyper-parameters and using the grid search with cross-fold validation, as well as the support
vector machine. Decision tree and random forest identified the following important variables:
“Buildings (classic)”, “Residents with the 1st cycle of basic education” and “Present population
(male)”. The model comparison Table 1 advises that the random forest has the best model
performance accuracy = 0.832, recall = 0.99, precision = 0.79 and F1 score = 0.89. Random
forest also provides a set of important for crime variables. Thus, the logistic regression model
provides a detailed set of important for crime variables and impact (positive or negative) of
these variables on crime, although it underperforms based on the precision metric.

Table 1. Comparison of machine learning classification model performance.

Model Accuracy Recall Precision F1 Score

Logistic Regression
0.65 0.84 0.64 0.72
(L1 penalty = 0.151)
Decision Tree
0.61 0.56 0.70 0.63
(criterion = entropy, max depth = 3)
Random Forest
(max. features = 2, number of trees = 100, 0.83 0.99 0.79 0.89
max depth = 5)
SVM
0.80 0.87 0.82 0.91
(kernel = rbf, C = 1, gamma = 0.1)

5.3. Natural Language Processing (NLP)

To analyze the social activity and opinion dimension in regard to crime, tweets from
Twitter were collected by using Snscrape library, a social networking service scraper in
Python. The longitude and latitude of crime data points have been used to extract tweets
within 1 km radius around crime locations. To try and relate to the crime pattern, in a first
experimental iteration, tweets associated to words such as theft, burglary, arson, vandalism,
violence, etc., in English and Portuguese were searched. These represented only a small
amount of the total number of tweets in existence in this area, which may indicate that
users do not log-in to report on crime-related subjects. In this case, around 1300 tweets
were collected, with most of them actually associated to media sources, in particular the
user “JornalNoticias” (literal translation: Newspaper of News), a Porto-based national
Portuguese newspaper.
In Figure 5, these tweets are spatially plotted, and it can be seen that the biggest number of
tweets are in and around downtown and, particularly, further south in the nightlife district of
Ribeira, consistent with the persistence and intensifying hotspots of reported crime previously
identified, as well as the areas where the forecasting was highest. Noticeable also is the
concentrations in Campo Alegre (west of the city center) and in the Cerco social neighborhood
(east of the city center), not temporal hotspot locations but with significant crime densities.
In Figure 5, these tweets are spatially plotted, and it can be seen that the biggest num-
ber of tweets are in and around downtown and, particularly, further south in the nightlife
district of Ribeira, consistent with the persistence and intensifying hotspots of reported
crime previously identified, as well as the areas where the forecasting was highest. No-
ticeable also is the concentrations in Campo Alegre (west of the city center) and in the
ISPRS Int. J. Geo-Inf. 2022, 11, 400
Cerco social neighborhood (east of the city center), not temporal hotspot locations but11 of 18
with significant crime densities.

Figure 5. Spatial Distribution of tweets collected (source: authors, based on Twitter data).
Figure 5. Spatial Distribution of tweets collected (source: authors, based on Twitter data).
5.3.1. Topic Modeling (LDA)
5.3.1. Topic Modeling (LDA)
Topic modeling is a type of statistical modeling that identifies the “topics” that occur
in aTopic modeling
collection is a type of statistical
of documents. modeling allocation
Latent dirichlet that identifies the “topics”
(LDA) that occur
is the method of topic
inmodeling
a collection of documents. Latent dirichlet allocation (LDA) is the method
used in this research study. After cleaning the data (stemming, lemmatizationof topic mod-
eling used in this research study. After cleaning the data (stemming, lemmatization and
and vectorization) and tuning the hyper-parameters using grid search and cross-fold
vectorization) and tuning the hyper-parameters using grid search and cross-fold valida-
validation, the LDA model was run, and the value of Log likelihood −56,491 and perplexity
tion, the LDA model was run, and the value of Log likelihood −56,491 and perplexity
134.68 was computed. Topics with different weights of tweets were computed (Figure 6),
134.68 was computed. Topics with different weights of tweets were computed (Figure 6),
and from these topics, concerns of dwellers may be understood. The higher the weight,
and from these topics, concerns of dwellers may be understood. The higher the weight,
the bigger the word in the word cloud. As above seen, ordinary people may not directly
the bigger the word in the word cloud. As above seen, ordinary people may not directly
tweet about crime; newspapers seem mostly to do that in Porto. So, words such as theft,
tweet about crime; newspapers seem mostly to do that in Porto. So, words such as theft,
burglary, battery, violence are not very common in the topics. On the contrary, other words
burglary, battery, violence are not very common in the topics. On the contrary, other
moremore
words related to thetosense
related of insecurity,
the sense including
of insecurity, crime,
including police,
crime, police
police, arrest,
police prison,
arrest, murder,
prison,
influence, people or injury, have high weight in their respective topics
murder, influence, people or injury, have high weight in their respective topics (Some (Some non-topics
such as “thcmbzzbo”
non-topics or “mgruq” appear in this figure. This can be derived from incorrect
ISPRS Int. J. Geo-Inf. 2022, 11, x FOR PEER REVIEWsuch as “thcmbzzbo” or “mgruq” appear in this figure. This can be derived 13 of 20
from incorrect spellings orlanguage”
spellings or a “personal a “personal used in the tweets.
language” used inIfthe
a term does
tweets. If not make
a term sense
does notand is
not asense
make knownandabbreviation
is not a known or abbreviation
slang, it wasor removed
slang, it during text preprocessing).
was removed during text prepro-
cessing).

Figure 6. Five topics resulting from the LDA modeling (source: authors, based on Twitter data).
Figure 6. Five topics resulting from the LDA modeling (source: authors, based on Twitter data).

5.3.2. Sentiment Analysis

Sentiment analysis is the mining of text which identifies and extracts subjective in-
formation of sentiment/opinion that can be positive or negative. For this analysis, the
AFINN lexicon-based method was used. AFINN is a list of words rated for valence with
an integer between minus five (negative) and plus five (positive). Figure 7 presents the
ISPRS Int. J. Geo-Inf. 2022, 11, 400 12 of 18
Figure 6. Five topics resulting from the LDA modeling (source: authors, based on Twitter data).

5.3.2.
5.3.2. Sentiment
Sentiment Analysis
Analysis
Sentiment
Sentiment analysis is
analysis is the
the mining
mining of of text
text which
which identifies
identifies and
and extracts
extracts subjective
subjective in-
in-
formation of sentiment/opinion that can be positive or negative. For this
formation of sentiment/opinion that can be positive or negative. For this analysis, analysis, the the
AFINN lexicon-based method
AFINN lexicon-based methodwas wasused.
used.AFINN
AFINNisisa alist
listofof words
words rated
rated forfor valence
valence with
with an
an integer between minus five (negative) and plus five (positive). Figure 7
integer between minus five (negative) and plus five (positive). Figure 7 presents the word presents the
word
cloudsclouds of positive
of positive and negative
and negative sentiments
sentiments found infound in the analysis.
the Tweeter Tweeter Tweets
analysis. Tweets
including
including
words such words such
as love, as love,
god, win, god,
bookwin, book or awesome
or awesome have highhave high frequency
frequency in the
in the most most
positive
positive sentiments,
sentiments, whereas whereas
tweets suchtweets such assentenced,
as prison, prison, sentenced,
killed and killed andmake
profane profanethemake
most
the most negative
negative sentiments.sentiments.

Figure 7.
Figure Word clouds
7. Word of the
clouds of the most
most positive
positive and
and most
most negative sentiments based
negative sentiments based on
on the
the sentiment
sentiment
analysis of
analysis of tweets
tweets (source:
(source: authors,
authors, based
based on
on Twitter
Twitter data).
data).
Figure 8 demonstrates that the tweets have mostly a negative sentiment (negative
Figure
values), 8 demonstrates
in line with what was that the tweets
discussed haveThe
above. mostly
mostanegative
negativesegments
sentimentare(negative
tweeted
values), in line with what was discussed above. The most negative segments are tweeted
actually a little outside the main registered crime hotspots of the city center, to the southeast
actually a little outside
(in the Fontainhas the main
and Campo 24registered
de Agostocrime hotspots ofand
neighborhoods) the to
city
thecenter, to the(around
northwest south-
east (in the Fontainhas and Campo 24 de Agosto neighborhoods) and to the
the main football stadium of the city). Negative sentiments are also seen in the middle northwest
(around
of Boavistathe Avenue,
main football
to thestadium
west, theof the
newcity). Negative
hotspot. sentiments
On the contrary,are also
the mostseen in the
positive
middle of Boavista Avenue, to the west, the new hotspot. On the contrary, the most areas, posi-
ISPRS Int. J. Geo-Inf. 2022, 11, x FOR PEER REVIEW (2 being the maximum found in the [−5; 5] scale) are located in non-crime
sentiments 14 of 20
tive sentiments (2 being the maximum found in the [−5; 5] scale) are located
such as Lordelo parish, the commercial/industrial area northwest and around the Orientalin non-crime
areas, suchatasthe
City Park, Lordelo
easternparish, the
edge of thecommercial/industrial
city. area northwest and around the
Oriental City Park, at the eastern edge of the city.

Figure 8. Spatial distribution of sentiment scores (source: authors, based on Twitter data).
Figure 8. Spatial distribution of sentiment scores (source: authors, based on Twitter data).

6. Discussion and Conclusions

The continuous evolution in the last 20 years of the mapping and modelling capacity
of geospatial technologies has allowed for an unprecedented ability to understand the
relationships between crime and place. This has definitely underlined the relevance of
environmental criminology as a discipline and of its immediate contributions to decision
making in terms of prevention, city management and support of cohesion and quality of
life policies (in a general sense), as well as in terms of policing and micro-scale planning
(in a more specific sense). It has become consensual that data-driven methods [26; 46]
contribute effectively to the reduction of (real and perceived) insecurity, and that, within
these, the geographical perception of patterns is paramount [82].
On one hand, crime does display concentrated and generally stable patterns over time,
confirming for Porto the postulates of Weisburd’s Law of Crime Concentration [50] and
the spatial principles of Environmental Criminology. The main concentration occurs in
the downtown area, which is divided into persistent, consecutive but also intensifying
hotspot areas, whereas other smaller concentrations have also been pinpointed, including
a new hotspot location. The forecasting of crime counts allows following this trend, by
showing these axes as those with higher potential for occurrences but also uncovering other
locations that may display a rising trend. Along with a temporal perception (peaks in the
late afternoon, and a May–September rise) this can be very relevant in the allocation of
resources and in establishing prevention programs.
Obviously, this analysis was performed using for pre-pandemic data, the only kind
available at the time of writing, so crime forecasting will be performed at a later stage and
compared with actual values in order to further evaluate this model’s efficiency. However,
as explained, the model was validated by using 30% of untouched data to compare to the
basic reality, and it seems to fit with both the expected trends and the police stakeholders’
views of the territory, which have been consulted throughout this research. Additionally,
the data are biased by the reporting of crime itself (not all crimes are reported), and all
crimes of all typologies were considered in the forecasting and in the hotspotting, so
fine-graining the analysis by crime categories would also be of importance to cater to
different planning and prevention necessities. As discussed by other authors [83], the
geographical analysis of crime patterns is conditioned by the level of geography used and
how the spatial crime information has been supplied, in this case only by street segments,
which have also been shown, in some locations, to perform worse than natural streets in
the explanation of crime events [15]. Indeed, Space Syntax has often been used in crime
prediction and could, in future research, be used to further test or enhance the results here
presented. Furthermore, the visual representation of crime patterns, for example in kernel
density estimation, is also very sensitive to parameter settings, as cell size and distance band.
However, the initial iteration performed in this paper has revealed the importance of statistical
and spatial modeling, as it is based on know-how often not possessed by institutions, but at the
same time produces results that easily connect with, are understood and can be validated by
stakeholders. It is proven that trans-disciplinary partnerships with universities and research
centers can be the cornerstone for intelligence and place-based policing.
Nonetheless, on the other hand, although crime mapping supported by a combina-
tion of geospatial and statistical analysis is essential [84,85], authors call for a smarter
aggregation of data [86], i.e., an integrated and holistic approach that includes additional,
sometimes non-structured data sources reflecting the economic, morphological, social,
perceptual or cultural context of urban areas to better optimize prevention, planning and
cohesion policies [87–89]. In this research, machine learning methods, such as decision tree
and random forest, aligned with the Lasso regression, plotted these dimensions together
and revealed variables that, spatially and statistically, appear to have greater affinity with
the increase in reported crime rates. These include the percentage of population with low
level of schooling and the percentage of youngsters. On the contrary, places that have
higher rates of population with a university degree, more CCTV and more males present in
the population appear to relate less to crime rates. Building density and concentration of
ISPRS Int. J. Geo-Inf. 2022, 11, 400 14 of 18

dwellings can appear as a catalyst for and against crime rates, depending on the method.
However, even though the random forest prediction model demonstrated the best perfor-
mance results (recall = 0.99 and precision = 0.79), we suggest applying results derived from
the logistic regression, as it provides a broader set of important for crime variables with a
direction of their effect on crime (positive or negative), as well as the size of that effect.
Overall, these results align with previous research. Higher density, walkable neighbor-
hoods, a higher education and being a male are associated with lower fear of crime, whereas
house characteristics do not display an unequivocal relationship [90]. Street population is
strongly and positively related to crime, particularly female, as is concentrated disadvan-
tage at the community level [89] and the presence of high-risk juveniles [91]. These studies
also call attention to variables of collective efficacy. This was not directly approached in
this research, but the topic modeling (LDA) of the Twitter data, although these data are
also restricted in terms of users, themes and size of information (and hence cannot be
deemed as an overall substitute for surveys, interviews and workshops with residents)
was able to provide an expedite way to make a first iteration of how inhabitants feel about
the city. As was to be expected, sentiments are mostly negative in discussing insecurity,
close to the areas with higher rates of reported crime (the city center and Boavista) but also
areas that are highly stigmatized and command media attention (such as the Cerco social
neighborhood). Words such as “police”, “murder”, “injury” or “killed” reveal negative
sentiments in these locations, while there is a close association of areas with low crime
rates, such as green parks, with positive sentiments and words.
Such findings clearly reveal the importance of explanatory and predictive models
in decision support and may steer the definition of place-specific policies but should
be approached with caution. The capacity for pattern analysis is insightful and should
definitely be a part of area diagnosis and monitoring. However, research should not end
there, and the dependency of Big Data also hides great “dangers”, if meaning is lost [26].
First because correlation does not mean causality, and second, because, as discussed above,
since micro-scale locations are complex urban and social systems, important variables
related to personal and perceptual issues (for example those related to collective efficacy or
defensible space) may be lost in computation or not computed at all. Universal algorithms
and methods should be replaced by a deeper modelling and spatial understanding, and
model outcomes should be the object of critique. After the identification of hotspots, a
second stage of analysis should delve deeper into urban space, looking for the tangible and
the intangible, understanding how quantifiable variables correlate at the micro-scale but
also investigating the not immediately quantifiable, as community policing or CPTED teams
have been doing for the recent decades. This way, spatial analysis and machine learning
methods can effectively be used to properly frame these interventions, and more research and
discussion in the scientific literature is required to raise awareness, increase know-how and
avoid the fallacy of the “model for the model” of the “model without meaning”.

Author Contributions: Conceptualization, Miguel Saraiva and Irina Matijošaitienė; methodology,

Miguel Saraiva and Irina Matijošaitienė; software, Miguel Saraiva, Irina Matijošaitienė, Saloni Mishra
and Ana Amante; validation, Miguel Saraiva and Irina Matijošaitienė; investigation, Miguel Saraiva,
Irina Matijošaitienė, Saloni Mishra and Ana Amante; data curation, Miguel Saraiva, Irina Matijošai-
tienė, Saloni Mishra and Ana Amante; writing—original draft preparation, Miguel Saraiva and Saloni
Mishra; writing—review and editing, Miguel Saraiva and Irina Matijošaitienė; project administration,
Miguel Saraiva and Irina Matijošaitienė; funding acquisition, Miguel Saraiva. All authors have read
and agreed to the published version of the manuscript.
Funding: This research, as part of project CANVAS—Towards Safer and Attractive Cities: Crime
and Violence Prevention through Smart Planning and Artistic Resistance, was supported by the
European Regional Development Funds, through the COMPETE 2020—Operational Programme
‘Competitiveness and Internationalization’, under Grant POCI-01-0145-FEDER-030748. In addition,
as part of the Centre of Studies on Geography and Spatial Planning (CEGOT) of the University of
Porto, this work was partially supported by National Funds through the Portuguese Foundation for
Science and Technology (FCT) under Grant UIDB/04084/2020.
ISPRS Int. J. Geo-Inf. 2022, 11, 400 15 of 18

Institutional Review Board Statement: Not applicable.

Informed Consent Statement: Not applicable.
Data Availability Statement: Not applicable.
Conflicts of Interest: The authors declare no conflict of interest.

References
1. Organisation for Economic Co-operation and Development. How’s Life? OECD Publishing: Paris, France, 2020.
2. My Region, My Europe, Our Future—Seventh Report on Economic, Social and Territorial Cohesion; European Commission:
Luxembourg, 2017.
3. Brantingham, P.L.; Brantingham, P.J. Situational crime prevention in practice. Can. J. Criminol. 1990, 32, 17. [CrossRef]
4. Andresen, M.A. Environmental Criminology: Evolution, Theory, and Practice; Routledge: New York, NY, USA, 2014.
5. Weisburd, D.; Eck, J.; Braga, A.; Telep, C.W.; Cave, B. Place Matters: Criminology for the Twenty-First Century; Cambridge University
Press: New York, NY, USA, 2016.
6. Wortley, R.; Townsley, M. Environmental Criminology and Crime Analysis; Routledge: New York, NY, USA, 2016.
7. Leitner, M. Crime Modeling and Mapping Using Geospatial Technologies; Springer Science & Business Media: Berlin, Germany, 2013;
Volume 8.
8. Chainey, S.; Ratcliffe, J. GIS and Crime Mapping; John Wiley & Sons: Hoboken, NJ, USA, 2013.
9. Kannan, M.; Singh, M. Geographical Information System and Crime Mapping; CRC Press: Boca Raton, FL, USA, 2020.
10. Braga, A.; Papachristos, A.; Hureau, D. Hot spots policing effects on crime. Campbell Syst. Rev. 2012, 8, 1–96. [CrossRef]
11. Weisburd, D.; Telep, C.W. Hot spots policing: What we know and what we need to know. J. Contemp. Crim. Justice 2014,
30, 200–220. [CrossRef]
12. Andresen, M.A.; Weisburd, D. Place-based policing: New directions, new challenges. Polic. Int. J. 2018, 41, 310–313. [CrossRef]
13. Elmes, G.A.; Roedl, G.; Conley, J. Forensic GIS: The Role of Geospatial Technologies for Investigating Crime and Providing Evidence;
Springer: Dordrecht, The Netherlands, 2014; Volume 11.
14. Coldren, J.R.; Huntoon, A.; Medaris, M. Introducing smart policing: Foundations, principles, and practice. Police Q. 2013,
16, 275–286. [CrossRef]
15. Attig, S. The Organic Pattern of Space: A Space Syntax Analysis of Natural Streets and Street Segments for Measuring Crime and
Traffic Accidents (Dissertation). 2019. Available online: http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-264938 (accessed on
1 April 2022).
16. Jordan, M.I.; Mitchell, T.M. Machine learning: Trends, perspectives, and prospects. Science 2015, 349, 255–260. [CrossRef]
[PubMed]
17. Zhao, X.; Tang, J. Modeling temporal-spatial correlations for crime prediction. In Proceedings of the 2017 ACM on Conference on
Information and Knowledge Management, Singapore, 6–10 November 2017; pp. 497–506.
18. Babakura, A.; Sulaiman, M.N.; Yusuf, M.A. Improved method of classification algorithms for crime prediction. In Proceedings of
the 2014 International Symposium on Biometrics and Security Technologies (ISBAST), Kuala Lumpur, Malaysia, 26 August 2014;
IEEE: Piscataway, NJ, USA; pp. 250–255.
19. Alves, L.G.; Ribeiro, H.V.; Rodrigues, F.A. Crime prediction through urban metrics and statistical learning. Phys. A Stat. Mech. Its
Appl. 2018, 505, 435–443. [CrossRef]
20. Ivan, N.; Ahishakiye, E.; Omulo, E.O.; Taremwa, D. Crime Prediction Using Decision Tree (J48) Classification Algorithm. Int. J.
Comput. Inf. Technol. 2017, 6, 188–195.
21. Nasridinov, A.; Ihm, S.Y.; Park, Y.H. A decision tree-based classification model for crime prediction. In Information Technology
Convergence; Springer: Dordrecht, The Netherlands, 2013; pp. 531–538.
22. Tayal, D.K.; Jain, A.; Arora, S.; Agarwal, S.; Gupta, T.; Tyagi, N. Crime detection and criminal identification in India using data
mining techniques. AI Soc. 2015, 30, 117–127. [CrossRef]
23. Sivaranjani, S.; Sivakumari, S.; Aasha, M. Crime prediction and forecasting in Tamilnadu using clustering approaches. In
Proceedings of the 2016 International Conference on Emerging Technological Trends (ICETT), Kollam, India, 21–22 October 2016;
IEEE: Piscataway, NJ, USA; pp. 1–6.
24. Kianmehr, K.; Alhajj, R. Effectiveness of support vector machine for crime hot-spots prediction. Appl. Artif. Intell. 2008,
22, 433–458. [CrossRef]
25. Memon, Q.A.; Mehboob, S. Crime investigation and analysis using neural nets. In Proceedings of the 7th International Multi
Topic Conference, 2003. INMIC 2003, Islamabad, Pakistan, 8–9 December 2003; IEEE: Piscataway, NJ, USA; pp. 346–350.
26. Bannister, J.; O’Sullivan, A.; Bates, E. Place and time in the Criminology of Place. Theor. Criminol. 2019, 23, 315–332. [CrossRef]
27. Saraiva, M.; Amante, A.; Marques, T.; Ferreira, M.; Maia, C. Perfis territoriais de criminalidade em Portugal (2009–2019). Finisterra
2021, 56, 49–73. [CrossRef]
28. Freilich, J.D.; Newman, G.R. Situational Crime Prevention Oxford Research Encyclopedia of Criminology and Criminal Justice; Oxford
University Press: Oxford, UK, 2017.
ISPRS Int. J. Geo-Inf. 2022, 11, 400 16 of 18

29. Individualized Education Program. Global Peace Index 2021: Measuring Peace in a Complex World. 2021. Available online:
https://www.visionofhumanity.org/wp-content/uploads/2021/06/GPI-2021-web-1.pdf (accessed on 1 April 2022).
30. Grangeia, H.; Cruz, O.; Teixeira, R.; Alves, P. Vulnerabilidades urbanas: O caso da criminalidade associada às ourivesarias na
cidade do Porto. Rev. Latit. 2013, 7, 69–89.
31. Country Security Report. 2020. Available online: https://www.osac.gov/Country/Portugal/Content/Detail/Report/3e50b674
-78b2-4997-8950-188df6d2cadf (accessed on 1 April 2022).
32. Tulumello, S. Segurança urbana: Tendências globais, contradições portuguesas e tempos de crise. Cid. Em Reconstrução. Leituras
Círitcas 2018, 2008–2018, 73–80.
33. Eurostat. Crime and Criminal Justice Statistics. 2016. Available online: http://ec.europa.eu/eurostat/statistics-explained/index.
php/MainPage (accessed on 1 April 2022).
34. Ferreira, J.; João, P.; Martins, J. GIS for crime analysis-geography for predictive models. Electron. J. Inf. Syst. Eval. 2012, 15, 36–49.
35. João, P. Modelo Preditivo de Criminalidade: Georeferenciação ao Concelho de Lisboa. Master’s Thesis, Universidade Nova de
Lisboa, Lisboa, Portugal, 2009.
36. Rodrigues, T.M.F.; Inácio, A.A.; Araújo, D.; Painho, M.; Henriques, R.; Cabral, P.d.C.B.; Oliveira, T.H.; Neto, M.d.C.
SIM4SECURITY. In V Congresso Português de Demografia; A forecast and spatial analysis model for homeland security. Portugal
2030; Fundação Calouste Gulbenkian: Lisbon, Portugal, 2016.
37. Innes, M.; Roberts, C.; Preece, A.; Rogers, D. Ten “Rs” of social reaction: Using social media to analyse the “post-event” impacts
of the murder of Lee Rigby. Terror. Political Violence 2018, 30, 454–474. [CrossRef]
38. Hu, S.; Gao, S.; Wu, L.; Xu, Y.; Zhang, Z.; Cui, H.; Gong, X. Urban function classification at road segment level using taxi trajectory
data: A graph convolutional neural network approach. Comput. Environ. Urban Syst. 2021, 87, 101619. [CrossRef]
39. Wu, H.; Lin, A.; Xing, X.; Song, D.; Li, Y. Identifying core driving factors of urban land use change from global land cover products
and POI data using the random forest method. Int. J. Appl. Earth Obs. Geoinf. 2021, 103, 102475. [CrossRef]
40. Abouheaf, M.; Qu, S.; Gueaieb, W.; Abielmona, R.; Harb, M. Responding to illegal activities along the Canadian coastlines using
reinforcement learning. In Proceedings of the IEEE Instrumentation & Measurement Magazine, Catania, Italy, 12 April 2021;
Volume 24, pp. 118–126. [CrossRef]
41. Petrossian, G.A. Preventing illegal, unreported and unregulated (IUU) fishing: A situational approach. Biol. Conserv. 2015,
189, 39–48. [CrossRef]
42. Lin, Y.L.; Chen, T.Y.; Yu, L.C. Using machine learning to assist crime prevention. In Proceedings of the 2017 6th IIAI Interna-
tional Congress on Advanced Applied Informatics (IIAI-AAI), Hamamatsu, Japan, 9–13 July 2017; IEEE: Piscataway, NJ, USA;
pp. 1029–1030.
43. Zhang, X.; Liu, L.; Xiao, L.; Ji, J. Comparison of machine learning algorithms for predicting crime hotspots. IEEE Access 2020,
8, 181302–181310. [CrossRef]
44. Matijosaitiene, I.; McDowald, A.; Juneja, V. Predicting safe parking spaces: A machine learning approach to geospatial urban and
crime data. Sustainability 2019, 11, 2848. [CrossRef]
45. Pinto, M.; Wei, H.; Konate, K.; Touray, I. Delving into factors influencing New York crime data with the tools of machine learning.
J. Comput. Sci. Coll. 2020, 36, 61–70.
46. Mittal, M.; Goyal, L.M.; Sethi, J.K.; Hemanth, D.J. Monitoring the impact of economic crisis on crime in India using machine
learning. Comput. Econ. 2019, 53, 1467–1485. [CrossRef]
47. Bogomolov, A.; Lepri, B.; Staiano, J.; Oliver, N.; Pianesi, F.; Pentland, A. Once upon a crime: Towards crime prediction from
demographics and mobile data. In Proceedings of the 16th International Conference on Multimodal Interaction, Istanbul, Türkiye,
12–16 November 2014; pp. 427–434.
48. Zhou, J.; Li, Z.; Ma, J.J.; Jiang, F. Exploration of the hidden influential factors on crime activities: A big data approach. IEEE Access
2020, 8, 141033–141045. [CrossRef]
49. Al Boni, M.; Gerber, M.S. Area-specific crime prediction models. In Proceedings of the 2016 15th IEEE International Conference
on Machine Learning and Applications (ICMLA), Anaheim, CA, USA, 18–20 December 2016; IEEE: Piscataway, NJ, USA;
pp. 671–676.
50. Weisburd, D. The law of crime concentration and the criminology of place. Criminology 2015, 53, 133–157. [CrossRef]
51. Zhang, Q.; Yuan, P.; Zhou, Q.; Yang, Z. Mixed spatial-temporal characteristics based crime hot spots prediction. In Proceedings of
the 2016 IEEE 20th International Conference on Computer Supported Cooperative Work in Design (CSCWD), Nanchang, China,
4–6 May 2016; IEEE: Piscataway, NJ, USA; pp. 97–101.
52. Bappee, F.K.; Junior, A.S.; Matwin, S. Predicting crime using spatial features. In Proceedings of the Canadian Conference on
Artificial Intelligence, Toronto, Canada, 8–11 May 2018; Springer: Cham, Switzerland; pp. 367–373.
53. Chen, Y. Crime Mapping Powered by Machine Learning and Web GIS. Ph.D. Thesis, California State University, Northridge, CA,
USA, 2019.
54. He, L.; Páez, A.; Jiao, J.; An, P.; Lu, C.; Mao, W.; Long, D. Ambient population and larceny-theft: A spatial analysis using mobile
phone data. ISPRS Int. J. Geo-Inf. 2020, 9, 342. [CrossRef]
55. Gerber, M. Predicting crime using Twitter and kernel density estimation. Decis. Support Syst. 2014, 61, 115–125. [CrossRef]
ISPRS Int. J. Geo-Inf. 2022, 11, 400 17 of 18

56. Vo, T.; Sharma, R.; Kumar, R.; Son, L.H.; Pham, B.T.; Tien Bui, D.; Priyadarshini, I.; Sarkar, M.; Le, T. Crime rate detection using
social media of different crime locations and Twitter part-of-speech tagger with Brown clustering. J. Intell. Fuzzy Syst. 2020,
38, 4287–4299, (Preprint). [CrossRef]
57. Wang, X.; Gerber, M.S.; Brown, D.E. Automatic crime prediction using events extracted from twitter posts. In Proceedings of
the International Conference on Social Computing, Behavioral-Cultural Modeling, and Prediction, College Park, MD, USA,
3–5 April 2012; Springer: Berlin, Heidelberg; pp. 231–238.
58. Siriaraya, P.; Zhang, Y.; Wang, Y.; Kawai, Y.; Mittal, M.; Jeszenszky, P.; Jatowt, A. Witnessing crime through Tweets: A crime
investigation tool based on social media. In Proceedings of the 27th ACM SIGSPATIAL International Conference on Advances in
Geographic Information Systems, Chicago, IL, USA, 5–8 November 2019; pp. 568–571.
59. El Hannach, H.; Benkhalifa, M. WordNet based implicit aspect sentiment analysis for crime identification from twitter. Int. J. Adv.
Comput. Sci. Appl. 2018, 9, 150–159. [CrossRef]
60. Pandey, R.; Mohler, G.O. Evaluation of crime topic models: Topic coherence vs. spatial crime concentration. In Proceedings of the
2018 IEEE International Conference on Intelligence and Security Informatics (ISI), Miami, FL, USA, 9–11 November 2018; IEEE:
Piscataway, NJ, USA; pp. 76–78.
61. Kuang, D.; Brantingham, P.J.; Bertozzi, A.L. Crime topic modeling. Crime Sci. 2017, 6, 12. [CrossRef]
62. Tompson, L.; Johnson, S.; Ashby, M.; Perkins, C.; Edwards, P. UK open source crime data: Accuracy and possibilities for research.
Cartogr. Geogr. Inf. Sci. 2015, 42, 97–111. [CrossRef]
63. Instituto Nacional de Estatistica. Main Indicators. Instituto Nacional de Estatistica (INE), Lisbon, Portugal. 2012. Available online:
http://www.ine.pt/xportal/xmain?xpid=INE&xpgid=inemain (accessed on 1 April 2022).
64. Saraiva, M.; Amante, A. Geografia do bem-estar: Insegurança: O caso dos crimes contra as pessoas no Grande Porto. In Geografia
do Porto; Fernandes, R., Ed.; Book Cover: Porto, Portugal, 2020; pp. 202–211. ISBN 9789898898517.
65. GitHub—JustAnotherArchivist/Snscrape: A Social Website. Available online: www.github.com/JustAnotherArchivist/snscrape
(accessed on 1 April 2022).
66. Chainey, S.; Tompson, L.; Uhlig, S. The utility of hotspot mapping for predicting spatial patterns of crime. Secur. J. 2008, 21, 4–28.
[CrossRef]
67. Kalinic, M.; Krisp, J.M. Kernel density estimation (KDE) vs. hot-spot analysis–detecting criminal hot spots in the city of San
Francisco. In Proceedings of the 21 Conference on Geo-Information Science, Lund, Sweden, 12–15 June 2018.
68. Eck, J.; Chainey, S.; Cameron, J.; Wilson, R. Mapping Crime: Understanding Hotspots; U.S. Department of Justice Office of Justice
Programs: Washington, DC, USA, 2005.
69. Jansenberger, E.M.; Staufer-Steinnocher, P. Dual kernel density estimation as a method for describing spatio-temporal changes
in the upper Austrian food retailing market. In Proceedings of the 7th AGILE Conference on Geographic Information Science,
Heraklion, Crete, Greece, 29 April – 1 May 2004.
70. Chainey, S.P. Examining the influence of cell size and bandwidth size on kernel density estimation crime hotspot maps for
predicting spatial patterns of crime. Bull. Geogr. Soc. Liege 2013, 60, 7–19.
71. Hu, Y.; Wang, F.; Guin, C.; Zhu, H. A spatio-temporal kernel density estimation framework for predictive crime hotspot mapping
and evaluation. Appl. Geogr. 2018, 99, 89–97. [CrossRef]
72. Meneses, B.M.; Reis, E.; Reis, R.; Vale, M.J. The effects of land use and land cover geoinformation raster generalization in the
analysis of LUCC in Portugal. ISPRS Int. J. Geo-Inf. 2018, 7, 390. [CrossRef]
73. Ord, J.K.; Getis, A. Local spatial autocorrelation statistics: Distribution issues and an application. Geogr. Anal. 1995, 27, 286–306.
[CrossRef]
74. Tibshirani, R. Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B (Methodol.) 1996, 58, 267–288. [CrossRef]
75. Du, W.; Zhan, Z. Building Decision Tree Classifier on Private Data. Electrical Engineering and Computer Science. 2002. Available
online: https://surface.syr.edu/eecs/8 (accessed on 1 April 2022).
76. Ho, T.K. Random decision forests. In Proceedings of the 3rd International Conference on Document Analysis and Recognition,
Montreal, QC, Canada, 14–16 August 1995; IEEE: Piscataway, NJ, USA; Volume 1, pp. 278–282.
77. Ho, T.K. The random subpace method for constructing decision forests. IEEE Trans. Pattern Anal. Mach. Intell. 1998, 20, 832–844.
78. Wang, L. (Ed.) Support Vector Machines: Theory and Applications; Springer Science & Business Media: Berlin, Germany, 2005;
Volume 177.
79. Blei, D.M.; Ng, A.Y.; Jordan, M.I. Latent dirichlet allocation. J. Mach. Learn. Res. 2003, 3, 993–1022.
80. Liu, B. Sentiment analysis and subjectivity. Handb. Nat. Lang. Processing 2010, 2, 627–666.
81. Breiman, L. Bagging predictors. Mach. Learn. 1996, 24, 123–140. [CrossRef]
82. Lasierra, F.G. Detecting and tackling the different levels of subjective security1. In The Dimensions of Insecurity in Urban Areas;
Barabás, A.T., Ed.; National Institute of Budapest: Budapest, Hungary, 2018.
83. Solymosi, R.; Bowers, K.; Fujiyama, T. Mapping fear of crime as a context-dependent everyday experience that varies in space
and time. Leg. Criminol. Psychol. 2015, 20, 193–211. [CrossRef]
84. LeBeau, J.L.; Leitner, M. Introduction: Progress in research on the geography of crime. Prof. Geogr. 2011, 63, 161–173. [CrossRef]
85. Bunting, R.J.; Chang, O.Y.; Cowen, C.; Hankins, R.; Langston, S.; Warner, A.; Yang, X.; Louderback, E.R.; Roy, S.S. Spatial patterns
of larceny and aggravated assault in Miami–Dade County, 2007–2015. Prof. Geogr. 2018, 70, 34–46. [CrossRef]
ISPRS Int. J. Geo-Inf. 2022, 11, 400 18 of 18

86. Hunt, P.; Kilmer, B.; Rubin, J. Development of a European Crime Report: Improving Safety and Justice with Existing Crime and Criminal
Justice Data; RAND Europe: Cambridge, UK, 2011.
87. Partnership on Security in Public Spaces (PSPS). Action Plan Urban Agenda Partnership Security in Public Spaces. 2021. Available
online: https://ec.europa.eu/futurium/en/system/files/ged/final_action_plan_security_in_public_spaces.pdf (accessed on
1 April 2022).
88. Weisburd, D.; White, C.; Wooditch, A. Does collective efficacy matter at the micro geographic level?: Findings from a study of
street segments. Br. J. Criminol. 2020, 60, 873–891. [CrossRef] [PubMed]
89. Weisburd, D.; White, C.; Wire, S.; Wilson, D.B. Enhancing informal social controls to reduce crime: Evidence from a study of
crime hot spots. Prev. Sci. 2021, 22, 509–522. [CrossRef]
90. Foster, S.; Giles-Corti, B.; Knuiman, M. Neighbourhood design and fear of crime: A social-ecological examination of the correlates
of residents’ fear in new suburban housing developments. Health Place 2010, 16, 1156–1165. [CrossRef]
91. Weisburd, D.; Groff, E.R.; Yang, S.M. Understanding and controlling hot spots of crime: The importance of formal and informal
social controls. Prev. Sci. 2014, 15, 31–43. [CrossRef]

GIS and Crime Mapping
100% (1)
GIS and Crime Mapping
45 pages
Chapter 8 Relative Permeability
No ratings yet
Chapter 8 Relative Permeability
45 pages
Introduction To Separation of Oil and Water
100% (1)
Introduction To Separation of Oil and Water
30 pages
Crime Prediction
No ratings yet
Crime Prediction
11 pages
S1 LCM HCF Problem Sums
No ratings yet
S1 LCM HCF Problem Sums
4 pages
Hydro Generators
100% (1)
Hydro Generators
39 pages
The Art of The Possible Scheduling Options in SAP. Mark W. Scott Vesta Partners, LLC
100% (2)
The Art of The Possible Scheduling Options in SAP. Mark W. Scott Vesta Partners, LLC
110 pages
Machine Learning in Crime Prediction
No ratings yet
Machine Learning in Crime Prediction
28 pages
The Spatial-Temporal Prediction of Various Crime Types in Houston
No ratings yet
The Spatial-Temporal Prediction of Various Crime Types in Houston
72 pages
Dot Net Interview Question
No ratings yet
Dot Net Interview Question
143 pages
1-Crime Mapping and Spatial Analysis PDF
No ratings yet
1-Crime Mapping and Spatial Analysis PDF
64 pages
2019-07-12-NE 33 Future Crime
No ratings yet
2019-07-12-NE 33 Future Crime
23 pages
Angular 17 Notes
No ratings yet
Angular 17 Notes
58 pages
TORQUE
No ratings yet
TORQUE
14 pages
1 Crime Mapping and Spatial Analysis PDF
No ratings yet
1 Crime Mapping and Spatial Analysis PDF
64 pages
A Critical Study of Geospatial Algorithm Use in Crime Analysis An
No ratings yet
A Critical Study of Geospatial Algorithm Use in Crime Analysis An
75 pages
Crime Analysis and Prediction Using Data
No ratings yet
Crime Analysis and Prediction Using Data
7 pages
A Conditional Machine Learning Classification Approach For Fltqd75a
No ratings yet
A Conditional Machine Learning Classification Approach For Fltqd75a
30 pages
Machine Learning Based Advanced Crime Prediction and Analysis
No ratings yet
Machine Learning Based Advanced Crime Prediction and Analysis
7 pages
Logistic Regression On Titanic Dataset
No ratings yet
Logistic Regression On Titanic Dataset
6 pages
Table - 19 Equivalent Temperature Difference (Walls)
100% (2)
Table - 19 Equivalent Temperature Difference (Walls)
1 page
Spatial Temporal Meta Path Guided Explainable Crime 2w1absi6
No ratings yet
Spatial Temporal Meta Path Guided Explainable Crime 2w1absi6
30 pages
p427 Bogomolov PDF
No ratings yet
p427 Bogomolov PDF
8 pages
Crime Prediction System Proposal
No ratings yet
Crime Prediction System Proposal
24 pages
Energy Forms and Changes TEST
100% (2)
Energy Forms and Changes TEST
9 pages
Scopus 001 Ok
No ratings yet
Scopus 001 Ok
37 pages
Linear Programming
100% (1)
Linear Programming
23 pages
Batch 3 Final
No ratings yet
Batch 3 Final
29 pages
Crime Analysis Through Machine Learning: November 2018
No ratings yet
Crime Analysis Through Machine Learning: November 2018
7 pages
Article 3 Critical Reading
No ratings yet
Article 3 Critical Reading
34 pages
Simplifications of Context-Free Grammars
No ratings yet
Simplifications of Context-Free Grammars
51 pages
Ford Focus Mk3 Specifikacije
No ratings yet
Ford Focus Mk3 Specifikacije
14 pages
Chap1 Stateless Programming
No ratings yet
Chap1 Stateless Programming
16 pages
Management and Organizational Behavior
No ratings yet
Management and Organizational Behavior
20 pages
Aerofly RC 7 Manual English
No ratings yet
Aerofly RC 7 Manual English
46 pages
Examining The Determinants and Consequences of Financial Constraints Faced by Micro, Small and Medium Enterprises ' Owners
No ratings yet
Examining The Determinants and Consequences of Financial Constraints Faced by Micro, Small and Medium Enterprises ' Owners
22 pages
100 Type Script Concepts
No ratings yet
100 Type Script Concepts
22 pages
Empirical Analysis For Crime Prediction and Forecasting Using Machine
No ratings yet
Empirical Analysis For Crime Prediction and Forecasting Using Machine
15 pages
Quantifying The Effect of Socio-Economic Predictors and Built Environment On Mental Health Events in Little Rock, AR
No ratings yet
Quantifying The Effect of Socio-Economic Predictors and Built Environment On Mental Health Events in Little Rock, AR
15 pages
Key Elements of Predictive Policing Include Crime Analysis, Crime Mapping and Geographies
No ratings yet
Key Elements of Predictive Policing Include Crime Analysis, Crime Mapping and Geographies
15 pages
Crime Analyses Using Data Analytics
No ratings yet
Crime Analyses Using Data Analytics
15 pages
Stinger Technical Specification
100% (1)
Stinger Technical Specification
8 pages
Predictive Analytics
No ratings yet
Predictive Analytics
19 pages
Altek 820 Specs PDF
No ratings yet
Altek 820 Specs PDF
19 pages
Ijcsit 2021120201
No ratings yet
Ijcsit 2021120201
9 pages
Artificial Intelligence & Crime Prediction
No ratings yet
Artificial Intelligence & Crime Prediction
23 pages
New Content
No ratings yet
New Content
45 pages
Predicting Violent Crime Hot-Spots Utilizing Machine Learning
No ratings yet
Predicting Violent Crime Hot-Spots Utilizing Machine Learning
9 pages
Forecasting of Crime Ppt1
No ratings yet
Forecasting of Crime Ppt1
18 pages
The Utility of Hotspot Mapping For Predicting Spatial Patterns of Crime - SpringerLink
No ratings yet
The Utility of Hotspot Mapping For Predicting Spatial Patterns of Crime - SpringerLink
10 pages
The Role of Predictive Policing in Predicting Spatio-Temporal Crime Mapping
No ratings yet
The Role of Predictive Policing in Predicting Spatio-Temporal Crime Mapping
14 pages
Full Adder
No ratings yet
Full Adder
5 pages
BOLETIN BOBINAS Bul HY14-2543-M10 D3W C-Style 32 Design
No ratings yet
BOLETIN BOBINAS Bul HY14-2543-M10 D3W C-Style 32 Design
20 pages
A Data Driven Agent
No ratings yet
A Data Driven Agent
15 pages
Prediction of Crime Hotspots Using Machine Learning With Stacked Generalized Approach
No ratings yet
Prediction of Crime Hotspots Using Machine Learning With Stacked Generalized Approach
5 pages
Algorithm and Problem Solving
No ratings yet
Algorithm and Problem Solving
9 pages
Forecasting Crime For Law Enforcement
No ratings yet
Forecasting Crime For Law Enforcement
5 pages
Prof - Deepali Jain (AI) UNIT-6 Knowledge Engineering
No ratings yet
Prof - Deepali Jain (AI) UNIT-6 Knowledge Engineering
19 pages
Suraksha Kavach Crime Prediction and Analysis
No ratings yet
Suraksha Kavach Crime Prediction and Analysis
13 pages
STAD Análisis de Los Nudos Críticos para La Implementación de Un Sistema de Análisis
No ratings yet
STAD Análisis de Los Nudos Críticos para La Implementación de Un Sistema de Análisis
14 pages
Crime (Paper 5)
No ratings yet
Crime (Paper 5)
10 pages
10.1515 - Jisys 2022 0223
No ratings yet
10.1515 - Jisys 2022 0223
12 pages
NIOS 12th Physics Question Paper April 2018
No ratings yet
NIOS 12th Physics Question Paper April 2018
10 pages
Comparison of Machine Learning Algorithms For Predicting Crime Hotspots
No ratings yet
Comparison of Machine Learning Algorithms For Predicting Crime Hotspots
4 pages
Golang Mysql Tutorial
No ratings yet
Golang Mysql Tutorial
3 pages
Cec Cato
No ratings yet
Cec Cato
18 pages
Once Upon A Crime: Towards Crime Prediction From Demographics and Mobile Data
No ratings yet
Once Upon A Crime: Towards Crime Prediction From Demographics and Mobile Data
8 pages
References
No ratings yet
References
5 pages
2025-05-08 Group Project Updates
No ratings yet
2025-05-08 Group Project Updates
2 pages
Crime Data Mediante Machine Learning
No ratings yet
Crime Data Mediante Machine Learning
6 pages
Crimen y Análisis Sig
No ratings yet
Crimen y Análisis Sig
16 pages
Gorr and Harries - Introduction To Crime Forecasting
No ratings yet
Gorr and Harries - Introduction To Crime Forecasting
5 pages
5106 12840 1 SM
No ratings yet
5106 12840 1 SM
5 pages
Crime Prediction and Prevention Using K-Means Clustering
No ratings yet
Crime Prediction and Prevention Using K-Means Clustering
7 pages
AI-Powered Local Crime Prediction
No ratings yet
AI-Powered Local Crime Prediction
6 pages
Crime Hot Spot Prediction-1
No ratings yet
Crime Hot Spot Prediction-1
7 pages
Science Magazine: Predictive Policing
No ratings yet
Science Magazine: Predictive Policing
5 pages
Addarsh Chandrasekar - Crime Prediction and Classification in San Francisco City
No ratings yet
Addarsh Chandrasekar - Crime Prediction and Classification in San Francisco City
6 pages
TXOER G7 M04 T01 L04 Asignment Answer Key
No ratings yet
TXOER G7 M04 T01 L04 Asignment Answer Key
2 pages
Paper5 Nairobi Crime
No ratings yet
Paper5 Nairobi Crime
6 pages
CM Place Prediction
No ratings yet
CM Place Prediction
3 pages
Lin - Using Machine Learning To Assist Crime Prevention
No ratings yet
Lin - Using Machine Learning To Assist Crime Prevention
2 pages
Predictive Policing
No ratings yet
Predictive Policing
1 page
Sentence Patterns: Lesson Six
No ratings yet
Sentence Patterns: Lesson Six
4 pages
Pump Problems
100% (1)
Pump Problems
1 page
GIS Based Decision Support System For Crime Mapping, Analysis and Identify Hotspot in Ahmedabad City
No ratings yet
GIS Based Decision Support System For Crime Mapping, Analysis and Identify Hotspot in Ahmedabad City
4 pages
Crime Analytics: Exploring Analysis of Crimes Through R Programming Language
No ratings yet
Crime Analytics: Exploring Analysis of Crimes Through R Programming Language
5 pages
Crime Analysis and Prediction Using Machine Learning
No ratings yet
Crime Analysis and Prediction Using Machine Learning
5 pages
IAAC Bits 10 – Learning Cities: Collective Intelligence in Urban Design
From Everand
IAAC Bits 10 – Learning Cities: Collective Intelligence in Urban Design
Areti Markopoulou
No ratings yet
Activity Recognition: Fundamentals and Applications
From Everand
Activity Recognition: Fundamentals and Applications
Fouad Sabry
No ratings yet

Ijgi 11 00400

Uploaded by

Ijgi 11 00400

Uploaded by

International Journal of

ISPRS Int. J. Geo-Inf. 2022, 11, 400. https://doi.org/10.3390/ijgi11070400 https://www.mdpi.com/journal/ijgi

2. Machine Learning, Sentiment Analysis and Topic Modelling in Crime Hot-Spotting

3. Data and Methods

3.2. Data Sources

4. Porto’s Crime Pattern between 2016 and 2018

Emerging hot-spot analysis was performed considering a space-time bin of 3 months

Emerging hot-spot analysis was performed considering a space-time bin of 3 months

Table 1. Comparison of machine learning classification model performance.

Model Accuracy Recall Precision F1 Score

5.3. Natural Language Processing (NLP)

5.3.2. Sentiment Analysis

6. Discussion and Conclusions

6. Discussion and Conclusions

Author Contributions: Conceptualization, Miguel Saraiva and Irina Matijošaitienė; methodology,

Institutional Review Board Statement: Not applicable.

You might also like