0% found this document useful (0 votes)
132 views

Applications of Artificial Intelligence and Machine Learning in Geospatial Data

Uploaded by

kasimhanoi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
132 views

Applications of Artificial Intelligence and Machine Learning in Geospatial Data

Uploaded by

kasimhanoi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/370234057

Applications of Artificial Intelligence and Machine Learning in Geospatial Data

Chapter · April 2023


DOI: 10.4018/978-1-6684-7319-1.ch010

CITATION READS
1 556

2 authors, including:

Nishi Srivastava
Birla Institute of Technology, Mesra
18 PUBLICATIONS 112 CITATIONS

SEE PROFILE

All content following this page was uploaded by Nishi Srivastava on 06 May 2023.

The user has requested enhancement of the downloaded file.


196

Chapter 10
Applications of Artificial
Intelligence and Machine
Learning in Geospatial Data
Nishi Srivastava
Birla Institute of Technology, Ranchi, India

Nisheeth Saxena
Birla Institute of Technology, Ranchi, India

ABSTRACT
With rapid development in new technologies our intelligence and expertise in artificial intelligence
(AI) have increased significantly. Intelligent machines are preferred, which motivates us to incorporate
highly sophisticated technologies. Geographic analysis for environmental applications has advanced
recently, owing to the vast explosion of geospatial data, the accessibility of powerful computing resources,
and advancement in AI. Geospatial analytics at a high-resolution scale is now possible because AI
reshapes our research environment. High-resolution satellite imaginaries used in geospatial analysis
always include bigdata; thus, alternative methods other than traditional data-processing applications
are needed to deal with these large datasets. AI has become an alternative method to handle big data in
recent decades. Geospatial information from high-resolution remote sensing and other environmental
sensors generates enormous data. AI makes the process more effective and makes it possible to derive
deep understandings and information from the data.

1. INTRODUCTION

Artificial intelligence (AI) is the term used to describe the intelligence exhibited by a human-made system.
Weak AI makes it impossible to create intelligent machines that can reason and solve issues. Strong AI
theories foresee the creation of intelligent machines with the ability to reason and solve problems, which
would be perceived as responsive and self-sufficient. Expert systems, machine learning, language rec-
ognition, computer vision, recommendation systems, the medical field, and geosciences are prime areas

DOI: 10.4018/978-1-6684-7319-1.ch010

Copyright © 2023, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited.

Applications of Artificial Intelligence and Machine Learning in Geospatial Data

of AI research (Zhu et al., 2020; Jordan & Mitchell, 2015). Machine learning is a subset of AI, and deep
learning is a subset of machine learning. Machine learning algorithms inspect how computers mimic or
apply human learning behavior, defined by the capacity to get better with practice automatically. This
enhanced the existing knowledge of learning algorithms through new skills. Deep learning technology is
a crucial field of AI development to solve the basic blocks in the evolution of machine learning models
(Mitchell, 2006). With the recent developments in AI techniques, various research fields are using this
as a research tool, and the geoscience and big data fields are among them.
Geoscience/geospatial big data has been used increasingly in environmental studies to find and
understand spatial trends of various meteorological and ecological parameters. Geospatial analysis
and modeling for environmental applications benefit significantly from the advantages offered by AI
techniques, especially machine learning, and deep learning. This includes their capacity for handling
substantial amounts of geographical and temporal data from multiple sources. AI techniques are com-
putationally effective and scalable to represent additional environmental factors and regional activities.
AI can significantly contribute by offering real-time analytics and forecasting capabilities that can be
applied in various scenarios (e.g., predicting urban development from satellite images, mapping land
coverage and usage).
This chapter aims to give a general overview of the key ideas in the rapidly developing artificial
intelligence application in the geosciences field, especially in handling big data. First, we have focused
on the basic structure and techniques of machine learning, deep learning, and data mining, big data,
their properties, geospatial data, their characteristics, and the application of machine learning in geo-
science research applications. The authors have discussed the various geoscience research applications
of AI/deep learning. This chapter showcases cutting-edge AI approaches and techniques for geospatial
investigation and their applications in geoscience studies. This chapter also provides an overview of the
machine learning/deep learning applications for urbanization, smart city planning, weather and air pol-
lution, geospatial knowledge graphs, and disaster response. This chapter also provides a brief overview
of the use of quantum computing in geoscience data handling.

2. ARTIFICIAL INTELLIGENCE

Since the advent of the digital computer in the 1940s, it has been shown that computers can be instructed
to perform incredibly complicated jobs, such as discovering proofs for mathematical theorems or play-
ing go and chess, with immense ability. Artificial intelligence mentions to the intellect exhibited by a
system made by humans.
According to (Copeland, 2022)

Artificial intelligence is the ability of a computer, or a robot controlled by a computer to do tasks that
humans usually do because they require human intelligence and discernment. Although no AIs can
perform the wide variety of tasks an ordinary human, some AIs can match humans in specific tasks.

AI is the capacity of a digital computer or computer-controlled devices, such as robots, to carry out
activities closely correlated with intelligent entities. The phrase is widely used in reference to the effort
to create artificial intelligence systems that exhibit human-like cognitive abilities, including the potential
for reasoning, meaning-finding, generalization, and experience-based learning.

197

Applications of Artificial Intelligence and Machine Learning in Geospatial Data

2.1 Machine Learning

Machine Learning (ML) is a subset of AI, and the two fields are closely connected, which likewise
focuses on computer-aided predictions. It is a type of mathematical optimization that provides the area
with techniques, theory, and application fields. ML is training a computer to learn from its inputs without
explicit programming to help it develop artificial intelligence. In 1959, Arthur Samuel pioneered the
field of machine learning and came up with the term. ML employs algorithms to evaluate and understand
data, learn from it, and then make the best judgments based on that learning (Li et al., 2020).
According to Arthur Samuel, Machine Learning is a:

“Field of study that gives computers the ability to learn without being explicitly programmed.”

ML algorithms inspect how computers mimic or apply human learning behavior, defined by the ca-
pacity to get better with practice automatically, enhancing the existing knowledge of learning algorithms
through new skills. The main goal of ML is the classification and regression of data using previously
learned known characteristics from training data. Designing software that can learn rules from data,
adapt to changes, and improve with practice falls within the purview of machine learning.
Carnegie Mellon University Professor Tom M. Mitchell (1997) has defined ML as:

A computer program is said to learn from experience E with respect to some class of tasks T and per-
formance measure P, if its performance at tasks in T, as measured by P, improves with experience E.

This basically means that a machine may be said to learn if it can gain experience via doing a specific
activity and enhance its performance in carrying out related activities in the future. The machine receives
this data as input from a source. Machine Learning Algorithms are primarily grouped into supervised,
unsupervised, and reinforcement learning categories.

2.2 Supervised Learning

Employs an algorithm that demands the usage of a labeled dataset. The tagged input database is divided
into two datasets: training and testing. Supervised Algorithms attempt to learn patterns from the training
dataset and confirm these learned patterns using a testing dataset, which offers the accuracy of result
predictions.

2.3 Unsupervised Learning

Deals with the unlabelled dataset. The algorithm takes the input dataset and generates clusters based on
its features. It uses previously learned properties to identify the data class or cluster.

2.4 Reinforcement Learning

Is action-based learning with a reward and punishment system. In this learning, activities are based on
reward-based decisions, so the outcomes become more valued in the desired favorable situation (Sut-
ton,1992). The learner’s decision or action impacts the present and future situations.

198

Applications of Artificial Intelligence and Machine Learning in Geospatial Data

ML, a subset of AI, has been actively used in developing various new technologies. It makes a quick,
valuable, and ongoing contribution. It has produced ground-breaking applications in numerous sectors,
sped up numerous breakthroughs, and, hopefully, will continue to advance technology.

2.5 Deep Learning

Deep learning (DL) technology is a crucial field of AI development to solve the blocks encountered in
the evolution of ML, such as several model types, thorough training, difficulty in estimating parameter
weights, and a wide range of parameters (Mitchell, 2006). Deep learning is a subsection of machine
learning which uses deep neural networks to assess the learning process. Figure 1 depicts the connection
between artificial intelligence, machine learning, and deep learning.

Figure 1. Connection between the artificial intelligence, machine learning and deep learning

DL enables autonomous learning of input characteristics and their representations at many layers
in a hierarchical fashion. DL’s robust methodology makes it resilient compared to typical ML models;
in essence, DL’s whole architecture is employed for feature extraction and modification processes. The
initial layers handle incoming data simply or learn simple features; their output then passes to the later
layers, which learn complicated features. As a result, DL is well suited to dealing with large amounts
of complex data (Zhang et al., 2017). Strong DL approaches do not require unique design elements;
instead, it is resilient because of their autonomous learning process and representation of the optimal
features for each job.
As coming towards today’s time, a new era of neural networks that are termed as deep learning has
grown. The subsequent rise of the neural network began in 2005 with researchers Andrew Ng, Hinton,

199

Applications of Artificial Intelligence and Machine Learning in Geospatial Data

Bengio, LeCun, and others (LeCun et al., 1998). McCulloch and Pitts introduced the first computing
model based on neurons in 1943 (McCulloch et al., 1943; Rosenblatt,1958).
The DL methodology is generalized, meaning that the same DL method may be used to multiple
datasets or in different applications, a process known as transfer learning. It is useful when there is in-
sufficient data to solve the problem (Ackley et al., 1985; He et al., 2016). Scalability and the capacity to
generate essential data where appropriate information is not available for system learning are significant
challenges faced by DL approaches. Expanded chip processing capabilities, such as Graphics processing
units (GPU), affordable computer hardware, and recent developments in machine learning initiatives,
are the main drivers for deep learning acceptance today (Marko, 2012; Minsky& Papert 2017). Deep
learning has wide applications in all research areas, including computer, physical, chemical, biological,
geoscience, remote sensing, and medical sciences.
ML and DL techniques are used in various research fields, along with the geosciences field. During
the recent pandemic, these techniques have been used to find an efficient solutions to various medical
problems. DL technique is frequently used to detect Covid-19 Infection Status from Chest X-Ray Image
Using CNN Based Architecture, Single Transfer Learning-Driven Approach (Ghose et al., 2021, 2022,
2022a). In the detection of cancers also, ML and DL techniques are used. In a recent work by Ghose
et al., 2022b, they used Grid-Search Integrated Optimized Support Vector Machine Model for Breast
Cancer Detection.

3. BIG DATA

Data quality and quantity have improved significantly with our knowledge and technical ability enhance-
ment. Big data usually describes data sets too big or intricate for conventional data-processing application
software to handle. The rate at which data comes in from sources like corporate processes, application
logs, internet networks, networking websites, sensors, mobile devices, etc., is referred to as big data
velocity. Big data processing solutions must be innovative, cost-effective, and efficient. Big data theory
may be able to solve some problems that are now plaguing the geoscience and geoengineering fields
(Hilbert, 2016). In the very near future, extensive, thorough, multidirectional, and multifield geotechni-
cal monitoring will be a reality. Therefore, specialists in geoscience and geoengineering must give big
data research more attention, foster an atmosphere where data can be used to advance our areas, and
encourage cooperation with data analysts from other domains (Manyika et al., 2011).

3.1 Types of Big Data

The big data can be characterized into the following categories based on their storage, access, and pro-
cessing styles:

1. Structured
2. Unstructured
3. Semi-structured

200

Applications of Artificial Intelligence and Machine Learning in Geospatial Data

Any data that can be retrieved, processed, and stored in a set manner is considered structured data.
When the format is wholly understood beforehand, computer science knowledge has more recently suc-
ceeded in developing ways to manage this data and gain value from it.
Any data whose organization or shape is uncertain is considered unstructured. The amount of unstruc-
tured data is immense, and processing challenges must be addressed to gain value from it. Unstructured
data is typically present in heterogeneous data sources, including unstructured text files with images,
videos, and other data types.
Semi-structured data includes both kinds of data. Semi-structured data can appear to be structured,
but it is not defined by the concept of a table, for instance, in a relational database management system.
Depending on the utilization of big data technology, we can also characterize it into two prime divisions:

1. Operational Big Data Technologies


2. Analytical Big Data Technologies

All the data we produce from routine activities, such as online purchases, social media platforms,
or information from a specific company, is called operational big data. This information is used as raw
data for operational big data technology analysis.
Big Data Technologies, which are more complex than Operational Big Data, can be seen as a modi-
fied version of Analytical Big Data. Analytical big data is often used when performance indicators are
involved, and important business choices must be made based on the reports produced by operational
big data. Therefore, this kind of big data technology relates to analyzing massive data that is important
for business choices.

3.2 Characteristics of Big Data

The qualities that can be used to define big data can listed:

• Volume
• Variety
• Velocity
• Variability

Volume- By its very term, big data alludes to something enormous. When assessing the value of the
data, the amount of the data is a crucial consideration. The quantity of data will also decide if a particular
data set meets the big data criteria. Therefore, Volume is a factor that must be considered while working
with Big Data solutions.
Variety-
Variety refers to a range of organized and unstructured data sources and types. Most apps used to only
look at databases and spreadsheets as data sources. Today’s analytical software also considers data from
monitoring devices, emails, pictures, videos, PDFs, audio, and other sources. These sorts of unstructured
data present mining, storage, and analysis difficulties.
Velocity-
Velocity refers to the rate at which data is created. How quickly information is collected and processed
to fulfill requests determines the data’s true potential. The speed at which data comes in from sources

201

Applications of Artificial Intelligence and Machine Learning in Geospatial Data

like corporate processes, application logs, networks, social media websites, sensors, mobile devices,
etc., is referred to as big data velocity.
Variability-
This refers to the data’s inconsistent character, which can occasionally make it challenging to handle
and manage effectively.

3.3 Advantages of Big Data

Various benefits associated with big data are explored with advancement in the ability and techniques
to handle and manage big data sets. The advantages of big data sets can be listed as:

• Because of the availability of data from search engines and social networking sites, businesses
may now tweak their business strategy.
• Big Data technology-based replacements for outdated customer feedback systems have been de-
veloped. These innovative tools use big data and natural language processing to read and evaluate
client feedback.
• Early detection of any harm to the products or services can also be accomplished using big data
sets.
• A staging area or landing zone for new data can be created using big data technologies before
choosing which data should be transmitted to the data warehouse. A business can release infre-
quently utilized data by combining big data technology with data warehouses.
• A few benefits of big data include better decision-making, improved customer service, and im-
proved operational efficiency.
• Numerous applications in geosciences now have new approaches and potential because of big data
and artificial intelligence.

4. GEOSCIENCE AND GEOSCIENCE DATA

Geoscience is known to study the Earth’s system and its various components, including its atmosphere,
biosphere, lithosphere, cryosphere, and biosphere. This covers a wide range of characteristics of interac-
tions between living things—including people—and the earth. The geologists gathered data in two pri-
mary ways: on their own in the field and by evaluating previously available data (Thompson et al., 2001;
Zhang et al., 2015). Fieldwork entails gathering unique data in support of the objectives of the research.

4.1 GeoScience Data Type

The geoscience data and collections support industry programs to find and develop domestic natural
resources. The geoscience community has accumulated a vast amount of data, most of which has po-
tential utility and would be expensive to replace. A large portion of these collections is irreplaceable.
These geoscience data and collections are becoming more varied and substantial, which has increased
the requirement for space to enable their preservation and availability. To maintain energy and mass
budgets, the states of the Earth’s primary interacting components are constantly altering in space and

202

Applications of Artificial Intelligence and Machine Learning in Geospatial Data

time(Stillings 2012, Carbone et al., 2016). The earth’s primary components are lithosphere, biosphere,
hydrosphere, and atmosphere (Fig. 2).
The Earth system’s components interact through complex and dynamic geoscience processes that
are interlinked. The following is a brief description of both of these types of geoscience data sources
(Karpatne & Liess, 2015).

Figure 2. Major interacting components of the Earth system

4.1.1 Geoscience Observations (Remote Sensing and In-Situ Sensor Data)

Different collection techniques are used to gather data on the Earth system at various spatial and tem-
poral scales and for various geoscience goals. For instance, a cluster of Earth observation satellites in
orbit is responsible for keeping track of various geoscience parameters like surface temperature, aerosol
properties, wind information, sea surface temperature, sea surface height, relative humidity, surface
reflectance, atmospheric compositions, and several others. Several institutions are working together to
contribute to the vast amount and diversity of remote sensing data, much of which is made available to
the general public.
From the beginning of the 1970s, remote sensing techniques enhanced, and data provided a long-term,
worldwide view of the evolution of geoscience variables at regular time intervals (daily to monthly)
and fine spatial resolutions ranging from 1km to a few meters (NASA & USGS, 2017). Geoscience
information can also be collected using on airborne sensors for targeted studies in particular geographic
regions of interest (Frankenberg et al., 2016). The in-situ sensors also collect various weather, climate,
and geoscience data. The in-situ sensors are primarily installed on weather stations, radiosonde bal-
loons, ships, and ocean buoys. Waterbodies are another significant source of geoscience observations.
Sensor-based observations of geoscience processes are typically available at irregular time intervals,

203

Applications of Artificial Intelligence and Machine Learning in Geospatial Data

non-uniform spatial grids, and occasionally over moving objects like balloons, ships, or buoys. Govern-
ment organizations, like NASA/NOAA, actively maintain them, and the data collected represents one
of the most reliable sources of information about the Earth and climate systems (NOAA, 2017; Kalnay,
1996; NSF, 2017; WMO, 2017).

4.1.2 Model Simulation Data

Geoscience processes are distinct since the interactions between various components or the evolution of
system states have a strong foundation in laws and principles of physics that researchers have discovered
over a long time after rigorous investigations. Despite the fact that these physics-based equations can
occasionally be solved in closed form for small-scale experiments, it is frequently challenging to discover
precise answers for the complex real-world systems found in geosciences. However, numerical models
known as “physics-based models” can still be used to mimic the evolution of the states of the Earth
system using the underlying physical principles. Large amounts of simulation data of the many parts of
the Earth system are produced by physics-based models, and these data can be employed in data-driven
analysis. They are created and maintained by various centers of different international research groups
(World Climate Research Programme, 2017; NCAR, 2017).

4.2 Properties of Geoscience Data:

The general characteristics of geoscience data are as follows:

4.2.1 Spatial and Temporal Structure

Geoscience observations are typically auto-correlated at proper spatial and temporal resolutions in space
and time. When the land cover at a specific site changes, the transition typically lasts for a particular time.
Even though spatio-temporal autocorrelation needs higher connectivity between adjacent observations
in space and time, geoscience systems can show long-range spatial dependency. Geoscience processes
can also show long-memory properties, for example, in terms of the connection between worldwide
floods, Indian droughts, and forest fires and climate indices like the El-Nino, Southern Oscillation,
Atlantic Multidecadal Oscillation (AMO) and Indian Ocean Dipole (Ward et al., 2014; Wunsch, 2006).
ML techniques are affected in several different ways by the intrinsic spatiotemporal organization of
geoscience data.

4.2.2 Spatio-Temporal Heterogeneity

The degree of spatial and temporal variability of geoscience processes is an intriguing trait that results
in rich spatial and temporal heterogeneity in geoscience data. For instance, geoscience properties vary
significantly from one place to another due to diverse geographies, terrain, vegetation types, and cli-
matic conditions in different areas of the Earth. It is challenging research to study the variability in the
distribution of geoscience variables globally because of the variability of geoscience processes. As a
result, it is challenging to train ML models that function well across both spatial and temporal domains.

204

Applications of Artificial Intelligence and Machine Learning in Geospatial Data

4.2.3 High Dimensionality

Analysis of numerous remote sensing parameters, such as normalized difference vegetation index
(NDVI), the proportion of vegetation, and land surface temperature, is necessary for the vigorous and
detailed identification of land cover changes over any location. Geoscience data is high dimensional
due to the ability to capture the impacts of several variables at fine spatial and temporal resolutions.
The dimensionality of the data in 3D spatial resolutions is further increased by the fact that geoscience
phenomena extend beyond the Earth’s surface and across several layers in earths vertical direction (i.e.,
in the atmosphere or below the earth’s surface). Therefore, it is essential to scale up existing ML ap-
proaches to handle several dimensions to analyze geoscience phenomena thoroughly.

4.2.4 Multi-Source Multi-Resolution Data

Geoscience data sets are frequently accessible from various sources at different geographical locations
and possess varying temporal resolutions. The prime source of the geoscience data are satellites, in-situ
observations, and simulation results from various numerical model. High-resolution data must be col-
lected on ecological processes like forest fires, which may involve aerial images of planes flying over
the area of interest in combination with satellite images with a coarser resolution that is updated often.
We can better understand processes at various space and time scales by analyzing multi-resolution geo-
science data sets. Furthermore, we must develop algorithms that recognize the different geographical
patterns at various resolutions.

4.2.5 Data Quality

There is a lot of noise in many geoscience data sets, such as those gathered by sensors on Earth monitor-
ing satellites. For instance, sensors may briefly stop working due to flaws or harsh weather, leaving data
gaps. A consistent analysis technique may also be challenging to put into practice throughout several
time-periods due to changes in measuring equipment, such as replacing a sensor that isn’t working cor-
rectly or switching from one generation of satellite to another. The signal of interest in many geoscience
applications may be weak compared to the noise level. Many geoscience variables are challenging to
observe directly and can be derived from measurements or simulation results. Even data produced from
model outputs have uncertainty owing to the uncertainties associated with initial and boundary condi-
tions of the system and their limitation to represent a process in the model.

5. GEOSPATIAL ARTIFICIAL INTELLIGENCE (GEOAI)

Geospatial artificial intelligence (geoAI), an evolving topic of study, syndicates breakthroughs in spatial
science with AI techniques in machine learning, data mining, and high-performance computation to
extract information from geoscience big data GeoAI was developed to extract useful information from
big data. GeoAI was recently created by fusing advances in spatial science with the rapidly expanding
techniques in AI. GeoAI is quite interdisciplinary, bridging many fields and scientific disciplines such as
engineering, computer science, spatial science, and statistics. The advancement of geoAI is partly used
to solve problems in the actual world. Environmental exposure modeling is one area where researchers

205

Applications of Artificial Intelligence and Machine Learning in Geospatial Data

have started applying geoAI technologies (Lin et al., 2017). GeoAI is integrated with environmental
epidemiology to carry out more precise modeling of environmental exposures. This would then lead
to a correct estimation of the environmental factors we are exposed. This results in an enhanced un-
derstanding of the possible links between exposures and disease in epidemiologic studies. GeoAI is an
applied branch of spatial data science that focuses primarily on using AI technologies to analyze spatial
big data. GeoAI is also a dedicated discipline within spatial science, such as Geographic information
system (GIS), that must be employed to assess and examine spatial data. Applications of geoAI that
were highlighted included semantic similarity detection (Majic et al., 2017), deep learning algorithms
for feature recognition in maps (Duan et al., 2017), and multi-sensor remote sensing picture resolution
augmentation (Collins et al., 2017).
Environmental epidemiologists use direct methods (such as biomonitoring) and indirect methods
(exposure modeling) of exposure assessment to identify the factors to which humans may be exposed.
These elements may subsequently have an impact on health. Using multiple data inputs and statistical
techniques, exposure modeling involves building a model to represent a specific environmental factor
(Nieuwenhuijsen, 2015). In the past two decades, spatial science has played a significant role in exposure
modeling for epidemiologic studies. It empowered environmental epidemiologists to use geographic
information system (GIS) knowledge to build exposure models and connect them to health outcome
data using geoscience parameters to study the impact of factors like the association of air pollution in
enhancing various cardiovascular and respiratory diseases.

6. APPLICATION OF ARTIFICIAL INTELLIGENCE AND


MACHINE LEARNING IN BIG DATA AND GEOSCIENCE

Geosciences is a discipline with significant societal implications that must solve several pressing issues
affecting humans and the ecosystem. ML has been enormously successful in commercial fields, provides
an enormous promise to help solve problems in geosciences as it enters the era of big data. However,
geosciences problems face particular difficulties that are uncommon in traditional applications and
necessitate using new problem formulations and machine-learning approaches (Huang & Liu, 2016).
With the enhancement in instrumental capacity and techniques, the Geoscience data falls in the big data
category. Numerous applications in geoscience now have new approaches and potential because to big
data and artificial intelligence/machine learning (Wang et al., 2014; Twarakavi et al., 2006; Gonbadi et
al., 2015; Hamilton, 1978; Siegal, 1980). However, big data and AI/ML-based geoscience applications
are still in their infancy and lack a coherent theoretical and application framework, leaving the method-
ologies and aims dispersed.
In various areas of geoscience (land, ocean, and atmosphere) and beyond, AI and ML has shown to be
helpful for an extensive range of applications (Lary et al., 2004, 2009; Brown et al. 2008; Azamathulla,
2012; Zahabiyoun et al., 2013; Madadi et al., 2015; Braga & Logan, 2017). While this is true, only a few
recent techniques have been used in geosciences and remote sensing. The ML algorithms use widespread
approximations. In other words, they use a collection of training data to learn the fundamental behavior
of a system. The fact that ML-based techniques do not require prior knowledge of the nature of the links
between the data is another intriguing aspect of these techniques. The deterministic model of the system
requires expensive calculation, and ML can be utilized as a code-accelerator. There is no deterministic

206

Applications of Artificial Intelligence and Machine Learning in Geospatial Data

model, but given the available data, an empirical ML-based model can be created. Sometimes these
models encounter classification problems.
The most popular ML methods for solving geophysical problems are Artificial Neural Networks
(ANN) and support vector machines (SVM). Genetic programming (GP) applications in the geoscience
and remote sensing domain are among the critical subsets of ML, but they are relatively young and limited
to a few fields. ANNs, SVM, and many other ML techniques have excellent performance, yet they are
still viewed as “black-box” models. In other words, they are unable to produce useful prediction equa-
tions. GP is regarded as an effective strategy for handling this problem. GP creates computer programs
to solve problems with the Darwinian natural selection theory. The encoded solutions (individuals) in
GP are computer programs rather than binary strings, making it a subtype of Genetic Algorithms (GA)
(Alavi & Gandomi, 2011). The ability to generate prediction equations without defining the shape of
the existing relationship is a noteworthy characteristic of GP and its derivatives (Alavi et al., 2011a;
Gandomi & Alavi, 2011).)
Geoscience and geo-engineering face more severe uncertainties than other branches of civil and me-
chanical engineering because of the nature of geological materials. Additionally, geotechnical engineer-
ing has a wealth of monitoring and site investigation data for which data analytic techniques can utilize.
As a result, applying ML to geotechnical engineering challenges can be an appropriate and successful
solution. Big data and machine learning may be able to solve previously unsolved geotechnical issues in
novel ways. Several studies used AI/ML techniques to advance and improve the quality of their research
and results (Shi & Wang, 2021; Wang et al., 2020; Zhang et al. 2020a; 2020b,2020 c; Ray et al., 2020;
Pan et al., 2021; Shen et al., 2021; Zheng et al., 2020; Zhang et al., 2020b, Olierook et al., 2020).

7. SPECIFIC APPLICATION OF ARTIFICIAL INTELLIGENCE/


MACHINE LEARNING IN THE GEOSCIENCE FIELD

In this section authors will discuss few specific applications of Artificial Intelligence/Machine Learning
in the GeoScience Field.

7.1 Atmospheric Aerosols and Particulate Matter Characterization

One of the serious issues of this time is climate change and its impact on the whole ecosystem. The ra-
diative forcing caused by atmospheric aerosols is one of the most significant uncertainties contributing
to climate change prediction (IPCC, 2013). Along with climatic impact, these particles are also crucial
to air pollutants and the primary cause of pre-mature deaths worldwide (WHO, 2014; Pope et al., 2009;
Boldo et al., 2011). A Virtual Sensor can produce an accurate boundary layer atmospheric aerosol product
and associated uncertainty by merging many remote sensing and modeling products using multivariate
ML. The method is advantageous for creating data products that are pertinent to society. By using ML
to predict the daily global load of PM2.5, it is possible to effectively address this need with the help of
existing remote sensing data (Lee et al., 2011a,b; Liu & Harrison, 2011).
A multivariate ML technique appears to be suitable to capture this link adequately. In terms of of-
fering a brand-new PM2.5 product, ML has performed exceptionally well. For PM2.5 simulations to
be accurate, meticulous attention to detail is essential. While setting up a model to predict information
from satellite data, we must be cautious when selecting the training data set. The satellite data and

207

Applications of Artificial Intelligence and Machine Learning in Geospatial Data

ground-based information, which are coincident, should be taken as the training dataset. A vast training
dataset is required to develop a robust ML model (Srivastava et al., 2021). Employing the entire set of
training parameters that accurately describe the local environment is advised. The local environment’s
parameters are relative humidity, wind speed, and direction, air temperature, boundary layer height,
surface pressure, and sunshine hours.

7.2 Dust Detection

There are numerous types of dust sources found worldwide. Automated diagnosis and, by extension,
description in global models have thus far generally failed to capture the extremely localized nature
of dust sources realistically. In contrast to what we often observe when viewing satellite photos of the
entire planet.
For the realistic modeling of atmospheric particle distributions identifying dust sources is a crucial
but difficult challenge (Ginoux et al., 2001; Prospero et al., 2002). Realistic atmospheric particle distri-
butions is essential for air quality and climate change studies. The novel forces face significant practical
difficulty in creating real-time visibility forecasts, and to overcome these difficulties, several techniques
are used. In connection with it, the ML technique was used for the prime dust source (Walker et al.,
2009). The categorization problem is a good fit for self-organizing maps (SOMs). The SOMs techniques
were developed by Kohonen (1982) for unsupervised classification and data visualization. They use
self-organizing neural networks to minimize the dimensionality of the data. The problem that humans
can’t naturally visualise high-dimensional data is one that SOMs assist us to solve. SOMs pick up on
the topology and distribution of the input vectors used in their training. SOMs can show similarities and
decrease dimensionality using this method. The capacity of a SOM to express non-linear functions or
mappings is a notable improvement over principal component analysis. Researchers have used the ML
approach to identify dust plumes and have shown the effectiveness of this approach (Lary et al., 2016).
Lary et al., 2016, claimed their approach to be novel as it could distinguish between different sorts of
dust sources while also operating at very high spatial resolution.

7.3 Characterization of Rock Mass

The behavioral characterization of rock mass is the main focus of the bulk of GP applications. A few
other research employ the GP technique for decoding the information from remote sensing. It is impor-
tant to note that additional research has focused on using GP to examine geotechnical engineering issues
such as the liquefaction phenomena, ground movement patterns, and parameters (Gandomi et al., 2011;
Gandomi & Alavi, 2013). Among the first research in the area was carried out by Baykasoglu et al.
(2008). To anticipate limestone strength, they used GP-based techniques. Based on experimental data,
the models were created. The models were reasonably accurate, with determination coefficients (R2)
for tensile strength and uniaxial compressive strength of 0.76 and 0.95, respectively. To measure the
deformation modulus, Beiki et al. (2010) established new techniques with the help of GP where serval
other parameters were used as predictor variables. These parameters were the modulus of elasticity,
compression ratio, rock mass quality designation, porosity, rock density, geological strength index, and
serval other. This study showed that the machine learning-based model performed better than existing
empirical models. For the analysis of strength and elasticity modulus for granitic rocks, Karakus (2011)

208

Applications of Artificial Intelligence and Machine Learning in Geospatial Data

used GP. The outcomes made it quite evident that GP is a possible instrument for estimating granitic
rocks’ strength and elasticity modulus.

7.4 Contribution of Artificial Intelligence to Smart Cities

Urbanization has gained prominence over the past few decades as a fundamental issue that is inextricably
tied to society’s growth. Unsustainable urbanization can result due to poor urban planning and impede
the advancement of society and the technology that powers the services offered to inhabitants. This
research topic is open to submissions that are a part of cutting-edge approaches to artificial intelligence
used in Smart Cities to deliver new, powerful tools to address the issues associated with urban growth.
The creation of technical solutions that assist cities in efficiently utilizing the resources at their disposal
and that enhance information management and sharing among many actors is crucial in this context.
Particularly, geospatial technologies have the potential to improve services, rationalize costs, increase the
sustainability of urban development, lower emissions, and water use, and shorten travel times. Systems
for tracking air pollution, controlling energy costs according to demand, and controlling satellite data
used to monitor physical characteristics are all additional potential applications that may be of interest.
All of these technologies have the potential to boost and sustain economic growth, which will raise resi-
dents’ standards of living and allow for the creation of services that are more effective and affordable.
Indian cities are predicted to hold 40% of the country’s population and produce more than 70% of its
GDP in the future (MoHUA, 2015). To offer the urban population a high standard of living and viable
economic possibilities, our cities must be outfitted with the necessary physical and social infrastructure.
With a network of video cameras, sensors, traffic management systems, smart meters, automobiles, IoT
devices, and mobile phones, cities today are sitting on a goldmine of data that they generate in massive
numbers daily (MoHUA, 2015). With the help of this data, emerging technologies, such as AI, have the
potential to completely change how cities deal with the problems brought on by rapid urbanization. There
are many applications for AI, from preserving a healthier environment to improving public transportation
and safety. To build smarter cities, governments aggressively promote developing and implementing
AI-enabled services worldwide. Its prime application can be seen in the traffic control sector, health
sectors (in case of the spread of diseases), crowd management, biometric applications, public services,
the education sector, and transport and road safety.
Using AI systems in cities will have an internal impact on process efficiency and an outward effect
on how well services are provided to the general populace. AI and other city systems will cooperate to
improve avenues for equitable economic growth, promote sustainability in cities, and enhance the quality
of life for citizens. In light of this, governments can use AI to customize public services for individuals,
gain actionable insights into policy choices, anticipate and predict future trends, simulate the adoption
of alternative policy options, and identify unintended repercussions prior to enacting new legislation
(MoHUA, 2015). This will improve city sustainability and improve the quality of life for inhabitants
while also making the government more responsive and effective.

7.5 Oil and Gas Industry

Neural networks and other deep learning architectures have returned since the middle of the 2000s. These
developments have enabled deep learning techniques to be successfully applied across numerous fields.
However, the Oil and Gas sector is not entirely utilizing these techniques. Many academics and engineers

209

Applications of Artificial Intelligence and Machine Learning in Geospatial Data

have worked hard to close this gap to develop a wide range of contemporary ML applications in this area,
including geosciences (Hiren et al., 2018). The application of ML techniques in the geoscience domains
was examined by Hiren et al. 2018, with a primary focus on complex geoscientific issues, including log
correlation and seismic fault interpretation. For a given basin, geologists must spend a significant time
identifying and classifying tops in multiple well log files. Advanced ML algorithms can be used to train
from a small sample of geologists’ labeled well logs to forecast and propagate the tops in the remaining
samples. This workflow aims to identify and categorize tops using triple combo logs files automatically.
This method would reduce the efforts of geologists significantly (Xuan and Murphy, 2007).
They use a paradigm for fault interpretation based on the so-called local model (Hiren et al., 2018).
An expert interpreter selects tiny subsets of each new survey, qualitatively classifies them for super-
vised model building and ensures that the sample is representative of all sample properties. The task
was finished once the model had been developed and applied to the remaining uninterpreted survey.
The goal is to cut down on the entire interpreting effort drastically. So far, based on various datasets,
the interpreter often only needs to label a tiny portion of a survey to train the model, and the trained
model mostly identified the unlabeled problems. Researchers demonstrated the effectiveness of these
techniques in locating boundaries in well logs and faults in seismic studies. Computers can now analyze
vast volumes of data quickly and accurately thanks to applications that use machine learning in the oil
and gas industry. It is possible to map in detail and accuracy faults or stratigraphically complex areas
(Algorithmxlab, 2020; Anirbid et al., 2021).

7.6 Seismological Applications

Seismology examines earthquakes at multiple scales utilizing a vast amount of measurement data, focus-
ing on determining how the natural disaster may affect civil infrastructure systems (Jiao & Alavi, 2020).
Seismology has developed quickly recently thanks to advances in sensing, processing, and analysis
techniques. This is especially true given the increased computational power for processing large amounts
of seismic data. While traditional data mining techniques were mostly used in prior studies, AI now of-
fers excellent techniques to extract relevant information from raw data to create accurate predictions in
seismology. Since the visual analysis of recording data from various phases by seismic experts is inef-
fective, it is desirable to anticipate earthquakes using real-time gather data in a logical and trustworthy
manner (Lary et al., 2016). To determine whether an earthquake would occur, the latter was developed
to group seismic phases together (Shahin, 2016). These conventional methods, however, fail to identify
more frequent, smaller seismic events. To handle complex scenarios in earthquake prediction AI tools
are used. ANN, GP, self-organizing maps (SOM), SVM, and decision trees (DT) are examples of ML
techniques trained to find implicit predictions for seismic events. To establish complex nonlinear interac-
tions between variables, deep learning, one of the most cutting-edge techniques in the ML space, learns
generalized representations of data sets from many domains using the ANN idea. The most popular ML
techniques in this field are ANNs. Recently, a rising interest in using deep learning methods in seismol-
ogy has emerged (Jiao, & Alavi, 2020).

210

Applications of Artificial Intelligence and Machine Learning in Geospatial Data

8. QUANTUM COMPUTING IN BIG DATA AND GEOSCIENCE

The technology of the future for processing information is known as quantum computing. It is an ar-
chetype for computing that uses matter’s explicit quantum mechanical features to do calculations. A
quantum bit, also known as a qubit, is the fundamental building block of a quantum computing system.
By physically implementing a qubit in many systems, considerable progress has been achieved over the
past few decades toward the realization of quantum computing systems. Quantum theory is advancing
quickly across various cutting-edge technologies, including implications for space applications. There is
already quantum technology in space, such as quantum key distribution. Recent research findings support
the viability of practical applications. Many additional resources with previously unrealized potentials,
such as quantum simulations, computing, imaging, sensing, metrology, optimization, or machine learn-
ing, are on the cusp of development. We are in the early stages of a paradigm change that primarily
affects Earth observation (EO). Early quantum computers were accessible to everybody, which offered
a significant opportunity to find novel solutions and expand the scope of EO’s potential applications.
New concepts for measuring and observing physical quantities are made possible by quantum sensing
and imaging, which achieves accuracy levels that go beyond what is possible with traditional methods.
Quantum simulations can improve our ability to predict climate change and other planetary events, and
quantum computing can help us find creative Data Science and AI solutions to the problems presented
by Big EO Data. These advancements present numerous chances to contribute to the engineering and
research sides of quantum technologies.
The field of EO quickly established itself as a driving force for the processing, storing, or transmit-
ting data. Several terabytes of data are being collected and transmitted each day to receiving stations by
satellite and aerial sensors throughout the past few decades. As spatial resolution increases, the amount
of visual information explodes along with the data volume. Making the millions of EO photos collected
and kept in archives more accessible to a broader user community became a challenge for the data ex-
ploitation and information dissemination approaches.
Interdisciplinary activities have grown at the intersection of quantum and earth system sciences due
to computational power and expertise improvements. These activities range from fundamental research
and algorithm development to cutting-edge product development and institutional services like remote
sensing, big data analytics, risk assessment, dynamic prediction, and decision support.
The development of quantum computational techniques is required to enable a quick but thorough
approach to these problems given the increasingly tricky forecasting, and prediction challenges pertain-
ing to the Earth system, particularly the computation overstretch resulting from coupling highly detailed
spatiotemporal resolution to long-range prediction.
In conjunction with those in quantum technologies, more advanced phenomenologically conscious
data analytics and model design tools are being developed. Data analysts and modelers have a critical
opportunity to truly reflect on and re-connect with the fundamental principles at work in the processes
and a fantastic potential for cross-fertilization and co-construction.
We have reached a point where top-notch weather centers are struggling to keep up with the massive
volumes of information and complexity, driving them to tiredness and anger. It is time to make a quan-
tum leap into a new era of Earth system modeling across multiple spatiotemporal domains. The scales
as traditional computational geosciences reach their limits, and high-resolution long-term prediction
appears to be out of reach, especially given the Earth system’s inherent complexity and predictability

211

Applications of Artificial Intelligence and Machine Learning in Geospatial Data

challenges. Smarter computing is more important than simply quicker computing. One in which basic
science is once again used to stimulate our minds and reacquaint us with nature.
Quantum computing is expected to answer the big data concerns of the future as classical binary
computing approaches its performance limits and becomes one of the fastest-growing digital technologies.

9. CONCLUSION

The essential structure and methodologies of machine learning, deep learning, and data mining and
their use in geoAI research applications are discussed. The current chapter focuses on the innovative
use of artificial intelligence in processing geographical data, notably machine learning or deep learn-
ing. Various AI/ML/DL applications for geoscience study disciplines are thoroughly covered. A review
of AI applications in time series analysis, geospatial data analysis, geospatial text analysis, and sensor
data analysis is undertaken. This chapter also deals with machine learning and deep learning applica-
tions to urbanization, smart city planning, weather and air pollution, geospatial knowledge graphs, and
disaster response. This chapter presents cutting-edge AI methods and tools for geospatial analysis and
their uses in geoscience research. The application of quantum computing in processing geoscience data
is mentioned briefly at the end of the chapter since it is a new area of research.

REFERENCES

Ackley, D. H., Hinton, G. E., & Sejnowski, T. J. (1985). A learning algorithm for Boltzmann machines.
Cognitive Science, 9(1), 147–169. doi:10.120715516709cog0901_7
Alavi, A.H., Ameri, M., Gandomi, A.H., & Mirzahosseini, M.R., (2011a). Formulation of flow number
of asphalt mixes using a hybrid computational method. Construction and Building Materials, 25(3),
1338e1355.
Alavi, A.H., & Gandomi, A. H., (2011). A robust data mining approach for formulation of geotechnical
engineering systems. Engineering Computations, 28(3), 242e274.
Anirbid, S., Kriti, Y., Kamakshi, R., Namrata, B., & Hemangi, O. (2021). Application of machine learn-
ing and artificial intelligence in oil and gas industry, Petroleum Research, 6(4), 379-391. doi:10.1016/j.
ptlrs.2021.05.009
Azamathulla, H. M. (2012). Linear programming for irrigation scheduling e a case study (Book Chapter).
In: Linear Programming: New Frontiers in Theory and Applications, 174e192.
Baykasoglu, A., Hamza, G., Hanifi, Ç., & Lale, Ö. (2008). Prediction of compressive and tensile
strength of limestone via genetic programming, Expert Systems with Applications, 35(1–2), 111-123,
doi:10.1016/j.eswa.2007.06.006
Beiki, M., Bashari, A., & Majdi, A., (2010). Genetic programming approach for estimating the deforma-
tion modulus of rock mass using sensitivity analysis by neural network. International Journal of Rock
Mechanics and Mining Sciences, 47, 1091e1103.

212

Applications of Artificial Intelligence and Machine Learning in Geospatial Data

Boldo, E., Linares, C., Lumbreras, J., Borge, R., Narros, A., Garcia-Perez, J., Fernandez- Navarro, P.,
Perez-Gomez, B., Aragones, N., Ramis, R., Pollan, M., Moreno, T., Karanasiou, A., & Lopez-Abente,
G., (2011). Health impact assessment of a reduction in ambient PM2.5 levels in spain. Environment
International 37(2), 342e348. . doi:10.1016/j.envint.2010.10.004
Braga, A., & Logan, R. (2017). The Emperor of Strong AI Has No Clothes: Limits to Artificial Intel-
ligence. Information (Basel), 8(4), 156. doi:10.3390/info8040156
Brown, M.E., Lary, D.J., Vrieling, A., Stathakis, D., & Mussa, H., (2008). Neural networks as a tool for
constructing continuous NDVI time series from AVHRR and MODIS. International Journal of Remote
Sensing, 29(24), 7141e7158.
Carbone, A., Jensen, M., & Sato, A. H. (2016). Challenges in data science: A complex systems perspec-
tive. Chaos, Solitons, and Fractals, 90, 1–7. doi:10.1016/j.chaos.2016.04.020
Collins, C. B., Beck, J. M., Bridges, S. M., Rushing, J. A., & Graves, S. J. (2017). Deep learning for
multisensor image resolution enhancement. In: 25th ACM SIGSPATIAL international conference on
advances in geographic information systems. Los Angeles, California. 10.1145/3149808.3149815
Copeland, B. J. (2022). artificial intelligence. Encyclopedia Britannica. https://www.britannica.com/
technology/arti_cial-intelligence.
Duan, W., Chiang, Y. Y., Knoblock, C. A., Jain, V., Feldman, D., Uhl, J. H., & Leyk, S. (2017). Automatic
alignment of geographic features in contemporary vector data and historical maps. In: Proceedings of
the 25th ACM SIGSPATIAL international conference on advances in geographic information systems.
ACM. 10.1145/3149808.3149816
Frankenberg, C., Thorpe, A. K., Thompson, D. R., Hulley, G., Kort, E. A., Vance, N., Borchardt, J.,
Krings, T., Gerilowski, K., & Sweeney, C. (2016). Airborne methane remote measurements reveal heavy
tail flux distribution in four corners region. Proceedings of the National Academy of Sciences. PNAS.
10.1073/pnas.1605617113
Gandomi, A.H., & Alavi, A.H., (2011). Multi-stage genetic programming: a new strategy to nonlinear
system modeling. Information Sciences 181(23), 5227e5239.
Gandomi, A.H., & Alavi, A.H., (2013). Hybridizing genetic programming with orthogonal least squares
for modeling of soil liquefaction. International Journal of Earthquake Engineering and Hazard Mitiga-
tion, 1(1), 1e8.
Gandomi, A.H., Alavi, A.H., Mousavi, M., & Tabatabaei, S.M., (2011). A hybrid computational approach
to derive new ground-motion prediction equations. Engineering Applications of Artificial Intelligence,
24(4), 717e732
Ghose, P., Acharjee, U. K., Islam, M. A., Sharmin, S., & Uddin, M. A. (2021). Deep Viewing for Co-
vid-19 Detection from X-Ray Using CNN Based Architecture, 2021 8th International Conference on
Electrical Engineering, Computer Science and Informatics (EECSI), Semarang, Indonesia. 10.23919/
EECSI53397.2021.9624257

213

Applications of Artificial Intelligence and Machine Learning in Geospatial Data

Ghose, P., Md, A. U., Uzzal, K. A., & Selina, S. (2022a). Deep viewing for the identification of Co-
vid-19 infection status from chest X-Ray image using CNN based architecture. Intelligent Systems with
Applications, 16, 200130. doi:10.1016/j.iswa.2022.200130
Ghose, P., Muhaddid, A., Mehnaz, T., Md. Ashraf, U., Milon, B., Kawsher, M., Gaur, L., Saurav, M. &
Zhongming, Z., (2022). Detecting COVID-19 infection status from chest X-ray and CT scan via single
transfer learning-driven approach, Front. Genet., 21 Sec. Computational Genomics, 13. Frontiers.|
doi:10.3389/fgene.2022.980338
Ghose, P., Sharmin, S., Gaur, L., & Zhao, Z. (2022b). Grid-Search Integrated Optimized Support Vec-
tor Machine Model for Breast Cancer Detection, IEEE International Conference on Bioinformatics and
Biomedicine (BIBM), Las Vegas, NV, USA. 10.1109/BIBM55620.2022.9995703
Ginoux, P., Chin, M., Tegen, I., Prospero, J.M., Holben, B., Dubovik, O., & Lin, S.J., (2001). Sources
and distributions of dust aerosols simulated with the GOCART model. Journal of Geophysical Research-
Atmospheres 106 (D17), 20255e20273.
Gonbadi, A. M., Tabatabaei, S. H., & Carranza, E. J. M. (2015). Supervised geochemical anomaly
detection by pattern recognition. Journal of Geochemical Exploration, 157, 81–91. doi:10.1016/j.gex-
plo.2015.06.001
Hamilton, E. I. (1978). Principles of isotope geology. Earth-Science Reviews, 14(2), 190–191.
doi:10.1016/0012-8252(78)90027-2
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition, in: Proceedings
of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 770-778.
Hilbert, M. (2016). Big Data for Development: A Review of Promises and Challenges. Development
Policy Review, 34(1), 135–174. doi:10.1111/dpr.12142
Hiren, M., Srikanth, R., Mandar, S. K., Aria, A., & Schlumberger. (2018). Machine learning meth-
ods in Geoscience. SEG International Exposition and 88th Annual Meeting. IEEE. doi:10.1190/se-
gam2018-2997218.1
Huang, S. F., & Liu, X. H. (2016). Thinking about the application of geological big data and geological
information development. China Mining Magazine, 25(08), 166–170.
IPCC. (2013). Climate Change 2013: The Physical Science Basis. Contribution of Working Group I to the
Fifth Assessment Report of the Intergovernmental Panel on Climate Change. Cambridge University Press.
Jiao, P., & Alavi, A. H. (2020). Artificial intelligence in seismology: Advent, performance and future
trends, Geoscience Frontiers, 11(3), 739-744. doi:10.1016/j.gsf.2019.10.004
Jordan, M. I., & Mitchell, T. M. (2015). Machine learning: Trends, perspectives, and prospects. Science,
349(6245), 255–260. doi:10.1126cience.aaa8415 PMID:26185243

214

Applications of Artificial Intelligence and Machine Learning in Geospatial Data

Kalnay, E., Kanamitsu, M., Kistler, R., Collins, W., Deaven, D., Gandin, L., Iredell, M., Saha, S., White,
G., Woollen, J., Zhu, Y., Leetmaa, A., Reynolds, R., Chelliah, M., Ebisuzaki, W., Higgins, W., Janowiak,
J., Mo, K. C., Ropelewski, C., & Joseph, D. (1996). The ncep/ncar 40-year reanalysis project. Bulletin
of the American Meteorological Society, 77(3), 437–471. doi:10.1175/1520-0477(1996)077<0437:TN
YRP>2.0.CO;2
Karakus, M., (2011). Function identification for the intrinsic strength and elastic properties of granitic
rocks via genetic programming (GP). Computers & Geosciences, 37, 1318e1323.
Karpatne, A., & Liess, S. (2015). A guide to earth science data: Summary and research challenges.
Computing in Science & Engineering, 17(6), 14–18. doi:10.1109/MCSE.2015.127
Kohonen, T., (1982). Self-organized formation of topologically correct feature maps. Biological Cyber-
netics, 43(1), 59e69.
Lary, D. J., Alavi, A. H., Gandomi, A. H., & Walker, A. L. (2016). Machine learning in geosciences and
remote sensing. Geoscience Frontiers, 7(1), 3–10. doi:10.1016/j.gsf.2015.07.003
Lary, D.J., Muller, M.D., & Mussa, H.Y., (2004). Using neural networks to describe tracer correlations.
Atmospheric Chemistry and Physics, 4, 143e146.
Lary, D.J., Remer, L.A., MacNeill, D., Roscoe, B., & Paradise, S., (2009). Machine learning and bias
correction of MODIS aerosol optical depth. IEEE Geoscience and Remote Sensing Letters, 6 (4), 694e698.
LeCun, Y., Bottou, L., Bengio, Y., & Haner, P. (1998). Gradient-based learning applied to document
recognition. Proceedings of the IEEE, 86(11), 2278–2324. doi:10.1109/5.726791
Lee, H. J., Liu, Y., Coull, B., Schwartz, J., & Koutrakis, P. (2011a). PM2.5 prediction modelling using
MODIS AOD and its implications for health effect studies. Epidemiology (Cambridge, Mass.), 22(1).
Lee, H. J., Liu, Y., Coull, B. A., Schwartz, J., & Koutrakis, P. (2011b). A novel calibration approach
of MODIS AOD data to predict PM2.5 concentrations. Atmospheric Chemistry and Physics, 11(15),
7991–8002. doi:10.5194/acp-11-7991-2011
LiM.ZhangZ.JiangS.LiuQ.ChenC.ZhangY.. (2020), Predicting the epidemic trend of COVID-19 in China
and across the world using the machine learning approach, medRxiv, doi:10.1101/2020.03.18.20038117
Lin, Y., Chiang, Y. Y., Pan, F., Stripelis, D., Ambite, J. L., Eckel, S. P., & Habre, R. (2017). Mining
public datasets for modeling intra-city PM2.5 concentrations at a fine spatial resolution. In: Proceedings
of the 25th ACM SIGSPATIAL international conference on advances in geographic information systems.
Los Angeles area, CA: ACM, 1–10. 10.1145/3139958.3140013
Liu, J. Y., Harrison, R. M., (2011). Properties of coarse particles in the atmosphere of the United King-
dom. Atmospheric Environment (45), 3267e3276.
Madadi, M. R., Azamathulla, H., & Yakhkeshi, M. (2015, September). Md, & Yakhkeshi, M., (2015).
Application of Google Earth to investigate the change of flood inundation area due to flood detention
dam. Earth Science Informatics, 8(3), 627–638. doi:10.100712145-014-0197-8

215

Applications of Artificial Intelligence and Machine Learning in Geospatial Data

Majic, I., & Winter, S., Tomko, & M., (2017). Finding equivalent keys in OpenStreetMap: semantic
similarity computation based on extensional definitions. In: Proceedings of the 25th ACM SIGSPATIAL
international conference on advances in geographic information systems. Los Angeles, California.
10.1145/3149808.3149813
Manyika, J., Chui, M., Brown, B., Bughin, J., Dobbs, R., Roxburgh, C., & Byers, A. H. (2011). Big Data:
The Next Frontier For Innovation, Competition, And Productivity. Analytics. McKinsey & Company.
Marko, J. (2012). Scientists See Promise in Deep Learning Programs. The New York Times.
McCulloch, W. S., & Pitts, W. (1943). A logical calculus of the ideas immanent in nervous activity. The
Bulletin of Mathematical Biophysics, 5(4), 115–133. doi:10.1007/BF02478259
Minsky, M., & Papert, S. A. (2017). Perceptrons: An Introduction to Computational Geometry. MIT
Press. doi:10.7551/mitpress/11301.001.0001
Mitchell, M. (2006). Complex systems: Network thinking. Artificial Intelligence, 170(18), 1194–1212.
doi:10.1016/j.artint.2006.10.002
Mitchell, T. M. (1997). Machine Learning (1st ed.). McGraw-Hill, Inc.
MoHUA. (2015). Smart Cities Mission Statement & Guidelines, MoHUA, June 2015; Future of India the
winning leap. PwC https://www.pwc.in/assets/pdfs/future-of-india/future-of-india-the-winning-leap.pdf
NASA and USGS. (2017). Landsat Data Archive. Landstat. https://landsat.gsfc.nasa.gov/data/
National Corporation for Atmospheric Research. (2017). Community Land Model. NCAR. https://www.
cesm.ucar.edu/models/clm/
Nieuwenhuijsen, M. J. (2015). Exposure assessment in environmental epidemiology (2nd ed.). Oxford
University Press. doi:10.1093/med/9780199378784.001.0001
NOAA (National Oceanic and Atmospheric Administration). (2017). National Centers for Environmental
Information. NOAA. https://www.ncdc. noaa.gov/
National Science Foundation, (2017). EarthScope. NSF. https://www.earthscope.org/
Olierook, H. K. H., Scalzo, R., Kohn, D., Chandra, R., Farahbakhsh, E., Clark, C., Reddy, S., & Müller,
D. (2020). Bayesian geological and geophysical data fusion for the construction and uncertainty quanti-
fication of 3D geological models. Geoscience Frontiers, 12(1), 479–493. doi:10.1016/j.gsf.2020.04.015
Pan, Q., Leung, Y. F., & Hsu, S. (2021). Stochastic seismic slope stability assessment using polynomial
chaos expansions combined with relevance vector machine. Geoscience Frontiers, 12(1), 405–414.
doi:10.1016/j.gsf.2020.03.016
Pope, I., Arden, C., Burnett, R.T., Krewski, D., Jerrett, M., Shi, Y., Calle, E.E., & Thun, M.J., (2009).
Cardiovascular mortality and exposure to airborne fine particulate matter and cigarette smoke shape
of the exposure-response relationship. Circulation 120 (11), 941e948. . doi:10.1161/CIRCULA-
TIONAHA.109.857888

216

Applications of Artificial Intelligence and Machine Learning in Geospatial Data

Prospero, J. M., Ginoux, P., Torres, O., Nicholson, S. E., & Gill, T. E. (2002). Environmental characterization
of global sources of atmospheric soil dust identified with the nimbus 7 total ozone mapping spectrometer
(TOMS) absorbing aerosol product. Reviews of Geophysics, 40(1), 31. doi:10.1029/2000RG000095
Ray, R., Kumar, D., Samui, P., Roy, L.B., Goh, A.T.C., & Zhang, W., (2020). Application of soft
computing techniques for shallow foundation reliability in geotechnical engineering. Geosci. Front. .
doi:10.1016/j.gsf.2020.05.003
Rosenblatt, F. (1958). The perceptron: A probabilistic model for information storage and organization
in the brain. Psychological Review, 65(6), 386–408. doi:10.1037/h0042519 PMID:13602029
Samuel, A. L. (1959). Some Studies in Machine Learning Using the Game of Checkers. IBM Journal
of Research and Development, 44(1.2), 206–227. doi:10.1147/rd.441.0206
Shahin, M. A. (2016). State-of-the-art review of some artificial intelligence applications in pile founda-
tions. Geoscience Frontiers, 7(1), 33–44. doi:10.1016/j.gsf.2014.10.002
Shen, H., Li, J. H., Wang, S. X., & Xie, Z. W. (2021). Prediction of load-displacement performance of
grouted anchors in weathered granites using FastICA-MARS as a novel model. Geoscience Frontiers,
12(1), 415–423. doi:10.1016/j.gsf.2020.05.004
Shi, C., & Wang, Y. (2021). Non-parametric machine learning methods for interpolation of spatially
varying non-stationary and non-Gaussian geotechnical properties. Geoscience Frontiers, 12(1), 339–350.
doi:10.1016/j.gsf.2020.01.011
Siegal, B. S., & Gillespie, A. R. (1980). Remote sensing in geology. Wiley.
Srivastava, N., Vignesh, D., & Saxena, N. (2021). Investigation of artificial neural network performance in
the aerosol properties retrieval, Journal of Water and Climate Change, 12 (6), 2814–2834., doi:10.2166/
wcc.2021.336
Stillings, N. (2012). Complex systems in the geosciences and in geoscience learning (Vol. 486). Geologi-
cal Society of America Special Papers. doi:10.1130/2012.2486(17)
Sutton, R. S. (1992). Introduction: the challenge of reinforcement learning. In Machine Learning, 8 (pp.
225–227). Kluwer Academic Publishers.
Thompson, S., Fueten, F., & Bockus, D. (2001). Mineral identification using artificial neural networks
and the rotating polarizer stage. Computers & Geosciences, 27(9), 1081–1089. doi:10.1016/S0098-
3004(00)00153-9
Twarakavi, N. K. C., Misra, D., & Bandopadhyay, S. (2006). Prediction of Arsenic in Bedrock Derived
Stream Sediments at a Gold Mine Site Under Conditions of Sparse Data. Natural Resources Research,
15(1), 15–26. doi:10.100711053-006-9013-6
Walker, A. L., Liu, M., Miller, S. D., Richardson, K. A., & Westphal, D. L. (2009). Development of a
dust source database for mesoscale forecasting in southwest Asia. Journal of Geophysical Research,
114(D18), D18207. doi:10.1029/2008JD011541

217

Applications of Artificial Intelligence and Machine Learning in Geospatial Data

Wang, H.J., Zhang, L.M., Xiao, T., Zhang, L.L., & Li, J.H., (2020). Landslide identification using ma-
chine learning. Geosci. Front. . doi:10.1016/j.gsf.2020.02.012
Wang, X. S., Zhao, B. L., Dong, S. T., Zhang, Y., Yi, W. Q., & Xu, G. C. (2014). Challenges and Strategies
for Large Seismic Exploration Data of Oil and Gas Industry. Zhongguo Shiyou Kantan, 19(04), 43–47.
Ward, P. J., Jongman, B., Kummu, M., Dettinger, M. D., Weiland, F. C. S., & Winsemius, H. C. (2014).
Strong influence of el nino southern oscillation on flood risk around the world, Proceedings of the Na-
tional Academy of Sciences, 111(44), 15 659–15 664. 10.1073/pnas.1409822111
WHO. (2014). 7 million Premature Deaths Annually Linked to Air Pollution. WHO. https://www.who.
int/mediacentre/news/releases/2014/air-pollution/en/
WMO (World Meteorological Organisation). (2017). Global Runoff Data Centre. WMO. http://www.
bafg.de/GRDC/
World Climate Research Programme. (2017), Coupled Model Intercomparison Project, https://cmip-
pcmdi.llnl.gov/
Wunsch, C. (2006). Discrete Inverse and State Estimation Problems With Geophysical Fluid Applica-
tions. Cambridge University Press. doi:10.1017/CBO9780511535949
Xuan, X., & Murphy, K. (2007). Modeling changing dependency structure in multivariate time se-
ries: Proceedings of the 24th International Conference on Machine Learning, 1055–1062.10.1190
10.1145/1273496.1273629
Zahabiyoun, B., Goodarzi, M.R., Bavani, A.R.M., &Azamathulla, H.M., (2013). Assessment of climate
change impact on the Gharesou river Basin using SWAT Hydrological model. Clean e Soil, Air, Water,
41 (6), 601e609.
Zhang, L., Wang, S., & Liu, B. (2017). Deep Learning for Sentiment Analysis: A Survey, National Sci-
ence Foundation (NSF). Huawei Technologies Co. Ltd.
Zhang, P., Yin, Z.Y., Jin, Y.F., Chan, T.H.T., Gao, F. P., (2020a). Intelligent modelling of clay compress-
ibility using hybrid meta-heuristic and machine learning algorithms. Geosci. Front. . doi:10.1016/j.
gsf.2020.02.014
Zhang, Q., Jia, X. Y., Wu, Z., Wang, J. R., Jiao, S. T., & Chen, W. F. (2015). Big data will lead to a
great change in Geological Science Research. China Geoscience union annual meeting, Beijing, China.
Zhang, R.H., Wu, C.Z., Goh, A.T.C., B€ohlke, Thomas, Zhang, W.G., (2020c). Estimation of diaphragm
wall deflections for deep braced excavation in anisotropic clays using ensemble learning. Geosci. Front.
. doi:10.1016/j.gsf.2020.03.003
Zhang, W.G., Wu, C.Z., Zhong, H.Y., Li, Y.Q., Wang, L., (2020b). Prediction of undrained shear strength
using extreme gradient boosting and random forest based on Bayesian optimization. Geosci. Front. .
doi:10.1016/j.gsf.2020.03.007

218

Applications of Artificial Intelligence and Machine Learning in Geospatial Data

Zheng, S., Zhu, Y. X., Li, D. Q., Cao, Z. J., Deng, Q. X., & Phoon, K. K., (2020). Probabilistic outlier
detection for sparse multivariate geotechnical site investigation data using Bayesian learning. Geosci.
Front. . doi:10.1016/j.gsf.2020.03.017
Zhu, Y., Tao, G., Lifeng, F., Siyuan, H., Mark, E., Hangxin, L., Feng, G., Chi, Z., Siyuan, Q., Ying, N.
W., Joshua, B. T., & Song-Chun, Z. Dark, (2020). Beyond Deep: A Paradigm Shift to Cognitive AI with
Humanlike Common Sense, Engineering, 6(3), 310-345. doi:10.1016/j.eng.2020.01.011

219

View publication stats

You might also like