0% found this document useful (0 votes)
9 views6 pages

GRSL Minoro 2024

Best document

Uploaded by

daniyal2k23
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views6 pages

GRSL Minoro 2024

Best document

Uploaded by

daniyal2k23
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/383135469

Sentinel-2 Active Fire Segmentation: Analyzing Convolutional and Transformer


Architectures, Knowledge Transfer, Fine-Tuning and Seam-Lines

Article in IEEE Geoscience and Remote Sensing Letters · January 2024


DOI: 10.1109/LGRS.2024.3443775

CITATION READS

1 13

4 authors, including:

André Fusioka Rodrigo Minetto


Federal University of Technology of Paraná Federal University of Technology of Paraná
4 PUBLICATIONS 123 CITATIONS 71 PUBLICATIONS 1,634 CITATIONS

SEE PROFILE SEE PROFILE

All content following this page was uploaded by Rodrigo Minetto on 10 January 2025.

The user has requested enhancement of the downloaded file.


IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, VOL. ?, NO. ?, ? 2024 1

Sentinel-2 Active Fire Segmentation: Analyzing


Convolutional and Transformer Architectures,
Knowledge Transfer, Fine-Tuning and Seam-Lines
Andre M. Fusioka, Gabriel H. D. A. Pereira, Bogdan T. Nassu and Rodrigo Minetto, Member, IEEE

Abstract—Active fire segmentation in satellite imagery is a crit- A recent trend in the field is attempting to improve fire
ical remote sensing task, providing essential support for planning, segmentation results by exploring computational models based
decision-making, and policy development. Several techniques on deep networks. The problem has been addressed in several
have been proposed for this problem over the years, generally
based on specific equations and thresholds, which are sometimes manners, from binary classification (i.e. presence or not of
empirically chosen. Some satellites, such as MODIS and Landsat- fire in an image) [6], [7], to the semantic segmentation of
8 have consolidated algorithms for this task. However, for other individual pixels [8], [9]. Remote sensing images from a num-
important satellites such as Sentinel-2, this is still an open ber of sensors have been used, such as GOES-16, Himawari-
problem. In this paper, we explore the possibility of using 8, VIIRS, CBERS, Landsat-8 and Sentinel-2, also including
transfer learning to train convolutional and transformer-based
deep architectures (U-Net, DeepLabV3+ and SegFormer) for multisensor approaches. In general, authors report superior re-
active fire segmentation. We pre-train these architectures based sults when using machine learning approaches over traditional
on Landsat-8 images and automatically labeled samples, and threshold-based methods for active fire segmentation.
fine-tune them to Sentinel-2 images. Experiments show that the Regarding the Sentinel-2 satellite, Zhang et al. [10] in-
proposed method achieves F 1-scores of up to 88.4% for Sentinel- troduced a framework for the acquisition and segmentation
2 images, outperforming three threshold-based algorithms by at
least 19%, while maintaining a low demand for manually labeled of active fire images. The study was limited to the United
samples. We also address detection over seam-line regions that States and Australia. They utilized the equations proposed
present a particular challenge for existing methods. The source by [11], adjusted for the collected images, to generate masks
code and trained models are available on GitHub1 . for training a deep network. The authors from [9] segment both
Index Terms—active fire segmentation, transfer learning, fine- active fires and burnt areas combining data from the Sentinel-1
tuning, Sentinel-2 imagery, seam-lines. and 2 satellites, as well as MODIS fire products, Google Earth
images, and field observation data to establish the ground truth
masks. The large number of burnt area pixels in these masks,
I. I NTRODUCTION
compared to active fire pixels, may favor metrics focused on
The automatic segmentation of active fire in satellite im- the former, while hiding a worse performance on the latter.
agery is fundamental for environmental monitoring, providing That study focused on the Mozambique region.
invaluable data for firefighters, researchers, and policymakers One of the major challenges when training deep networks
to assess the extent and intensity of wildfires; thus contributing is the need for a large amount of labeled data — for active fire
to the sustainable development goals defined by the United segmentation, in the form of many multispectral images with
Nations, such as responsible production (as fire is often used their corresponding fire masks. The usual way of obtaining
to prepare land for agriculture), climate action, and life on such data would involve the effort of human specialists an-
earth (impact on biodiversity). That led to a number of alyzing and annotating by hand each fire pixel on thousands
techniques being proposed for this of problem, usually based of images, a costly and time-consuming task. This problem
on specific equations and thresholds derived from statistical was addressed by Pereira et al. [8] by creating masks (which
properties observed in some satellite bands and how they were later made public) based on combinations of the seg-
relate. Some sensors, such as MODIS (onboard the Aqua mentation results produced by several Landsat-8 algorithms.
and Terra satellites), VIIRS (onboard the NPP and NOAA- However, that was only possible due to the maturity and the
20 satellites), and OLI (onboard the Landsat-8 and Landsat-9 general nature of these algorithms. For Sentinel-2, the lack of
satellites) [1], [2], [3], have consolidated algorithms, which algorithms with the same qualities prevents one from using the
achieve very high-quality results. However, for other important same approach to obtain high-quality, massive labeled data.
satellites such as Sentinel-2, this is still an open problem, with The great demand for labeled data for training deep net-
several candidate solutions being proposed [2], [4], [5]. works is a known issue that may affect applications from many
domains. One very popular way of dealing with it is transfer
Andre M. Fusioka, Gabriel H. de A. Pereira, Bogdan T. Nassu and Rodrigo
Minetto are with Federal University of Technology - Parana (UTFPR), learning: taking a model pre-trained on a large benchmark
Brazil. E-mails: [email protected], [email protected], dataset, such as ImageNet or COCO, and fine-tuning it on
{rminetto,bogdan}@utfpr.edu.br. We would like to thank CNPq (Grant a smaller, task-specific dataset. Although this approach is
312815/2023-9), CAPES, FAPESP and Fundação Araucária.
Manuscript received ? ?, ?; revised ? ?, ?. effective for many types of data, satellite imagery poses
1 https://github.com/Minoro/l8tos2-transf-seamlines one additional challenge: each satellite produces multispectral
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, VOL. ?, NO. ?, ? 2024 2

images with distinct wavelengths and spatial resolutions for Landsat-8 Sentinel-2

each band, making pre-training on a general dataset unfeasible. Bands: 1-7


In this paper, we investigate three deep segmentation ar-
chitectures — U-Net [12], a popular model for segmentation
tasks, DeepLabV3+ [13], which has been previously used Bands: 7, 6 and 5 Bands: 12, 11 and 8A

for segmentation through transfer learning [14] and Seg-


Former [15], a transformer-based model that has achieved
state-of-art results — for segmenting active fires in Sentinel-
2 images without relying on massive hand-labeled data. Al- ... ...
though it is not possible to perform transfer learning from a 22,097 three-channel patches 7,345 three-channel patches
general dataset, Sentinel-2 and Landsat-8 bands share many with 256 × 256 pixels (27 GB) with 256 × 256 pixels (8.5 GB)

similarities, as shown in Fig. 1. Note that the bands do not


Active Fire Masks Active Fire Masks
align perfectly and have different spatial resolutions, but the
(Voting) (Reference)
similarities are enough to allow pre-training on Landsat-8
images (i.e. taking models trained the same way as done by
Pereira et al. [8]) and fine-tuning the models on a smaller
amount of Sentinel-2 images. Furthermore, we analyze the
impact of seam-lines, a visual artifact that appears on Sentinel-
2 images but not on Landsat-8 images, on the segmentation.
Deep network Deep network
• DeepLabV3+ • DeepLabV3+ Fine
• SegFormerB0 • SegFormerB0 tuning
• U-Net Transfer • U-Net
learning

Fig. 2. Transfer learning scheme between Landsat-8 and Sentinel-2 satellites


for active fire segmentation.

A. Landsat-8 pre-training

Fig. 1. Representation of the similarity of wavelengths observed by each For the Landsat-8 pre-training, shown on the left side of
band of Sentinel-2 and Landsat-8. Adapted from: USGS [16]. Fig. 2, we used 22,097, 256 × 256-pixel image patches from
the dataset compiled by Pereira et al. [8]. We selected non-
In summary, the main contributions of this paper are: (i) overlapping patches covering the entire globe and including
We evaluate three popular deep network architectures for many large and small wildfire events in areas such as the
semantic segmentation (U-Net [12], Deeplab-V3+ [13] and Amazon region, Africa, Australia and the United States. Each
Segformer-B0 [15]), taking pre-trained models for active fire patch has an associated fire mask, obtained by combining
segmentation on Landsat-8 images and fine-tuning them for the masks produced by three well-known sets of condi-
Sentinel-2 images; (ii) we show that the described method tions: Schroeder et al. [1], Murphy et al. [2], and Kumar-
can successfully produce active fire segmentation models Roy et al. [3]. The masks were combined through a pixelwise
for Sentinel-2 considering three deep network architectures, majority voting process — a pixel is set in the active fire mask
without the need for massive hand-labeled data, with reduced if it is set by at least two condition sets.
training time, with results that surpass those obtained by three
We performed pre-training on the Landsat-8 images using
traditional threshold-based methods; and (iii) we point out how
3 networks: besides the U-Net [12], used by [8], we also
seam-lines are less prone to affect fine-tuned deep learning
train DeepLabV3+ [13] (with a ResNet50 backbone) and
models than non fine-tuned models or thresholding techniques.
Segformer-B0 [15] models, allowing us to compare differ-
ent architectures for transfer learning. Furthermore, different
II. M ETHODOLOGY from [8], who trained the models on Landsat-8 bands 7, 6 and
2, we trained our models on bands 7, 6 and 5, which roughly
The methodology for active fire segmentation on Sentinel- correspond to Sentinel-2 bands 12, 11 and 8A. Each network
2 images involves fine-tuning models trained on Landsat- was trained for up to 50 epochs, halting the training if the
8 images through a transfer learning process. The scheme validation loss did not improve for 5 epochs, with a learning
for transfer learning is summarized in Fig. 2. The left and rate of 0.001, using the Adam optimizer, applying vertical
right pipeline flows refer, respectively, to the Landsat-8 pre- and horizontal flips for data augmentation, with 8 images per
training and the Sentinel-2 transfer learning stages, and are batch. Note that we did not evaluate the performance of the
further detailed below. We also describe the dataset and the trained models on Landsat-8 images, since our aim is fine-
threshold-based algorithms considered in our comparison. All tuning these models for Sentinel-2 images, through a transfer
implementations were in Python, using the TensorFlow library. learning procedure.
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, VOL. ?, NO. ?, ? 2024 3

B. Fine-tuning Landsat-8 models for Sentinel-2 data


The Sentinel-2 pipeline flow is shown on the right side
of Fig. 2. We take the models trained for Landsat-8 images
and specialize them for Sentinel-2 images, leveraging the
available Landsat-8 algorithms: while the Landsat-8 models
were pre-trained on 22,097 automatically labeled patches (27
GB), the fine-tuning process uses only 7,345 patches labeled
by hand (8.5 GB). We trained models on 3-channel patches,
containing bands 12, 11 and 8A, the same bands used by the
three Sentinel-2 active fire detection algorithms we compared Fig. 4. Geographical distribution of Sentinel-2 image patches (red dots),
256 × 256 pixels each, used in our experiments.
our models to, allowing a fairer comparison. The Sentinel-2
images are distributed with a quantification value of 10,000,
therefore the patches were divided by this value for scale include all 12,242 patches. All patches were fully annotated
adjustment to feed the networks. by a remote sensing specialist on the QGIS software. The total
The knowledge transfer from Landsat-8 to Sentinel-2, repre- number of annotated active fire pixels is 19, 274.
sented by a connection between their deep networks in Fig. 2,
consists of using the pre-trained model parameters from a D. Sentinel-2: threshold-based algorithms
Landsat-8 network (DeeplabV3+, SegFormerB0 or U-Net) as We compared the models obtained through transfer learning
a starting point for the same architecture for Sentinel-2. Then, against three threshold-based algorithms for active fire seg-
the representations learned from the source task (Landsat-8) mentation in Sentinel-2 images, shown in Table I. It is worth
are fine-tuned to the target task (Sentinel-2), as shown in noting that there are few methods specifically designed for
Fig. 3. Each model was trained for up to 40 epochs, with active fire detection on Sentinel-2 images. Note that all the
a learning rate of 0.0001 and all weights unfrozen, with listed methods use the same bands for detecting unambiguous
early stopping after 5 epochs without improvements on the fire pixels, the difference lies in the thresholds and contextual
validation loss, and image flipping for data augmentation. analysis. Also, Murphy et al.’s method was originally designed
for Landsat-8 and its initial purpose, as for Kato-Nakamura
Sentinel-2 Bands Output Reference method, is to detect thermal anomalies (like volcanic ac-
{12,11,8A} Segmentation Mask
(1) Initial network
tivities). After an initial segmentation, both Liu et al.’s and
weights from Landsat-8 Murphy et al.’s methods apply a contextual analysis capturing
Scale adjustment Deep network ? potential fires close to unambiguous fire pixels to reduce
= the number of false negatives. In our implementation, the
fine tuning Murphy et al.’s method was the only one to apply solar zenith
(2) Compute loss and angle correction since this was not specified by the others.
weights adjustment

TABLE I
Fig. 3. Sentinel-2 fine-tuning: the pre-trained weights from Landsat-8, used
S UMMARY OF THE BANDS USED BY THRESHOLD - BASED ACTIVE FIRE
as a starting point, are further refined by using similar wavelength bands and
SEGMENTATION METHODS . T HE U NAMBIGUOUS F IRE ROW PRESENTS THE
reference masks. The numbers indicate the order of events. The reference
BANDS USED TO IDENTIFY ACTIVE FIRE , THE P OTENTIAL F IRE ROW
masks were fully annotated by a remote sensing specialist, with part of this
PRESENTS THE BANDS USED TO IDENTIFY PIXELS NOT CAPTURED AS
set being used for training and the remaining for testing.
UNAMBIGUOUS FIRES , GENERALLY APPLIED TO THE NEIGHBORHOOD OF
THESE PIXELS . T HE FALSE A LARM C ONTROL ROW SHOWS THE BANDS
USED TO DISCARD COMMON SOURCES OF CONFUSION , SUCH AS WATER
BODIES AND SHADOWS .
C. Sentinel-2 dataset
Fig. 4 shows the world locations of the Sentinel-2 image Kato-Nakamura Liu et al. Murphy et al.
samples used in our tests. Our Sentinel-2 dataset is divided Unambiguous fire b12 , b11 , b8A b12 , b11 , b8A b12 , b11 , b8A
into three subsets: one with patches with the presence of Potential fire — b12 , b11 b12 , b11 , b8A
active fire (387 patches), another one without active fire spots False Alarm — b12 , b8A —
(10, 577 patches), and the last one with 1, 278 image patches,
all with the presence of seam-lines — artifacts caused by the
composition of multiple captures, a known issue that results III. R ESULTS AND D ISCUSSION
in misaligned bands visible in regions with clouds, and that
affects active fire recognition [17]. It is important to stress that In this section, we report performances according to the
the images with and without fire pixels also contain samples well-known metrics precision (P = tp/(tp + f p)) and re-
with seam-lines in a small proportion, which is exactly what call (R = tp/(tp + f n)), where the number of pixels of
led us to identify this problem and create a specific subset true positives (tp), false positives (f p), and false negatives
with just this type of image for later analysis. (f n) were accumulated for all images, and the metrics were
For our experiments, we used 5-fold cross-validation, with computed for the entire test set. We also report the F 1-score
3 folds being used for training, 1 for validation, and 1 for (F 1-score = 2/(1/P + 1/R)) and IoU metrics, which are
testing (training and test patches do not overlap). The folds commonly applied for segmentation problems.
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, VOL. ?, NO. ?, ? 2024 4

A. General performance Cropping Sentinel-2 image Manual annotation


(see region R)
Table II presents the results achieved by the three threshold-
based methods, as well as different versions of the deep mod-
els, with 5-fold cross-validation on the entire dataset (including R
images with and without active fires and seam-lines). For refer-
ence, besides the fine-tuned models, we show results obtained
by the original pre-trained Landsat-8 networks without fine-
tuning. The worst performance for the threshold-based algo- Kato-Nakamura Liu et al. Murphy et al.
rithms was obtained by the Kato-Nakamura et al. method [4],

Threshold-based
which displayed an under-segmentation behavior (restrictive-
ness), resulting in the best precision (≈ 97%) but the worst
recall (≈ 24%). Fig. 5 shows that this method failed to identify
several active fire regions. Conversely, the Liu et al. [17] and
Murphy et al. [2] methods have a super-segmentation behavior
(as can be seen in Fig. 5), with a high recall rate (above 81%), SegFormer (TL w/o FT) DeepLab (TL w/o FT) U-Net (TL w/o FT)
but a low precision (equal or below to 59%).

Deep learning
TABLE II
ACTIVE FIRE SEGMENTATION PERFORMANCES CONSIDERING THREE
HANDCRAFTED THRESHOLD - BASED ALGORITHMS FOR S ENTINEL -2
IMAGES (M URPHY et al., K ATO -NAKAMURA AND L IU et al.), AND THREE
DEEP ARCHITECTURES (D EEP L AB V3+, S EG F ORMER B0 AND U-N ET ),
WITH TRANSFER LEARNING BUT NO FINE - TUNING (TL, NO FT), AND
WITH TRANSFER LEARNING AND FINE - TUNING (TL + FT). T HE RESULTS SegFormer (TL w/ FT) DeepLab (TL w/ FT) U-Net (TL w/ FT)
ARE THE MEAN AND STANDARD DEVIATION OVER 5 FOLDS . B OLDFACE
VALUES CORRESPOND TO THE BEST PERFORMANCE IN EACH COLUMN .
Deep learning

Method P (%) R (%) F 1-score (%) mIoU (%)


Kato-Nakamura 97.1(±1) 24.0(±3) 38.3(±4) 23.7(±3)
Liu et al. 45.2(±5) 81.3(±5) 58.0(±6) 41.0(±5)
Murphy et al. 59.0(±6) 84.0(±4) 69.2(±5) 53.0(±5)
SegFormer (TL w/o FT) 24.5(±6) 85.8(±4) 37.9(±8) 23.6(±6)
Fig. 5. Active fire segmentation results. Region R (cropped and highlighted
DeepLab (TL w/o FT) 49.8(±11) 77.6(±7) 60.4(±10) 43.9(±10) in the top left) shows an active fire region with different levels of temperature.
U-net (TL w/o FT) 55.2(±9) 90.1(±3) 68.2(±8) 52.2(±9) It can be seen that the threshold-based methods do not correctly segment the
SegFormer (TL w/ FT) 84.2(±5) 80.5(±4) 82.2(±2) 69.8(±3) region core with higher fire intensities.

DeepLab (TL w/ FT) 83.1(±4) 68.3(±9) 74.8(±6) 60.0(±8)


U-net (TL w/ FT) 91.6(±1) 85.6(±5) 88.4(±3) 79.3(±5)
registration of the image being aligned on the ground, once
the difference in altitude between the top of the clouds and the
Regarding the models pre-trained for Landsat-8 but without
ground can make this misregistration. The possible locations
fine-tuning for Sentinel-2, it can be seen in Table II that
for seam-lines are indicated by available Sentinel-2 metadata
all architectures reported a high recall (above 77%), which
as shown in Fig. 6. Simply discarding these regions is not a
indicates a high recognition rate of active fire spots, as shown
solution, because not all joining positions have visible seam-
in Fig. 5. However, precision was below 56% for the three
lines, and an active fire area can appear in those regions.
models, with many regions of wildfire being over-segmented.
Furthermore, seam-line regions are not fixed over time. In
The best trade-off between precision and recall in Table II
our experiments, seam-lines led threshold-based methods to
was achieved by using transfer learning from Landsat-8 and
produce many false positives, as shown in Fig. 6. Although
fine-tuning for Sentinel-2. Observe in Fig. 5 that with this
the deep learning models did not eliminate the problem, the
approach the fire regions are more accurately delineated, since
figure shows that the occurrences have reduced considerably.
the models were properly refined for the new satellite resolu-
To further understand the impact of these seam-lines on the
tion and also for Sentinel-2 artifacts. Despite its simplicity, the
results, we performed tests using the subset of 1, 278 image
U-Net proved still competitive, outperforming the more recent
patches with seam-line artifacts. We removed these patches
SegFormer and DeepLab architectures.
from the original folds, and retrained the models without them.
The entire seam-line subset was then added to all the test
B. Performance over seam-lines folds, without division. The threshold-based methods were also
Seam-lines are a common source of errors for active fire tested on these sets. Fig. 7 shows the performance drop in
recognition algorithms because they sometimes produce pat- the F 1-score for each model in this experiment, compared to
terns similar to active fire. These seam-lines are the bound- the original training and test folds. The worst impact is for
ary between the sensor detectors in the tile. Seam-lines are the threshold-based methods of Liu et al. and Murphy et al.,
caused due to the detector’s orientation and capture, and the which are now being evaluated on a much larger test set,
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, VOL. ?, NO. ?, ? 2024 5

and proved especially sensitive to the seam-lines, which we took less than 3 minutes per epoch for DeepLab, 6 minutes for
observed tend to be detected as false positives. The Kato- U-Net, and 7 minutes for SegFormer (918 batches per epoch).
Nakamura method was not affected the same way, due to it Training a network from scratch with Sentinel-2 using
already being very restrictive. Models without fine-tuning also masks built by the thresholding algorithms, the same way as
have a significant performance drop, as they are often confused done for Landsat-8, would carry the bias of the masks. This
by these artifacts that do not appear in the Landsat-8 images means that the segmentation of seam-lines could occur since it
they were trained on. Fine-tuned models were still affected, but is a problem for the Liu et al. and Murphy et al. methods. On
on a much smaller scale. A few patches with seam-lines are the other hand, the Kato-Nakamura method is very restrictive,
present in the training set, and these were seemingly enough which would lead to many omission errors. In this context, the
to allow these models to avoid the problem in most cases. proposed approach seems to be viable since it relies on a large
source of data provided for Landsat-8 and its consolidated fire
Seam-line (zoom) Sentinel-2 patch Footprint (metadata) Liu et al. segmentation algorithms. In future works, other satellites with
similar wavelengths will be studied, in addition with the use
of satellite-specific metadata for enhancing results.

R EFERENCES
Murphy et al. SegFormer U-Net SegFormer, U-Net, [1] W. Schroeder, P. Oliva, L. Giglio, B. Quayle, E. Lorenz, and F. Morelli,
(TL w/o FT) (TL w/o FT) DeepLab (w/ FT) “Active fire detection using Landsat-8/OLI data,” Remote Sensing of
Environment, vol. 185, pp. 210–220, nov 2016.
[2] S. W. Murphy, C. R. de Souza Filho, R. Wright, G. Sabatino, and
R. Correa Pabon, “HOTMAP: Global hot target detection at moderate
spatial resolution,” Remote Sensing of Env., vol. 177, pp. 78–88, 2016.
[3] S. S. Kumar and D. P. Roy, “Global operational land imager Landsat-8
reflectance-based active fire detection algorithm,” International Journal
of Digital Earth, vol. 11, no. 2, pp. 154–178, 2018.
[4] S. Kato and R. Nakamura, “Detection of thermal anomaly using
Fig. 6. Presence of seam-lines in a Sentinel-2 image patch, as indicated Sentinel-2A data,” in IEEE International Geoscience and Remote Sens-
by the footprint metadata; false positive errors for Kato-Nakamura (1 pixel), ing Symposium, 7 2017, pp. 831–833.
Liu et al. (92 pixels), Murphy et al. (88 pixels); and networks without fine- [5] Y. Liu, W. Zhi, B. Xu, W. Xu, and W. Wu, “High-temperature anomalies
tuning DeepLab (51 pixels), SegFormer (94 pixels) and U-net (83 pixels). from Sentinel-2 MSI images,” ISPRS Journal of Photogrammetry and
After fine-tuning, the three architectures did not report any errors (0 pixels). Remote Sensing, Elsevier, vol. 177, pp. 174–193, 2021.
[6] Y. Kang, T. Sung, and J. Im, “Toward an adaptable deep-learning
model for satellite-based wildfire monitoring with consideration of
environmental conditions,” Remote Sensing of Env., vol. 298, 2023.
F 1-score drop (%)

[7] Z. Hong, Z. Tang, H. Pan, Y. Zhang, Z. Zheng, R. Zhou, Z. Ma,


0 Y. Zhang, Y. Han, J. Wang, and S. Yang, “Active fire detection using
−1.6 −2 −0.3 −0.4 a novel convolutional neural network based on Himawari-8 satellite
−5.5 −5.2
−10 images,” Frontiers in Environmental Science, vol. 10, 2022.
−12.8 [8] G. H. Pereira, A. M. Fusioka, B. T. Nassu, and R. Minetto, “Active
−20 −16.2 fire detection in Landsat-8 imagery: A large-scale dataset and a deep-
−18.4
learning study,” ISPRS J. Photogram. and Remote Sens., vol. 178, 2021.
.

FT

pL / FT

T
a

hy l.

[9] Z. Shirvani, O. Abdi, and R. C. Goodman, “High-resolution semantic


al
ur

pL /o F

/F
a

or /o F

/F
et
am

et

/o

segmentation of woodland fires using residual attention UNet and time


w

w
u
ak

er

ab

et
Li

er

ab
Se urp

Se et
N

-N

series of Sentinel-2,” Remote Sensing, vol. 15, no. 5, 2023.


m

-N
o-

U
or

ee
at

gF
U
ee

[10] Q. Zhang, L. Ge, R. Zhang, G. I. Metternicht, C. Liu, and Z. Du,


gF

D
K

Fig. 7. Performance drop in F 1-score when removing most patches contain- “Towards a deep-learning-based framework of Sentinel-2 imagery for
ing seam-lines from the training set, and adding them to all the test folds. automated active fire detection,” Remote Sensing, vol. 13, 2021.
Fine-tuning, even on a very small number of patches containing seam-lines, [11] X. Hu, Y. Ban, and A. Nascetti, “Sentinel-2 MSI data for active
made the models more robust to their presence. fire detection in major fire-prone biomes: A multi-criteria approach,”
Intl. J. Applied Earth Observation and Geoinformation, vol. 101, 2021.
[12] O. Ronneberger, P. Fischer, and T. Brox, “U-Net: Convolutional net-
works for biomedical image segmentation,” in Medical Image Comput-
IV. C ONCLUSIONS ing and Computer-Assisted Intervention (MICCAI), 2015, pp. 234–241.
[13] L.-C. Chen, G. Papandreou, F. Schroff, and H. Adam, “Rethinking
atrous convolution for semantic image segmentation,” 2017. [Online].
Available: http://arxiv.org/abs/1706.05587++
The similarities between the Landsat-8 and Sentinel-2 bands [14] Z. Zhou, F. Zhang, H. Xiao, F. Wang, X. Hong, K. Wu, and J. Zhang,
allowed transfer learning from the former to the latter using “A novel ground-based cloud image segmentation method by using deep
a few labeled samples for three different types of deep archi- transfer learning,” IEEE Geosci. and Remote Sens. Letters, vol. 19, 2022.
[15] E. Xie, W. Wang, Z. Yu, A. Anandkumar, J. M. Alvarez, and P. Luo,
tectures (SegFormer, DeepLab and U-Net). Not only that, but “Segformer: Simple and efficient design for semantic segmentation with
the fine-tuned models outperform the thresholding techniques transformers,” http://arxiv.org/abs/2105.15203, 5 2021.
available for Sentinel-2. The proposed models were also able [16] US Geological Survey, “USGS EROS Archive, Comparison of Sentinel-
2 and Landsat,” 2023. [Online]. Available: https://www.usgs.gov
to reduce the number of false positives generated by the [17] Y. Liu, B. Xu, W. Zhi, C. Hu, Y. Dong, S. Jin, Y. Lu, T. Chen, W. Xu,
segmentation of seam-lines, which are common on Sentinel-2 Y. Liu, B. Zhao, and W. Lu, “Space eye on flying aircraft: From Sentinel-
images. Moreover, since only a few images were used, fine- 2 MSI parallax to hybrid computing,” Remote Sensing of Environment,
vol. 246, 9 2020.
tuning the models to the new satellite was fast: while the
base Landsat-8 networks took up to 2.5 hours per epoch to
be trained by using a Nvidia Titan Xp (12 GB), fine-tuning

View publication stats

You might also like