An Interpretable Deep Semantic Segmentation Method for Earth Observation 1stZiyang Zhang

2025-04-30 1 0 8.82MB 8 页 10玖币
侵权投诉
An Interpretable Deep Semantic Segmentation
Method for Earth Observation
1st Ziyang Zhang
School of Computing and Communications
Lancaster University
Lancaster, UK
z.zhang51@lancaster.ac.uk
2nd Plamen Angelov
School of Computing and Communications
Lancaster University
Lancaster, UK
p.angelov@lancaster.ac.uk
3rd Eduardo Soares
School of Computing and Communications
Lancaster University
Lancaster, UK
e.almeidasoares@lancaster.ac.uk
4rd Nicolas Longepe
Phi-Lab Explore Office
European Space Agency
Frascati, Italy
nicolas.longepe@esa.int
5rd Pierre Philippe Mathieu
Phi-Lab Explore Office
European Space Agency
Frascati, Italy
pierre.philippe.mathieu@esa.int
Abstract—Earth observation is fundamental for a range of
human activities including flood response as it offers vital
information to decision makers. Semantic segmentation plays a
key role in mapping the raw hyper-spectral data coming from the
satellites into a human understandable form assigning class labels
to each pixel. Traditionally, water index based methods have been
used for detecting water pixels. More recently, deep learning
techniques such as U-Net started to gain attention offering
significantly higher accuracy. However, the latter are hard to
interpret by humans and use dozens of millions of abstract
parameters that are not directly related to the physical nature
of the problem being modelled. They are also labelled data and
computational power hungry. At the same time, data transmission
capability on small nanosatellites is limited in terms of power
and bandwidth yet constellations of such small, nanosatellites are
preferable, because they reduce the revisit time in disaster areas
from days to hours. Therefore, being able to achieve as highly
accurate models as deep learning (e.g. U-Net) or even more, to
surpass them in terms of accuracy, but without the need to rely
on huge amounts of labelled training data, computational power,
abstract coefficients offers potentially game-changing capabilities
for EO (Earth observation) and flood detection, in particular. In
this paper, we introduce a prototype-based interpretable deep
semantic segmentation (IDSS) method, which is highly accurate
as well as interpretable. Its parameters are in orders of magnitude
less than the number of parameters used by deep networks
such as U-Net and are clearly interpretable by humans. The
proposed here IDSS offers a transparent structure that allows
users to inspect and audit the algorithm’s decision. Results
have demonstrated that IDSS could surpass other algorithms,
including U-Net, in terms of IoU (Intersection over Union) total
water and Recall total water. We used WorldFloods data set
for our experiments and plan to use the semantic segmentation
results combined with masks for permanent water to detect flood
events.
Index Terms—Earth observation, semantic segmentation, flood
detection, interpretable deep learning, prototype-based classifiers,
U-Net, WorldFloods
The project is co-funded by the ESA (European Space Agency).
I. INTRODUCTION
Flood is one of the most catastrophic weather events as it
caused about 7 million fatalities in the twentieth century [1].
The average annual loss generated by floods are estimated at
over US$100 billion (2015) [2]. Recent findings claim that
the exposure to floods is expected to grow three times by 2050
due to increases in population and economic assets in flood-
prone areas [3]. Depending on the socio-economic scenario,
human losses from flooding are projected to rise by 70–83%
and direct flood damage by 160–240%relative to 1976–2005
[4]. Detecting flooding and its associated impacts is critical to
effective risk reduction [5], [6].
The advent of Copernicus of ESA (European Space Agency)
and the launch of several Sentinels satellites have provided
massive amounts of data for a range of Earth observation
missions, including flood detection [7]. SAR (Synthetic Aper-
ture Radar) images and optical remote sensing images are
often used for flood mapping, but the optical remote sensing
images is preferable due to its mature processing and analysis
techniques, while SAR images suffer from noise and informa-
tion loss problems [8]. The Sentinel-2 satellite constellation is
often used for flood detection due to the availability of Multi-
Spectral Imagers, shorter revisit times (5 days), and higher
spatial resolution (10m for some bands) [9].
Flood mapping traditionally relies on water indices, such as
NDWI [10] and MNDWI [11]. However, these methods are
threshold-based, which requires expert knowledge. In recent
years, machine learning algorithms have also been applied to
flood detection. [12] uses clustering algorithm and a combi-
nation of several water indexes to identify the water areas.
However, the coastal water, clouds, shadows and snow pixels
had to be removed in advance, which greatly increases the
workload.
The recent work from [13] investigates how a constellation
of small nano satellites assembled from commercial off-the-
978-1-6654-5656-2/22/$31.00 ©2022
arXiv:2210.12820v1 [cs.CV] 23 Oct 2022
shelf (COTS) hardware, also known as CubeSats, could be
used for disaster response in the case of floods. The authors
proposed the use of deep learning algorithms to produce
multi-class segmentation with high accuracy on-board of a
very cheap satellite hardware. Although the deep learning
approaches proposed for this challenging task have produced
great results in terms of accuracy, they are black-box and are
extremely difficult to audit [14].
In this paper, a new prototype-based approach is proposed
called interpretable deep semantic segmentation (IDSS), which
extends the recently introduced explainable Deep Neural Net-
work (xDNN) [15] through a new clustering and decision
mechanism. The prototype-based nature of IDSS allows clear
interpretability of its decision mechanism [16]. This is im-
portant for the human-in-the-loop process involved in this
application. Results demonstrate that IDSS is able to surpass
xDNN and U-Net in terms of IoU and recall for water
detection.
II. LITERATURE REVIEW
A. Water/Flood mapping
Water/Flood mapping requires semantic segmentation for
allocating a class label to each multidimensional pixel. Tradi-
tionally, different water index-based methods are being used to
determine liquid water. They represent ratios of different spec-
tral bands from the raw satellite signal that can characterize
the water absorption. The rationale behind the usefulness of
the water indices is that the water absorbs energy in the near
infrared (NIR) and short-wave infrared (SWIR) wavelengths,
making it significantly different from other objects on the
ground [17]. The most widely used water index is NDWI
[10]. Other water indices also exists such as MNDWI [11],
WNDWI [18] and AWEI [19]. However, such methods often
require manual setting of thresholds, which is challenging and
the choice does influence the result significantly.
Machine learning methods were also applied for flood map-
ping. For example, K-means was used to perform clustering
based on Synthetic Aperture Radar (SAR) optical features and
thresholds were applied to further perform classification by
[20]. The SVM classifier was used as a reference label since
it is considered to provide better results than the thresholding
methods [21]. KNN was used to perform per-pixel classifica-
tion based on the water index by [22].
More recently, with the rapid development of deep learning,
convolutional neural networks started to be used more widely
for Earth Observation and flood mapping, in particular. For
example, the fully convolutional neural network Resnet50
trained on Sentinel-1 satellite images was used to segment
permanent water and flood by [23], U-Net and a simple CNN
were trained on Sentinel-2 satellite images used to perform the
onboard flood segmentation task by [13]. The main advanatge
of using deep convolutional networks is their high levels
of accuracy measured primarily by IoU (intersection over
union) and recall characteristics for this particular problem.
They also offer powerful latent features extraction capability.
However, the downside is that they require large amount of
labeled training data and computational power and most of
all, they lack interpretability. They are often considered as
”black box” models because they have millions of abstract
model parameters (weights) which have no direct physical
meaning and are hard to interpret or check if affected by noise
or adversarial actions especially when transmitted. Attempts
have been made to provide some explainability, but these
are mostly post-hoc partial solutions or surrogate models.
Therefore, currently, there is a powerful trend aiming to
develop explainable or interpretable-by-design alternatives that
are as powerful as such Deep Neural Networks, yet offer
human-intelligible models [24], [15], [14].
The method proposed in this paper offers visual and linguis-
tic forms of interpretation based on prototypes, offering layers
with clear interpretation as well as a linguistic IF... THEN
rules and clear decision making process based on similarity.
These forms of representation of the model can be inspected
by a human and have clear meaning. The prototypes can also
be visualized using RGB colours as well as raw features such
as NIR, SWIR, etc. which are easy to interpret by a human.
In this next sub-section, prototype-based machine learning
methods are briefly reviewed because they are underestimated
and often overlooked, but they do offer clear advantages
in terms of interpretability and high levels of accuracy and
performance.
B. Prototype-based models
Prototype-based machine learning methods have similar
concepts as some methods from cognitive psychology and
neuroscience in regards to comparisons of new observations
or stimuli to a set of prototypes [25]. Prototype-based machine
learning models have been attracting much attention due
to their easy to understand decision making processes and
interpretability. K nearest neighbor [26] and K-means [27]
are the most representative algorithms based on prototypes.
The most typical prototype-based neural network algorithm is
the Radial-basis Function (RBF) method [28] which is like
a bridge between neural networks and linguistic IF...THEN
rules. There are also other prototype-based machine learning
algorithms such as fuzzy c-means clustering [29] and Learning
Vector Quantization (LVQ) [30].
Recently, with the rapid development of deep learning
algorithms, prototype-based models have a tendency to inte-
grate with neural networks. Such as ProtoPNet [31], xDNN
[15] and a nonparametric segmentation framework [32]. The
IDSS method proposed in this paper differs from these works
because xDNN [15] does not perform clustering to generate
prototypes and has not been applied to flood mapping while
the work by [32] only look for prototypes in the embedding
space, which greatly reduces the interpretability of the model.
The method proposed in this paper not only outperforms the
state-of-the-art deep convolutional networks such as UNet and
SCNN as well as various water-indices-based models, but also,
greatly improves the interpretability of the model by further
finding the mean value in the raw features space corresponding
摘要:

AnInterpretableDeepSemanticSegmentationMethodforEarthObservation1stZiyangZhangSchoolofComputingandCommunicationsLancasterUniversityLancaster,UKz.zhang51@lancaster.ac.uk2ndPlamenAngelovSchoolofComputingandCommunicationsLancasterUniversityLancaster,UKp.angelov@lancaster.ac.uk3rdEduardoSoaresSchoolofCo...

收起<<
An Interpretable Deep Semantic Segmentation Method for Earth Observation 1stZiyang Zhang.pdf

共8页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!

相关推荐

分类:图书资源 价格:10玖币 属性:8 页 大小:8.82MB 格式:PDF 时间:2025-04-30

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 8
客服
关注