Exploring Self-Attention for Crop-type Classification Explainability

2025-05-06 0 0 847.74KB 19 页 10玖币
侵权投诉
Exploring Self-Attention for Crop-type Classification
Explainability
Ivica Obadica,d,
, Ribana Roscherb,1, Dario Augusto Borges Oliveiraa,c, Xiao Xiang
Zhua,d,
aChair of Data Science in Earth Observation, Technical University of Munich, Arcisstraße
21, Munich, 80333, Germany
bForschungszentrum J¨ulich GmbH, Institute of Bio- and Geosciences, Plant
Sciences, Wilhelm-Johnen-Straße, J¨ulich, 52428, Germany
cSchool of Applied Mathematics, Getulio Vargas Foundation, Praia de Botafogo, 190, Rio de
Janeiro, 22250-900, Brazil
dMunich Center for Machine Learning (MCML), Arcisstraße 21, Munich, 80333, Germany
Abstract
Transformer models have become a promising approach for crop-type classification. Al-
though their attention weights can be used to understand the relevant time points for
crop disambiguation, the validity of these insights depends on how closely the attention
weights approximate the actual workings of these black-box models, which is not always
clear. In this paper, we introduce a novel explainability framework that systematically
evaluates the explanatory power of the attention weights of a standard transformer en-
coder for crop-type classification. Our framework first relates the attention weights to
domain knowledge about crop phenology to interpret the salient dates and phenological
events for the model predictions. Next, we evaluate whether these insights are critical
for crop disambiguation and develop a sensitivity analysis approach to understand the
capability of attention to reveal crop-specific phenological events. Our results show that
attention patterns strongly relate to key dates, which are often associated with critical
phenological events for crop-type classification. Further, the sensitivity analysis reveals
the limited capability of the attention weights to characterize crop phenology as the iden-
tified phenological events depend on the other crops considered during training. This
limitation highlights the relevance of future work towards the development of deep learn-
ing approaches capable of automatically learning the temporal vegetation dynamics for
accurate crop disambiguation.
Keywords: Crop Type Classification, Explainable Machine Learning, Self-Attention,
Time Series Explainability
arXiv:2210.13167v2 [cs.CV] 20 Apr 2025
1. Introduction
Monitoring crop fields is a vital task for agriculture, and in the European Union (EU),
it plays an essential role in the decision process for agricultural subsidization. According
to the audit report by the European Court of Auditors, 2020, several EU members al-
ready use machine learning algorithms trained on time series of Sentinel observations for
crop-type classification and detection of various phenological events. In the last years,
state-of-the-art approaches for crop-type classification were proposed using transformer
encoder models (Rußwurm and K¨orner, 2020; Garnot et al., 2020, 2022). Notwithstand-
ing the accurate crop maps produced by these models, their black-box nature disables
a straightforward understanding of the model decision process. Yet, for improving trust
and enabling broad adoption of such models for agricultural policy making, it is desirable
to connect model decisions to common agricultural knowledge (Campos-Taberner et al.,
2020).
At the core of the transformer encoders is the self-attention mechanism which models
the temporal dependencies in the data through the attention weights. They indicate
the relevance of the sequence elements when creating functional, high-level feature rep-
resentation (Vaswani et al., 2017). Inspecting the attention weights became a popular
interpretability approach used to understand the workings of the transformer models in
natural language processing (Clark et al., 2019), image classification (Li et al., 2023) or
video action recognition (Meng et al., 2019). Although the attention weights can be used
to asses the temporal importance of the satellite observations (Rußwurm and K¨orner,
2020; Xu et al., 2021), their potential is far from fully explored for uncovering other im-
portant aspects in crop monitoring, such as identification of phenological events. At the
same time, the validity of the explanations based on attention weights is challenged in
several studies which yield no clear consensus on whether the attention weights faithfully
explain the model decisions (Bibal et al., 2022). Therefore, with the growing number of
black-box approaches for crop-type classification that rely on self-attention, it becomes
essential to:
1. explore the potential of attention weights to reveal the inner workings of these
models, and
2. unveil their explanatory power in the context of crop disambiguation.
In this paper, we tackle the above questions by applying a novel explainability frame-
work that leverages the attention weights of a trained transformer encoder model to
identify critical insights for agriculture monitoring. Next, we evaluate the relevance of
This work is supported by the Munich Center for Machine Learning (MCML), by the German
Federal Ministry of Education and Research (BMBF) in the framework of the international future AI lab
”AI4EO – Artificial Intelligence for Earth Observation: Reasoning, Uncertainties, Ethics and Beyond”
(grant number: 01DD20001) and by German Federal Ministry for Economic Affairs and Climate Action
in the framework of the ”national center of excellence ML4Earth” (grant number: 50EE2201C). We
thank Marc Rußwurm for his constructive feedback.
Corresponding authors.
Email addresses: ivica.obadic@tum.de (Ivica Obadic), ribana.roscher@uni-bonn.de (Ribana
Roscher), dario.oliveira@tum.de (Dario Augusto Borges Oliveira), xiaoxiang.zhu@tum.de (Xiao
Xiang Zhu)
1The major contribution of Ribana Roscher to this manuscript was during her time at the Chair of
Data Science in Earth Observation, TUM and with IGG, Remote Sensing, University of Bonn.
2
the attention weights explanations for crop disambiguation and assess their potential for
uncovering detailed events in crop phenology. In summary, our proposed methodology
improves the transformer encoder model explainability for crop-type classification and
uncovers its limitations with the following main contributions:
Identification of the relevant dates and phenological events for the transformer
encoder model by relating the sparse attention patterns with domain knowledge
about crop phenology.
Quantitative evaluation of the attention weights explanations to assess whether the
identified insights are crucial for crop disambiguation.
Sensitivity analysis to better understand the capabilities of the attention weights
for detecting important events in crop phenology.
Our findings show that the self-attention mechanism highlights the key dates for crop
disambiguation, as it assigns high attention values to crops exhibiting distinct spectral
reflectance features. Moreover, attention patterns provide insights into specific and rel-
evant events in crop phenology, such as harvesting and growing, which play a critical
role in effectively distinguishing between different crop types. However, it is important
to note that our sensitivity analysis indicates that attention weights do not capture all
significant events in crop phenology. The identified phenological events are conditioned
on the presence of other crop types within the dataset, meaning that they are specifically
relevant for disambiguating between different classes.
2. Related Work
The interpretability studies for crop-type classification typically focus on understand-
ing the importance of the temporal signal and identifying the relevant spectral bands.
For example, (Vuolo et al., 2018) shows that utilizing Sentinel-2 observations acquired
within the growing season improves the accuracy of a random forest model. Further,
(Campos-Taberner et al., 2020) propose a perturbation method that reveals high rele-
vance of the Sentinel-2 observations acquired in the summer months and the red and
the near-infrared band for the predictions of a BiLSTM model. Similar findings are ob-
served in (Xu et al., 2021) where feature importance analysis reveals that attention-based
deep learning approaches and a random forest model attribute high importance to the
observation acquired in the summer months and to the shortwave infrared band. The
growing number of crop-type classification models based on self-attention led to using the
attention weights for understanding the temporal importance patterns assigned by these
models (Rußwurm and K¨orner, 2020; Garnot et al., 2020; Garnot and Landrieu, 2020; Xu
et al., 2021). These studies find that the attention weights are primarily concentrated on
short and specific temporal intervals per crop type. Moreover, (Rußwurm and K¨orner,
2020) discovers that the transformer encoder suppresses the cloudy observations and ar-
gues that this behavior derives from the sparse attention distribution that neglects the
cloudy observations. At the same time, the usage of attention weights for model expla-
nation is questioned in several natural language processing studies that investigate how
close the attention weights approximate the inner model workings (Bibal et al., 2022).
For instance, (Jain and Wallace, 2019) observes a weak correlation between the attention
3
and gradient-based feature importance and the existence of adversarial attention distri-
butions or (Meister et al., 2021) shows that inducing sparsity in the attention distribution
does not improve some of the interpretability benchmarks. Therefore, while the existing
works demonstrate that attention-based explanations can offer valuable insights into the
behavior of the transformer encoder model for crop-type classification, it is important to
note that they do not directly elucidate the impact of attention weights on model predic-
tions nor highlight the potential of these weights in identifying significant events in crop
phenology. Addressing these crucial aspects is the primary objective of our explainability
framework, which is comprehensively presented in this paper.
3. Methods
3.1. Self-Attention Mechanism
The self-attention mechanism (Vaswani et al., 2017) creates high-level feature rep-
resentations for the sequence elements by modeling the temporal dependencies in the
data. First, the input sequence XRT×din is linearly projected into a query matrix
QRT×dk, a key matrix KRT×dkand a value matrix VRT×dvwith the following
operations:
Q=XθQ, K =XθK, V =XθV(1)
where Tis the length of the time series, din is the embedding dimension of the sequence
elements, dkis the query and key embedding dimension, dvis the value embedding di-
mension and θQ,θK, and θVare projection matrices optimized jointly with the usual
model parameters during the training process.
Next, the ”Scaled Dot-Product Attention” combines the keys and the queries into atten-
tion weights with the following equation:
A= softmax(QKT
dk
) (2)
ART×Tis a square matrix containing the attention weights which model the alignment
between the queries and keys. These weights are used as linear coefficients to relate the
different positions in the sequence. Concretely, an entry aij of the matrix Amodels the
influence of the j-th sequence element in creating the high-level feature representation
hifor the i-th sequence element with the following equation:
hi=
T
X
j=1
aij vj(3)
where vjis the j-th row in the value matrix Vrepresenting the value embedding for the
position j.
(Vaswani et al., 2017) also introduces the ”Multi-Head Attention” approach which
on parallel projects non-overlapping subspaces of the input data into queries, keys, and
values and concatenates the output of the scaled dot-product attention on each projected
version. Hence, the number of heads corresponds to the number of applied projections.
4
摘要:

ExploringSelf-AttentionforCrop-typeClassificationExplainability⋆IvicaObadica,d,∗,RibanaRoscherb,1,DarioAugustoBorgesOliveiraa,c,XiaoXiangZhua,d,∗aChairofDataScienceinEarthObservation,TechnicalUniversityofMunich,Arcisstraße21,Munich,80333,GermanybForschungszentrumJ¨ulichGmbH,InstituteofBio-andGeoscie...

展开>> 收起<<
Exploring Self-Attention for Crop-type Classification Explainability.pdf

共19页,预览4页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!

相关推荐

分类:图书资源 价格:10玖币 属性:19 页 大小:847.74KB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 19
客服
关注