Exploring Self-Attention for Crop-type Classification Explainability

2025-05-06 0 0 847.74KB 19 页 10玖币

侵权投诉

Exploring Self-Attention for Crop-type Classiﬁcation

Explainability⋆

Ivica Obadica,d,∗

, Ribana Roscherb,1, Dario Augusto Borges Oliveiraa,c, Xiao Xiang

Zhua,d,∗

aChair of Data Science in Earth Observation, Technical University of Munich, Arcisstraße

21, Munich, 80333, Germany

bForschungszentrum J¨ulich GmbH, Institute of Bio- and Geosciences, Plant

Sciences, Wilhelm-Johnen-Straße, J¨ulich, 52428, Germany

cSchool of Applied Mathematics, Getulio Vargas Foundation, Praia de Botafogo, 190, Rio de

Janeiro, 22250-900, Brazil

dMunich Center for Machine Learning (MCML), Arcisstraße 21, Munich, 80333, Germany

Abstract

Transformer models have become a promising approach for crop-type classiﬁcation. Al-

though their attention weights can be used to understand the relevant time points for

crop disambiguation, the validity of these insights depends on how closely the attention

weights approximate the actual workings of these black-box models, which is not always

clear. In this paper, we introduce a novel explainability framework that systematically

evaluates the explanatory power of the attention weights of a standard transformer en-

coder for crop-type classiﬁcation. Our framework ﬁrst relates the attention weights to

domain knowledge about crop phenology to interpret the salient dates and phenological

events for the model predictions. Next, we evaluate whether these insights are critical

for crop disambiguation and develop a sensitivity analysis approach to understand the

capability of attention to reveal crop-speciﬁc phenological events. Our results show that

attention patterns strongly relate to key dates, which are often associated with critical

phenological events for crop-type classiﬁcation. Further, the sensitivity analysis reveals

the limited capability of the attention weights to characterize crop phenology as the iden-

tiﬁed phenological events depend on the other crops considered during training. This

limitation highlights the relevance of future work towards the development of deep learn-

ing approaches capable of automatically learning the temporal vegetation dynamics for

accurate crop disambiguation.

Keywords: Crop Type Classiﬁcation, Explainable Machine Learning, Self-Attention,

Time Series Explainability

arXiv:2210.13167v2 [cs.CV] 20 Apr 2025

1. Introduction

Monitoring crop ﬁelds is a vital task for agriculture, and in the European Union (EU),

it plays an essential role in the decision process for agricultural subsidization. According

to the audit report by the European Court of Auditors, 2020, several EU members al-

ready use machine learning algorithms trained on time series of Sentinel observations for

crop-type classiﬁcation and detection of various phenological events. In the last years,

state-of-the-art approaches for crop-type classiﬁcation were proposed using transformer

encoder models (Rußwurm and K¨orner, 2020; Garnot et al., 2020, 2022). Notwithstand-

ing the accurate crop maps produced by these models, their black-box nature disables

a straightforward understanding of the model decision process. Yet, for improving trust

and enabling broad adoption of such models for agricultural policy making, it is desirable

to connect model decisions to common agricultural knowledge (Campos-Taberner et al.,

2020).

At the core of the transformer encoders is the self-attention mechanism which models

the temporal dependencies in the data through the attention weights. They indicate

the relevance of the sequence elements when creating functional, high-level feature rep-

resentation (Vaswani et al., 2017). Inspecting the attention weights became a popular

interpretability approach used to understand the workings of the transformer models in

natural language processing (Clark et al., 2019), image classiﬁcation (Li et al., 2023) or

video action recognition (Meng et al., 2019). Although the attention weights can be used

to asses the temporal importance of the satellite observations (Rußwurm and K¨orner,

2020; Xu et al., 2021), their potential is far from fully explored for uncovering other im-

portant aspects in crop monitoring, such as identiﬁcation of phenological events. At the

same time, the validity of the explanations based on attention weights is challenged in

several studies which yield no clear consensus on whether the attention weights faithfully

explain the model decisions (Bibal et al., 2022). Therefore, with the growing number of

black-box approaches for crop-type classiﬁcation that rely on self-attention, it becomes

essential to:

1. explore the potential of attention weights to reveal the inner workings of these

models, and

2. unveil their explanatory power in the context of crop disambiguation.

In this paper, we tackle the above questions by applying a novel explainability frame-

work that leverages the attention weights of a trained transformer encoder model to

identify critical insights for agriculture monitoring. Next, we evaluate the relevance of

⋆This work is supported by the Munich Center for Machine Learning (MCML), by the German

Federal Ministry of Education and Research (BMBF) in the framework of the international future AI lab

”AI4EO – Artiﬁcial Intelligence for Earth Observation: Reasoning, Uncertainties, Ethics and Beyond”

(grant number: 01DD20001) and by German Federal Ministry for Economic Aﬀairs and Climate Action

in the framework of the ”national center of excellence ML4Earth” (grant number: 50EE2201C). We

thank Marc Rußwurm for his constructive feedback.

∗Corresponding authors.

Email addresses: ivica.obadic@tum.de (Ivica Obadic), ribana.roscher@uni-bonn.de (Ribana

Roscher), dario.oliveira@tum.de (Dario Augusto Borges Oliveira), xiaoxiang.zhu@tum.de (Xiao

Xiang Zhu)

1The major contribution of Ribana Roscher to this manuscript was during her time at the Chair of

Data Science in Earth Observation, TUM and with IGG, Remote Sensing, University of Bonn.

the attention weights explanations for crop disambiguation and assess their potential for

uncovering detailed events in crop phenology. In summary, our proposed methodology

improves the transformer encoder model explainability for crop-type classiﬁcation and

uncovers its limitations with the following main contributions:

•Identiﬁcation of the relevant dates and phenological events for the transformer

encoder model by relating the sparse attention patterns with domain knowledge

about crop phenology.

•Quantitative evaluation of the attention weights explanations to assess whether the

identiﬁed insights are crucial for crop disambiguation.

•Sensitivity analysis to better understand the capabilities of the attention weights

for detecting important events in crop phenology.

Our ﬁndings show that the self-attention mechanism highlights the key dates for crop

disambiguation, as it assigns high attention values to crops exhibiting distinct spectral

reﬂectance features. Moreover, attention patterns provide insights into speciﬁc and rel-

evant events in crop phenology, such as harvesting and growing, which play a critical

role in eﬀectively distinguishing between diﬀerent crop types. However, it is important

to note that our sensitivity analysis indicates that attention weights do not capture all

signiﬁcant events in crop phenology. The identiﬁed phenological events are conditioned

on the presence of other crop types within the dataset, meaning that they are speciﬁcally

relevant for disambiguating between diﬀerent classes.

2. Related Work

The interpretability studies for crop-type classiﬁcation typically focus on understand-

ing the importance of the temporal signal and identifying the relevant spectral bands.

For example, (Vuolo et al., 2018) shows that utilizing Sentinel-2 observations acquired

within the growing season improves the accuracy of a random forest model. Further,

(Campos-Taberner et al., 2020) propose a perturbation method that reveals high rele-

vance of the Sentinel-2 observations acquired in the summer months and the red and

the near-infrared band for the predictions of a BiLSTM model. Similar ﬁndings are ob-

served in (Xu et al., 2021) where feature importance analysis reveals that attention-based

deep learning approaches and a random forest model attribute high importance to the

observation acquired in the summer months and to the shortwave infrared band. The

growing number of crop-type classiﬁcation models based on self-attention led to using the

attention weights for understanding the temporal importance patterns assigned by these

models (Rußwurm and K¨orner, 2020; Garnot et al., 2020; Garnot and Landrieu, 2020; Xu

et al., 2021). These studies ﬁnd that the attention weights are primarily concentrated on

short and speciﬁc temporal intervals per crop type. Moreover, (Rußwurm and K¨orner,

2020) discovers that the transformer encoder suppresses the cloudy observations and ar-

gues that this behavior derives from the sparse attention distribution that neglects the

cloudy observations. At the same time, the usage of attention weights for model expla-

nation is questioned in several natural language processing studies that investigate how

close the attention weights approximate the inner model workings (Bibal et al., 2022).

For instance, (Jain and Wallace, 2019) observes a weak correlation between the attention

and gradient-based feature importance and the existence of adversarial attention distri-

butions or (Meister et al., 2021) shows that inducing sparsity in the attention distribution

does not improve some of the interpretability benchmarks. Therefore, while the existing

works demonstrate that attention-based explanations can oﬀer valuable insights into the

behavior of the transformer encoder model for crop-type classiﬁcation, it is important to

note that they do not directly elucidate the impact of attention weights on model predic-

tions nor highlight the potential of these weights in identifying signiﬁcant events in crop

phenology. Addressing these crucial aspects is the primary objective of our explainability

framework, which is comprehensively presented in this paper.

3. Methods

3.1. Self-Attention Mechanism

The self-attention mechanism (Vaswani et al., 2017) creates high-level feature rep-

resentations for the sequence elements by modeling the temporal dependencies in the

data. First, the input sequence X∈RT×din is linearly projected into a query matrix

Q∈RT×dk, a key matrix K∈RT×dkand a value matrix V∈RT×dvwith the following

operations:

Q=XθQ, K =XθK, V =XθV(1)

where Tis the length of the time series, din is the embedding dimension of the sequence

elements, dkis the query and key embedding dimension, dvis the value embedding di-

mension and θQ,θK, and θVare projection matrices optimized jointly with the usual

model parameters during the training process.

Next, the ”Scaled Dot-Product Attention” combines the keys and the queries into atten-

tion weights with the following equation:

A= softmax(QKT

√dk

) (2)

A∈RT×Tis a square matrix containing the attention weights which model the alignment

between the queries and keys. These weights are used as linear coeﬃcients to relate the

diﬀerent positions in the sequence. Concretely, an entry aij of the matrix Amodels the

inﬂuence of the j-th sequence element in creating the high-level feature representation

hifor the i-th sequence element with the following equation:

hi=

j=1

aij vj(3)

where vjis the j-th row in the value matrix Vrepresenting the value embedding for the

position j.

(Vaswani et al., 2017) also introduces the ”Multi-Head Attention” approach which

on parallel projects non-overlapping subspaces of the input data into queries, keys, and

values and concatenates the output of the scaled dot-product attention on each projected

version. Hence, the number of heads corresponds to the number of applied projections.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

ExploringSelf-AttentionforCrop-typeClassificationExplainability⋆IvicaObadica,d,∗,RibanaRoscherb,1,DarioAugustoBorgesOliveiraa,c,XiaoXiangZhua,d,∗aChairofDataScienceinEarthObservation,TechnicalUniversityofMunich,Arcisstraße21,Munich,80333,GermanybForschungszentrumJ¨ulichGmbH,InstituteofBio-andGeoscie...

展开>> 收起<<

Exploring Self-Attention for Crop-type Classification Explainability.pdf

共19页,预览4页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Exploring Self-Attention for Crop-type Classification Explainability

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: