In this paper, we focus on geospatial data, which typically feature non-linear relationships among observations. In this
szenario, ANNs are good candidate models, since ANNs are capable of handling complex and non-linear relations by
learning from data and training some adjustable weights and biases [
8
]. In recent years these methods have been used in
various ways on geospatial data [9], [10], [11].
The problem with using ANNs on data of the Earth system is that we often only have relatively short time series to
predict on or a small number of events to learn from. Using sophisticated neural networks encounters a large number of
trainable parameters and these models are prone to overfitting. This requires a lot of expertise and effort to train these
models and prevent them from getting stuck in local minima of the objective function. Famous techniques are dropout,
early stopping and regularization [12], [13], [14].
In this work we overcome these problems by using Echo State Networks (ESNs) [
15
]. ESNs are a certain type of RNNs
and have been widely used for time series forecasting [
16
], [
17
]. In its basic form an ESN consists of an input and
an output layer. In between we find a reservoir of sparsely connected units. Weights and biases connecting inputs
to reservoir units and internal reservoir weights and biases are randomly initialized. The input length determines the
number of recurrent time steps inside the reservoir. We record the final reservoir states and only the output weights and
bias are trained. But opposed to other types of neural networks, this does not encounter some gradient descent methods
but is rather done in a closed-form manner by applying linear regression of final reservoir states onto desired target
values to get the output weights and bias.
This makes ESN models extremely powerful since they require only a very small number of trainable parameters (the
output weights and bias). In addition to that, training an ESN is easy, fast and leads to stable and reproducible results.
This makes them especially suitable for applications in the domain of climate and ocean research.
But as long as ESNs remain black-boxes, there is only a low level of trust in the obtained results and using these kinds
of models is likely to be rejected by domain experts. This can be overcome by adopting techniques from computer
vision developed for image data to climate data. Layer-wise relevance propagation (LRP) is a technique to trace the
final prediction of a multilayered neural network back through its layers until reaching the input space [
18
], [
19
]. When
applied to image classification, this reveals valuable insights in which input pixels have the highest relevance for the
model to come to its conclusion.
Toms, Barnes and Ebert-Uphoff have shown in their work [
20
] that LRP can be successfully applied to MLP used for
classification of events related to some well-known Earth system variablity: El Niño Southern Oscillation (ENSO).
This work is inspired by [
20
] and goes beyond their studies: We also pick the well-known ENSO problem [
21
]. ENSO
is found to have some strong zonal structure: It comes with anomalies in the sea surface temperature (SST) in Tropical
Pacific. This phenomenon is limited to a quite narrow range of latitude and some extended region in terms of longitude.
We use ESN models for image classification on SST anomaly fields. We then open the black-box and apply LRP to
ESN models, which has not been done before - to the best of our knowledge.
SST anomaly fields used in this work are found to be noisy. For this reason we focus on a special flavour of ESNs, that
uses a leaky reservoir because they have been considered to be more powerful on noisy input data, compared to standard
ESNs [
22
]. With the help of our LRP application to ESNs, we find the leak rate used in reservoir state transition to be a
crucial parameter determining the memory of the reservoir. Leak rate needs to be chosen appropriately to enable ESN
models to reach the desired high level of accuracy.
Our models yield competitive results compared to linear regression and MLP used as baselines. However, ESN models
require significantly less parameters and hence prevent our model from overfitting. We even find our reservoirs to be
robust against random permutation of input fields, destroying the zonal structure in the underlying ENSO anomalies.
This opens the door to use ESNs on unsolved problems from the domain of climate and ocean science and apply further
techniques of the toolbox of xAI [23].
The rest of this work is structured as follows: In Section 2 we briefly introduce basic ESNs and focus on reservoir state
transition for leaky reservoirs. We then sketch an efficient way to use ESN models for image classification. Section 3
outlines the concept of LRP in general before we customize LRP for our base ESN models by unfolding the reservoir
recurrence. The classification of ENSO patterns and the application of LRP to ESN models is presented in Section
4. Our models are not only found to be competitive classifiers but also reveal valuable insights in what the models
have learned. We show robustness of our models on randomly permuted input samples and visualize how the leak rate
determines the reservoir memory. Discussion and conclusion is found in Section 5, followed by technical details on the
used ESN and baseline models in the Appendix.
2