Data-Driven Sea Ice Forecasting Timofey Grigoryev et al.
new routes through the Arctic will cause an increase in the ocean and atmospheric pollution risks, primarily due to
fishing, oil/gas extraction, and transportation. For the delivery of natural gas and oil to long-distance destinations,
transport by deep-sea vessels is more economical compared to offshore pipelines [
10
]. To decrease ocean pollution and
the carbon footprint [
11
,
12
] caused by transportation, gas/oil companies must optimize the routes [
13
] to make them
faster and to reduce associated ecological risks (for example, reduce the atomic icebreaker usage).
Coupled ocean-ice numerical modeling is the evident source of a reliable forecast of sea ice conditions. Newest sea
ice models, such as NextSIM [
14
,
9
] demonstrate fascinating results on sea ice concentration, thickness and drift
vectors representation comparing to the observational data (OSI SAF SSMI-S [
15
], AMSR2 [
16
], GloblICE dataset,
http://www.globice.info
). NextSIM is a fully-Lagrangian finite-element model, making it tough to couple with
Euler method-based ocean models. Eulerian sea ice models have been evolving for the last two decades and can
reproduce some aspects of sea ice and its recent changes. However, detailed comparisons between satellite remote
sensing data with Eulerian-model results reveal big differences in certain aspects of the sea ice cover, e.g., for fracture
zones and small-scale dynamic processes [
17
,
18
]. It remains unclear whether the current model physics (elastic-
viscous-plastic rheology) is suitable for reproducing these observed sea ice deformation features [
19
,
20
,
21
] and
provides a reliable forecast. Furthermore, coupled ocean-ice numerical modeling requires significant computational
resources.
Statistical or data-driven machine learning approaches, on the other hand, are more flexible and lightweight. They
do not need a complex physical model of processes in the ocean and atmosphere to work. Once trained, such a
model only needs appropriate recent observations and comparatively little computational resources to make a forecast.
However, the training part in this case is quite difficult for several reasons. First, most of the input data used for training
(including sea ice concentration) is presented as 3d or even 4d spatiotemporal maps with a huge amount of highly
correlated input channels. It has been found, that usage of modern convolutional [
22
,
23
,
24
,
25
,
26
], recurrent [
27
,
28
]
or attention-based [
29
,
30
] architectures can overcome difficulties associated with exploding number of trainable
parameters and overfitting. Second, the model’s output is expected to be a consistent SIC forecast retaining the same
spatiotemporal nature, which is hard to guarantee when training on a limited amount of data. In order to overcome
these difficulties, one can train a model not to predict the data itself but to compensate for the errors of simple baselines,
such as climatology mean, persistence, or cell-wise linear trend. Finally, operative climate and sea ice characteristics
data have their peculiarities. It usually consists of several patches obtained at different times each day, thus should be
combined and averaged daily. SIC can only be measured in the sea, leaving the land cells blank. Measurements can be
based on different sources inheriting different biases, making the signal-to-noise ratio lower than expected. Furthermore,
the actual changes in the sea ice condition occur in limited periods in fall and spring, making more than half of the data
barely usable. Considering everything above, one must be very thoughtful when designing training and testing pipelines
and choose proper metrics to assess obtained solutions adequately.
Many works are dedicated to sea ice forecasting in the Arctic region. However, research in this field mainly focuses on
climate studies rather than operative sea ice forecasts for practical use. Fully-connected MLP is often used either as
the primary method for predicting monthly-averaged sea ice concentration [
31
] or as one of the benchmarks [
32
,
33
].
NSIDC Nimbus-7 SMMR and DMSP SSMI/SSMIS data are used there as SIC maps. Other approaches exploit
CNN, applied on patches cropped out of ice maps [
33
], or RF with an additional set of weather input features from
ERA-Interim [
34
]. Deep learning methods are compared with simpler baselines in these works and reported to perform
significantly better in standard metrics, such as RMSE. Works [
35
,
36
] are of particular interest, as they consider more
advanced deep learning models that seem more suitable for sea ice forecasting. In [
35
] authors consider ConvLSTM
[
37
] model, which can fully make use of spatial-temporal structure of the climatological data. However, they use
weather maps (predictors) from ERA-Interim and ORAS4 NEMO reanalysis data for training, thus limiting model
applicability for operational sea ice forecasts. Authors evaluate the performance of ConvLSTM on a weekly-averaged
and monthly-averaged scale and obtain results comparable in terms of RMSE to those of the ECMWF numerical climate
model only for short lead times. Authors of [
36
] deal with U-Net [
25
] model and train it to predict probabilities for the
next 6 months for monthly-averaged SIC values in each cell to belong to each of three classes: open water, marginal ice
and packed ice. They thoroughly investigate the model properties and compare it with SEAS5, a numerical ocean-ice
model with state-of-the-art sea ice prediction skills. However, the paper does not consider possibilities for operating at
the daily temporal resolution.
In our work, we focus on the operative daily sea ice forecasting and imply corresponding restrictions on the weather and
sea ice data we use. To our knowledge, only a few papers consider this type of setting. However, all these works either
use non-operative reanalysis data or perform experiments with one or two currently outdated machine learning methods.
For example, [
38
] demonstrates the potential of machine learning in sea ice forecasting by comparing a numerical
ocean-ice model with simple CNN and cell-wise k-NN method. Unlike previous works, it focuses on short-term
predictions with a length of 1-4 weeks. [
39
] assesses the ability of different cell-wise GRU networks equipped with
feed-forward encoder and decoder to forecast SIC for up to the next 15 days. To overcome limitations of locality in
2