
In recent years however new forecast techniques are becoming available, which use autoregression (AR) meth-
ods to combine the long-term forecast of mesoscale models and the telemetry data coming from the telescope
sensors, and are able to provide short-term forecasts over a window of 1-4 hours, regularly updated during the
night of observations [5].
Short-term AR aided forecasts are able to provide a huge gain, in terms of forecast error, on all the parameters
interested by this kind of prediction [5], though being limited to few hours in the future, and allow the telescope
operations to think about a different kind of planning. A first implementation of such strategy is currently
present in the ALTA project.
Other studies focused on implementing an OT forecast purely based on machine learning (ML), without the
input from an atmospheric model (which is based on physics), relying only on the measurements made available
from the telescope instruments and monitors [9]. These methods seem to provide a limited accuracy with respect
to the previously mentioned AR short-term forecasts, and share the same limitations on the future forecast win-
dows of very few hours. Despite this the implementation of such tools is very preliminary and it’s worth studying
their performances in order to explore their capabilities. The present paper is dedicated to this latter aspect.
Also any knowledge accumulated with these tools could prove instrumental in increasing the performances of the
already implemented short-term predictions, which huge benefit for the telescope planning and scientific output.
In this contribution we will concentrate our attention on the Random Forest (RF) ML algorithm as it has
been already used for atmospheric forecasts [9]. We investigate the feasibility of a forecast of Optical Tur-
bulence (astroclimatic) parameters (seeing, wavefront coherence time (τ0), Isoplanatic Angle (θ0) and Ground
Layer Fraction (GLF), that is the C2
Nfraction at ground, and atmospheric parameters (Temperature, Relative
Humidity, Wind speed and direction), by using only instrument data from the telescope telemetry. This study
focuses on the VLT telescope and make use of data obtained from the Ambient Condition Database provided
by ESO†, which by far is one of the most complete collection of telemetry data with a wide range of measured
parameters with different sensors, without any input from the long-term mesoscale model forecast. Specifically,
we are interested into characterizing the behaviour of the ML method with respect to different parameters such
as the training sample length and a variation of the sampling temporal frequency of dataset. We also test two
different applications for 1-hour and 2-hour future forecast, which are the most relevant for telescope real-time
applications. The aim is to evaluate the reliability of the ML method itself and pave the way to more complex
applications, which may also make use of input parameters coming from an atmospheric forecast model. We are
interested in identifying and characterizing the constraints imposed by the ML method. For the sake of simplic-
ity, in this preliminary study we use the RMSE error as the sole indicator for the forecast performance. Once
the methods and the optimal input sets are selected, we will focus also on the LBT telescope implementation
in future studies. We refer to ESO database for an in-detail explanation of all the parameters treated in this study.
2. ALGORITHM AND INPUT SET DEFINITION
ML saw a huge development in the last decades of XX century and rose to a widespread usage in the first decades
of XXI century. The term can be used as a general hat to cover different disciplines from Artificial Intelligence to
Neural Networks and Computational Statistics. In general we refer to ML techniques when based on algorithms
that make use of heterogeneous data to automatically “learn” and build a “model” that is used to produce a
desired output, using statistical methods. While a general discussion on the several categories of ML methods is
out of the scopes of this paper, in order to perform an atmospheric and OT prediction we are interested in the
broad class of Supervised methods, that is algorithms that are trained over pre-given set of inputs and outputs.
Among this general category, algorithms can be divided into Classifiers, that is methods that predicts general
categories as an output (e.g. bad/good seeing) and Regressors, that instead produce a real number (i.e. the
value of the seeing). This paper focuses on the Random Forest (RF) algorithm [10], already used in previous
similar studies [9]. The RF algorithm is one of the simplest yet robust methods that can be implemented, and
†http://archive.eso.org/cms/eso-data/ambient-conditions.html