
2 Nicola Scafetta
CO2concentration could cause an increase in global surface temperature of about 1◦C. Therefore, only strong positive
climate feedbacks could significantly increase the ECS above such a value, but their existence is still debated.
Constraining the ECS value is an urgent task of climatology. In fact, at least two-thirds of the CMIP6 GCMs could be
severely defective. For example, by grouping models into low (1.5 <ECS ≤3.0 ◦C), medium (3.0 <ECS ≤4.5 ◦C) and
high (4.5 <ECS ≤6.0 ◦C) sensitivity values, if, say, the actual ECS is less than 3°C, the GCMs with ECS >3◦C should
be ignored. Therefore, it is very important that detailed evaluations of the models are carried out in order to determine
if, where and how the models should improve both on a global scale – as proposed, for example, in this work – and on
regional scales, as done in numerous other studies (e.g.: Heo et al., 2014; Seo et al., 2018, and many others).
Constraining ECS also has important policy implications because the expected warming for the 21st century depends
on the value of the model’s ECS (Grose et al., 2017; Scafetta, 2022): the higher the ECS, the greater the expected warming
due to GHG emissions. For example, Huntingford et al. (2020) found that the wide ECS range of CMIP6 GCMs implies
that at thermal equilibrium the global surface temperature could warm up between 1.0◦C and 3.3◦C above the pre-
industrial period (1850-1900) even if anthropocentric emissions cease today.
Scientists already wondered whether a strong response to greenhouse gases could be realistic (Voosen, 2019). Indeed,
high ECS CMIP6 models have already been found to perform poorly (e.g.: Ribes et al., 2021; Scafetta, 2022; Tokarska et
al., 2020; Zhu et al., 2020) while the medium and even the low ECS models are being carefully evaluated.
For example, Nijsse et al. (2020) derived that the most likely ECS interval should be 1.9-3.4◦C while alternative
studies, often empirical based, have suggested that the actual ECS could be even lower, probably between 1◦C and
2.5◦C (e.g.: Lewis and Curry, 2018; Lindzen and Choi, 2011; Scafetta, 2013; Stefani, 2021; van Wijngaarden and Happer,
2020). Most GCMs seem to overestimate the observed surface warming since 1980 (Scafetta, 2021b, 2022) and also that
observed in the global (McKitrick and Christy, 2020) and tropical troposphere (Mitchell et al., 2020), in particular at
its top (200-300 hPa) where the CMIP6 GCMs predict an unobserved hot spot (McKitrick and Christy, 2018). A similar
situation also occurred with the previous CMIP3 and CMIP5 GCMs (Fu et al., 2011; Scafetta, 2012a, 2013). Actually, as
Knutti et al. (2017) acknowledged, there is a dichotomy between the observed and modeled ECS as GCMs tend to favor
sensitivity values at the top of the probable range, while several studies based on instrumentally recorded warming and
some from paleoclimate favor values in the lower part of the range. Therefore, not only the models with high ECS, but
also those with medium ECS should be and are being seriously questioned.
Scafetta (2021a, 2022) showed that the performance of the GCMs improves as their ECS decreases and, in any case,
the low ECS GCMs appear to be the best performing models. However, even low-ECS GCMs need further evaluation
because biases in some regions (e.g. on land) could be offset by opposite biases in other regions (e.g. on ocean). Further-
more, serious uncertainties remain in the solar forcing and in the temperature records themselves (Connolly et al., 2021;
D’Aleo, 2016). These uncertainties question the warming trend reported by the available climate records and, directly or
indirectly, the models themselves. Finally, climate systems seem to be regulated by various natural oscillations from the
decadal to the millennial scales, which the GCMs are unable to reproduce, the presence of which would also imply low
ECS values, probably between 1 and 2◦C (Scafetta, 2012a, 2013, 2021c).
Focusing on the performance of the CMIP6 GCMs, Scafetta (2022) proposed that the probable ECS range could be
constrained by statistical investigation to find which GCM group – low, medium or high ECS – best reproduces the ob-
served global surface warming between the 1980-1990 and 2011-2021 as reported by ERA5-T2m (Hersbach et al., 2020;
Simmons et al., 2021). The period 1980-2021 was chosen because it is optimally covered by all available climatic tempera-
ture records. Scafetta (2022) analyzed the “average” simulations provided by the Koninklijk Nederlands Meteorologisch
Instituut (KNMI) Climate Explorer (Oldenborgh, 2020) of 38 CMIP6 GCMs with three shared socioeconomic pathways
(SSP) emission scenarios, which also counted for a partial evaluation of the internal variability of the models. The low
ECS GCM group was found to be perfectly compatible, at least on a global scale, with the 2011-2021 warming relating to
the 1980-1990 period. Conversely, both GCM groups with medium and high ECS showed too high warming trends.
A possible objection to the analysis proposed in Scafetta (2022) is that temperature records should be compared
with actual members of the CMIP6 GCM ensemble instead of their ensemble averages because the unforced internal
variability of the models produces different results due to uncertainties in the initial conditions as well as in the internal
parameters of the models. This problem will be addressed in this paper considering that:
1. physical models, including the GCMs, should be accurate and precise (see Appendix B);
2. there are still open issues regarding the reliability of the available global surface temperature records.
In fact, theoretical models must reproduce observations within a reasonably small error. In our case, it should be evident
that the poor precision of a GCM cannot be used as a pretext to justify its poor accuracy. For example, a low-precision
model could produce a very wide range of different hindcasts due to its internal variability. In this situation, even if
some of its hindcasts fit the observations, the result should still be considered unsatisfactory if the mean of the GCM set
diverges too much from the actual data. Similarly, if an ECS GCM group produces a set of hindcasts that too sparsely
encompass the observations, the ECS values that characterize that group should be considered unrealistic even though
some of the models in the same group might perform better than others. In general, the accuracy, precision and ECS
category of the GCMs must be evaluated simultaneously.