Interpreting County Level COVID-19 Infection and
Feature Sensitivity using Deep Learning Time
Series Models
Md Khairul Islam 1, Di Zhu 1, Yingzheng Liu 1, Andrej Erkelens 2, Nick Daniello 2, Judy Fox 1,2
1Computer Science Department, University of Virginia
2School of Data Science, University of Virginia
Charlottesville, USA
Email : {mi3se, yqx8es, yl4dt, wsw3fa, njd9e, cwk9mp}@virginia.edu
Abstract—Interpretable machine learning plays a key role in
healthcare because it is challenging in understanding feature
importance in deep learning model predictions. We propose
a novel framework that uses deep learning to study feature
sensitivity for model predictions. This work combines sensitivity
analysis with heterogeneous time-series deep learning model
prediction, which corresponds to the interpretations of Spatio-
temporal features from what the model has actually learned.
We forecast county-level COVID-19 infection using the Temporal
Fusion Transformer (TFT). We then use the sensitivity analysis
extending Morris Method to see how sensitive the outputs are
with respect to perturbation to our static and dynamic input
features. The significance of the work is grounded in a real-
world COVID-19 infection prediction with highly non-stationary,
finely granular, and heterogeneous data. 1) Our model can
capture the detailed daily changes of temporal and spatial model
behaviors and achieves high prediction performance compared to
a PyTorch baseline. 2) By analyzing the Morris sensitivity indices
and attention patterns, we decipher the meaning of feature
importance with observational population and dynamic model
changes. 3) We have collected 2.5 years of socioeconomic and
health features over 3142 US counties, such as observed cases and
deaths, and a number of static (age distribution, health disparity,
and industry) and dynamic features (vaccination, disease spread,
transmissible cases, and social distancing). Using the proposed
framework, we conduct extensive experiments and show our
model can learn complex interactions and perform predictions
for daily infection at the county level. Being able to model
the disease infection with a hybrid prediction and description
accuracy measurement with Morris index at the county level is a
central idea that sheds light on individual feature interpretation
via sensitivity analysis.
Index Terms—Interpretability, County Level COVID-19, Time
Series Deep Learning, TFT, Sensitivity Analysis, Morris Method.
I. INTRODUCTION
Interpretation of machine learning models has recently [1]
led to numerous research applications of AI for social impact.
This includes direct analysis of model components with casual
inference and uncertainty estimation or studying sensitivity
to input perturbations. Typically a simpler model is easier
to interpret but can result in lower predictive accuracy. One
natural question that arises is how to interpret these complex
deep learning models, which may describe the data better.
One major challenge of interpretability is the gap between
model prediction accuracy and descriptive accuracy in real-
world problems. The latter can be illustrated by a quantifiable
measurement and explanation of the individual feature impor-
tance with regard to the model’s forecast relevancy.
To our knowledge, however, no prior studies have evalu-
ated individual feature importance at the county level using
deep learning and the Morris method. We have been closely
monitoring the scientific literature and identifying reports de-
scribing the community-level impact of COVID-19. A number
of factors contribute to COVID-19 cases and deaths, including
a very diverse set of socioeconomic and geographic-specific
features. A more granular real-time analysis that considers
important county-level factors is lacking and urgently needed.
Furthermore, non-stationary time series (with their distribution
drifting over time) [2] or time series with extreme events
[3] or unknown events like COVID variants are particularly
challenging to model and interpret.
To effectively study county-level input features, we design a
novel method to compute the Morris index but generalize it to
multidimensional spatial and temporal variables. Using a self-
attention-based Temporal Fusion Transformer (TFT) model
[4], we can capture a complex mix and full range of static
and dynamic covariates, known inputs, and other exogenous
time series parameters. We perform individual feature impor-
tance evaluations to identify the most influential features for
prediction and the sensitivity of infected cases. The results
show that the model obtains significant performance and
learns temporal patterns. More significantly, our scaled Morris
index provides sensitivity measurement to individual features
that help policymakers develop effective control strategies in
response to the rapidly evolving pandemic. We have made
our code available on GitHub 1. In summary, we’ve made the
following contributions:
•Introduce individual feature sensitivity to forecasting out-
puts with an extended Morris Method for multidimen-
sional spatial and temporal data.
1https://github.com/Data-ScienceHub/gpce-sensitivity
arXiv:2210.03258v1 [cs.LG] 6 Oct 2022