
Probabilistic Inverse Modeling: An Application in Hydrology
Somya Sharma∗Rahul Ghosh∗Arvind Renganathan∗Xiang Li∗
Snigdhansu Chatterjee∗John Nieber∗Christopher Duffy+Vipin Kumar∗
Abstract
Rapid advancement in inverse modeling methods have
brought into light their susceptibility to imperfect data. The
astounding success of these methods has made it impera-
tive to obtain more explainable and trustworthy estimates
from these models. In hydrology, basin characteristics can
be noisy or missing, impacting streamflow prediction. For
solving inverse problems in such applications, ensuring ex-
plainability is pivotal for tackling issues relating to data
bias and large search space. We propose a probabilistic in-
verse model framework that can reconstruct robust hydrol-
ogy basin characteristics from dynamic input weather driver
and streamflow response data. We address two aspects of
building more explainable inverse models, uncertainty esti-
mation and robustness. This can help improve the trust of
water managers, handling of noisy data and reduce costs.
We propose uncertainty based learning method that offers
6% improvement in R2for streamflow prediction (forward
modeling) from inverse model inferred basin characteristic
estimates, 17% reduction in uncertainty (40% in presence of
noise) and 4% higher coverage rate for basin characteristics.
1 Introduction
Researchers in scientific communities study engineered
or natural systems and their responses to external
drivers. In hydrology, streamflow prediction is one cru-
cial research problem for understanding hydrology cy-
cles, flood mapping, water supply management, and
other operational decisions. For a given entity (river-
basin/catchment), the response (streamflow) is gov-
erned by external drivers (meteorological data) and
complex physical processes specific to each entity (basin
characteristics). Process-based models are commonly
used to study streamflow in river basins (for example,
Soil & Water Assessment Tool). However, these hy-
drological models are constrained by assumptions, con-
tain many parameters that need calibration and incur
enormous computation cost. In addition, these mod-
els are often calibrated on every specific catchment and
thus can require specific fine-tuning for each basin. As
promising alternatives, machine learning (ML) models
are increasingly being used [30] (Figure 1 shows the di-
∗University of Minnesota - Twin Cities. {sharm636, ghosh128,
renga016, lixx5000, chatt019, nieber, kumar001}@umn.edu, +
Pennsylvania State University {cxd11}@psu.edu
Figure 1: Forward model using weather drivers xt
iand
river basin characteristics zt
ito estimate streamflow yt
i
[16]
agrammatic representation of this data-driven forward
model). In our study, an entity’s response to external
drivers depends on its inherent properties (called entity
characteristics). For example, for the same amount of
precipitation (external driver), two river basins (enti-
ties) will have very different streamflow (response) val-
ues depending on their land-cover type (entity char-
acteristic) [38]. Disregarding these inherent proper-
ties of entities can lead to sub-optimal model per-
formance. Knowledge-guided self-supervised learning
(KGSSL) [16] has been proposed to extract these en-
tity characteristics using the input drivers and output-
response data.
Developing such entity-aware inverse models re-
quires addressing several challenges. Often, the mea-
sured characteristics are only surrogate variables for the
actual entity characteristics, leading to inconsistencies
and high uncertainty. Uncertainty can arise due to sev-
eral reasons, such as measurement error, missing data,
and temporal changes in characteristics. Moreover, in
real-world applications these characteristics may be es-
sential in modeling the driver-response relation. How-
ever, they may be completely unknown, not well un-
derstood, or not present in the available set of entity
characteristics. A principled method of managing this
uncertainty due to imperfect data can contribute in im-
proving trust of data-driven decision making from these
methods.
In this paper, we introduce uncertainty quantifica-
tion in learning representations of static characteristics.
Such a framework can help quantify the effect of multi-
ple sources of uncertainty that introduce bias and er-
ror in decision-making. For instance, Equifinality of
hydrological modeling (different model representation
result in same model results) is a widely known phe-
nomenon affecting the adoption of hydrology models in
Copyright ©20XX by SIAM
Unauthorized reproduction of this article is prohibited
arXiv:2210.06213v1 [cs.LG] 12 Oct 2022