Probabilities Are Not Enough Formal Controller Synthesis for Stochastic Dynamical Models with Epistemic Uncertainty Thom Badings1 Licio Romao2 Alessandro Abate2 Nils Jansen1

2025-04-26 0 0 2.03MB 16 页 10玖币
侵权投诉
Probabilities Are Not Enough: Formal Controller Synthesis for
Stochastic Dynamical Models with Epistemic Uncertainty
Thom Badings1, Licio Romao2, Alessandro Abate2, Nils Jansen1
1Radboud University, Nijmegen, the Netherlands
2University of Oxford, Oxford, United Kingdom
Abstract
Capturing uncertainty in models of complex dynamical sys-
tems is crucial to designing safe controllers. Stochastic noise
causes aleatoric uncertainty, whereas imprecise knowledge of
model parameters leads to epistemic uncertainty. Several ap-
proaches use formal abstractions to synthesize policies that sat-
isfy temporal specifications related to safety and reachability.
However, the underlying models exclusively capture aleatoric
but not epistemic uncertainty, and thus require that model pa-
rameters are known precisely. Our contribution to overcoming
this restriction is a novel abstraction-based controller synthe-
sis method for continuous-state models with stochastic noise
and uncertain parameters. By sampling techniques and robust
analysis, we capture both aleatoric and epistemic uncertainty,
with a user-specified confidence level, in the transition proba-
bility intervals of a so-called interval Markov decision process
(iMDP). We synthesize an optimal policy on this iMDP, which
translates (with the specified confidence level) to a feedback
controller for the continuous model with the same perfor-
mance guarantees. Our experimental benchmarks confirm that
accounting for epistemic uncertainty leads to controllers that
are more robust against variations in parameter values.
1 Introduction
Stochastic models.
Stochastic dynamical models capture
complex systems where the likelihood of transitions is speci-
fied by probabilities (Kumar and Varaiya 2015). Controllers
for stochastic models must act safely and reliably with re-
spect to a desired specification. Traditional control design
methods use, e.g., Lyapunov functions and optimization to
guarantee stability and (asymptotic) convergence. However,
alternative methods are needed to give formal guarantees
about richer temporal specifications relevant to, for example,
safety-critical applications (Fan et al. 2022).
Finite abstractions.
A powerful approach to synthesizing
certifiably safe controllers leverages probabilistic verification
to provide formal guarantees over specifications of safety
(always avoid certain states) and reachability (reach a certain
set of states). A common example is the reach-avoid specifi-
cation, where the task is to maximize the probability of reach-
ing desired goal states while avoiding unsafe states (Fisac
et al. 2015). Finite abstractions can make continuous models
Copyright
©
2023, Association for the Advancement of Artificial
Intelligence (www.aaai.org). All rights reserved.
System
s
Controller
Control
input
Ù
Ù+
Ù
Ù+
PÜ= high
P{Ü= low }
Figure 1: Aleatoric (stochastic) uncertainty in the wind (
Ö
)
causes probability distributions over the outcomes of controls,
while epistemic uncertainty in the mass (
Ó
) of the drone
causes state transitions to be nondeterministic.
amenable to techniques and tools from formal verification:
by discretizing their state and action spaces, abstractions re-
sult in, e.g., finite Markov decision processes (MDPs) that
soundly capture the continuous dynamics (Abate et al. 2008).
Verification guarantees on the finite abstraction can thus carry
over to the continuous model. In this paper, we adopt such
an abstraction-based approach to controller synthesis.
Probabilities are not enough.
The notion of uncertainty
is often distinguished in aleatoric (statistical) and epistemic
(systematic) uncertainty (Fox and
¨
Ulk
¨
umen 2011; Sullivan
2015). Epistemic uncertainty is, in particular, modeled by
parameters that are not precisely known (Smith 2013). A
general premise is that purely probabilistic approaches fail to
capture epistemic uncertainty (H
¨
ullermeier and Waegeman
2021). In this work, we aim to reason under both aleatoric
and epistemic uncertainty in order to synthesize provably
correct controllers for safety-critical applications. Existing
abstraction methods fail to achieve this novel, general goal.
Models with epistemic uncertainty.
We consider reach-
avoid problems for stochastic dynamical models with con-
tinuous state and action spaces under epistemic uncertainty
described by parameters that lie within a convex uncertainty
set. In the simplest case, this uncertainty set is an interval,
such as a drone whose mass is only known to lie between
0.75
1.25 kg
. As shown in Fig. 1, the dynamics of the drone
depend on uncertain factors, such as the wind and the drone’s
mass. For the wind, we may derive a probabilistic model
from, e.g., weather data to reason over the likelihood of state
dynamics. For the mass, however, we do not have information
about the likelihood of each value, so employing a probabilis-
tic model is unrealistic. Thus, we treat epistemic uncertainty
arXiv:2210.05989v2 [eess.SY] 7 Dec 2022
in such imprecisely known parameters (in this case, the mass)
using a nondeterministic framework instead.
Problem statement.
Our goal is to synthesize a controller
that (1) is robust against nondeterminism due to parameter
uncertainty and (2) reasons over probabilities derived from
stochastic noise. In other words, the controller must satisfy a
given specification under any possible outcome of the nonde-
terminism (robustness) and with at least a certain probability
regarding the stochastic noise (reasoning over probabilities).
We wish to synthesize a controller with a probably approxi-
mately correct (PAC)-style guarantee to satisfy a reach-avoid
specification with at least a desired threshold probability.
Thus, we solve the following problem:
Problem.
Given a reach-avoid specification for a stochas-
tic model with uncertain parameters, compute a con-
troller and a lower bound on the probability that, under
any admissible value of the parameters, the specification
is probabilistically satisfied with this lower bound and
with at least a user-specified confidence probability.
We solve this problem via a discrete-state abstraction of the
continuous model. We generate this abstraction by partition-
ing the continuous state space and defining actions that induce
potential transitions between elements of this partition.
Algorithmically, the closest approach to ours is Badings
et al. (2022), which uses abstractions to synthesize controllers
for stochastic models with aleatoric uncertainty of unknown
distribution, but with known parameters. Our setting is more
general, as epistemic uncertainty requires fundamental differ-
ences to the technical approach, as explained below.
Robustness to capture nondeterminism.
The main con-
tribution that allows us to capture nondeterminism, is that
we reason over sets of potential transitions (as shown by
the boxes in Fig. 1), rather than precise transitions, e.g., as
in Badings et al. (2022). Intuitively, for a given action, the
aleatoric uncertainty creates a probability distribution over
sets of possible outcomes. To ensure robustness against epis-
temic uncertainty, we consider all possible outcomes within
these sets. We show that, for our class of models, comput-
ing these sets of all possible outcomes is computationally
tractable. Building upon this reasoning, we provide the fol-
lowing guarantees to solve the above-mentioned problem.
1) PAC guarantees on abstractions.
We show that
both probabilities and nondeterminism can be captured
in the transition probability intervals of so-called interval
Markov decision processes (iMDPs, Givan, Leach, and Dean
2000). We use sampling methods from scenario optimiza-
tion (Campi, Car
`
e, and Garatti 2021) and concentration in-
equalities (Boucheron, Lugosi, and Massart 2013) to compute
PAC bounds on these intervals. With a predefined confidence
probability, the iMDP correctly captures both aleatoric and
epistemic uncertainty.
2) Correct-by-construction control.
For the iMDP, we
compute a robust optimal policy that maximizes the worst-
case probability of satisfying the reach-avoid specification.
The iMDP policy is automatically translated to a controller
for the original, continuous model on the fly. We show that, by
construction, the PAC guarantees on the iMDP carry over to
the satisfaction of the specification by the continuous model.
Contributions.
We develop the first abstraction-based, for-
mal controller synthesis method that simultaneously captures
epistemic and aleatoric uncertainty for models with contin-
uous state and action spaces. We also provide results on the
PAC-correctness of obtained iMDP abstractions and guaran-
tees on the synthesized controllers for a reach-avoid speci-
fication. Our numerical experiments in Sect. 6 confirm that
accounting for epistemic uncertainty yields controllers that
are more robust against deviations in the parameter values.
Related Work
Uncertainty models.
Distinguishing aleatoric from epis-
temic uncertainty is a key challenge towards trustworthy
AI (Thiebes, Lins, and Sunyaev 2021), and has been con-
sidered in reinforcement learning (Charpentier et al. 2022),
Bayesian neural networks (Depeweg et al. 2018; Loquercio,
Seg
`
u, and Scaramuzza 2020), and systems modeling (Smith
2013). Dynamical models with epistemic parameter uncer-
tainty (but without aleatoric uncertainty) are considered
by Yedavalli (2014), and Geromel and Colaneri (2006). Con-
trol of models similar to ours is studied by (Modares 2022),
albeit only for stability specifications.
Model-based approaches.
Abstractions of stochastic mod-
els are a well-studied research area (Abate et al. 2008; Alur
et al. 2000), with applications to stochastic hybrid (Cauchi
et al. 2019; Lavaei et al. 2022), switched (Lahijanian, Ander-
sson, and Belta 2015), and partially observable systems (Bad-
ings et al. 2021; Haesaert et al. 2018). Various tools exist,
e.g., StocHy (Cauchi and Abate 2019), ProbReach (Shmarov
and Zuliani 2015), and SReachTools (Vinod, Gleason, and
Oishi 2019). However, in contrast to the approach of this
paper, none of these papers deals with epistemic uncertainty.
Fan et al. (2022) use optimization for reach-avoid control
of linear but non-stochastic models with bounded distur-
bances. Barrier functions are used for cost minimization in
stochastic optimal control (Pereira et al. 2020). So-called
funnel libraries are used by Majumdar and Tedrake (2017)
for robust feedback motion planning under epistemic uncer-
tainty. Finally, Zikelic et al. (2022) learn policies together
with formal reach-avoid certificates using neural networks
for nonlinear systems with only aleatoric uncertainty.
Data-driven approaches.
Models with (partly) unknown
dynamics express epistemic uncertainty about the underlying
system. Verification of such models based on data has been
done using Bayesian inference (Haesaert, den Hof, and Abate
2017), optimization (Kenanian et al. 2019; Vinod, Israel, and
Topcu 2022), and Gaussian process regression (Jackson et al.
2020). Moreover, Knuth et al. (2021), and Chou, Ozay, and
Berenson (2021) use neural network models for feedback
motion planning for nonlinear deterministic systems with
probabilistic safety and reachability guarantees. Data-driven
abstractions have been developed for monotone (Makdesi,
Girard, and Fribourg 2021) and event-triggered systems (Pe-
ruffo and Mazo 2023). By contrast to our setting, these
approaches consider models with non-stochastic dynam-
ics. A few recent exceptions also consider aleatoric uncer-
tainty (Salamati and Zamani 2022; Lavaei et al. 2023), but
these approaches require more strict assumptions (e.g., dis-
crete input sets) than our model-based approach.
Safe learning.
While outside the scope of this paper, our
approach fits naturally in a model-based safe learning con-
text (Brunke et al. 2022; Garc
´
ıa and Fern
´
andez 2015). In
such a setting, our approach may synthesize controllers that
guarantee safe interactions with the system, while techniques
from, for example, reinforcement learning (RL, Berkenkamp
et al. 2017; Zanon and Gros 2021) or stochastic system identi-
fication (Tsiamis and Pappas 2019) can reduce the epistemic
uncertainty based on state observations. A risk-sensitive RL
scheme providing approximate safety guarantees is devel-
oped by Geibel and Wysotzki (2005); we instead give formal
guarantees at the cost of an expensive abstraction.
2 Problem Statement
The cardinality of a discrete set
X
is
|X |
. A probability space
is a triple
(Ω,F,P)
of an arbitrary set
, sigma algebra
F
on
, and probability measure
P:F [0,1]
. The con-
vex hull of a polytopic set
X
with vertices
v1, . . . , vn
is
conv(v1, . . . , vn)
. The word controller relates to continuous
models; a policy to discrete models.
Stochastic Models with Parameter Uncertainty
To capture parameter uncertainty in a linear time-invariant
stochastic system, we consider a model (we extend this model
with parameters describing uncertain additive disturbances in
Sect. 5) whose continuous state
xk
at time
kN
evolves as
xk+1 =A(α)xk+B(α)uk+ηk,(1)
where
uk∈ U
is the control input, which is constrained by
the control space
U= conv(u1, . . . , uq)Rm
, being is
a convex polytope with
q
vertices, and where the process
noise
ηk
is a stochastic process defined on a probability space
(Ω,F,P)
. Both the dynamics matrix
A(α)Rn×n
and the
control matrix
B(α)Rn×m
are a convex combination of a
finite set of rNknown matrices:
A(α) = Xr
i=1 αiAi, B(α) = Xr
i=1 αiBi,(2)
where the unknown model parameter
αΓ
can be any point
in the unit simplex ΓRr:
Γ = nαRr:αi0,i∈ {1, . . . , r},Xr
i=1 αi= 1o.
The model in Eq. (1) captures epistemic uncertainty in
A(α)
and
B(α)
through model parameter
α
, as well as aleatoric
uncertainty in the discrete-time stochastic process (ηk)kN.
Assumption 1.
The noise
ηk
is independent and identically
distributed (i.i.d., which is a common assumption) and has
density with respect to the Lebesgue measure. However, con-
trary to most definitions, we allow Pto be unknown.
Importantly, being distribution-free, our proposed techniques
hold for any distribution of ηthat satisfies Assumption 1.
The matrices
Ai
and
Bi
can represent the bounds of inter-
vals over parameters, as illustrated by the following example.
Example 1.
Consider again the drone of Fig. 1. The drone’s
longitudinal position pkand velocity vkare modeled as
xk+1 =pk+1
vk+1=1τ
0 1 0.1τ
mxk+τ2
2m
τ
muk+ηk,
with
τ
the discretization time, and
U= [5,5]
. Assume that
the mass
m
is only known to lie within
[0.75,1.25]
. Then, we
obtain a model as Eq. (1), with
r= 2
vertices where
A1, B1
are obtained for m:= 0.75, and A2, B2for m:= 1.25.
Reach-Avoid Planning Problem
The goal is to steer the state
xk
of Eq. (1) to a desirable state
within
K
time steps while always remaining in a safe region.
Formally, let the safe set
Z
be a compact set of
Rn
, and let
G ⊆ Z
be the goal set (see Fig. 2). The control inputs
uk
in
Eq. (1) are selected by a time-varying feedback controller:
Definition 1.
A time-varying feedback controller
c:Rn×
N→ U
for Eq. (1) is a function that maps a state
xkRn
and a time step kNto a control input uk∈ U.
The space of admissible feedback controllers is denoted by
C
. The reach-avoid probability
V(x0, α, c)
is the probability
that Eq. (1), under parameter αΓand a controller c, satis-
fies a reach-avoid planning problem with respect to the sets
Zand G, starting from initial state x0Rn. Formally:
Definition 2.
The reach-avoid probability
V:Rn×Γ×C →
[0,1] for a given controller c∈ C on horizon Kis
V(x0, α, c) = PηkΩ : xkevolves as per Eq. (1),
k0∈ {0, . . . , K}such that xk0∈ G,and
xk∈ Z, uk=c(xk, k)k∈ {0, . . . , k0}.
We aim to find a controller for which the reach-avoid proba-
bility is above a certain threshold λ[0,1]. However, since
parameter
α
of Eq. (1) is unknown, it is impossible to com-
pute the reach-avoid probability explicitly. We instead take a
robust approach, thus stating the formal problem as follows:
Formal problem.
Given an initial state
x0
, find a controller
c∈ C
together with a (high) probability threshold
λ[0,1]
,
such that V(x0, α, c)λholds for all αΓ.
Thus, we want to find a controller
c∈ C
that is robust against
any possible instance of parameter
αΓ
in Eq. (1). Due
to the aleatoric uncertainty in Eq. (1) of unknown distribu-
tion, we solve the problem up to a user-specified confidence
probability β(0,1), as we shall see in Sect. 5.
Interval Markov Decision Processes
We solve the problem by generating a finite-state abstraction
of the model as an interval Markov decision process (iMDP):
Definition 3.
An iMDP is a tuple
MI= (S, Act, sI,P)
where
S
is a finite set of states,
Act
is a finite set of actions,
sIS
is the initial state, and
P:S×Act×SI{[0,0]}
is an uncertain partial probabilistic transition function over
intervals I={[a, b]|a, b (0,1] and ab}.
Note that an interval may not have a lower bound of
0
, except
for the
[0,0]
interval. If
P(s, a, s0) = [0,0] s0S
, action
aAct
is not enabled in state
sS
. We can instantiate
Figure 2: Partition of safe set
X ⊆ Z
(which excludes obsta-
cles) into regions that define iMDP states. A state
si
is a goal
state,
siSG
, if its region is contained in the goal region
G
.
an iMDP into an MDP with a partial transition function
P:S×Act * Dist (S)
by fixing a probability
P(s, a)(s0)
P(s, a, s0)
for each
s, s0S
and for each
aAct
enabled
in
s
, such that
P(s, a)
is a probability distribution over
S
.
For brevity, we write this instantiation as
P∈ P
and denote
the resulting MDP by MI[P].
A deterministic policy (Baier and Katoen 2008) for an
iMDP
MI
is a function
π:SAct
, where
S
is a se-
quence of states (memoryless policies do not suffice for time-
bounded specifications), with
πΠMI
the admissible policy
space. For a policy
π
and an instantiated MDP
MI[P]
for
P∈ P
, we denote by
P rπ(MI[P]|=ϕK
sI)
the probability
of satisfying a reach-avoid specification
1ϕK
sI
(i.e., reaching a
goal in
SGS
within
KN
steps, while not leaving a safe
set
SZS
). A robust optimal policy
π?ΠMI
maximizes
this probability under the minimizing instantiation P∈ P:2
π?= arg max
πΠ
min
P∈P P rπ(MI[P]|=ϕK
sI).(3)
We compute an optimal policy in Eq. (3) using a robust
variant of value iteration proposed by Wolff, Topcu, and
Murray (2012). Note that deterministic policies suffice to
obtain optimal values for Eq. (3), see Puggelli et al. (2013).
3 Finite-State Abstraction
To solve the formal problem, we construct a finite-state ab-
straction of Eq. (1) as an iMDP. We define the actions of
this iMDP via backward reachability computations on a so-
called nominal model that neglects any source of uncertainty.
We then compensate for the error caused by this modeling
simplification in the iMDP’s transition probability intervals.
Nominal Model of the Dynamics
To build our abstraction, we rely on a nominal model that ne-
glects both the aleatoric and epistemic uncertainty in Eq. (1),
and is thus deterministic. Concretely, we fix any value
ˆαΓ
and define the nominal model dynamics as
ˆxk+1 =A(ˆα)xk+B(ˆα)uk.(4)
Due to the linearity of the dynamics, we can now express the
successor state xk+1 with full uncertainty, from Eq. (1), as
xk+1 = ˆxk+1 +δ(α, xk, uk) + ηk,(5)
1
Note that
ϕK
sI
is a reach-avoid specification for an iMDP, while
V(x0, α, c)
is the reach-avoid probability on the dynamical model.
2
Such min-max (and max-min) problems are common in (dis-
tributionally) robust optimization, see Ben-Tal, Ghaoui, and Ne-
mirovski (2009), and Wiesemann, Kuhn, and Sim (2014) for details.
with
δ(α, xk, uk)
being a new term, called the epistemic er-
ror, encompassing the error caused by parameter uncertainty:
δ(α, xk, uk)=[A(α)A(ˆα)]xk+ [B(α)B(ˆα)]uk.(6)
In other words, the successor state
xk+1
is the nominal one,
plus the epistemic error, and plus the stochastic noise. Note
that for
α= ˆα
(i.e., the true model parameters equal their
nominal values), we obtain
δ(α, xk, uk)=0
. We also impose
the next assumption on the nominal model, which guarantees
that we can compute the inverse image of Eq. (4) for a given
ˆxk+1, and is used in the proof of Lemma 3 in Appendix A.
Assumption 2. The matrix A(ˆα)in Eq. (4) is non-singular.
Assumption Assumption 2 is mild as it only requires the
existence of a non-singular matrix in conv{A1, . . . , Ar}.
States
We create a partition of a subset
X
of the safe set
Z
on the
continuous state space, see Fig. 2. This partition fully covers
the goal set Gbut excludes unsafe states, i.e., any x /∈ Z.
Definition 4.
A finite collection of subsets
(Pi)L
i=1
is called
apartition of X ⊆ Z Rnif the following conditions hold:
1. X=SL
i=1 Pi,
2. PiTPj=,i, j ∈ {1, . . . , L}, i 6=j.
We append to the partition a so-called absorbing region
P0= cl(Rn\ X )
, which is defined as the closure of
Rn\ X
and represents any state
x /∈ X
that is disregarded in subse-
quent reachability computations. We consider partitions into
convex polytopic regions, which will allow us to compute
PAC probability intervals in Lemma 1 using results from Ro-
mao, Papachristodoulou, and Margellos (2022) on scenario
optimization programs with discarded constraints:
Assumption 3.
Each region
Pi
is a convex polytope given by
Pi={xRn:Hixhi},(7)
with
HiRpi×n
and
hiRpi
for some
piN
, and the
inequality in Eq. (7) is to be interpreted element-wise.
We define an iMDP state for each element of
(Pi)L
i=0
, yield-
ing a set of
L+ 1
discrete states
S={si|i= 0, . . . , L}
.
Define
T:Rn→ {0,1, . . . , L}
as the map from any
x∈ X
to its corresponding region index
i
. We say that a continuous
state
x
belongs to iMDP state
si
if
T(x) = i
. State
s0
is a
deadlock, such that the only transition leads back to s0.
Actions
Recall that we define the iMDP actions via backward reacha-
bility computations under the nominal model in Eq. (4). Let
T={T1,...,TM}
be a finite collection of target sets, each
of which is a convex polytope,
T`= conv{t1, . . . , td} ⊂ Rn
.
Every target set corresponds to an iMDP action, yielding the
set
Act ={a`|`= 1, . . . , M}
of actions. Action
a`Act
represents a transition to
ˆxk+1 ∈ T`
that is feasible under
the nominal model. The one-step backward reachable set
R1
ˆα(T`)
, shown in Fig. 3, represents precisely these contin-
uous states from which such a direct transition to T`exists:
R1
ˆα(T`) = xRn| ∃u∈ U, A(ˆα)x+B(ˆα)u∈ T`.
摘要:

ProbabilitiesAreNotEnough:FormalControllerSynthesisforStochasticDynamicalModelswithEpistemicUncertaintyThomBadings1,LicioRomao2,AlessandroAbate2,NilsJansen11RadboudUniversity,Nijmegen,theNetherlands2UniversityofOxford,Oxford,UnitedKingdomAbstractCapturinguncertaintyinmodelsofcomplexdynamicalsys-tems...

展开>> 收起<<
Probabilities Are Not Enough Formal Controller Synthesis for Stochastic Dynamical Models with Epistemic Uncertainty Thom Badings1 Licio Romao2 Alessandro Abate2 Nils Jansen1.pdf

共16页,预览4页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:16 页 大小:2.03MB 格式:PDF 时间:2025-04-26

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 16
客服
关注