line. To overcome this challenge, we leverage DeepReach [6]
– a reachability toolbox that builds upon recent advances in
neural partial differential equation (PDE) solvers to compute
high-dimensional reachable sets. DeepReach along with a
parameterized FRT allows us to ensure safe human-robot
interaction despite erroneous predictions.
To summarize, the key contributions of this work are: (1)
incorporating confidence estimation in high-capacity human
prediction models, e.g., models based on deep neural net-
works. The proposed framework allows us to exploit the
predictive power of these models to plan efficient robot
trajectories, yet ensure safety when the predictions cannot be
trusted; (2) developing a Hamilton-Jacobi reachability frame-
work to update the model confidence and the corresponding
safety assurances online for safe human-robot interaction.
II. RELATED WORK
Human Modeling and Prediction. It is a common view-
point that humans are rational agents, that is, that humans
act with intent. A common model used in human-robot inter-
action domains is the Boltzmann model, which captures the
notion that humans are exponentially more likely to choose
actions that maximize some reward function [1], [7], [8].
However, reward functions that incorrectly specify human
intent can lead to overly confident incorrect predictions.
Furthermore, the reward functions used to model the human’s
goals often fail to capture semantic information that impacts
human decision-making. Specifically, in contexts such as
autonomous driving, semantic information like stop signs or
crosswalks shape how humans make decisions. One way to
leverage semantic information and make predictions in the
continuous action space is to use a neural network-based
human model. These models have enabled inference and
planning around human arm motion [9], [10], navigation
[11], [12], and autonomous driving [13]–[15] (see [16] for
a survey). However, data-driven approaches are in general
subject to incorrect predictions in scenarios not captured in
the training data. In this work, our goal is to ensure safe
human-robot interaction despite erroneous predictions.
Safe Motion Planning. The notion of safety in the context of
human-robot interaction is well studied [17], [18]. Works in
[7], [19] use backward reachability to find the set of unsafe
states and utilize them within a model predictive control
framework to plan efficient trajectories. Although some of
these works [7] track human model confidence, the safety-
enforcing backward reachable set is typically computed for
a fixed set of parameters. [1], [20]–[22] add flexibility by
precomputing a small discrete bank of reachable sets that
reflect different potential beliefs of the human model. The
system switches between these reachable sets based on which
one fits the robots estimate best at runtime. However in
practice, the parameters that affect the model predictions (and
subsequently the unsafe set) are not known a priori and must
be observed/estimated online, such as semantic information
in the environment and the model confidence parameter.
Any precomputed bank of reachable sets will suffer from
being overly conservative in such scenarios. In this work,
we propose a method to update such reachable sets online
in an effective fashion.
III. PROBLEM SETUP
We consider a robot operating in a human occupied
space. We assume that the robot has full knowledge of the
environment, and the robot and human states.
A. Agent Dynamics
We model each agent as a dynamical system, where we
denote the robot and human states as xR∈Rnand xH∈
Rmrespectively. Their individual dynamics and controls are
as follows: ˙
xi=f(xi,ui)i∈[R,H]
We also let ξ(τ;xi,ui(·), t)denote the agent state at time τ
starting at the state xiat time tand applying control ui(·)
over the time horizon [t, τ].
The robot is assumed to have some objective or task, such
as reaching a goal state, that it needs to plan and execute
a trajectory for. While the robot performs its task, it is
imperative for it to never incur any safety violations. We
denote by Cthe set of states the robot should avoid to ensure
safety, e.g., because they imply physical collisions with the
human. In this work, we will compute Cvia evaluating a
forward reachable tube of the human.
Running example: We introduce a running example for
illustration throughout the paper. We consider a scenario
where an autonomous car is interacting with a human-driven
vehicle at a traffic intersection. We model both agents in this
scenario as extended unicycles where ˙x = [ ˙x, ˙y, ˙
θ, ˙v]⊺=
[vcos θ, v sin θ, u1, u2]⊺. The vehicle controls are given by
steering rate and acceleration. The unicycle model is widely
used in the literature for modeling autonomous vehicles [14],
[15]. Given a collision radius of Rcol = 1.5m, we define C
as the positions of the autonomous vehicle that are within a
distance of Rcol of the human vehicle.
B. Human Prediction Model
In human-robot interaction scenarios, the robot typically
maintains a model of human behavior in order to aid in the
prediction of their future states. In this work, we are par-
ticularly interested in the settings where the human motion
predictors might be high-capacity models that use semantic
information about the environment as an input (e.g., the
roadgraph and traffic light state in the context of autonomous
driving), along with the human states (and possibly their
history) to generate continuous distributions over human
controls. We assume that at each time step t, the robot has
a prediction for each time step over the prediction horizon
[t, t +T]in terms of multivariate Gaussian distribution over
human control actions:
ut:t+T
H∼ N (µt:t+T,Σt:t+T)(1)
Here, µt:t+Tand Σt:t+Tare the vectors and matrices of
appropriate dimensions that represent the mean and covari-
ance for the human control actions from time tto t+T.
Such prediction representations are common in the literature,
especially when the model is data-driven (e.g., [14] and [15]).