Goal Recognition as a Deep Learning Task the GRNet Approach Mattia Chiari1Alfonso E. Gerevini1Luca Putelli1Francesco Percassi2Ivan Serina1 1Dipartimento di Ingegneria dellInformazione Universit a degli Studi di Brescia Via Branze 38 Brescia Italy

2025-05-01 0 0 570.36KB 13 页 10玖币
侵权投诉
Goal Recognition as a Deep Learning Task: the GRNet Approach
Mattia Chiari,1Alfonso E. Gerevini,1Luca Putelli,1Francesco Percassi,2Ivan Serina,1
1Dipartimento di Ingegneria dell’Informazione, Universit`
a degli Studi di Brescia, Via Branze 38, Brescia, Italy
2School of Computing and Engineering, University of Huddersfield, Queensgate, Huddersfield HD1 3DH, United Kingdom
{m.chiari017, alfonso.gerevini, luca.putelli1, ivan.serina}@unibs.it
f.percassi@hud.ac.uk
Abstract
In automated planning, recognising the goal of an agent from
a trace of observations is an important task with many ap-
plications. The state-of-the-art approaches to goal recogni-
tion rely on the application of planning techniques, which
requires a model of the domain actions and of the initial do-
main state (written, e.g., in PDDL). We study an alternative
approach where goal recognition is formulated as a classi-
fication task addressed by machine learning. Our approach,
called GRNet, is primarily aimed at making goal recognition
more accurate as well as faster by learning how to solve it in a
given domain. Given a planning domain specified by a set of
propositions and a set of action names, the goal classification
instances in the domain are solved by a Recurrent Neural Net-
work (RNN). A run of the RNN processes a trace of observed
actions to compute how likely it is that each domain propo-
sition is part of the agent’s goal, for the problem instance
under consideration. These predictions are then aggregated
to choose one of the candidate goals. The only information
required as input of the trained RNN is a trace of action la-
bels, each one indicating just the name of an observed action.
An experimental analysis confirms that GRNet achieves good
performance in terms of both goal classification accuracy and
runtime, obtaining better performance w.r.t. a state-of-the-art
goal recognition system over the considered benchmarks.
Introduction
Goal Recognition is the task of recognising the goal that
an agent is trying to achieve from observations about the
agent’s behaviour in the environment (Van-Horenbeke and
Peer 2021; Geffner 2018). Typically, such observations con-
sist of a trace (sequence) of executed actions in an agent’s
plan to achieve the goal, or a trace of world states gener-
ated by the agent’s actions, while an agent goal is specified
by a set of propositions. Goal recognition has been studied
in AI for many years, and it is an important task in sev-
eral fields, including human-computer interactions (Batrinca
et al. 2016), computer games (Min et al. 2016), network
security (Mirsky et al. 2019), smart homes (Harman and
Simoens 2019), financial applications (Borrajo, Gopalakr-
ishnan, and Potluru 2020), and others.
In the literature, several systems to solve goal recogni-
tion problems have been proposed (Meneguzzi and Pereira
2021). The state-of-the-art approach is based on transform-
ing a plan recognition problem into one or more plan gen-
eration problems solved by classical planning algorithms
(Ram´
ırez and Geffner 2009; Pereira, Oren, and Meneguzzi
2020; Sohrabi, Riabov, and Udrea 2016). In order to per-
form planning, this approach requires domain knowledge
consisting of a model of each agent’s action specified as a set
of preconditions and effects, and a description of an initial
state of the world, in which the agent perform the actions.
The computational efficiency (runtime) largely depends on
the planning algorithm performance, which could be inade-
quate in a context demanding fast goal recognition (e.g., in
real-time/online applications).1
In this paper, we investigate an alternative approach in
which the goal recognition problem is formulated as a clas-
sification task, addressed through machine learning, where
each candidate goal (a set of propositions) of the problem
can be seen as a value class. The primary aim is making
goal recognition more accurate as well as faster by learn-
ing how to solve it in a given domain. Given a planning do-
main specified by a set of propositions and a set of action
names, we tackle the goal classification instances in the do-
main through a Recurrent Neural Network (RNN). A run of
our RNN processes a trace of observed actions to compute
how likely it is that each domain proposition is part of the
agent’s goal, for the problem instance under considerations.
These predictions are then aggregated through a goal selec-
tion mechanism to choose one of the candidate goals.
The proposed approach, that we call GRNet, is generally
faster than the model-based approach to goal recognition
based on planning, since running a trained neural network
can be much faster than plan generation. Moreover, GRNet
operates with minimal information, since the only informa-
tion required as input for the trained RNN is a trace of action
labels (each one indicating just the name of an observed ac-
tion), and the initial state can be completely unknown.
The RNN is trained only once for a given domain, i.e.,
the same trained network can be used to solve a large set of
goal recognition instances in the domain. On the other hand,
as usual in supervised learning, a (possibly large) dataset
of solved goal recognition instances for the domain under
consideration is needed for the training. When such data are
unavailable or scarse, they can be synthetized via planning.
1Deciding plan existence in classical planning is PSPACE-
complete (Bylander 1994).
arXiv:2210.02377v2 [cs.AI] 25 Oct 2022
In such a case, the resulting overall system can be seen as
a combined approach (model-based for generating training
data, and model-free for the goal classification task) that out-
performs the pure model-based approach in terms of both
classification accuracy and classification runtime. Indeed, an
experimental analysis presented in the paper confirms that
GRNet achieves good performance in terms of both goal
classification accuracy and runtime, obtaining consistently
better performance with respect to a state-of-the-art goal
recognition system over a class of benchmarks in six plan-
ning domains.
In rest of paper, after giving background and preliminaries
about goal recognition and LSTM networks, we describe the
GRNet approach; then we present the experimental results;
finally, we discuss related work and give the conclusions.
Preliminaries
We describe the problem goal of recognition, starting with
its relation to activity/plan recognition, and we give the es-
sential background on Long Short-Term Memory networks
and the attention mechanism.
Activity, Goal and Plan Recognition
Activity, plan, and goal recognition are related tasks (Geib
and Pynadath 2007). Since in the literature sometime they
are not clearly distinguished, we begin with an informal def-
inition of them following (Van-Horenbeke and Peer 2021).
Activity recognition concerns analyzing temporal se-
quences of (typically low-level) data generated by humans,
or other autonomous agents acting in an environment, to
identify the corresponding activity that they are perform-
ing. For instance, data can be collected from wearable sen-
sors, accelerometers, or images to recognize human activi-
ties such as running, cooking, driving, etc. (Vrigkas, Nikou,
and Kakadiaris 2015; Jobanputra, Bavishi, and Doshi 2019).
Goal recognition (GR) can be defined as the problem of
identifying the intention (goal) of an agent from observa-
tions about the agent behaviour in an environment. These
observations can be represented as an ordered sequence of
discrete actions (each one possibly identified by activity
recognition), while the agent’s goal can be expressed either
as a set of propositions or a probability distribution over al-
ternative sets of propositions (each oven forming a distinct
candidate goal).
Finally, plan recognition is more general than GR and
concerns both recognising the goal of an agent and identi-
fying the full ordered set of actions (plan) that have been, or
will be, performed by the agent in order to reach that goal;
as GR, typically plan recognition takes as input a set of ob-
served actions performed by the agent (Carberry 2001).
Model-based and Model-free Goal Recognition
In the approach to GR known as “goal recognition over a do-
main theory” (Ram´
ırez and Geffner 2010; Van-Horenbeke
and Peer 2021; Santos et al. 2021; Sohrabi, Riabov, and
Udrea 2016), the available knowledge consists of an under-
lying model of the behaviour of the agent and its environ-
ment. Such a model represents the agent/environment states
and the set of actions Athat the agent can perform; typi-
cally it is specified by a planning language such as PDDL
(McDermott et al. 1998). The states of the agent and envi-
ronment are formalised as subsets of a set of propositions F,
called fluents or facts, and each domain action in Ais mod-
eled by a set of preconditions and a set of effects, both over
F. An instance of the GR problem in a given domain is then
specified by:
an initial state Iof the agent and environment (IF);
a sequence O=hobs1, .., obsniof observations (n1),
where each obsiis an action in Aperformed by the agent;
and a set G={G1, .., Gm}(m1) of possible goals of
the agent, where each Giis a set of fluents over F.
The observations form a trace of the full sequence πof ac-
tions performed by the agent to achieve the goal. Such a plan
trace is a selection of (possibly non-consecutive) actions in
π, ordered as in π. Solving a GR instance consists of identi-
fying the Gin Gthat is the (hidden) goal of the agent.
The approach based on a model of the agent’s actions and
of the agent/environment states, that we call model-based
goal recognition (MBGR), defines GR as a reasoning task
addressable by automated planning techniques (Meneguzzi
and Pereira 2021; Ghallab, Nau, and Traverso 2016).
An alternative approach to MBGR is model-free goal
recognition (MFGR) (Geffner 2018; Borrajo, Gopalakrish-
nan, and Potluru 2020). In this approach, GR is formulated
as a classification task addressed through machine learning.
The domain specification consists of a fluent set F, and a set
of possible actions A, where each action aAis specified
by just a label (a unique identifier for each action).
A MFGR instance for a domain is specified by an observa-
tion sequence Oformed by action labels and, as in MBGR,
a goal set Gformed by subsets of F. MFGR requires mini-
mal information about the domain actions, and can operate
without the specification of an initial state, that can be com-
pletely unknown. Moreover, since running a learned classi-
fication model is usually fast, a MFGR system is expected
to run faster than a MBGR system based on planning algo-
rithms. On the other hand, MFGR needs a data set of solved
GR instances from which learning a classification model for
the new GR instances of the domain.
Example 1 As a running example, we will use a very sim-
ple GR instance in the well-known BLOCKSWORLD domain.
In this domain the agent has the goal of building one or
more stacks of blocks, and only one block may be moved at a
time. The agent can perform four types of actions: Pick-Up
a block from the table, Put-Down a block on the table,
Stack a block on top of another one, and Unstack a block
that is on another one. We assume that a GR instance in
the domain involves at most 22 blocks. In BLOCKSWORLD
there are three types of facts (predicates): On, that has
two blocks as arguments, plus On-Table and Clear that
have one argument. Therefore, the fluent set Fconsists of
22 ×21 + 22 + 22 = 506 propositions. The goal set Gof
the instance example consists of the two goals G1={(On
Block F Block C),(On Block C Block B)}and G2=
{(On Block G Block H),(On Block H Block F)}; the
observation sequence Ois h(Pick-Up Block C),(Stack
Block C Block B),(Pick-Up Block F)i.
LSTM and Attention Mechanism
A Long Short-Term Memory Network (LSTM) is a partic-
ular kind of Recurrent Neural Network (RNN). This kind
of deep learning architecture is particularly suitable for
processing sequential data like signals or text documents
(Hochreiter and Schmidhuber 1997). With respect to the
standard RNN, LSTM deals with typical issues such as van-
ishing gradient and long-term dependencies, obtaining bet-
ter predictive performance (Gers, Schmidhuber, and Cum-
mins 2000). Let x1, x2. . . xmbe an input time series where
xtRdis the feature vector representing the t-th element
of the series, and dis the dimension of each feature vec-
tor of the sequence. The long and short term memory states
ctRNand htRNat time step tof the series, respec-
tively, are computed recursively considering the values at the
previous time step t1as follows:
ˆct= tanh(Wc[ht1, xt] + bc)it=σ(Wi[ht1, xt] + bi)
ft=σ(Wf[ht1, xt] + bf)ct=itˆct+ftct1
ot=σ(Wo[ht1, xt] + bo)ht= tanh(ct)ot
where σdenotes the sigmoid activation function and cor-
responds to the element-wise product; Wf,Wi,Wo,Wc
R(N+d)×Nare the weight matrices and bf,bi,bo,bc
RNare the bias vectors; the vectors in square brackets are
concatenated. Weight matrices and bias vectors are typically
initialized with the Glorot uniform initializer (Glorot and
Bengio 2010), and they are shared by all the cells in the
LSTM layer. h0and c0are initialized as zero vectors.
The attention mechanism (Bahdanau, Cho, and Bengio
2015) is another layer which computes weights represent-
ing the contribution of each element of the sequence, and
provides a representation of the sequence (also called the
context vector) as the weighted average of the outputs (ht)
of the LSTM cells, improving the predictive performance
with respect to the base LSTM networks. In our system, we
use the so-called word attention introduced by Yang et al.
(2016) in the context of text classification.
Goal Recognition through GRNet
In this section we present our approach to goal recognition
based on deep learning, GRNet.GRNet is depicted in Fig-
ure 1 consists of two main components. The first compo-
nent takes as input the observations of the GR instance to
solve, and gives as output a score (between 0 and 1) for
each proposition in the domain proposition set F. This com-
ponent, called Domain Component, is general in the sense
that it can be used for every GR instance over F(training
is performed once for each domain). The second compo-
nent, called Instance Component, takes as input the propo-
sition ranks generated by the domain component for a GR
instance, and uses them to select a goal from the candidate
goal set G.
The Domain Component of GRNet
Given a sequence of observations, represented on the left
side of Figure 1, each action aicorresponding to an ob-
servation is encoded as a vector eiof real numbers by an
embedding layer (Bengio et al. 2003).2In Figure 1, the ob-
served actions are displayed from top to bottom in the or-
der in which they are executed by the agent. The embedding
layer is initialised with random weights, and trained at the
same time with the rest of the domain component.
The index of each observed action is simply the result of
an arbitrary order of the actions that is computed in the pre-
processing phase, only once for the domain under consider-
ation. Please note that two actions aiand ajconsecutively
observed may not be consecutive actions in the full plan of
the agent (the full plan may contain any number of actions
between aiand aj).
The sequence of embedding vectors is then fed to a LSTM
Neural Network, and the result of the output of each cell is
processed by the Attention Mechanism. After computing a
weight for the contribution of each cell, this layer provides
a so-called context-vector that summarizes the information
contained in the trace of plan. The context vector is then
passed to a feed-forward layer, which has Noutput neurons
with sigmoid activation function. Nis the number of the
domain fluents (propositions) that can appear in any goal of
Gfor any GR instance in the domain; for our experiments N
was set to the size of the domain fluent set F, i.e., N=|F|.
The output of the i-th neuron oicorresponds to the i-th fluent
fi(fluents are lexically ordered), and the activation value of
oigives a rank for fibeing true in the agent’s goal (with rank
equal to one meaning that fiis true in the goal). In other
words, our network is trained as a multi-label classification
problem, where each domain fluent can be considered as a
different binary class. As loss function, we used standard
binary crossentropy.
As shown in Figure 1, the dimension of the input and
output of our neural networks depend only on the selected
domain and some basic information, such as the maximum
number of possible output facts that we want to consider.
The dimension of the embedding vectors, the dimension of
the LSTM layer and other hyperaparameters of the networks
are selected using the Bayesian-optimisation approach pro-
vided by the Optuna framework (Akiba et al. 2019), with a
validation set formed by 20% of the training set while the re-
maining 80% is used for training the network. More details
about the hyperparameters are given in the Supplementary
Material.
The Instance Specific Component of GRNet
After the training and optimisation phases of the domain
component, the resulting network can be used to solve
any goal recognition instance in the domain through the
instance-specific component of our system (right part of Fig-
ure 1). Such component performs an evaluation of the can-
didate goals in Gof the GR instance, using the output of
the domain component fed by the observations of the GR
instance. To choose the most probable goal in G(solving
the multi-class classification task associated to the GR in-
stance), we designed a simple score function that indicates
how likely it is that Gis the correct goal, according to the
2https://keras.io/api/layers/core layers/embedding/
摘要:

GoalRecognitionasaDeepLearningTask:theGRNetApproachMattiaChiari,1AlfonsoE.Gerevini,1LucaPutelli,1FrancescoPercassi,2IvanSerina,11DipartimentodiIngegneriadell'Informazione,UniversitadegliStudidiBrescia,ViaBranze38,Brescia,Italy2SchoolofComputingandEngineering,UniversityofHudderseld,Queensgate,Hudde...

展开>> 收起<<
Goal Recognition as a Deep Learning Task the GRNet Approach Mattia Chiari1Alfonso E. Gerevini1Luca Putelli1Francesco Percassi2Ivan Serina1 1Dipartimento di Ingegneria dellInformazione Universit a degli Studi di Brescia Via Branze 38 Brescia Italy.pdf

共13页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:13 页 大小:570.36KB 格式:PDF 时间:2025-05-01

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 13
客服
关注