Active Predictive Coding A Unified Neural Framework for Learning Hierarchical World Models for Perception and Planning Rajesh P. N. Rao Dimitrios C. Gklezakos Vishwas Sathish

2025-04-30 0 0 2.92MB 15 页 10玖币
侵权投诉
Active Predictive Coding: A Unified Neural Framework for Learning
Hierarchical World Models for Perception and Planning
Rajesh P. N. Rao Dimitrios C. Gklezakos Vishwas Sathish
Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle
{rao,gklezd,vsathish}@cs.washington.edu
Abstract
Predictive coding has emerged as a prominent
model of how the brain learns through predic-
tions, anticipating the importance accorded to
predictive learning in recent AI architectures
such as transformers. Here we propose a new
framework for predictive coding called active
predictive coding which can learn hierarchical
world models and solve two radically different
open problems in AI: (1) how do we learn com-
positional representations, e.g., part-whole hier-
archies, for equivariant vision? and (2) how do
we solve large-scale planning problems, which
are hard for traditional reinforcement learning,
by composing complex action sequences from
primitive policies? Our approach exploits hyper-
networks, self-supervised learning and reinforce-
ment learning to learn hierarchical world mod-
els that combine task-invariant state transition
networks and task-dependent policy networks at
multiple abstraction levels. We demonstrate the
viability of our approach on a variety of vision
datasets (MNIST, FashionMNIST, Omniglot) as
well as on a scalable hierarchical planning prob-
lem. Our results represent, to our knowledge,
the first demonstration of a unified solution to
the part-whole learning problem posed by Hin-
ton, the nested reference frames problem posed
by Hawkins, and the integrated state-action hi-
erarchy learning problem in reinforcement learn-
ing.
1 INTRODUCTION
Predictive coding (Rao and Ballard, 1999; Friston and
Kiebel, 2009; Keller and Mrsic-Flogel, 2018; Jiang et al.,
2021) has received growing attention in recent years as a
model of how the brain learns models of the world through
prediction and self-supervised learning. In predictive cod-
ing, feedback connections from a higher to a lower level
of a cortical neural network (e.g., the visual cortex) con-
vey predictions of lower level responses and the predic-
tion errors are conveyed via feedforward connections to
correct the higher level estimates, completing a prediction-
error-correction cycle. Such a model has provided expla-
nations for a wide variety of neural and cognitive phenom-
ena (Keller and Mrsic-Flogel, 2018; Jiang and Rao, 2022).
The layered architecture of the cortex is remarkably similar
across cortical areas (Mountcastle, 1978), hinting at a com-
mon computational principle, with superficial layers (lay-
ers 2-4) receiving and processing sensory information and
deeper layers (layer 5) conveying outputs to motor centers
(Sherman and Guillery, 2013). The traditional predictive
coding model focused primarily on learning visual hierar-
chical representations and did not acknowledge the impor-
tant role of actions in learning internal world models.
In this paper, we introduce Active Predictive Coding, a new
model of predictive coding that combines state and action
networks at different abstract levels to learn hierarchical in-
ternal models. The model provides a unified framework for
solving three important AI problems as discussed below.
2 RELATED WORK
Part-Whole Learning Problem. Hinton and colleagues
have posed the problem of how neural networks can learn
to parse visual scenes into part-whole hierarchies by dy-
namically allocating nodes in a parse tree. They have ex-
plored networks that use a group of neurons to represent
not only the presence of an object but also parameters such
as position and orientation (Sabour et al., 2017; Kosiorek
et al., 2019; Hinton et al., 2018; Hinton, 2021), seeking
to overcome the inability of deep convolutional neural net-
works (CNNs) (Krizhevsky et al., 2012) which are unable
to explain the images they classify in the way humans do,
in terms of objects, parts and their locations. A major open
question is how neural networks can learn to create parse
trees on-the-fly using learned part-whole representations.
Reference Frames Problem. In a parallel line of re-
search, Hawkins and colleagues (Hawkins, 2021; Lewis
et al., 2019) have taken inspiration from the cortex and
“grid cells” to propose that the brain uses object-centered
arXiv:2210.13461v1 [cs.LG] 23 Oct 2022
Active Predictive Coding: A Unified Neural Framework for Learning Hierarchical World Models for Perception and Planning
reference frames to represent objects, spatial environments
and even abstract concepts. The question of how such refer-
ence frames can be learned and used in a nested manner for
hierarchical recognition and reasoning has remained open.
Integrated State-Action Hierarchy Learning Problem.
A considerable literature exists on hierarchical reinforce-
ment learning (see (Hutsebaut-Buysse et al., 2022) for a
recent survey), where the goal is to make traditional rein-
forcement learning (RL) algorithms more efficient through
state and/or action abstraction. A particularly popular ap-
proach is to use options (Sutton et al., 1999), which are ab-
stract actions which can be selected in particular states (in
the option’s initiation set) and whose execution executes a
sequence of primitive actions as prescribed by the option’s
lower level policy. A major problem that has received re-
cent attention (e.g., Bacon et al., 2016) is learning options
from interactions with the environment. The broader prob-
lem of simultaneously learning state and action abstraction
hierarchies has remained relatively less explored.
3 CONTRIBUTIONS OF THE PAPER
The proposed active predictive coding (APC) model ad-
dresses all three problems above in a unified manner us-
ing state/action embeddings and hypernetworks (Ha et al.,
2017) to dynamically generate and generalize over state
and action networks at multiple hierarchical levels. The
APC model contributes to a number of lines of research
which have not been connected before:
Perception, Predictive Coding, and Reference Frame
Learning: APC extends predictive coding and related neu-
roscience models of brain function (Rao and Ballard, 1999;
Friston and Kiebel, 2009; Jiang et al., 2021) to hierarchical
sensory-motor inference and learning, and connects these
to learning nested reference frames (Hawkins, 2021) for
perception and cognition.
Attention Networks: APC extends previous hard attention
approaches such as the Recurrent Attention Model (RAM)
(Mnih et al., 2014) and Attend-Infer-Repeat (AIR) (Eslami
et al., 2016) by learning structured hierarchical strategies
for sampling the visual scene.
Hierarchical Planning and Reinforcement Learn-
ing: APC contributes to hierarchical planning/reinforce-
ment learning research (Hutsebaut-Buysse et al., 2022;
Botvinick et al., 2009) by proposing a new way of simulta-
neously learning abstract macro-actions or options (Sutton
et al., 1999) and abstract states.
General Applicability in AI: When applied to vision, the
APC model learns to hierarchically represent and parse im-
ages into parts and locations. When applied to RL prob-
lems, the model can exploit hypernets to (a) define a state
hierarchy, not merely through state aggregation, but by ab-
stracting transition dynamics at multiple levels, and (b) po-
tentially generalize learned hierarchical states and actions
(options) to novel scenarios via interpolation/extrapolation
Figure 1: Active Predictive Coding Generative Mod-
ule: (A) Canonical generative module for the APC model.
The lower level functions are generated via hypernetworks
based on the current higher level state and action embed-
ding vectors. All functions (in boxes) are implemented
as recurrent neural networks (RNNs). Arrows with circu-
lar terminations denote generation of function parameters
(here, neural network weights and biases). (B) Two-level
model used in this paper. (C) Generation of states and ac-
tions in the 2-level model based on past states and actions.
in the input embedding space of the hypernetworks.
Our approach brings us closer towards a neural solution to
an important challenge in both AI and cognitive science
(Lake et al., 2017): how can neural networks learn hier-
archical compositional representations that allow new con-
cepts to be created, recognized and learned?
4 ACTIVE PREDICTIVE CODING
MODEL
The APC model implements a hierarchical version of the
traditional Partially Observable Markov Decision Process
(POMDP) (Kaelbling et al., 1998; Rao, 2010), with each
level employing a state network and an action network.
Figure 1A shows the canonical generative module for the
APC model. The module consists of (1) a higher level state
embedding vector r(i+1) at level i+ 1, which uses a func-
tion Hi
s(implemented as a hypernetwork (Ha et al., 2017))1
to generate a lower level state transition function fi
s(imple-
mented as an RNN), and (2) a higher level action embed-
ding vector a(i+1), which uses a function (hypernetwork)
Hi
ato generate a lower level option/policy function fi
a(im-
plemented as an RNN). The state and action networks at the
lower level are generated independently (by the higher level
state/action embedding vectors) but exchange information
1See Supplementary Materials for an alternate neural imple-
mentation of the APC model using higher-level embedding-based
inputs to lower-level RNNs.
Rajesh P. N. Rao, Dimitrios C. Gklezakos, Vishwas Sathish
horizontally within each level as shown in Figure 1C: the
state network generates the next state prediction based on
the current state and action while the action network gener-
ates the current action based on the current state and previ-
ous action. In our current implementation, the lower level
RNNs execute for a fixed number of time steps before re-
turning control back to the higher level.2For the present
paper, we focus on a two-level model (with a top level and
bottom level) as shown in Figures 1B-C.
4.1 Inference in the Active Predictive Coding Model
Inference involves estimating the state and action vectors
at multiple levels based on the sequence of inputs produced
by interacting with the environment in the context of a par-
ticular task or goal. The top level runs for T2steps (referred
to as “macro-steps”). For each macro-step, the bottom level
runs for T1“micro-steps”. As shown in Figure 1C, Fs, Fa
are the top level state and action networks respectively, and
Rt, Atare the recurrent activity vectors of these networks
(i.e., the top level state and action embedding vectors) at
macro-step t. We use the notation f(; θ)to denote a net-
work parameterized by θ={Wl, bl}L
l=1, the weight matri-
ces and biases for all the layers. The bottom level state and
action RNNs are denoted by fs(; θs)and fa(; θa), while
their activity vectors are denoted by rt,τ and at,τ respec-
tively (tranges over macro-steps, τover micro-steps).
4.1.1 Higher-Level Inference and Reference Frame
Generation
At each macro-step t, the top level state RNN Fspro-
duces a new state embedding vector Rtbased on the previ-
ous state and action embedding vectors. This higher level
state Rtdefines a new “reference frame” for the lower
level to operate over as follows: Rtis fed as input to the
state hypernetwork Hsto generate the lower level param-
eters θs(t) = Hs(Rt)specifying a dynamically generated
bottom-level state RNN characterizing the state transition
dynamics locally (e.g., local parts and their transformations
in vision (see Section 5), navigation dynamics in a local re-
gion of a building (see Section 6)). Figure 2 (a) illustrates
this top-down generation process.
The current state Rtis also input to the action/policy RNN
Fawhich outputs an action embedding vector At(a macro-
action/option/sub-goal) appropriate for the current task/-
goal given current state Rt(Figure 2 (a)). This embedding
vector Atis used as input to a non-linear function, imple-
mented by the hypernetwork Ha, to dynamically generate
the parameters θa(t) = Ha(At)of the lower-level action
RNN, which implements a policy to generate primitive ac-
tions suitable for achieving the sub-goal associated with
At. Since reinforcement learning or planning require ex-
ploration, the output of an action network at any level can
2Future implementations will explore the use of termination
functions (Sutton et al., 1999; Eslami et al., 2016) to allow a vari-
able number of time steps at each level and for each input.
Figure 2: Inference in the Active Predictive Coding
Model: (a) Dynamic generation of bottom-level state RNN
fsand action RNN fa(“sub-programs”) from top-level
state vector Rtand action vector At. This diagram elab-
orates the one in Figure 1C. (b) Update of top-level state
Rtand action Atbased on feedback (via networks ρsand
ρa) upon bottom-level sub-program termination.
be regarded as the mean value of a Gaussian with fixed vari-
ance (Mnih et al., 2014) or as a categorical representation
(Hafner et al., 2021) to sample an action.
4.1.2 Lower-Level Inference and Interaction with the
Higher Level
At the beginning of each micro-step, the higher-level state
Rtis used to initialize the bottom-level state vector via
a small feedforward network Initsto produce rt,0=
Inits(Rt). Each micro-step proceeds in a manner similar
to a macro-step. The bottom-level action RNN, which was
generated by higher-level state Rt, produces the current ac-
tion at,τ based on the current lower level state and previ-
ous action (Figure 2 (a) lower right). This action (e.g., eye
movement or body movement) results in a new input being
generated by the environment for the bottom (and possibly
higher) state network.
To predict this new input, the lower-level state vector rt,τ
is fed to a generic decoder network Dto generate the pre-
diction ˆ
It,τ . This predicted input is compared to the ac-
tual input to generate a prediction error t,τ =It,τ ˆ
It,τ .
Following the predictive coding model (Rao and Ballard,
1999), the prediction error is used to update the state vector
via the state network: rt,τ +1 =fs(rt,τ , t,τ , at,τ ;θ(s)(t))
(Figure 2 (a) lower left).
摘要:

ActivePredictiveCoding:AUniedNeuralFrameworkforLearningHierarchicalWorldModelsforPerceptionandPlanningRajeshP.N.RaoDimitriosC.GklezakosVishwasSathishPaulG.AllenSchoolofComputerScienceandEngineering,UniversityofWashington,Seattlefrao,gklezd,vsathishg@cs.washington.eduAbstractPredictivecodinghasemerg...

展开>> 收起<<
Active Predictive Coding A Unified Neural Framework for Learning Hierarchical World Models for Perception and Planning Rajesh P. N. Rao Dimitrios C. Gklezakos Vishwas Sathish.pdf

共15页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:15 页 大小:2.92MB 格式:PDF 时间:2025-04-30

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 15
客服
关注