
Active Predictive Coding: A Unified Neural Framework for Learning Hierarchical World Models for Perception and Planning
reference frames to represent objects, spatial environments
and even abstract concepts. The question of how such refer-
ence frames can be learned and used in a nested manner for
hierarchical recognition and reasoning has remained open.
Integrated State-Action Hierarchy Learning Problem.
A considerable literature exists on hierarchical reinforce-
ment learning (see (Hutsebaut-Buysse et al., 2022) for a
recent survey), where the goal is to make traditional rein-
forcement learning (RL) algorithms more efficient through
state and/or action abstraction. A particularly popular ap-
proach is to use options (Sutton et al., 1999), which are ab-
stract actions which can be selected in particular states (in
the option’s initiation set) and whose execution executes a
sequence of primitive actions as prescribed by the option’s
lower level policy. A major problem that has received re-
cent attention (e.g., Bacon et al., 2016) is learning options
from interactions with the environment. The broader prob-
lem of simultaneously learning state and action abstraction
hierarchies has remained relatively less explored.
3 CONTRIBUTIONS OF THE PAPER
The proposed active predictive coding (APC) model ad-
dresses all three problems above in a unified manner us-
ing state/action embeddings and hypernetworks (Ha et al.,
2017) to dynamically generate and generalize over state
and action networks at multiple hierarchical levels. The
APC model contributes to a number of lines of research
which have not been connected before:
Perception, Predictive Coding, and Reference Frame
Learning: APC extends predictive coding and related neu-
roscience models of brain function (Rao and Ballard, 1999;
Friston and Kiebel, 2009; Jiang et al., 2021) to hierarchical
sensory-motor inference and learning, and connects these
to learning nested reference frames (Hawkins, 2021) for
perception and cognition.
Attention Networks: APC extends previous hard attention
approaches such as the Recurrent Attention Model (RAM)
(Mnih et al., 2014) and Attend-Infer-Repeat (AIR) (Eslami
et al., 2016) by learning structured hierarchical strategies
for sampling the visual scene.
Hierarchical Planning and Reinforcement Learn-
ing: APC contributes to hierarchical planning/reinforce-
ment learning research (Hutsebaut-Buysse et al., 2022;
Botvinick et al., 2009) by proposing a new way of simulta-
neously learning abstract macro-actions or options (Sutton
et al., 1999) and abstract states.
General Applicability in AI: When applied to vision, the
APC model learns to hierarchically represent and parse im-
ages into parts and locations. When applied to RL prob-
lems, the model can exploit hypernets to (a) define a state
hierarchy, not merely through state aggregation, but by ab-
stracting transition dynamics at multiple levels, and (b) po-
tentially generalize learned hierarchical states and actions
(options) to novel scenarios via interpolation/extrapolation
Figure 1: Active Predictive Coding Generative Mod-
ule: (A) Canonical generative module for the APC model.
The lower level functions are generated via hypernetworks
based on the current higher level state and action embed-
ding vectors. All functions (in boxes) are implemented
as recurrent neural networks (RNNs). Arrows with circu-
lar terminations denote generation of function parameters
(here, neural network weights and biases). (B) Two-level
model used in this paper. (C) Generation of states and ac-
tions in the 2-level model based on past states and actions.
in the input embedding space of the hypernetworks.
Our approach brings us closer towards a neural solution to
an important challenge in both AI and cognitive science
(Lake et al., 2017): how can neural networks learn hier-
archical compositional representations that allow new con-
cepts to be created, recognized and learned?
4 ACTIVE PREDICTIVE CODING
MODEL
The APC model implements a hierarchical version of the
traditional Partially Observable Markov Decision Process
(POMDP) (Kaelbling et al., 1998; Rao, 2010), with each
level employing a state network and an action network.
Figure 1A shows the canonical generative module for the
APC model. The module consists of (1) a higher level state
embedding vector r(i+1) at level i+ 1, which uses a func-
tion Hi
s(implemented as a hypernetwork (Ha et al., 2017))1
to generate a lower level state transition function fi
s(imple-
mented as an RNN), and (2) a higher level action embed-
ding vector a(i+1), which uses a function (hypernetwork)
Hi
ato generate a lower level option/policy function fi
a(im-
plemented as an RNN). The state and action networks at the
lower level are generated independently (by the higher level
state/action embedding vectors) but exchange information
1See Supplementary Materials for an alternate neural imple-
mentation of the APC model using higher-level embedding-based
inputs to lower-level RNNs.