
Algorithm 1 Extract Abstraction via Decision Tree
Data Trajectories rollout from a compact neural policy Ddt ={(o0, s0, a0, z0, . . . )j}N
j=1
Result Interpreters of neuron response {fi
S}i∈I
for i∈ I do
Train a decision tree Tθifrom state {st}to neural response {zi
t}.
Collect dataset Ddp with neuron response {zi
t}and decision paths {Pi
st}.
Train neuron response classifier qϕi:R→ {P} with Ddp .
Obtain decision path parser ri:{P} → L by tracing out {Pi
k}Ki
k=1 in Tθi.
Construct the mapping fi
S=ri◦qϕi.
end for
In this section, we describe how to obtain factor of variation by predicting logic programs from
neuron responses that reflect the learned behavior of the policies (Section 3.1), followed by a set of
quantitative measures of interpretability in the lens of disentanglement (Section 3.2).
3.1 Extracting Abstraction via Decision Tree
Our goal is to formulate a logic program that represents the decision-making of a parametric policy
to serve as an abstraction of learned behaviors, summarized in Algorithm 1. First, we describe a
decision process as a tuple {O,S,A, Pa, h}, where at a time instance t,ot∈ O is the observation,
st∈ S is the state, at∈ A is the action, Pa:S × A × S → [0,1] is the (Markovian) transition
probability from current state stto next state st+1 under action at, and h:S → O is the observation
model. We define a neural policy as π:O → A and the response of neuron i∈ I as {zi
t∈R}i∈I ,
where Irefers to a set of neurons to be interpreted. For each neuron i, we aim to construct a mapping
that infers a logic program from neuron response, fi
S:R→ L, where Lis a set of logic programs
grounded on environment states S. Note that fi
Sdoes not take the state as an input as underlying
states may be inaccessible during robot deployment. In the following discussion, we heavily use the
notation Pi
∗for the decision path associated with the i’th neuron, where the subscript ∗refers to the
dependency on state if with parenthesis (like (st)) and otherwise indexing based on the context.
From states to neuron responses. Decision trees are non-parametric supervised learning algo-
rithms for classification and regression. Throughout training, they develop a set of decision rules
based on thresholding one or a subset of input dimensions. The relation across rules is described by
a tree structure with the root node as the starting point of the decision-making process and the leaf
nodes as the predictions. The property of decision trees to convert data for decision making to a set
of propositions is a natural fit for state-grounded logic programs. Given a trained neural policy π,
we collect a set of rollout trajectories Ddt ={τj}N
j=1, where τj= (o0, s0, a0, z0, o1, . . . ). We first
train a decision tree Tθito predict the ith neuron response from states,
θi∗= arg min
θiX
(st,zi
t)∈Ddt
Ldt(ˆzi
t, zi
t),where ˆzi
t=Tθi(st)(1)
where Ldt represents the underlying classification or regression criteria. The decision tree Tθide-
scribes relations between the neuron responses and the relevant states as logical expressions. During
inference, starting from the root node, relevant state dimensions will be checked by the decision rule
in the current node and directed to the relevant lower layer, finally arriving at one of the leaf nodes
and providing information to regress the neuron response. Each inference traces out a route from the
root node to a leaf node. This route is called a decision path. A decision path consists of a sequence
of decision rules defined by nodes visited by the path, which combine to form a logic program,
∧
n∈Pi
(st),j=g(n)(sj
t≤cn)←→ Behavior extracted from ˆzi
tvia Tθi(2)
where ∧is the logical AND, Pi
(st)is the decision path of the tree Tθithat takes stas inputs, g
gives the state dimension used in the decision rule of node n(assume each node uses one feature for
notation simplicity), and cnis the threshold at node n.
From neuron responses to decision paths. So far, we recover a correspondence between the neuron
response ztand the state-grounded program based on decision paths Pi
(st); however, this is not
3