
2. Illustrative examples
In this section, we consider a number of toy examples to illustrate our problem setting. These examples
involve message-passing GNNs, where pooling arises because we need to aggregate the messages coming from
adjacent nodes. In Section 2.1, we discuss a simple example to illustrate the basic principles. This example
involves aggregating sets of properties, without any further reasoning. We then build on this example in
Section 2.2 to illustrate how propositional reasoning can be carried out in this framework. Section 2.3
illustrates a setting where we need to reason in the presence of background knowledge. Finally, Section 2.4
shows how this can be extended to situations involving non-monotonic reasoning with defeasible knowledge.
2.1. Basic setting
Figure 1 displays a graph with four types of nodes, referring to research topics (t1, t2, t3, t4, t5), scientific
articles (a1, a2, a3), researchers (r1, r2, r3) and committees (c1, c2), respectively. In this graph, research topics
are connected to the articles that discuss them, articles are connected to their authors and researchers are
connected to the committees they belong to. We say that a researcher is an expert on a topic if they have
published at least one article which discusses this topic. We say that a committee is complete if it contains
an expert on each of the five topics. For instance, in the case of Figure 1, we can see that researcher r1
is an expert on topics t1, t2; researcher r2is an expert on topics t1, t2, t3, t4; and researcher r3is an expert
on topics t4, t5. It follows that committee c2is complete, as each of the five topics are covered by its two
members, but committee c1is not. We are now interested in designing a message-passing GNN which can
predict whether a given committee is complete or not. A message-passing GNN learns embeddings of the
nodes of a given graph, by iteratively updating each node’s embedding based on the embeddings of its
neighbouring nodes. To specify a message-passing GNN, we thus need to specify (i) the initial embedding
of each node and (ii) the update mechanism which is used.
A straightforward solution is as follows. Let us write x(l)∈R5for the representation of a node xin layer
lof the GNN. In particular, x(0)represents the input embedding of node x, which we define as follows:
t(0)
1= (1,0,0,0,0) t(0)
2= (0,1,0,0,0) t(0)
3= (0,0,1,0,0) t(0)
4= (0,0,0,1,0) t(0)
5= (0,0,0,0,1)
In other words, we use a one-hot encoding for representing the different topics. The input embeddings for
all the other nodes are set to x(0)= (0,0,0,0,0). In the subsequent layers, the node representations are
updated as follows:
x(l+1)= max({x(l)}∪{y(l)|(y, x)∈ E}) (2)
where Erepresents the set of edges, i.e. (y, x)∈ E if there is an edge from node yto node x, and the maximum
is applied component-wise. It is easy to verify that an article aicovers topic tjif the jth coordinate of a(l)
iis
1, for l≥1. Similarly, a researcher riis an expert on topic jif the jth coordinate of r(l)
iis 1, for l≥2; and
a committee cicontains expertise on topic jif the jth coordinate of c(l)
iis 1, for l≥3. We thus have that
the committee ciis complete iff c(3)
i= (1,1,1,1,1). Note that the latter condition can be checked using a
linear scoring function, since it is equivalent with ci(3) ·1≥5, where 1= (1,1,1,1,1).
In the proposed construction, a number of particular design decisions were made. For instance, we used
5-dimensional embeddings for representing nodes and we used the maximum for aggregating the evidence
from neighbouring nodes. Our analysis in this paper is aimed at studying the importance of such decisions.
For instance, we may wonder whether it is possible to devise an encoding that relies on four-dimensional
embeddings. If we are allowed to replace the maximum with an arbitrarily complex function (and we allow
non-linear scoring functions), then the answer is clearly positive2. However, as we will see in Section 4,
2In fact, we can even use one-dimensional embeddings. Let nbe a number which is larger than the number of topics
k. Then we can initialise topic ias ni−1. To aggregate two node representations xand y, we first decompose them as
x=x0+x1n+x2n2+... +xk−1nk−1and y=y0+y1n+y2n2+... +yk−1nk−1, with x0, ..., xk−1, y0, ...yk−1∈ {0,1}. Then
the aggregated embedding can be defined as max(x0, y0) + max(x1, y1)n+... + max(xk−1, yk−1)nk−1.
4