
(in 3D; a quadtree in 2D) of all bodies, dividing the simulation space into
halves along each axis until at most one body is left per sub-space. Then, the
algorithm allows approximating the force of a whole subspace by the root of the
corresponding subtree whenever the subspace is far away for its size.
The position of a virtual body (an inner node of the tree) is the weighted
average of the positions of the approximated real bodies (we call that the
centroid ), weighted by a body-dependent scalar (in our case, the number of
vacant synaptic elements). Whenever this virtual body is used, its weight is the
sum of the weights of all approximated bodies. The approximation is valid if
the maximum length of a subspace
l
divided by the distance from the target to
the virtual body dis smaller than a previously fixed θ > 0.
If a neuron wants to create a new synapse, it searches for another neuron
with a vacant synaptic element of the opposite type (a synapse is always formed
from an axon to a dendrite). It does so by checking the Acceptance Criterion
(AC)
l/d < θ
for the root of the octree, unpacking a virtual neuron if it does not
satisfy the AC, and rechecking its children. Once a neuron establishes a list of
potential partners (actual neurons or virtual neurons that are far enough away),
it uses this list to calculate the attracting forces and picks a partner accordingly.
If this partner is a virtual neuron, the searching neuron needs to unpack it and
use the Barnes–Hut-algorithm again, this time starting from the virtual neuron
instead of the root of the octree.
If the octree is balanced, its height is
O
(
log
(
n
)) for
n
neurons, and in the
worst case, there must be
O
(
log
(
n
)) applications of the Barnes–Hut-algorithm
per neuron searching a partner. This is the case if a neuron repeatedly picks a
virtual neuron as close as possible to the respective root. Overall, the runtime
is in
O
(
n·log
(
n
)
·log
(
n
)), i.e., every neuron (
n
) has to perform
O
(
log
(
n
))
Barnes–Hut-algorithms, which all have complexity O(log(n)).
In the parallel version of this algorithm, each MPI rank is responsible for a
fixed number of neurons. A rank gets assigned a sub-space in which it places all
its neurons and for which it builds such an octree (enforcing a suitable number
of MPI ranks to avoid boundary issues). In every update step, a rank updates
the centroids and the numbers of vacant elements in its local sub-tree. Then,
all ranks exchange the roots of their sub-trees and construct the common upper
portion of the octree. Whenever a rank now needs access to the children of
another rank’s node (e.g., when the AC is not satisfied for the branch node or it
picked a remote node later on), it downloads the children and inserts them into
its own tree. These downloaded nodes are discarded at the end of the update
step so that the newly calculated values can be used in the next update step.
Algorithm 1 shows the pseudo-code for finding targets in this parallel version.
A Sharp Runtime Bound
In the previous publication, the authors advised setting
θ≤
1
/√3
. This way,
a subspace that contains the searching neuron is always unpacked because it
never satisfies the Acceptance Criterion (AC). This precaution prevents autapses
2