2
recursion strategy (e.g. dynamic programming that builds upon the optimality of substructures [13]).
We seek to understand overparameterization in the framework of optimizing hierarchical representations
[85]. On the other hand, an evolutionary perspective on biological and artificial neural networks [34]
offers a direct-fit approach in brute force. Such a deceivingly simple model, when combined with over-
parameterized optimization, offers an appealing solution to increase the generalization (predictive) power
without explicitly modeling the unknown generative structure underlying sensory inputs.
In this paper, we construct an over-parameterized direct-fit model for visual perception. Unlike the
conventional wisdom of abstracting simple and complex cells, we use space partitioning and composition
as the building block of our hierarchical construction. In addition to biological plausibility, we offer
a geometric analysis of our construction in topological space (i.e., topological manifolds without the
definition of a distance metric or an inner product). Our construction can be interpreted as a product-
topology-based generalization of the existing k-d tree [14], making it suitable for brute-force direct-fit
in a high-dimensional space. In the presence of novelty/anomaly, a surrogate model that mimics the
escape mechanism of the hippocampus can be activated for unsupervised continual learning [98]. The
constructed model has been applied to several classical experiments in neuroscience and psychology. We
also provide an anti-sparse coding interpretation [46] of the constructed vision model and present a dynamic
programming (DP)-like solution to approximate nearest neighbor in high-dimensional space. Finally,
we briefly discuss two possible network implementations of the proposed model based on asymmetric
autoencoder [69] and spiking neural networks (SNN) [45], respectively.
II. NEUROSCIENCE FOUNDATION
A. Dichotomy: Excitatory and Inhibitory Neurons
During the work of Wilson and Cowan in the 1970s [92, 93], they made the crucial assumption that “all
nervous processes of any complexity depend on the interaction of excitatory and inhibitory cells.” Using
phase plane methods, they have shown simple and multiple hysteresis phenomena and limit cycle activity
with localized populations of model neurons. Their results, more or less, offer the primitive basis for
memory storage, namely stimulus intensity, which can be coded in both the average spike frequency and
the frequency of periodic variations in the average spike frequency [78]. However, such ad hoc sensory
encoding cannot explain the sophistication of learning, memory, and recognition associated with higher
functions.
B. Hebbian Learning and Anti-Hebbian Learning
Hebbian learning [38] is a dogma that claims that an increase in synaptic efficacy arises from repeated
and persistent stimulation of a presynaptic cell by a postsynaptic cell. Hebbian learning rule is often
summarized as “cells that fire together wire together”. The physical implementation of the Hebbian learning
rule has been well studied in the literature, for example, through spike timing-dependent plasticity (STDP)
[16]. The mechanism of STDP is to adjust the connection strengths based on the relative timing of
some neuron’s input and output action potentials. STDP as a Hebbian synaptic learning rule has been
demonstrated in various neural circuits, from insects to humans.
By analogy to excitatory and inhibitory neurons, it has been suggested that a reversal of Hebb’s postulate,
named anti-Hebbian learning, dictates the reduction (rather than increase) of the synaptic connectivity
strength between neurons following a firing scenario. Synaptic plasticity that operates under the control
of an anti-Hebbian learning rule has been found to occur in the cerebellum [12]. More importantly, local
anti-Hebbian learning has been shown to be the foundation for forming sparse representations [27]. By
connecting a layer of simple Hebbian units with modifiable anti-Hebbian feedback connections, one can
learn to encode a set of patterns into a sparse representation in which statistical dependency between
the elements is reduced while preserving the information. However, the sparse coding represents only a
local approximation of the sensory processing machinery. To extend it to global (nonlocal) integration,
we have to assume an additional postulate, called the “hierarchical organization principle ”, which we
will introduce in the next section.