However, there is also important evidence that neural systems learn representations that mirror
the structure of the task, as might be predicted in the “rich” regime. For example, it is often
observed that neurons active in one task are silent during another, and vice versa. For example,
when macaques were trained to categorise morphed animal images according to two
independent classification schemes, 29% of PFC neurons became active during a single
scheme, whereas only 2% of neurons were active du ring both [30]. Much more recently, a
similar finding was reported using modern two-photon imaging methods in the parietal cortex
of mice trained to perform both a grating discrimination task and a T-maze task. Over half of
the recorded neurons were active in at least one task, but a much smaller fraction was active in
both tasks [31]. In other words, the brain learns to partition task knowledge across independent
sets of neurons.
One recent study used a neural network model to explicitly compare the predictions of the rich
and lazy learning schemes to neural signals recorded from the human brain [13]. They
developed a task (similar to ref [30]) that involved discriminating naturalistic images in two
independent contexts. Human participants learn to make “plant / don’t plant” decisions about
quasi-naturalistic images of trees with continuously varying branch density and leaf density,
whose growth success was determined by leaf density in one “garden” (task A) and branch
density in the other (task B) (Fig. 2A). Neural networks could be trained to perform a stylised
version of this task under either rich or lazy learning schemes by varying the different initial
connection strengths (Fig. 2B). Multivariate methods used to visualise the representational
dissimilarity matrix (RDM) and corresponding neural geometry for the network hidden layer
under either scheme revealed that they made quite different predictions (Fig. 2C). Under lazy
learning, the network learned a high dimensional solution whose RDM simply recapitulated
the organisation of the input signals (into a grid defined by “leafiness” and “branchiness”). This
is expected because randomly expanding the dimensionality of the inputs does not distort their
similarity structure. However, under the rich scheme, the network compressed information that
was irrelevant dimension to each context, so that the hidden layer represented the relevant input
dimensions (leafiness and branchiness) on two neural planes lying at right angles in neural state
space (Fig. 2C-D). Strikingly, BOLD signals in the posterior parietal cortex (PPC) and
dorsomedial prefrontal cortex (dmPFC) exhibited a similar dimensionality and RDMs revealed
a comparable geometric arrangement onto “orthogonal planes”, providing evidence in support
of “rich” task representations in the human brain (Fig. 2E-F).
Neural systems can thus learn both structured, low-dimensional task representations, and
unstructured, high-dimensional codes. In artificial neural networks, the emerging regime
depends on the magnitude of initial connection strengths in the network [15]. In the brain, these
regimes may arise through other mechanisms such as pressure toward metabolic efficiency
(regularisation), or architectural constraints that enforce unlearned nonlinear expansions.
Whilst it remains unclear when, how and why either coding scheme might be adopted in the
biological brain, neural theory is emerging that may help clarify this issue. One recent paper
explored how representational structure is shaped by specific task demands [32]. Comparing
recurrent artificial neural networks trained on different cognitive tasks, the authors found that
those tasks that required flexible input-output mappings, such as the context-dependent
decision task outlined above, induced task-specific representations, similar to the ones
observed under rich learning in feedforward networks. In contrast, for tasks that did not require
such a flexible mapping, the authors observed completely random, unstructured
representations. This suggests that representational geometry is not only determined by
intrinsic factors such as initial connection strength, but flexibly adapts to the computational
demands of specific tasks. Rich task-specific representations might therefore arise when there