Continual t ask learning in natural and artificial agents

2025-04-22 0 0 926.69KB 18 页 10玖币

侵权投诉

Continual task learning in natural and artificial agents

Timo Flesch1, Andrew Saxe2 and Christopher Summerfield1

1Department of Experimental Psychology, University of Oxford, Oxford, UK.

2Gatsby Computational Neuroscience Unit & Sainsbury Wellcome Centre, UCL, London, UK.

Correspondence: {timo.flesch, christopher.summerfield}@psy.ox.ac.uk

Abstract

How do humans and other animals learn new tasks? A wave of brain recording studies has

investigated how neural representations change during task learning, with a focus on how tasks

can be acquired and coded in ways that minimise mutual interference. We review recent work

that has explored the geometry and dimensionality of neural task representations in neocortex,

and computational models that have exploited these findings to understand how the brain may

partition knowledge between tasks. We discuss how ideas from machine learning, including

those that combine supervised and unsupervised learning, are helping neuroscientists

understand how natural tasks are learned and coded in biological brains.

Keywords

Continual learning, neural networks, representational geometry, Hebbian gating.

Highlights

• Both natural and artificial agents face the challenge of learning in ways that support

effective future behaviour

• This may be achieved by different learning regimes, associated with distinct dynamics,

and differing dimensionality and geometry of neural task representations

• Where two different tasks are learned, neural codes for task-relevant information may

be factorised in neocortex

• Combinations of supervised and unsupervised learning mechanisms may help partition

task knowledge and avoid catastrophic interference

Declaration

Authors declare no competing interests

1. Natural tasks

In the natural world, humans and other animals behave in temporally structured ways that

depend on environmental context. For example, many mammals cycle systematically through

daily activities such as foraging, grooming, napping, and socialising. Humans live in complex

societies in which labour is shared among group members, with each adult performing multiple

successive roles, such as securing resources, caring for young, or exchanging social

information. In many settings, we can describe the behaviour of natural agents as comprising

a succession of distinct tasks for which a desired outcome (reward) is achieved by taking

actions (responses) to observations (stimuli) through the learning of latent causal processes

(rules).

The nature of task-driven behaviour, and the way that tasks are represented and implemented

in neural circuits, have been widely studied by cognitive and neural scientists. One important

finding is that switching between distinct tasks incurs a cost in decision accuracy and latency

[1]. This switch cost implies the existence of control mechanisms that ensure we remain “on

task”, possibly protecting ongoing behavioural routines from interference [2,3]. In primates,

there is good evidence that control signals originate in the prefrontal cortex (PFC) and

encourage task-appropriate behaviours by biasing neural activity in sensory and motor regions

[4]. For example, single cells in the PFC have been observed to respond to task rules [5], and

patients with PFC damage tend to select tasks erroneously, leading to disinhibited or

inappropriate behaviours [6].

How, then, are tasks coded in the PFC and interconnected regions? One key insight is that

mutual interference among tasks can be mitigated when they are coded in independent

subspaces of neural activity, such that the neural population vector evoked during task A is

uncorrelated with that occurring during task B [7,8]. Over the past decade, evidence for this

coding principle has emerged in domains as varied as skilled motor control [9], auditory

prediction [10], memory processes [11,12], and visual categorisation [13,14]. However, the

precise computational mechanisms by which tasks are encoded and implemented remain a

matter of ongoing debate.

2. Rich and lazy learning

One way to study how tasks could be neurally encoded is to simulate learning in a simple class

of computational model – a neural network trained with gradient descent. Neural networks

uniquely allow researchers to form hypotheses about how neural codes form in biological

brains, because their representations emerge through optimisation rather than being hand-

crafted by the researcher [15]. One recent observation is that neural networks can learn to

perform tasks in different regimes that are characterised by qualitatively diverging learning

dynamics and distinct neural patterns at convergence [16,17]. In the lazy regime, which occurs

when network weights are initialised with a broader range of values (e.g., higher connection

strengths), the dimensionality of the input signals is rapidly expanded via random projections

to the hidden layer such that learning is mostly confined to the readout weights, and error

decreases exponentially [17–20]. By contrast, in the rich regime, which occurs when weights

are initialised with low variance (weak connectivity), the hidden units learn highly structured

representations that are tailored to the specific demands of the task, and the loss curve tends to

pass through one or more saddle points before convergence [21–24]. We illustrate using a

simple example – learning an “exclusive or” (XOR) problem – in Fig. 1A-D.

Figure 1. Rich and Lazy learning in neural networks. (A) The XOR (exclusive or) problem requires to provide

the same response A when either one or the other of two input units are set to one, and the response B when both

are 0 or both are 1. A linear classifier can’t learn to distinguish between the two classes. (B) Feedforward neural

network architecture that can solve the XOR task. The inputs are mapped into a hidden layer with non-linear

outputs, and from there into a non-linear response layer. (C) Effect of different initial weight variances on the

training dynamics of the network shown in (B). We distinguish between rich learning (with small initial weight

variance, light blue) and lazy learning (with large initial weight variance, dark blue). The change of the magnitude

of the input-to hidden (left) and hidden-to output (middle) depends strongly on initialisation strength. In lazy

initialised networks, the input-to hidden weights remain very close to their initial values and learning is confined

to the readout weights. In rich initialised networks, all weights adapt substantially. Moreover, rich-initialised

networks learn much slower than lazy-initialised networks (right). (D) Learned input-to-hidden weights after rich

(top) and lazy (bottom) learning. Under rich learning, the weights learn to point to the four input types. In contrast,

under lazy learning, the weights point in arbitrary directions, effectively performing a random mapping into a

higher-dimensional space.

These schemes may have complementary costs and benefits. High-dimensional coding

schemes maximise the number of discriminations that can be linearly read out from the

network, allowing agents to rapidly learn a new decision rule for a task [25]. Low-dimensional

coding schemes confer robustness through redundancy, because neurons exhibit overlapping

tuning properties, and promote generalisation, because they tend to correspond to simpler

input-output functions when the neural manifold extends in fewer directions [26,27].

Neural recordings have offered evidence for both rich and lazy coding schemes. One important

observation is that the variables that define a task – observations, actions, outcomes and rules

– are often encoded jointly by single neurons. For example, when monkeys make choices on

the basis of distinct cues, single cells tend to multiplex input and choice variables [28,29]. In

another study, the dimensionality of neural codes recorded during performance of a dual

memory task was found to approach its theoretical maximum, implying that neurons represent

every possible combination of relevant variables [12]. This finding is consistent with “lazy”

learning, implying that brains encode tasks via high-dimensional representations that enmesh

multiple task-relevant variables across the neural population.

However, there is also important evidence that neural systems learn representations that mirror

the structure of the task, as might be predicted in the “rich” regime. For example, it is often

observed that neurons active in one task are silent during another, and vice versa. For example,

when macaques were trained to categorise morphed animal images according to two

independent classification schemes, 29% of PFC neurons became active during a single

scheme, whereas only 2% of neurons were active du ring both [30]. Much more recently, a

similar finding was reported using modern two-photon imaging methods in the parietal cortex

of mice trained to perform both a grating discrimination task and a T-maze task. Over half of

the recorded neurons were active in at least one task, but a much smaller fraction was active in

both tasks [31]. In other words, the brain learns to partition task knowledge across independent

sets of neurons.

One recent study used a neural network model to explicitly compare the predictions of the rich

and lazy learning schemes to neural signals recorded from the human brain [13]. They

developed a task (similar to ref [30]) that involved discriminating naturalistic images in two

independent contexts. Human participants learn to make “plant / don’t plant” decisions about

quasi-naturalistic images of trees with continuously varying branch density and leaf density,

whose growth success was determined by leaf density in one “garden” (task A) and branch

density in the other (task B) (Fig. 2A). Neural networks could be trained to perform a stylised

version of this task under either rich or lazy learning schemes by varying the different initial

connection strengths (Fig. 2B). Multivariate methods used to visualise the representational

dissimilarity matrix (RDM) and corresponding neural geometry for the network hidden layer

under either scheme revealed that they made quite different predictions (Fig. 2C). Under lazy

learning, the network learned a high dimensional solution whose RDM simply recapitulated

the organisation of the input signals (into a grid defined by “leafiness” and “branchiness”). This

is expected because randomly expanding the dimensionality of the inputs does not distort their

similarity structure. However, under the rich scheme, the network compressed information that

was irrelevant dimension to each context, so that the hidden layer represented the relevant input

dimensions (leafiness and branchiness) on two neural planes lying at right angles in neural state

space (Fig. 2C-D). Strikingly, BOLD signals in the posterior parietal cortex (PPC) and

dorsomedial prefrontal cortex (dmPFC) exhibited a similar dimensionality and RDMs revealed

a comparable geometric arrangement onto “orthogonal planes”, providing evidence in support

of “rich” task representations in the human brain (Fig. 2E-F).

Neural systems can thus learn both structured, low-dimensional task representations, and

unstructured, high-dimensional codes. In artificial neural networks, the emerging regime

depends on the magnitude of initial connection strengths in the network [15]. In the brain, these

regimes may arise through other mechanisms such as pressure toward metabolic efficiency

(regularisation), or architectural constraints that enforce unlearned nonlinear expansions.

Whilst it remains unclear when, how and why either coding scheme might be adopted in the

biological brain, neural theory is emerging that may help clarify this issue. One recent paper

explored how representational structure is shaped by specific task demands [32]. Comparing

recurrent artificial neural networks trained on different cognitive tasks, the authors found that

those tasks that required flexible input-output mappings, such as the context-dependent

decision task outlined above, induced task-specific representations, similar to the ones

observed under rich learning in feedforward networks. In contrast, for tasks that did not require

such a flexible mapping, the authors observed completely random, unstructured

representations. This suggests that representational geometry is not only determined by

intrinsic factors such as initial connection strength, but flexibly adapts to the computational

demands of specific tasks. Rich task-specific representations might therefore arise when there

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

ContinualtasklearninginnaturalandartificialagentsTimoFlesch1,AndrewSaxe2andChristopherSummerfield11DepartmentofExperimentalPsychology,UniversityofOxford,Oxford,UK.2GatsbyComputationalNeuroscienceUnit&SainsburyWellcomeCentre,UCL,London,UK.Correspondence:{timo.flesch,christopher.summerfield}@psy.ox.ac...

展开>> 收起<<

Continual t ask learning in natural and artificial agents.pdf

共18页,预览4页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Continual t ask learning in natural and artificial agents

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: