Disentangling Reasoning Capabilities from Language Models with Compositional Reasoning Transformers Wanjun Zhong1 Tingting Ma2 Jiahai Wang1 Jian Yin1

2025-05-03 0 0 634.85KB 12 页 10玖币

侵权投诉

Disentangling Reasoning Capabilities from Language Models with

Compositional Reasoning Transformers

Wanjun Zhong1∗

, Tingting Ma2∗, Jiahai Wang1, Jian Yin1,

Tiejun Zhao2,Chin-Yew Lin3and Nan Duan3

1Sun Yat-sen University 2Harbin Institute of Technology

3Microsoft Research Asia

zhongwj25@mail2.sysu.edu.cn, hittingtingma@gmail.com

{wangjiah,issjyin}@mail.sysu.edu.cn, tjzhao@hit.edu.cn

{cyl, nanduan}@microsoft.com;

Abstract

This paper presents ReasonFormer, a uniﬁed

reasoning framework for mirroring the modu-

lar and compositional reasoning process of hu-

mans in complex decision-making. Inspired

by dual-process theory in cognitive science,

the representation module (automatic think-

ing) and reasoning modules (controlled think-

ing) are decoupled to capture different levels

of cognition. Upon the top of the represen-

tation module, the pre-trained reasoning mod-

ules are modular and professional in speciﬁc

and fundamental reasoning skills (e.g., logic,

simple QA, etc). To mimic the controlled

compositional thinking process, different rea-

soning modules are dynamically activated and

composed in both parallel and cascaded man-

ners to control what reasoning skills are acti-

vated and how deep the reasoning process will

be reached to solve the current problems. The

uniﬁed reasoning framework solves multiple

tasks with a single model, and is trained and in-

ferred in an end-to-end manner. Evaluated on

11 datasets requiring different reasoning skills

and complexity, ReasonFormer demonstrates

substantial performance boosts, revealing the

compositional reasoning ability. Few-shot ex-

periments exhibit better generalization ability

by learning to compose pre-trained skills for

new tasks with limited data, and decoupling

the representation module and the reasoning

modules. Further analysis shows the modular-

ity of reasoning modules as different tasks ac-

tivate distinct reasoning skills at different rea-

soning depths.

1 Introduction

Prevailing language models (LMs) (Devlin et al.,

2018;Brown et al.,2020) demonstrate impressive

performance in natural language processing tasks,

and have ushered in a new trend in AI research. De-

spite the emerging fervor, the homogeneous LMs

∗

Equal contributions during internship at Microsoft Re-

search Asia.

Question: What cause car accident?

Semantic Understanding

(Intuitive) System 1

(Controlled) System 2

Step 1: Memorizing Fact Knowledge

Driving relates to {speed, attention, rule following}

Alcohol hurts attention …

Step 2: Logical Deduction

alcohol →affect attention →driving accident

…

Step 3: Answering Question

alcohol, over-speeding, distraction …

Figure 1: Compositional reasoning process of humans

in complex decision-making. Humans solve the prob-

lems by cascaded executions of fundamental skills.

relying on a single call of the model are less mod-

ular and are hard to explicitly model the complex

reasoning process (Helwe et al.,2021) like humans.

In the dual-process theory (Daniel,2017) in cog-

nitive psychology, there are two cognitive systems

interacted to form a whole reasoning process. Sys-

tem 1 (automatic thinking) generates intuitive pat-

terns of ideas, and System 2 (controlled thinking)

constructs reasoning in an orderly logical series

of compositional reasoning processes. Besides, in

the process of System 2, different functional brain

areas could be modular and interact with each other.

System 2 can decide how to compose different rea-

soning skills and when to stop thinking. As the

example shown in Fig. 1, when ﬁnding the cause

of a car accident, humans intuitively comprehend

the question (System 1), and then conduct com-

positional reasoning (System 2: recalling fact

→

logical deduction →answering question).

We would like to incorporate this mechanism

into AI models in decision-making, and make the

following assumptions: (1) the representation mod-

ule (System 1) and reasoning module (System 2)

arXiv:2210.11265v2 [cs.CL] 7 Dec 2022

can be decoupled and (2) the “complicated" rea-

soning process can be disentangled into multi-step

executions of compositional “fundamental" reason-

ing modules, whose compositionality can be learnt

with limited data. Also, the “fundamental" nature

of basic reasoning skills allows them to have rich

training instances for reliable skill pre-training.

Under these motivations, this paper proposes the

modular and compositional reasoning framework -

ReasonFormer

, to mirror human’s compositional

reasoning process, with the following characteris-

tics: (1) the representation module and reasoning

modules are decoupled; (2) reasoning modules are

modular and professional in fundamental reason-

ing skills; (3) reasoning modules are compositional

in parallel and cascaded manner, to dynamically

decide the activated reasoning skills and the reason-

ing complexity; (4) the general-purpose reasoning

framework is end-to-end and uniﬁed in solving

multiple tasks with one model.

Speciﬁcally, the representation module learns

contextual representations of problems. Upon the

top of the it, there are cascaded reasoning mod-

ules to perform compositional multi-step reasoning.

The reasoning modules are pre-trained to expert

in speciﬁc reasoning skills (e.g., logic, QA, fact,

etc.). These pre-trained reasoning skills are con-

sidered relatively fundamental and have rich re-

sources. Two additional blocks complete the whole

framework: the reasoning router and the reason-

ing adapter. The reasoning router decides which

reasoning skills are activated in each reasoning

step, and when to stop the reasoning process. The

adapter adapts the reused reasoning modules to

different steps of the reasoning process.

We comprehensively evaluate the framework on

11 datasets emphasizing different reasoning skills

and complexity, and highlight the following ﬁnd-

ings: (1) Substantial performance boosts demon-

strate models’ harvest of compositional reasoning

ability, and both the reasoning-centric pre-training

and reasoning adapter bring compounding perfor-

mance gains. (2) Results of few-shot experiments

show that specialized modules enables better gener-

alization by learning to compose pre-trained skills

for low-resource tasks, and decoupling of repre-

sentation module and reasoning modules. (3) Fur-

ther analysis reveals the distinct reasoning skills

required for different tasks at different reasoning

depths, shoring up the modularity of reasoning

modules.

2 Reasoning Skills Formulation

The compositional reasoning process of LMs’ re-

lies on the pre-training of several fundamental rea-

soning skills and their compositionality. Hence, the

selection of skills is critical.

Selection Principles.

There are two major prin-

ciples in selecting skills: (1)

Fundamental

: Com-

plex problems can be decomposed and solved by

simpler basic skills. So the basic skills should be

more fundamental, well-deﬁned, and can be cov-

ered in the required skill set of as many tasks as pos-

sible; (2)

Resourceful

: Reliable skill pre-training

requires large-scale pre-training data. However, in

the real-world scenario, the annotated data is ex-

pensive to obtain for most reasoning tasks. So it is

expected that there are already rich resource or data

can be collected via self(semi)-supervised manner.

Basic Skills Selection.

Humans always solve

complex problem with fundamental skills, like un-

derstanding key information (e.g., entity and its

type) of events, recalling related facts, understand-

ing causal relations between events, and extracting

answers for the question. This motivates us to se-

lect the following basic skills: the

logic ability

to logically deduce the cause or consequence of

events;

simple question answering (QA)

to un-

derstand the context and answer simple questions;

named entity recognition (NER)

to identify im-

portant entities in the context;

natural language

inference (NLI)

to identify semantic relevance of

two sentences and

factual knowledge

to memo-

rize commonsense knowledge and understand daily

events. There is an additional

general

skill to learn

the commonly shared knowledge across selected

skills. We keep this setting in our paper as they are

relatively well deﬁned and resourceful 1.

We adopt self-supervised methods to construct

pre-training corpus for {logic ability,factual knowl-

edge,NER}, semi-supervised method to construct

pre-training corpus for simple QA, and large-scale

supervised data for NLI. Further details are given

in§4.2 and examples are given in Appendix A.

3ReasonFormer Framework

As shown in Fig. 2, the general-purpose reason-

ing framework is built based on encoder-decoder

It is worth noting that this selection is tentative. There

are plausible ways for selecting basic skills or knowledge

domains, which also inspire future directions.

𝑆𝑘𝑖𝑙𝑙 𝑅𝑜𝑢𝑡𝑒𝑟!+𝑆𝑡𝑜𝑝 𝐺𝑎𝑡𝑒!

Compositional Reasoning Modules (System 2)

Iterative cascaded reasoning at the 𝒊𝒕𝒉 𝒔𝒕𝒆𝒑

Question:

What cause car

accident? Please

give the answer:

Encoder

Input Decoder Output

Representation

Module

(System 1)

Reasoning

Modules

(System 2)

Transformer

Layer

𝑛×

Distraction

Alcohol

…

𝑂𝑢𝑡𝑝𝑢𝑡

𝑅$%&'

𝑅() 𝑅*+,!&

…𝑅-./.0%* LN

MHA

Adapter

FFN

RM Layer w/ Adapter

Figure 2: ReasonFormer framework. The representation module (§ 3.1) and reasoning modules (RMs) (§ 3.2) are

decoupled to form the compositional reasoning process. The RMs are pre-trained with different reasoning skills

Rskill (§ 2). The reasoning adapter (§ 3.2.1) adapts the shared RMs to different reasoning steps. Router decides

activated skills. Stop gate decides when to stop reasoning (§ 3.2.2). Red lines indicate cascaded reasoning process.

architecture to process multiple tasks (i.e., all pre-

training tasks and downstream tasks) with a uniﬁed

model, where all tasks are tackled as uniﬁed text-to-

text generation tasks. We ﬁrst reformat all the tasks

into the same format using hard prompts (Sanh

et al.,2021). For example, the question-answering

task input can be prompted with the template: The

question is {Question}. Please give the answer:",

and the expected output is the answer text.

Given the prompted task inputs, the modular and

compositional framework consists of two compo-

nents in its encoder: the representation module

(System 1) and the reasoning modules (System 2).

The

representation module

(§ 3.1) captures the

intuitive understanding of problems by calculating

initial contextual representations. Upon the top of

the representation module, there are several pre-

trained

reasoning modules

(§ 3.2) with different

reasoning skills, waiting for interaction to form a

compositional reasoning process. For reasoning

process organization, there are

reasoning routers

(§ 3.2.2) to decide the (parallel) activated skills and

when to stop the (cascaded) reasoning process.

3.1 Representation Module

Similar to the perceptive function of System 1,

the representation module targets basic contex-

tual understanding, and builds the foundation of

the following-up reasoning process. As LMs ex-

hibit impressive ability on contextual understand-

ing, we build the representation module with cas-

caded Transformer layers. Given the tokenized

input

with length

, the initial representations

learnt from representation module are denoted as:

H0={h0

[CLS],h0

1,h0

2..., h0

m}(1)

where [CLS] is a special token.

3.2 Reasoning Modules

To simulate the cognitive process (System 2)

formed by controlled interaction between various

functional areas in human brains, the reasoning

modules are modular and compositional. Reason-

ing modules (

RMs

) learn different reasoning skills

speciﬁed during pre-training, and are automatically

composed during downstream adaptation (§ 3.3)

with reasoning router (§ 3.2.2). Compositionality

is not only at the parallel level (different skills), but

also at the cascaded level (multi-step reasoning)

Since different reasoning steps intuitively model

different levels of information, there are additional

reasoning adapters

to adapt the reused modules

to different reasoning steps.

3.2.1 Reasoning Modules Architecture

Each reasoning module is implemented by sev-

eral Transformer layers. As shown in Fig.2(b),

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

DisentanglingReasoningCapabilitiesfromLanguageModelswithCompositionalReasoningTransformersWanjunZhong1,TingtingMa2,JiahaiWang1,JianYin1,TiejunZhao2,Chin-YewLin3andNanDuan31SunYat-senUniversity2HarbinInstituteofTechnology3MicrosoftResearchAsiazhongwj25@mail2.sysu.edu.cn,hittingtingma@gmail.com{wang...

展开>> 收起<<

Disentangling Reasoning Capabilities from Language Models with Compositional Reasoning Transformers Wanjun Zhong1 Tingting Ma2 Jiahai Wang1 Jian Yin1.pdf

共12页,预览3页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Disentangling Reasoning Capabilities from Language Models with Compositional Reasoning Transformers Wanjun Zhong1 Tingting Ma2 Jiahai Wang1 Jian Yin1

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: