Multi-View Reasoning Consistent Contrastive Learning for Math Word Problem Wenqi Zhang1 Yongliang Shen1 Yanna Ma2 Xiaoxia Cheng1

2025-05-02 0 0 1.81MB 15 页 10玖币

侵权投诉

Multi-View Reasoning:

Consistent Contrastive Learning for Math Word Problem

Wenqi Zhang1, Yongliang Shen1, Yanna Ma2, Xiaoxia Cheng1,

Zeqi Tan1, Qingpeng Nong3, Weiming Lu1†

1College of Computer Science and Technology, Zhejiang University

2University of Shanghai for Science and Technology

3Zhongxing Telecommunication Equipment Corporationy

{zhangwenqi, luwm}@zju.edu.cn

Abstract

Math word problem solver requires both pre-

cise relation reasoning about quantities in the

text and reliable generation for the diverse equa-

tion. Current sequence-to-tree or relation ex-

traction methods regard this only from a ﬁxed

view, struggling to simultaneously handle com-

plex semantics and diverse equations. However,

human solving naturally involves two consis-

tent reasoning views: top-down and bottom-up,

just as math equations also can be expressed in

multiple equivalent forms: pre-order and post-

order. We propose a multi-view consistent con-

trastive learning for a more complete semantics-

to-equation mapping. The entire process is de-

coupled into two independent but consistent

views: top-down decomposition and bottom-

up construction, and the two reasoning views

are aligned in multi-granularity for consistency,

enhancing global generation and precise reason-

ing. Experiments on multiple datasets across

two languages show our approach signiﬁcantly

outperforms the existing baselines, especially

on complex problems

. We also show after

consistent alignment, multi-view can absorb

the merits of both views and generate more di-

verse results consistent with the mathematical

laws.

1 Introduction

Math word problem (MWP) is a very signiﬁ-

cant and challenging task with a wide range of

applications in both natural language processing

and general artiﬁcial intelligence (Bobrow,1964).

The MWP is to predict the mathematical equa-

tion and the ﬁnal answer based on a natural lan-

guage description of the scenario and a math prob-

lem. It requires mathematical reasoning over the

text (Mukherjee and Garain,2008), which is very

challenging for conventional methods (Patel et al.,

†Corresponding author.

Our source code and data are open

sourced at

https://github.com/zwq2018/

Multi-view-Consistency-for-MWP

−

Top-down Reasoning

Bottom-up Reasoning

2345

(2, 3 +)

(Exp1, 4 ×)

(Exp2, 5 −)

1 2

Latent Space

Question:Xiao Ming and Zhang work in the orchard to pick fruit, Xiao

Ming pick 2fruits per minute, Zhang pick 3fruits per minute, they

worked for 4minutes, and then ate 5fruits, how many fruits are left

Answer:15 Expr:(2 + 3) ×4 −5

Multi-view

Post-order Traversal

(2 3 + 4 ×5−)

Pre-order Traversal

(−×+ 2 3 4 5 )

Mid-order

(2 + 3) ×4 - 5

1 3 4 52 76

Multi-order

1 3 4 52 76

Consistent

Exp3

Exp2

Exp3

Remaining

fruits

Total

pick rate

Total

fruits

Total

pick rate

Total

fruits

Remaining

fruits

Total

pick rate

Total

fruits

Remaining

fruits

Total

pick rate

Total

fruits

Remaining

fruits

Exp1

Figure 1: Human solving has multiple reasoning views,

and math equation also can be expressed in multi-order.

Pre-order traversal can be seen as a top-down reason-

ing view. Post-order traversal corresponds exactly to

the bottom-up reasoning view. Consistent contrastive

learning aligns two views in the same latent space.

2021).

MWP tasks have attracted a great deal of re-

search attention. In the early days, MWP was

treated as a sequence-to-sequence (seq2seq) trans-

lation task, translating human language into mathe-

matical language (Wang et al.,2017,2019). Then,

Xie and Sun (2019); Zhang et al. (2020); Faldu et al.

(2021) proposed that tree or graph structure was

more suitable for MWP. Those generation meth-

ods (Seq2Tree and Graph2Tree) further improved

generation capabilities through a speciﬁc structure.

Although very ﬂexible in generating complex equa-

tion combinations, the ﬁxed structure decoder also

limits its ﬁne-grained mapping. Recently, Cao et al.

(2021); Jie et al. (2022) introduced an iterative rela-

tion extraction approach, providing a new solving

view for MWP. It performs well at capturing local

relations, but lacks global generation capabilities,

especially for complex mathematical problems.

From the seq2seq translation to the seq2tree gen-

eration and relation extraction, those are essentially

seeking a suitable solving view for MWP. However,

arXiv:2210.11694v2 [cs.CL] 26 Aug 2023

MWP is more challenging than that as it requires

both precise relation reasoning about quantities and

reliable generation for diverse equation combina-

tions. Both are necessary for mathematical reason-

ing. Existing methods all consider the MWP from

a single view and thus bring certain limitations.

We argue that multiple views are required

to comprehensively solve the MWP. As shown

in Figure 1, the process of human solving in-

herently involves multiple reasoning views, i.e.,

top-down decomposition (

remaining fruits −

→

total fruits ×

→pick rate +

→

), and bottom-up

construction (

→pick rate ×

→total fruits −

→

remaining fruits

). Two reasoning views

are reversed in the process but consistent

in results. Meanwhile, mathematical equa-

tion can be expressed in multi-order traversal,

i.e., pre-order (

−,×,+,2,3,4,5

) and post-order

(

2,3,+,4,×,5,−

). Two sequences are quite dis-

similar in form but equivalent in logic. Two order

traversal equation corresponds exactly to the two

reasoning processes, i.e. , the pre-order equation

is a top-down reasoning view, while the post-order

can be seen as a bottom-up reasoning view.

Inspired by this, we design multi-view reason-

ing using multi-order traversal. The MWP solv-

ing is decoupled into two independent but consis-

tent views: top-down reasoning using pre-order

traversal to decompose problem from global to lo-

cal and a bottom-up process following post-order

traversal for relation construction from local to

global. Pre-order and post-order traversals should

be equivalent in math just as top-down decompo-

sition and bottom-up construction should be con-

sistent. In Figure 1, we add multi-granularity con-

trastive learning to align the intermediate expres-

sions generated by two views in the same latent

space. Through consistent alignment, two views

constrain each other and jointly learn a accurate

and complete representation for math reasoning.

Besides, math operator must conform to mathe-

matical laws (e.g., commutative law). We devise a

knowledge-enhanced augmentation to incorporate

mathematical rules into the learning process, pro-

moting multi-view reasoning more consistent with

mathematical rules.

Our contributions are threefold:

•

We treat multi-order traversal as a multi-view

reasoning process, which contains a top-down

decomposition using pre-order traversal and

a down-up construction following post-order.

Both views are necessary for MWP.

•

We introduce consistent contrastive learning

to align two views reasoning processes, fus-

ing ﬂexible global generation and accurate

semantics-to-equation mapping. We also de-

sign an augmentation process for rules injec-

tion and understanding.

•

Extensive experiments on multiple standard

datasets show our method signiﬁcantly outper-

forms existing baselines. Our method can also

generate equivalent but non-annotated math

equations, demonstrating reliable reasoning

ability behind our multi-view framework.

2 Related Work

Reliable reasoning is a necessary capability to

move towards general-purpose AI. How to achieve

human-like reasoning has been extensively re-

searched in areas such as natural language process-

ing, reinforcement learning, and robotics (Fu et al.,

2021;Zhang et al.,2021,2022a). In particular,

mathematical reasoning is an important manifesta-

tion of intelligence. Automatically solving mathe-

matical problems has been studied for a long time,

from rule-based methods (Fletcher,1985;Bakman,

2007;Yuhui et al.,2010) with hand-crafted fea-

tures and templates-based methods (Kushman et al.,

2014;Roy and Roth,2018) to deep learning meth-

ods (Wang et al.,2017;Ling et al.,2017) with

the encoder-decoder framework. The introduction

of Transformer (Vaswani et al.,2017) and pre-

trained language models (Devlin et al.,2019;Liu

et al.,2019b) greatly improves the performance

of MWPs. From the perspective of proxy tasks,

we divide the recent works into three categories:

seq2seq-based translation, seq2structure genera-

tion, and iterative relation extraction.

Seq2seq-based translation MWPs are treated

as a translation task, translating human language

into mathematical language (Liang and Zhang,

2021). Wang et al. (2017) proposed a large-scale

dataset Math23K and used the vanilla seq2seq

method (Chiang and Chen,2019). Li et al. (2019)

introduced a group attention mechanism to en-

hance seq2seq method performance. Huang et al.

(2018) used reinforcement learning to optimize

translation task. Huang et al. (2017) incorporated

semantic-parsing methods to solve MWPs. Al-

though seq2seq-based methods have made great

progress in the ﬁeld, the performance of these meth-

ods is still unsatisfying, since the generation of

mathematical equations requires relation reasoning

over quantities than natural language.

Seq2structure-based generation Liu et al.

(2019a); Xie and Sun (2019) introduced tree-

structured decoder to generate mathematical ex-

pressions. This explicit tree-based design rapidly

dominated the MWPs community. Other re-

searchers have begun to explore reasonable struc-

tures for encoder. Li et al. (2020); Zhang et al.

(2020,2022b) used graph neural networks to ex-

tract effective logical information from the natu-

ral language problem. Liang and Zhang (2021)

adopted the teacher model using contrast learn-

ing to improve the encoder. Several researchers

have attempted to extract multi-level features from

the problems using the hierarchical encoder (Lin

et al.,2021) and pre-trained model (Yu et al.,2021).

Many auxiliary tasks are used to enhance the sym-

bolic reasoning ability (Qin et al.,2021). Wu

et al. (2020,2021) tried to introduce mathemat-

ical knowledge to solve the difﬁcult mathematical

reasoning. These structured generation approaches

show strong generation capabilities towards com-

plex mathematical reasoning tasks.

Iterative relation extraction Recently, some re-

searchers have borrowed ideas from the ﬁeld of in-

formation extraction (Shen et al.,2021b), and have

designed iterative relation extraction frameworks

for predicting math relations between two numeric

tokens. Kim et al. (2020) designed an expression-

pointer transformer model to predict expression

fragmentation. Cao et al. (2021) introduced a DAG

structure to extract numerical token relation from

bottom to top. Jie et al. (2022) further treated the

MWP task as an iterative relation extraction task,

achieving impressive performance. These works

provide a new perspective to tackle MWP from

a local relation construction view, improving the

ﬁne-grained relation reasoning between quantities.

The above proxy tasks are designed from differ-

ent solving views. The seq2seq is a left-to-right

consecutive view, while seq2tree is a tree view, and

the relation extraction method emphasizes a local

relation view. Unlike these single-view methods,

our approach employs multiple consistent reason-

ing views to address the challenges of MWP.

3 Approach

3.1 Overview

The MWP is to predict the equation

and

the answer based on a problem description

T={w1, w2· · · wn}

containing

words and

quantity words

Q={q1, q2,· · · , qm}

. The equa-

tion

is a sequence of constant words (e.g., 3.14),

mathematical operator

op ={+,−,×,÷,· · · }

and

quantity words from

. Solving MWP is to ﬁnd the

optimal mapping

T→ˆ

, allowing predicted

derive the correct answer. Existing methods learn

this mapping from a single view, e.g., seq2tree

generation and iterative relation extraction. Our

consistent contrastive learning approach solves this

by reasoning from multiple views. Both top-down

and bottom-up view are necessary for a complete

semantics-to-equation mapping.

3.2 Multi-View using Multi-Order

We use the labeled mid-order equation to generate

two different sequences

Ypre ={yf

1, yf

2,· · · , yf

and

Ypost ={yb

1, yb

2,· · · , yb

using pre-order and

post-order traversal. As shown in Figure 1, we treat

the

Ypre

as the label for the top-down process and

the Ypost is for the bottom-up process training.

Global shared Embedding Firstly, we design

three types of global shared embedding matrix: text

word embedding

, quantity word embedding

, mathematical operator embedding

Eop

. Text

embedding and quantity word embedding are ex-

tracted from the pre-trained language model (De-

vlin et al.,2019;Liu et al.,2019b), and operator

embeddings are randomly initialized. Besides, all

constant word embeddings are also randomly ini-

tialized and added to

. As shown in Figure 2,

three global embeddings are shared by two rea-

soning processes. Then, text embeddings

are

fused into a target vector

troot

by the Bidirectional

Gated Recurrent Unit (GRU) (Cho et al.,2014),

where

troot

means the global target for top-down

reasoning. Quantity embeddings

is for quantity

relation construction in bottom-up reasoning.

Top-down view using Pre-order The top-

down view is a global-to-local decomposition

that follows the pre-order equation

Ypre

(e.g.,

−,×,+,2,3,4,5

). This process is similar to Xie

and Sun (2019). Starting from the root node, each

node needs to conduct node prediction, and the

operator node also conduct node decomposition,

e.g., in Figure 1, root node predicts its node type

is “operator” and output token is “

−

” and then is

decomposed into two child nodes. Two child nodes

are predicted to “×” in step 2and “5” in step 7.

Node prediction Each node has a target vector

decomposed from their parent (for root node,

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

MultiViewReasoning)ConsistentContrastiveLearningforMathWordProblemWenqiZhangsnYongliangShensnYannaMainXiaoxiaChengsnZeqiTansnQingpengNonggnWeimingLusysCollegeofComputerScienceandTechnologynZhejiangUniversityiUniversityofShanghaiforScienceandTechnologygZhongxingTelecommunicationEquipmentCorporationy{...

展开>> 收起<<

Multi-View Reasoning Consistent Contrastive Learning for Math Word Problem Wenqi Zhang1 Yongliang Shen1 Yanna Ma2 Xiaoxia Cheng1.pdf

共15页,预览3页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Multi-View Reasoning Consistent Contrastive Learning for Math Word Problem Wenqi Zhang1 Yongliang Shen1 Yanna Ma2 Xiaoxia Cheng1

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: