
CAT-probing: A Metric-based Approach to Interpret How Pre-trained
Models for Programming Language Attend Code Structure
Nuo Chen∗
, Qiushi Sun∗
, Renyu Zhu∗
, Xiang Li†, Xuesong Lu, and Ming Gao
School of Data Science and Engineering, East China Normal University, Shanghai, China
{nuochen,qiushisun,renyuzhu}@stu.ecnu.edu.cn,
{xiangli,xslu,mgao}@dase.ecnu.edu.cn
Abstract
Code pre-trained models (CodePTMs) have
recently demonstrated significant success in
code intelligence. To interpret these mod-
els, some probing methods have been ap-
plied. However, these methods fail to con-
sider the inherent characteristics of codes. In
this paper, to address the problem, we pro-
pose a novel probing method CAT-probing
to quantitatively interpret how CodePTMs at-
tend code structure. We first denoise the in-
put code sequences based on the token types
pre-defined by the compilers to filter those to-
kens whose attention scores are too small. Af-
ter that, we define a new metric CAT-score to
measure the commonality between the token-
level attention scores generated in CodePTMs
and the pair-wise distances between corre-
sponding AST nodes. The higher the CAT-
score, the stronger the ability of CodePTMs
to capture code structure. We conduct ex-
tensive experiments to integrate CAT-probing
with representative CodePTMs for different
programming languages. Experimental re-
sults show the effectiveness of CAT-probing in
CodePTM interpretation. Our codes and data
are publicly available at https://github.
com/nchen909/CodeAttention.
1 Introduction
In the era of “Big Code” (Allamanis et al.,2018),
the programming platforms, such as GitHub and
Stack Overflow, have generated massive open-
source code data. With the assumption of “Soft-
ware Naturalness” (Hindle et al.,2016), pre-trained
models (Vaswani et al.,2017;Devlin et al.,2019;
Liu et al.,2019) have been applied in the domain
of code intelligence.
Existing code pre-trained models (CodePTMs)
can be mainly divided into two categories:
structure-free methods (Feng et al.,2020;Svy-
∗Equal contribution, authors are listed alphabetically.
†Corresponding author.
atkovskiy et al.,2020) and structure-based meth-
ods (Wang et al.,2021b;Niu et al.,2022b). The
former only utilizes the information from raw code
texts, while the latter employs code structures,
such as data flow (Guo et al.,2021) and flattened
AST
1
(Guo et al.,2022), to enhance the perfor-
mance of pre-trained models. For more details,
readers can refer to Niu et al. (2022a). Recently,
there exist works that use probing techniques (Clark
et al.,2019a;Vig and Belinkov,2019;Zhang et al.,
2021) to investigate what CodePTMs learn. For
example, Karmakar and Robbes (2021) first probe
into CodePTMs and construct four probing tasks
to explain them. Troshin and Chirkova (2022) also
define a series of novel diagnosing probing tasks
about code syntactic structure. Further, Wan et al.
(2022) conduct qualitative structural analyses to
evaluate how CodePTMs interpret code structure.
Despite the success, all these methods lack quan-
titative characterization on the degree of how well
CodePTMs learn from code structure. Therefore,
a research question arises: Can we develop a new
probing way to evaluate how CodePTMs attend
code structure quantitatively?
In this paper, we propose a metric-based probing
method, namely, CAT-probing, to quantitatively
evaluate how
C
odePTMs
A
ttention scores relate to
distances between AS
T
nodes. First, to denoise the
input code sequence in the original attention scores
matrix, we classify the rows/cols by token types
that are pre-defined by compilers, and then retain
tokens whose types have the highest proportion
scores to derive a filtered attention matrix (see Fig-
ure 1(b)). Meanwhile, inspired by the works (Wang
et al.,2020;Zhu et al.,2022), we add edges to im-
prove the connectivity of AST and calculate the dis-
tances between nodes corresponding to the selected
tokens, which generates a distance matrix as shown
in Figure 1(c). After that, we define CAT-score to
measure the matching degree between the filtered
1Abstract syntax tree.
arXiv:2210.04633v4 [cs.SE] 10 Dec 2022