
Puing Them under Microscope: A Fine-Grained Approach for Detecting Redundant Test Cases in Natural LanguageESEC/FSE ’22, November 14–18, 2022, Singapore, Singapore
whole for the similarity comparison, while ignoring the ne-grained
semantic information hidden in the text. The new formulation
proposed in this paper, i.e., comparison of detailed content guided
by the scenario-related entities and relations, could potentially
motivate the researchers in these related elds.
In summary, the key contributions of this paper are as follows:
•
The new formulation of the NL test case redundancy detec-
tion problem, i.e., the comparison of detailed testing content
guided by the test-oriented entities and relations.
•
A ne-grained redundancy detection approach
Tscope
for
NL test cases, which dissects the test case into atomic test
tuple(s) with the ve entities restricted by their associated
relations, and conducts the comparison on them.
•
A context-aware model for extracting test-oriented entities
and their relations from test case descriptions, which in-
volves the global context of the test case in entity extraction,
and the local context of the involved entities for relation
extraction.
•
Evaluation with 3,467 test cases from ten projects, with
promising results. We also publicize the source code
2
for
facilitating follow-up studies and other related tasks.
The remainders of the paper are as follows: Section 2presents the
empirical studies of the entity category for redundancy detection.
Section 3elaborates the approach. Section 4presents the experi-
ment design. Section 5describes the results. Section 6discusses
the learned lessons. Section 7introduces the related work and its
limitations. Section 8concludes our work.
2 EMPIRICAL ANALYSIS OF ENTITIES AND
RELATIONS
2.1 Categories of Entities and Relations
Motivated by the observations in Section 1, we provide a new for-
mulation of the NL test case redundancy detection problem, i.e., the
comparison of detailed testing content guided by the test-oriented
entities and relations. To achieve this, we dene ve categories of
entities and four categories of relations associated with the entities.
Specically, we explore entity and relation categories through a
bottom-up analysis approach. Specically, three researchers (details
in Section 4.2) are involved in mining the categories of entities and
relations that aect redundancy detection in the test case text. If all
three researchers agree on adding a category, this entity category is
admitted and added to the entity category set. While if their views
diverge, the decision is made through a voting mechanism, i.e., the
entity category will be added to the set if it is admitted by at least
two researchers. Finally, we obtain the ve entity categories and
corresponding relations among the entity categories. Table 1shows
each entity/relation category and examples.
2.1.1
Categories of Entities
.The denition of the ve entity
categories is based on the purpose and basics of software testing,
as well as the observations on NL test cases. First, test cases are
driven by the feature(s) in requirements, and a feature species the
behavior
of one or more
components
in terms of their current
conditions
[
9
]. Taken in this sense, the key entities in a feature will
2https://github.com/czycurefun/testcase_detection
also be reected in the test case descriptions. Therefore, we identify
three entity categories “Component”, “Behavior” and “Prerequisite”
respectively.
Second, according to our observations, test cases dier by the
Manner sometimes. For example, there are descriptions of two non-
redundant test cases in Figure 1. The two test cases have the same
Prerequisite (“When drawing 3D graphics”) and Component (“gear
rotation processing”), but dierent operation manner (“mesa-util
tool” and “UnixBench tool”). To reect this dierence, we dene an
entity category “Manner”.
Third, in some cases, test cases may dier by the satised con-
straints. For example, there are two descriptions, “Test there are
preset applications after the system installation” and “Test the preset
applications including FTP application after the system installation”.
The two test cases have the same Component (“preset applications”)
but the latter additionally involves the constraint (“including FTP
application”). Accordingly, we dene an entity category “Constraint”
to indicate the dierence.
2.1.2
Categories of Relations
.As shown in Figure 2, there may
be multiple test-oriented entities per entity category within a test
case, which implies the need for inspecting the entities within the
test case a step further. Taking Test Case #346 in Figure 2as an
example, Behavior “browse” is targeting at Components “contents
of each resource diretory”, and Behavior “switch” is acting on Com-
ponents “visit history”. This demonstrates the mapping between
Components and Behavior, and we dene it as the Act relation.
We also observe the relations in terms of the other three cate-
gories of entities, e.g., the executing manner of the testing. And
considering the components in the test case are the basic object of
the testing content, we dene other three relations between Com-
ponent and Prerequisite,Manner,Constraint to indicate the detailed
information of the testing (details in Table 1).
2.2 Correlation Analysis
We conduct an empirical study to investigate the eectiveness of
the entity categories for redundancy detection. Specically, we
randomly sample 5,000 test case pairs and manually label each test
case by comparing each pair.
3
Then, we build ve Boolean variables
by manual judgment, i.e.,
𝐸𝑄𝑐𝑜𝑚
,
𝐸𝑄𝑏𝑒ℎ
,
𝐸𝑄𝑝𝑟𝑒
,
𝐸𝑄𝑚𝑎𝑛
and
𝐸𝑄𝑐𝑜𝑛
.
Each variable represents the entities belonging to each category
in the summaries are manually judged as equivalent. At the same
time, a variable
𝑅𝑒𝑑𝑢𝑛𝑑𝑎𝑛𝑡
is built according to the redundancy
label (not based on entity comparison), representing whether a test
case is truly redundant.
We analyze the correlation between the above ve variables
and the variable
𝑅𝑒𝑑𝑢𝑛𝑑𝑎𝑛𝑡
. Table 2shows the Pearson correlation
coecient and p-value of the correlation test. The results show
that the ve entity categories are signicantly correlated to the
variable
𝑅𝑒𝑑𝑢𝑛𝑑𝑎𝑛𝑡
, which indicates the eectiveness of each en-
tity category for redundancy detection. Moreover, we analyze the
consistency of the two variables, i.e.,
𝐸𝑄𝑎𝑙𝑙
and
𝑅𝑒𝑑𝑢𝑛𝑑𝑎𝑛𝑡
, where
𝐸𝑄𝑎𝑙𝑙
represents that the entities belonging to the ve entity cate-
gories in the test case pair are all equivalent by manual comparison.
Cohen
´
kappa coecient is 0.984, which shows the signicant consis-
tency of the two distributions. The results indicate that redundant
3
The test case pairs are built from the dataset in Table 4. The pairing and labeling
processes are consistent with the descriptions in Section 4.2.