Putting Them under Microscope A Fine-Grained Approach for Detecting Redundant Test Cases in Natural Language
2025-05-02
1
0
1.34MB
12 页
10玖币
侵权投诉
Puing Them under Microscope: A Fine-Grained Approach for
Detecting Redundant Test Cases in Natural Language
Zhiyuan Chang∗
Mingyang Li∗
{zhiyuan2019,mingyang2017}@iscas.ac.cn
Laboratory for Internet Software Technologies,
Institute of Software Chinese Academy of Sciences
Beijing, China
University of Chinese Academy of Sciences
,
Beijing
,
China
Junjie Wang†
junjie@iscas.ac.cn
Laboratory for Internet Software Technologies,
Institute of Software Chinese Academy of Sciences
Beijing, China
University of Chinese Academy of Sciences
,
Beijing
,
China
Qing Wang†
wq@iscas.ac.cn
Laboratory for Internet Software Technologies,
Institute of Software Chinese Academy of Sciences
Beijing, China
State Key Laboratory of Computer Science,
Institute of Software Chinese Academy of Sciences
Beijing, China
University of Chinese Academy of Sciences
,
Beijing
,
China
Shoubin Li
shoubin@iscas.ac.cn
Laboratory for Internet Software Technologies,
Institute of Software Chinese Academy of Sciences
Beijing, China
University of Chinese Academy of Sciences
,
Beijing
,
China
ABSTRACT
Natural language (NL) documentation is the bridge between soft-
ware managers and testers, and NL test cases are prevalent in
system-level testing and other quality assurance activities. Due
to reasons such as requirements redundancy, parallel testing, and
tester turnover within long evolving history, there are inevitably
lots of redundant test cases, which signicantly increase the cost.
Previous redundancy detection approaches typically treat the tex-
tual descriptions as a whole to compare their similarity and suer
from low precision. Our observation reveals that a test case can
have explicit test-oriented entities, such as tested function Compo-
nents, Constraints, etc; and there are also specic relations between
these entities. This inspires us with a potential opportunity for
accurate redundancy detection. In this paper, we rst dene ve
test-oriented entity categories and four associated relation cate-
gories and re-formulate the NL test case redundancy detection
problem as the comparison of detailed testing content guided by
the test-oriented entities and relations. Following that, we propose
Tscope
, a ne-grained approach for redundant NL test case de-
tection by dissecting test cases into atomic test tuple(s) with the
entities restricted by associated relations. To serve as the test case
dissection,
Tscope
designs a context-aware model for the automatic
entity and relation extraction. Evaluation on 3,467 test cases from
∗Both authors contributed equally to this research.
†Corresponding authors.
Permission to make digital or hard copies of part or all of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for prot or commercial advantage and that copies bear this notice and the full citation
on the rst page. Copyrights for third-party components of this work must be honored.
For all other uses, contact the owner/author(s).
ESEC/FSE ’22, November 14–18, 2022, Singapore, Singapore
©2022 Copyright held by the owner/author(s).
ACM ISBN 978-1-4503-9413-0/22/11.
https://doi.org/10.1145/3540250.3549089
ten projects shows
Tscope
could achieve 91.8% precision, 74.8%
recall, and 82.4% F1, signicantly outperforming state-of-the-art
approaches and commonly-used classiers. This new formulation
of the NL test case redundant detection problem can motivate the
follow-up studies to further improve this task and other related
tasks involving NL descriptions.
CCS CONCEPTS
•Software and its engineering →Software testing and de-
bugging;Acceptance testing.
KEYWORDS
Test Case Redundancy, Entity and Relation Extraction, Natural
Language Processing
ACM Reference Format:
Zhiyuan Chang, Mingyang Li, Junjie Wang, Qing Wang, and Shoubin Li.
2022. Putting Them under Microscope: A Fine-Grained Approach for De-
tecting Redundant Test Cases in Natural Language. In Proceedings of the
30th ACM Joint European Software Engineering Conference and Sympo-
sium on the Foundations of Software Engineering (ESEC/FSE ’22), Novem-
ber 14–18, 2022, Singapore, Singapore. ACM, New York, NY, USA, 12 pages.
https://doi.org/10.1145/3540250.3549089
1 INTRODUCTION
Software testing is an activity to ensure that an entire system meets
its requirements [
5
]. In the testing phase, testers need to analyze the
requirements specication, identify all the test execution scenarios,
and then instantiate them in manually written test cases [
54
]. Such
test cases are typically described in natural language (NL). Due to
their adjustability and interpretability, the NL test cases are still
prevalent in industrial practice [32].
A requirement covers multiple features, and there may be over-
lapping features among requirements. For a large software project,
arXiv:2210.01661v1 [cs.SE] 4 Oct 2022
ESEC/FSE ’22, November 14–18, 2022, Singapore, Singapore Zhiyuan Chang, Mingyang Li, Junjie Wang, Qing Wang, and Shoubin Li
the requirements are typically tested by dierent engineers, and en-
gineers are not aware of the feature overlapping. Test redundancy
may produce when each test engineer individually designs test
case(s) for assigned requirements [
14
,
36
]. As the system evolves,
the redundant test cases signicantly increase the cost of testing,
as well as maintenance eort[
36
]. The problem is especially obvi-
ous in the manual testing scenario where human testers must read
through test steps and carry them out manually by interacting with
the system [22].
To alleviate the issue, information retrieval-based approaches
have been proposed to automatically detect redundancy among the
NL test cases [
32
,
49
,
53
]. The general idea is to vectorize the de-
scription of the test case with text representing models, e.g., vector
space model or Doc2Vec, and conduct the similarity comparison
on it. However, these existing approaches suer from low accuracy
because they treat test cases’ textual descriptions as a whole, and
thus can not capture its ne-grained semantic information and
inherent meaning. Meanwhile, we have the following two observa-
tions which can facilitate the similarity comparison and redundancy
detection of the NL test case.
First, the test case has explicit categories of test-oriented
entities which can facilitate accurate redundancy detection.
Take Figure 1as an example, the two test cases look similar in their
textual descriptions, and would be detected as redundancy with the
aforementioned information retrieval-based approaches. However,
if putting these two test cases under the microscope, we can nd
that the executing manners of these two test cases (“mesa-util tool”
and “UnixBench tool”) are dierent, based on which, we can distin-
guish them accurately. More than that, one can easily observe that
there are dierent categories of test-oriented entities, for example,
“gear rotation processing” is the tested functional component, while
“when drawing 3D graphics” is the pre-conditions for executing the
test case. Only when the specic categories of test-oriented entities
are mapped, can the two test cases be determined as redundant.
Taken in this sense, this paper aims at identifying the test-oriented
entities to facilitate the accurate detection of NL test cases.
Figure 1: Non-redundant test cases with similar descriptions
Second, there might be multiple test-oriented entities that
need to be carefully parsed and matched to ensure accurate
redundancy detection.
The rst observation has motivated us to
conduct the comparison within the same category of test-oriented
entities for determining redundancy. However, when we put the
two test cases in Figure 2under the microscope, a second observa-
tion is made. There are both testing Behavior “browse” and tested
Component “visit history” in these two test cases, yet they are ex-
pressing dierent test-oriented operational information. In detail, in
test case #346, the Behavior “browse” is targeted at Component “con-
tent of each resource diretory”, and the Component “visit history”
is associated with the Behavior “switch”, while in test case #525
Behavior “browse” is directly for Component “visit history”. The
observation implies that the multiple test-oriented entities need to
be carefully parsed and matched, and it is necessary to identify the
test-oriented operational information, i.e., entities and associated
relations when analyzing test cases to achieve accurate redundancy
detection.
Figure 2: Non-redundant test cases with multiple test-
oriented entities
Motivated by the two ndings, we dene ve test-oriented en-
tity categories, i.e., Component,Behavior,Prerequisite,Manner and
Constraint, and four relation categories associated with the enti-
ties. We then re-formulate the NL test case redundancy detection
problem as the comparison of detailed testing content guided by
the test-oriented entities and relations.
Following that, we propose a ne-grained redundant test case de-
tection approach
Tscope1
, which dissects the test case into atomic
test tuple(s) with the ve entities restricted by their associated rela-
tions, and conducts the comparison on them. One example test tuple
dissected from Test case #525 in Figure 2is as follows, Behavior
“browse”, component “visit history” and Manner “mouse”. To achieve
this,
Tscope
rst designs a context-aware model for extracting test-
oriented entities and relations from test case descriptions, which
considers the global context of the test case for entity extraction,
and the local context of the involving entities for relation extraction.
After that,
Tscope
dissects each test case into the structured atomic
testing tuple(s) guided by the extracted entities and relations. Fi-
nally,
Tscope
detects redundancy by comparing the entities in each
tuple pair, considering the semantic meaning of the entities as well
as their involved indicative words.
We evaluate
Tscope
on 3,467 test cases from ten projects. The
evaluation results show that
Tscope
could reach 97.5% precision,
94.8% recall for the entity extraction, and 90.4% precision, 97.6%
recall for the relation extraction, which signicantly outperforms
two state-of-the-art approaches. For the redundancy detection task,
Tscope
could achieve 91.8% precision, 74.8% recall and 82.4% F1.
Compared with the two state-of-the-art redundancy detection ap-
proaches and four commonly-used classiers,
Tscope
is 19.8%-23.4%
higher in F1. Moreover, the results of ablation experiments show
that the ve entity categories all play signicant roles in Tscope.
The new formulation of the NL test case redundant detection
problem can motivate the follow-up studies to further improve this
task, and other related tasks involving NL descriptions. Actually,
there are several tasks in software engineering domain involving
the similarity comparison of two textual documents, e.g., duplicate
test reports detection [
24
,
25
], similar Stack Overow questions
identication [
55
], duplicate requirements detection [
38
], etc. The
previous techniques typically treat the textual descriptions as a
1
We name our approach as
Tscope
considering it likes a microscope to inspect the
detailed information in test cases to facilitate the redundant detection.
Puing Them under Microscope: A Fine-Grained Approach for Detecting Redundant Test Cases in Natural LanguageESEC/FSE ’22, November 14–18, 2022, Singapore, Singapore
whole for the similarity comparison, while ignoring the ne-grained
semantic information hidden in the text. The new formulation
proposed in this paper, i.e., comparison of detailed content guided
by the scenario-related entities and relations, could potentially
motivate the researchers in these related elds.
In summary, the key contributions of this paper are as follows:
•
The new formulation of the NL test case redundancy detec-
tion problem, i.e., the comparison of detailed testing content
guided by the test-oriented entities and relations.
•
A ne-grained redundancy detection approach
Tscope
for
NL test cases, which dissects the test case into atomic test
tuple(s) with the ve entities restricted by their associated
relations, and conducts the comparison on them.
•
A context-aware model for extracting test-oriented entities
and their relations from test case descriptions, which in-
volves the global context of the test case in entity extraction,
and the local context of the involved entities for relation
extraction.
•
Evaluation with 3,467 test cases from ten projects, with
promising results. We also publicize the source code
2
for
facilitating follow-up studies and other related tasks.
The remainders of the paper are as follows: Section 2presents the
empirical studies of the entity category for redundancy detection.
Section 3elaborates the approach. Section 4presents the experi-
ment design. Section 5describes the results. Section 6discusses
the learned lessons. Section 7introduces the related work and its
limitations. Section 8concludes our work.
2 EMPIRICAL ANALYSIS OF ENTITIES AND
RELATIONS
2.1 Categories of Entities and Relations
Motivated by the observations in Section 1, we provide a new for-
mulation of the NL test case redundancy detection problem, i.e., the
comparison of detailed testing content guided by the test-oriented
entities and relations. To achieve this, we dene ve categories of
entities and four categories of relations associated with the entities.
Specically, we explore entity and relation categories through a
bottom-up analysis approach. Specically, three researchers (details
in Section 4.2) are involved in mining the categories of entities and
relations that aect redundancy detection in the test case text. If all
three researchers agree on adding a category, this entity category is
admitted and added to the entity category set. While if their views
diverge, the decision is made through a voting mechanism, i.e., the
entity category will be added to the set if it is admitted by at least
two researchers. Finally, we obtain the ve entity categories and
corresponding relations among the entity categories. Table 1shows
each entity/relation category and examples.
2.1.1
Categories of Entities
.The denition of the ve entity
categories is based on the purpose and basics of software testing,
as well as the observations on NL test cases. First, test cases are
driven by the feature(s) in requirements, and a feature species the
behavior
of one or more
components
in terms of their current
conditions
[
9
]. Taken in this sense, the key entities in a feature will
2https://github.com/czycurefun/testcase_detection
also be reected in the test case descriptions. Therefore, we identify
three entity categories “Component”, “Behavior” and “Prerequisite”
respectively.
Second, according to our observations, test cases dier by the
Manner sometimes. For example, there are descriptions of two non-
redundant test cases in Figure 1. The two test cases have the same
Prerequisite (“When drawing 3D graphics”) and Component (“gear
rotation processing”), but dierent operation manner (“mesa-util
tool” and “UnixBench tool”). To reect this dierence, we dene an
entity category “Manner”.
Third, in some cases, test cases may dier by the satised con-
straints. For example, there are two descriptions, “Test there are
preset applications after the system installation” and “Test the preset
applications including FTP application after the system installation”.
The two test cases have the same Component (“preset applications”)
but the latter additionally involves the constraint (“including FTP
application”). Accordingly, we dene an entity category “Constraint”
to indicate the dierence.
2.1.2
Categories of Relations
.As shown in Figure 2, there may
be multiple test-oriented entities per entity category within a test
case, which implies the need for inspecting the entities within the
test case a step further. Taking Test Case #346 in Figure 2as an
example, Behavior “browse” is targeting at Components “contents
of each resource diretory”, and Behavior “switch” is acting on Com-
ponents “visit history”. This demonstrates the mapping between
Components and Behavior, and we dene it as the Act relation.
We also observe the relations in terms of the other three cate-
gories of entities, e.g., the executing manner of the testing. And
considering the components in the test case are the basic object of
the testing content, we dene other three relations between Com-
ponent and Prerequisite,Manner,Constraint to indicate the detailed
information of the testing (details in Table 1).
2.2 Correlation Analysis
We conduct an empirical study to investigate the eectiveness of
the entity categories for redundancy detection. Specically, we
randomly sample 5,000 test case pairs and manually label each test
case by comparing each pair.
3
Then, we build ve Boolean variables
by manual judgment, i.e.,
𝐸𝑄𝑐𝑜𝑚
,
𝐸𝑄𝑏𝑒ℎ
,
𝐸𝑄𝑝𝑟𝑒
,
𝐸𝑄𝑚𝑎𝑛
and
𝐸𝑄𝑐𝑜𝑛
.
Each variable represents the entities belonging to each category
in the summaries are manually judged as equivalent. At the same
time, a variable
𝑅𝑒𝑑𝑢𝑛𝑑𝑎𝑛𝑡
is built according to the redundancy
label (not based on entity comparison), representing whether a test
case is truly redundant.
We analyze the correlation between the above ve variables
and the variable
𝑅𝑒𝑑𝑢𝑛𝑑𝑎𝑛𝑡
. Table 2shows the Pearson correlation
coecient and p-value of the correlation test. The results show
that the ve entity categories are signicantly correlated to the
variable
𝑅𝑒𝑑𝑢𝑛𝑑𝑎𝑛𝑡
, which indicates the eectiveness of each en-
tity category for redundancy detection. Moreover, we analyze the
consistency of the two variables, i.e.,
𝐸𝑄𝑎𝑙𝑙
and
𝑅𝑒𝑑𝑢𝑛𝑑𝑎𝑛𝑡
, where
𝐸𝑄𝑎𝑙𝑙
represents that the entities belonging to the ve entity cate-
gories in the test case pair are all equivalent by manual comparison.
Cohen
´
kappa coecient is 0.984, which shows the signicant consis-
tency of the two distributions. The results indicate that redundant
3
The test case pairs are built from the dataset in Table 4. The pairing and labeling
processes are consistent with the descriptions in Section 4.2.
摘要:
展开>>
收起<<
PuttingThemunderMicroscope:AFine-GrainedApproachforDetectingRedundantTestCasesinNaturalLanguageZhiyuanChang∗MingyangLi∗{zhiyuan2019,mingyang2017}@iscas.ac.cnLaboratoryforInternetSoftwareTechnologies,InstituteofSoftwareChineseAcademyofSciencesBeijing,ChinaUniversityofChineseAcademyofSciences,Beijing,...
声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
相关推荐
-
公司营销部领导述职述廉报告VIP免费
2024-12-03 4 -
100套述职述廉述法述学框架提纲VIP免费
2024-12-03 3 -
20220106政府党组班子党史学习教育专题民主生活会“五个带头”对照检查材料VIP免费
2024-12-03 3 -
20220106县纪委监委领导班子党史学习教育专题民主生活会对照检查材料VIP免费
2024-12-03 6 -
A文秘笔杆子工作资料汇编手册(近70000字)VIP免费
2024-12-03 3 -
20220106县领导班子党史学习教育专题民主生活会对照检查材料VIP免费
2024-12-03 4 -
经济开发区党工委书记管委会主任述学述职述廉述法报告VIP免费
2024-12-03 34 -
20220106政府领导专题民主生活会五个方面对照检查材料VIP免费
2024-12-03 11 -
派出所教导员述职述廉报告6篇VIP免费
2024-12-03 8 -
民主生活会对县委班子及其成员批评意见清单VIP免费
2024-12-03 50
分类:图书资源
价格:10玖币
属性:12 页
大小:1.34MB
格式:PDF
时间:2025-05-02


渝公网安备50010702506394