ReAct AReview Comment Dataset for Actionability and more Gautam Choudhary1 Natwar Modani1 and Nitish Maurya2

2025-04-29 0 0 447.22KB 8 页 10玖币
侵权投诉
ReAct : A Review Comment Dataset for
Act ionability (and more)
Gautam Choudhary1, Natwar Modani1, and Nitish Maurya2
1Adobe Research
2Adobe System
{gautamc, nmodani, nmaurya}@adobe.com
Abstract. Review comments play an important role in the evolution
of documents. For a large document, the number of review comments
may become large, making it difficult for the authors to quickly grasp
what the comments are about. It is important to identify the nature
of the comments to identify which comments require some action on
the part of document authors, along with identifying the types of these
comments. In this paper, we introduce an annotated review comment
dataset ReAct. The review comments are sourced from OpenReview site.
We crowd-source annotations for these reviews for actionability and type
of comments. We analyze the properties of the dataset and validate the
quality of annotations. We release the dataset to the research community
as a major contribution3. We also benchmark our data with standard
baselines for classification tasks and analyze their performance.
Keywords: review dataset ·actionability ·taxonomy ·text-classification
1 Introduction
Review comments play an important role in the evolution of documents. Aca-
demic publications routinely go through a peer-review process, where the re-
viewers provide both their opinion about the suitability of the articles for the
publication venue and also feedback to the authors for potentially improving
the contributed article. Further, several publication venues are providing the au-
thors a chance to respond to the review comments (for example, ACL, NAACL,
EMNLP, etc., in addition to most journals). Therefore, it is important for the
authors to be able to quickly digest the review comments so that they can ad-
dress the concerns of the reviewers and clarify certain points which may not have
been communicated adequately by the article itself.
In this work, we focus on two aspects of understanding the review comments.
First, determining if a review comment requires some action on the part of docu-
ment authors. This motivates the need for the classification of review comments
based on ‘actionability’. Second, what type of review comment it is among Agree-
ment, Disagreement, Question, Suggestion, Shortcoming, Statement of Fact, and
3Full dataset available at https://github.com/gtmdotme/ReAct
arXiv:2210.00443v1 [cs.CL] 2 Oct 2022
2 Gautam Choudhary, Natwar Modani, Nitish Maurya
Others, similar to (but not exactly the same as) [11]. We provide the reason for
our choice of these specific types and their justification in Section 2.2.
Text classification has long been an active area of research, as the classifi-
cation can help the users efficiently process a large amount of content. Finding
actionable comments on social media (tweets) was addressed in [13] using new
lexicon features. A specificity score was explored in [3] for an employee satisfac-
tion survey and product review settings to understand actionable suggestions
and grievances (complaints) for improvements. In yet another work [9], the ac-
tionability of review comments for code review is investigated using lexical fea-
tures. These works address only the actionability aspect of our problem, and the
datasets used in these papers are not publicly available except in [9], where the
dataset is made available publicly.
Other binary classifications in prior work include Question classification [15],
agreement/disagreement classification [1] and suggestions/advice mining [4]. How-
ever, such binary classifications only provide information on a single dimension
in isolation and fall short in providing a more extensive set of categorization as
done in [10], where the authors investigated comments on product reviews in
an e-commerce setting. Again, the datasets are not publicly available, and the
categories proposed are not comprehensive in nature.
OpenReview is a popular online forum for reviewing research papers and the
choice of gathering data from this forum is motivated by a comprehensive study
for analyzing the review process [14]. [6] also present PeerRead dataset con-
solidating reviews from a lot of conferences. Our dataset provides finer-grained
annotation by providing two labels per review comment sentence and thereby
opens up a new research direction.
Our key contributions in this paper are:
A review comment dataset consisting of 1,250 labeled comments for iden-
tifying actionability and their types. We also have 52k+ unlabelled (but
otherwise processed) comments in this dataset for future extensions and/or
use of semi-supervised approaches.
A taxonomy for types of review comments.
Establishing strong baselines for the proposed dataset.
2 Dataset: ReAct
While the prior art focuses on feature engineering and model architecture, we
note a lack of publicly available datasets in this problem set. This section de-
scribes how we arrive at the proposed annotated dataset, ReAct.
In this paper, We use Fleiss’ kappa κ[5] as the measure of inter-annotator
agreement. It is used to determine the level of agreement between two or more
annotators when the response variable is measured on a categorical scale.
2.1 Raw Data Collection and Preprocessing
The proposed dataset is gathered from an online public forum OpenReview
where research papers are reviewed and discussed. Multiple anonymous reviewers
摘要:

ReAct:AReviewCommentDatasetforActionability(andmore)GautamChoudhary1,NatwarModani1,andNitishMaurya21AdobeResearch2AdobeSystemfgautamc,nmodani,nmauryag@adobe.comAbstract.Reviewcommentsplayanimportantroleintheevolutionofdocuments.Foralargedocument,thenumberofreviewcommentsmaybecomelarge,makingitdicul...

展开>> 收起<<
ReAct AReview Comment Dataset for Actionability and more Gautam Choudhary1 Natwar Modani1 and Nitish Maurya2.pdf

共8页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:8 页 大小:447.22KB 格式:PDF 时间:2025-04-29

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 8
客服
关注