DISENTANGLED AND ROBUST REPRESENTATION LEARNING FOR BRAGGING CLASSIFICATION IN SOCIAL MEDIA Xiang Li1 Yucheng Zhou2

2025-05-03 0 0 578.59KB 5 页 10玖币

侵权投诉

DISENTANGLED AND ROBUST REPRESENTATION LEARNING

FOR BRAGGING CLASSIFICATION IN SOCIAL MEDIA

Xiang Li1, Yucheng Zhou2

1College of Intelligence and Computing, Tianjin University,

2Australian AI Institute, School of Computer Science, FEIT, University of Technology Sydney

lixiang eren@tju.edu.cn, yucheng.zhou-1@student.uts.edu.au

ABSTRACT

Researching bragging behavior on social media arouses in-

terest of computational (socio) linguists. However, existing

bragging classiﬁcation datasets suffer from a serious data

imbalance issue. Because labeling a data-balance dataset is

expensive, most methods introduce external knowledge to im-

prove model learning. Nevertheless, such methods inevitably

introduce noise and non-relevance information from exter-

nal knowledge. To overcome the drawback, we propose a

novel bragging classiﬁcation method with disentangle-based

representation augmentation and domain-aware adversarial

strategy. Speciﬁcally, model learns to disentangle and re-

construct representation and generate augmented features

via disentangle-based representation augmentation. More-

over, domain-aware adversarial strategy aims to constrain

domain of augmented features to improve their robustness.

Experimental results demonstrate that our method achieves

state-of-the-art performance compared to other methods.

Index Terms—Bragging Classiﬁcation, Disentangled

Feature, Adversarial Learning, Social Media

1. INTRODUCTION

Bragging classiﬁcation aims to predict the bragging type for a

social media text. As online communication on social media

is more pervasive and essential in human life, bragging (or

self-promotion) classiﬁcation has become a signiﬁcant area

in computational (socio) linguistics [1, 2]. It has been widely

applied in academia and industry, like helping linguists dive

into the context and types of bragging [2], supporting so-

cial scientists to study the relation between bragging and

other traits (e.g., gender, age, economic status, occupation)

[1, 3], enhancing online users’ self-presentation strategies

[4, 5], and many real-world NLP applications in business,

economics and education [6, 7].

Although bragging has been widely studied in the context

of online communication and forum, all these studies depend

on manual analyses on small data sets [8, 4, 9, 10, 3]. To

efﬁciently research bragging on social media, Jin et al. [2]

collect the ﬁrst large-scale dataset of bragging classiﬁcation in

computational linguistics, which contains six bragging types

and a non-bragging type. However, the dataset suffers from

a heavy data imbalance issue. For example, there are 2,838

examples in the non-bragging type, while only 58 to 166 (i.e.,

1% ∼4%) in the other bragging types. It severely affects the

learning of the model on examples with these bragging types.

To alleviate the data imbalance issue, apart from employ-

ing a weighted loss function to balance sample learning from

different types [11, 12], many researchers attempt to perform

data augmentation by injecting models with external knowl-

edge, such as knowledge graph [13, 14], pre-trained word em-

bedding [15, 16], translation [17] and some related pragmat-

ics tasks, i.e., complaint severity classiﬁcation [18]. As for

bragging classiﬁcation, Jin et al. [2] inject language models

with external knowledge from the NRC word-emotion lexi-

con, Linguistic Inquiry and Word Count(LIWC) and vectors

clustered by Word2Vector. Despite their success, improve-

ment of external knowledge injection relies on the relevance

between bragging classiﬁcation and other pragmatics tasks.

However, knowledge provided by other pragmatic tasks is

ﬁxed and obtained in a model-based manner, which inevitably

brings noise.

To get rid of the noise from external knowledge injec-

tion, we propose a disentangle-based feature augmentation

for disentangled representation and augmented feature learn-

ing without any other external knowledge. Speciﬁcally, we

ﬁrst disentangle content and bragging-type information from

a representation. Next, we generate a reconstructed represen-

tation by integrating disentangled information and then con-

strain consistency between representation and reconstructed

representation. To address the data imbalance problem, we

fuse disentangled information from different examples to gen-

erate augmented features for model training.

Moreover, we propose a domain-aware adversarial strat-

egy to mitigate domain disorder caused by augmented fea-

tures. Speciﬁcally, we present a discriminator on top of the

language model, which is trained to distinguish whether the

input is a representation from the encoder or an augmented

feature. Meanwhile, jointing with a classiﬁcation objective,

the encoder is trained to fool the discriminator, which pushes

arXiv:2210.15180v1 [cs.CL] 27 Oct 2022

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

DISENTANGLEDANDROBUSTREPRESENTATIONLEARNINGFORBRAGGINGCLASSIFICATIONINSOCIALMEDIAXiangLi1,YuchengZhou21CollegeofIntelligenceandComputing,TianjinUniversity,2AustralianAIInstitute,SchoolofComputerScience,FEIT,UniversityofTechnologySydneylixiangeren@tju.edu.cn,yucheng.zhou-1@student.uts.edu.auABSTRACTR...

展开>> 收起<<

DISENTANGLED AND ROBUST REPRESENTATION LEARNING FOR BRAGGING CLASSIFICATION IN SOCIAL MEDIA Xiang Li1 Yucheng Zhou2.pdf

共5页,预览1页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

DISENTANGLED AND ROBUST REPRESENTATION LEARNING FOR BRAGGING CLASSIFICATION IN SOCIAL MEDIA Xiang Li1 Yucheng Zhou2

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: