Time Will Change Things An Empirical Study on Dynamic Language Understanding in Social Media Classiﬁcation Yuji Zhang Jing Li

2025-05-06 0 0 1.77MB 11 页 10玖币

侵权投诉

Time Will Change Things: An Empirical Study on Dynamic Language

Understanding in Social Media Classiﬁcation

Yuji Zhang, Jing Li

Department of Computing,

The Hong Kong Polytechnic University,

HKSAR, China

yu-ji.zhang@connect.polyu.hk jing-amelia.li@polyu.edu.hk

Abstract

Language features are ever-evolving in the

real-world social media environment. Many

trained models in natural language understand-

ing (NLU), ineffective in semantic inference

for unseen features, might consequently strug-

gle with the deteriorating performance in dy-

namicity. To address this challenge, we em-

pirically study social media NLU in a dy-

namic setup, where models are trained on

the past data and test on the future. It bet-

ter reﬂects the realistic practice compared to

the commonly-adopted static setup of random

data split. To further analyze model adaption

to the dynamicity, we explore the usefulness of

leveraging some unlabeled data created after

a model is trained. The performance of unsu-

pervised domain adaption baselines based on

auto-encoding and pseudo-labeling and a joint

framework coupling them both are examined

in the experiments. Substantial results on four

social media tasks imply the universally nega-

tive effects of evolving environments over clas-

siﬁcation accuracy, while auto-encoding and

pseudo-labeling collaboratively show the best

robustness in dynamicity.

1 Introduction

The advance of natural language understanding

(NLU) automates the learning of text semantics,

exhibiting the potential to broadly beneﬁts social

media applications. As shown in the previous

work (Tong et al.,2021;Heidari and Jones,2020;

Salminen et al.,2020), the pre-trained models from

the BERT family (Devlin et al.,2019;Liu et al.,

2019;Nguyen et al.,2020) have championed the

benchmark results in many social media tasks. Nev-

ertheless, will good benchmark results also indicate

good real-world performance on social media?

In view of our dynamic world, it is not hard to

envision an ever-evolving environment on social

media, which is shaped by what and how things are

discussed there in real time. As a result, language

features, formed by word patterns appearing there,

might also rapidly change over time. However,

many trendy NLU models, including the state-of-

the-art (SOTA) ones based on pre-training, demon-

strate compromised empirical results facing shifted

features (Hendrycks et al.,2020). The possible rea-

son lies in the widely-argued limitation of existing

NLU solutions on inferring meanings of new or

shifted features compared to what the models have

seen in the training data (Duchi and Namkoong,

2018;Arjovsky et al.,2019;Creager et al.,2021;

Shen et al.,2020;Liu et al.,2021b).

Consequently, the dynamic social media envi-

ronment in the realistic scenarios will continu-

ously challenge a trained NLU model with timely-

increasing unseen features (Nguyen et al.,2012),

further resulting in a deteriorating performance as

time goes by. To better illustrate this challenge, we

take the task and dataset of Twitter stance detection

for COVID-19 topics as an example (Glandt et al.,

2021). Two models based on LSTM and BERT are

trained on the past data and test on ﬁve datasets

with varying spans to the training set. The setup

and results are detailed in Figure 1(right).

Both models exhibit dropping accuracy scores

over time, implying the concrete challenge for them

to tackle dynamicity. To further analyze the rea-

sons, we employ variational auto-encoder (VAE)

(Kingma and Welling,2014) to learn the latent

topics (word clusters) from varying test sets and

display the words indicating the largest correlation

with each cluster in Figure 1(left). It is observed

that users’ discussion points change over time,

where the focus gradually shifted from the concern

to the virus itself (indicated by words like “Mask”,

“Immune Compromise”, “Lock Down”) to the dis-

appointment to the former US President Trump

(e.g., “Trump Land Slid” and “Lying Trump”). Be-

cause of the topic evolution, it might not be easy for

models trained with the

data to connect the later-

gathered “Trump”-patterns to an “against” stance

arXiv:2210.02857v1 [cs.CL] 6 Oct 2022

Time Topics

Scientist Doctor, Covid, Corona Virus, No Mask, Return To Work,

Govern, Nose, Mouth, Dread

Real Patriots Wear Mask, Failed Lock Down, 2020 US Election, Save,

Immune Compromise, Covid19 Outbreak, Schools Wear a Mask, Corona

Virus Pakistan, Corona Virus Canada

Symptom, Temperature, Lock Down, Panic Buy, Cough, I m With Fauci,

Trump Kills Us, Mask Up

Therapy, Inject, Wear Your Mask, Trump Is A National Disgrace, Trump

Land Slid, Surgeon, Covid 2019 India, Wear You Masks Dont Work

Red State, Blue State, Trump Lies Americans, Lying Trump, Trump

Melt Down, End Lock Down

Figure 1: Results from the Twitter stance detection dataset for COVID-19 topics (Glandt et al.,2021). t0refers to

the time span of the earliest 40% tweets and the rest are equally split into 4 segments in the chronological order

corresponding to t1,t2,t3, and t4, respectively. Latent topics from t0to t4are shown on the left and topic words

are learned by VAE. Stance detection results over time are shown on the right, where x-axis indicates test sets from

t0to t4and y-axis the prediction accuracy. LSTM results are displayed in the light blue line and BERT dark blue.

for topics related to his COVID-19 policies.

To empirically examine how dynamicity affects

NLU performance, we experiment in a

dynamic

setup

: the data is split with an absolute time, where

the messages posted beforehand are used for train-

ing while those afterwards are for test. On the

contrary, most social media benchmarks adopt the

static setup

, where training and test sets are ran-

domly split and tend to exhibit similar data distri-

butions (Glandt et al.,2021;Hansen et al.,2021;

Mathew et al.,2021). It is thus incapable of reﬂect-

ing the realistic application scenarios — a model

should usually learn to tackle the data created af-

ter it is trained while the evolving features would

continuously shift the data distributions.

Language learning with distribution shift (a.k.a.,

OOD, short for out-of-distribution) has drawn a

growing attention in the NLP community (Shen

et al.,2021;Arora et al.,2021). Most previous

work focuses on OOD in different domains (Muan-

det et al.,2013;Ganin et al.,2015) and studies

how to learn generalizable cross-domain features.

Here we experiment OOD in the dynamic environ-

ment — whose time-sensitive nature renders the

data evolution to occur progressively and contin-

uously; whereas most prior empirical studies dis-

cuss OOD across domains and hence focus on the

relatively discrete shifts from the source to target

domains (Volpi et al.,2018;Krueger et al.,2021).

To further examine NLU adaption to time evolu-

tion (henceforth

time-adaptive learning

), we ex-

ploit a small set of unlabeled data posted after a

model is trained (henceforth

trans-data

) and in-

vestigate its potential in mitigating the time-shaped

feature gap. For methodology, we start with the

existing solutions in unsupervised domain adaption

(UDA) (Ramponi and Plank,2020) and employ two

popular baselines in this line, one is feature-centric

based on auto-encoding (speciﬁcally VAE) and the

other data-centric pseudo-labeling (PL). Further-

more, a joint-training framework is explored to

study their coupled effects in ﬁghting against the

possible performance deterioration over time.

The experiments are based on three trendy so-

cial media tasks about the detection of COVID-19

stance (Glandt et al.,2021), fake news (Hansen

et al.,2021), and hate speech (Mathew et al.,

2021) with the benchmark data from Twitter. We

also gather a new corpus for hashtag prediction to

broaden our scope to noisy user-generated labels

tremendous on social media.

Dynamic setup is

adopted and models are tested on multiple datasets

varying in the time gap to the training data to quan-

tify the model sensitivity to the time evolution.

In the main results, the performance of all mod-

els are gradually worse in general over time. It

implies dynamic social media environment may

universally and negatively affect the NLU effec-

tiveness. With some trans-data, both VAE and PL

can helpfully tackle dynamicity and the their joint

framework achieves the best results consistently

over time. We then analyze the effects of trans-data

scale and create time and ﬁnd both PL and VAE

might beneﬁt from trans-data with larger scales and

smaller time gap to the training data. At last, case

studies interpret how VAE and PL collaboratively

handle the dynamic environments.

To conclude, we present the ﬁrst empirical study,

to the best of our knowledge, on the universal ef-

fects of dynamic social media environment on NLU,

and provide insights to when and how UDA meth-

ods help advance model robustness over time.

Hashtags are tagged by the author of a post to indicate its

topic label and start with an hash “#”, e.g., “#COVID19”.

2 Related Work

This paper is in line with previous work for the

out-of-distribution (OOD) issue, aiming to miti-

gate the distribution gap between training and test

data (Xie et al.,2021;Shen et al.,2021;Liu et al.,

2021a). Most prior OOD studies experiment on

domain gaps which tend to exhibit intermittent

change, while that shaped by time usually happen

step by step and hence forms a successive process.

Limited attention has been paid to examine NLU

models’ practical and general performance in han-

dling evolving social media environment, while our

empirical study is an initiate to ﬁll in the gap.

In previous OOD work, various domain adapta-

tion methods are explored (Chu and Wang,2018;

Ramesh Kashyap et al.,2021), e.g., adversarial

learning (Liu et al.,2020), pre-training (Hendrycks

et al.,2020;Goyal and Durrett,2020;Kong et al.,

2020), and data augmentation (Chen et al.,2021).

Some of them require labeled data from both source

and target domains (Arora et al.,2021) to learn

cross-domain features. It is however infeasible in

our time-shaped OOD scenarios because of the dif-

ﬁculties to continuously label data.

Our baseline solutions are inspired by existing

methods in unsupervised domain adaption (UDA),

employing labeled source data and unlabeled target

data for model training. Popular UDA baselines

mostly fall into feature-centric and data-centric cat-

egories (Ramponi and Plank,2020). The former ex-

plores implicit clusters to bridge semantic features

across domains (Gururangan et al.,2019), while the

latter transfers knowledge gained from the source

to target via self-training (Axelrod et al.,2011). In

our experiments, VAE and pseudo-labeling (PL)

are popular baselines respectively selected to repre-

sent feature- and data-centric UDA, whereas their

individual and collaborative performance in ad-

vancing NLU robustness in dynamicity have never

been studied before and will be explored here.

This work is also inspired by the previous studies

applying a dynamic setup on certain social media

tasks, such as content recommendation (Zeng et al.,

2020;Zhang et al.,2021). Based on their efforts,

we take a step further to broadly examine various

classiﬁcation tasks in order to draw a more general

conclusion on how dynamicity affects NLU.

3 Time-Adaptive Learning Baselines

Here we discuss time-adaptive learning baselines

and how we leverage them with social media clas-

Figure 2: Our integrated framework of VAE and PL.

VAE-learned features (zs) are injected into MLP with

BERT-encoded bs(indicated as BERT-VAE-MLP). PL-

predicted trans-data (pseudo data) and the labeled data

from the past are both used to train BERT-VAE-MLP.

siﬁcation in dynamicity. We will start with the clas-

siﬁcation overﬂow, followed by the introduction to

VAE- and PL-based baselines (

3.1) and how they

can collaboratively work via joint training (§3.2).

Classiﬁcation Overﬂow.

The NLU in a dynamic

setup will be examined on multiple tasks, all formu-

lated as post-level single-label classiﬁcation. The

input is a social media post

and output a label

speciﬁed by a task. Here we assume the availability

of two data types: (1) posts with gold-standard la-

bels created in the past (henceforth

history labeled

data

), which can be employed to train a supervised

classiﬁer; (2) posts created after the classiﬁer is

trained and without labels (i.e., trans-data).

For classiﬁcation, following the advanced prac-

tice (Devlin et al.,2019), the representation for lan-

guage understanding will be built in a pre-trained

BERT encoder, where we feed in the input

and

obtain a latent vector bsas the post embedding.

At the output, the learned classiﬁcation features

are mapped to a speciﬁc label

ˆys

with the formula:

ˆys=fout(Wout ·rs+bout)(1)

fout(·)

is the activation function for classiﬁcation

output (e.g., softmax).

Wout

and

bout

are learn-

able parameters for training.

couples the BERT-

encoded latent semantics (

) and the implicit

cross-time features gained by VAE (as a feature-

centric UDA) via a multi-layer perceptron (MLP):

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

TimeWillChangeThings:AnEmpiricalStudyonDynamicLanguageUnderstandinginSocialMediaClassicationYujiZhang,JingLiDepartmentofComputing,TheHongKongPolytechnicUniversity,HKSAR,Chinayu-ji.zhang@connect.polyu.hkjing-amelia.li@polyu.edu.hkAbstractLanguagefeaturesareever-evolvinginthereal-worldsocialmediaenvi...

展开>> 收起<<

Time Will Change Things An Empirical Study on Dynamic Language Understanding in Social Media Classiﬁcation Yuji Zhang Jing Li.pdf

共11页,预览3页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Time Will Change Things An Empirical Study on Dynamic Language Understanding in Social Media Classiﬁcation Yuji Zhang Jing Li

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: