Time Will Change Things An Empirical Study on Dynamic Language Understanding in Social Media Classification Yuji Zhang Jing Li

2025-05-06 0 0 1.77MB 11 页 10玖币
侵权投诉
Time Will Change Things: An Empirical Study on Dynamic Language
Understanding in Social Media Classification
Yuji Zhang, Jing Li
Department of Computing,
The Hong Kong Polytechnic University,
HKSAR, China
yu-ji.zhang@connect.polyu.hk jing-amelia.li@polyu.edu.hk
Abstract
Language features are ever-evolving in the
real-world social media environment. Many
trained models in natural language understand-
ing (NLU), ineffective in semantic inference
for unseen features, might consequently strug-
gle with the deteriorating performance in dy-
namicity. To address this challenge, we em-
pirically study social media NLU in a dy-
namic setup, where models are trained on
the past data and test on the future. It bet-
ter reflects the realistic practice compared to
the commonly-adopted static setup of random
data split. To further analyze model adaption
to the dynamicity, we explore the usefulness of
leveraging some unlabeled data created after
a model is trained. The performance of unsu-
pervised domain adaption baselines based on
auto-encoding and pseudo-labeling and a joint
framework coupling them both are examined
in the experiments. Substantial results on four
social media tasks imply the universally nega-
tive effects of evolving environments over clas-
sification accuracy, while auto-encoding and
pseudo-labeling collaboratively show the best
robustness in dynamicity.
1 Introduction
The advance of natural language understanding
(NLU) automates the learning of text semantics,
exhibiting the potential to broadly benefits social
media applications. As shown in the previous
work (Tong et al.,2021;Heidari and Jones,2020;
Salminen et al.,2020), the pre-trained models from
the BERT family (Devlin et al.,2019;Liu et al.,
2019;Nguyen et al.,2020) have championed the
benchmark results in many social media tasks. Nev-
ertheless, will good benchmark results also indicate
good real-world performance on social media?
In view of our dynamic world, it is not hard to
envision an ever-evolving environment on social
media, which is shaped by what and how things are
discussed there in real time. As a result, language
features, formed by word patterns appearing there,
might also rapidly change over time. However,
many trendy NLU models, including the state-of-
the-art (SOTA) ones based on pre-training, demon-
strate compromised empirical results facing shifted
features (Hendrycks et al.,2020). The possible rea-
son lies in the widely-argued limitation of existing
NLU solutions on inferring meanings of new or
shifted features compared to what the models have
seen in the training data (Duchi and Namkoong,
2018;Arjovsky et al.,2019;Creager et al.,2021;
Shen et al.,2020;Liu et al.,2021b).
Consequently, the dynamic social media envi-
ronment in the realistic scenarios will continu-
ously challenge a trained NLU model with timely-
increasing unseen features (Nguyen et al.,2012),
further resulting in a deteriorating performance as
time goes by. To better illustrate this challenge, we
take the task and dataset of Twitter stance detection
for COVID-19 topics as an example (Glandt et al.,
2021). Two models based on LSTM and BERT are
trained on the past data and test on five datasets
with varying spans to the training set. The setup
and results are detailed in Figure 1(right).
Both models exhibit dropping accuracy scores
over time, implying the concrete challenge for them
to tackle dynamicity. To further analyze the rea-
sons, we employ variational auto-encoder (VAE)
(Kingma and Welling,2014) to learn the latent
topics (word clusters) from varying test sets and
display the words indicating the largest correlation
with each cluster in Figure 1(left). It is observed
that users’ discussion points change over time,
where the focus gradually shifted from the concern
to the virus itself (indicated by words like “Mask”,
Immune Compromise”, “Lock Down”) to the dis-
appointment to the former US President Trump
(e.g., “Trump Land Slid” and “Lying Trump”). Be-
cause of the topic evolution, it might not be easy for
models trained with the
t0
data to connect the later-
gathered “Trump”-patterns to an “against” stance
arXiv:2210.02857v1 [cs.CL] 6 Oct 2022
Time Topics
t0
Scientist Doctor, Covid, Corona Virus, No Mask, Return To Work,
Govern, Nose, Mouth, Dread
t1
Real Patriots Wear Mask, Failed Lock Down, 2020 US Election, Save,
Immune Compromise, Covid19 Outbreak, Schools Wear a Mask, Corona
Virus Pakistan, Corona Virus Canada
t2
Symptom, Temperature, Lock Down, Panic Buy, Cough, I m With Fauci,
Trump Kills Us, Mask Up
t3
Therapy, Inject, Wear Your Mask, Trump Is A National Disgrace, Trump
Land Slid, Surgeon, Covid 2019 India, Wear You Masks Dont Work
t4
Red State, Blue State, Trump Lies Americans, Lying Trump, Trump
Melt Down, End Lock Down
Figure 1: Results from the Twitter stance detection dataset for COVID-19 topics (Glandt et al.,2021). t0refers to
the time span of the earliest 40% tweets and the rest are equally split into 4 segments in the chronological order
corresponding to t1,t2,t3, and t4, respectively. Latent topics from t0to t4are shown on the left and topic words
are learned by VAE. Stance detection results over time are shown on the right, where x-axis indicates test sets from
t0to t4and y-axis the prediction accuracy. LSTM results are displayed in the light blue line and BERT dark blue.
for topics related to his COVID-19 policies.
To empirically examine how dynamicity affects
NLU performance, we experiment in a
dynamic
setup
: the data is split with an absolute time, where
the messages posted beforehand are used for train-
ing while those afterwards are for test. On the
contrary, most social media benchmarks adopt the
static setup
, where training and test sets are ran-
domly split and tend to exhibit similar data distri-
butions (Glandt et al.,2021;Hansen et al.,2021;
Mathew et al.,2021). It is thus incapable of reflect-
ing the realistic application scenarios — a model
should usually learn to tackle the data created af-
ter it is trained while the evolving features would
continuously shift the data distributions.
Language learning with distribution shift (a.k.a.,
OOD, short for out-of-distribution) has drawn a
growing attention in the NLP community (Shen
et al.,2021;Arora et al.,2021). Most previous
work focuses on OOD in different domains (Muan-
det et al.,2013;Ganin et al.,2015) and studies
how to learn generalizable cross-domain features.
Here we experiment OOD in the dynamic environ-
ment — whose time-sensitive nature renders the
data evolution to occur progressively and contin-
uously; whereas most prior empirical studies dis-
cuss OOD across domains and hence focus on the
relatively discrete shifts from the source to target
domains (Volpi et al.,2018;Krueger et al.,2021).
To further examine NLU adaption to time evolu-
tion (henceforth
time-adaptive learning
), we ex-
ploit a small set of unlabeled data posted after a
model is trained (henceforth
trans-data
) and in-
vestigate its potential in mitigating the time-shaped
feature gap. For methodology, we start with the
existing solutions in unsupervised domain adaption
(UDA) (Ramponi and Plank,2020) and employ two
popular baselines in this line, one is feature-centric
based on auto-encoding (specifically VAE) and the
other data-centric pseudo-labeling (PL). Further-
more, a joint-training framework is explored to
study their coupled effects in fighting against the
possible performance deterioration over time.
The experiments are based on three trendy so-
cial media tasks about the detection of COVID-19
stance (Glandt et al.,2021), fake news (Hansen
et al.,2021), and hate speech (Mathew et al.,
2021) with the benchmark data from Twitter. We
also gather a new corpus for hashtag prediction to
broaden our scope to noisy user-generated labels
tremendous on social media.
1
Dynamic setup is
adopted and models are tested on multiple datasets
varying in the time gap to the training data to quan-
tify the model sensitivity to the time evolution.
In the main results, the performance of all mod-
els are gradually worse in general over time. It
implies dynamic social media environment may
universally and negatively affect the NLU effec-
tiveness. With some trans-data, both VAE and PL
can helpfully tackle dynamicity and the their joint
framework achieves the best results consistently
over time. We then analyze the effects of trans-data
scale and create time and find both PL and VAE
might benefit from trans-data with larger scales and
smaller time gap to the training data. At last, case
studies interpret how VAE and PL collaboratively
handle the dynamic environments.
To conclude, we present the first empirical study,
to the best of our knowledge, on the universal ef-
fects of dynamic social media environment on NLU,
and provide insights to when and how UDA meth-
ods help advance model robustness over time.
1
Hashtags are tagged by the author of a post to indicate its
topic label and start with an hash “#”, e.g., “#COVID19”.
2 Related Work
This paper is in line with previous work for the
out-of-distribution (OOD) issue, aiming to miti-
gate the distribution gap between training and test
data (Xie et al.,2021;Shen et al.,2021;Liu et al.,
2021a). Most prior OOD studies experiment on
domain gaps which tend to exhibit intermittent
change, while that shaped by time usually happen
step by step and hence forms a successive process.
Limited attention has been paid to examine NLU
models’ practical and general performance in han-
dling evolving social media environment, while our
empirical study is an initiate to fill in the gap.
In previous OOD work, various domain adapta-
tion methods are explored (Chu and Wang,2018;
Ramesh Kashyap et al.,2021), e.g., adversarial
learning (Liu et al.,2020), pre-training (Hendrycks
et al.,2020;Goyal and Durrett,2020;Kong et al.,
2020), and data augmentation (Chen et al.,2021).
Some of them require labeled data from both source
and target domains (Arora et al.,2021) to learn
cross-domain features. It is however infeasible in
our time-shaped OOD scenarios because of the dif-
ficulties to continuously label data.
Our baseline solutions are inspired by existing
methods in unsupervised domain adaption (UDA),
employing labeled source data and unlabeled target
data for model training. Popular UDA baselines
mostly fall into feature-centric and data-centric cat-
egories (Ramponi and Plank,2020). The former ex-
plores implicit clusters to bridge semantic features
across domains (Gururangan et al.,2019), while the
latter transfers knowledge gained from the source
to target via self-training (Axelrod et al.,2011). In
our experiments, VAE and pseudo-labeling (PL)
are popular baselines respectively selected to repre-
sent feature- and data-centric UDA, whereas their
individual and collaborative performance in ad-
vancing NLU robustness in dynamicity have never
been studied before and will be explored here.
This work is also inspired by the previous studies
applying a dynamic setup on certain social media
tasks, such as content recommendation (Zeng et al.,
2020;Zhang et al.,2021). Based on their efforts,
we take a step further to broadly examine various
classification tasks in order to draw a more general
conclusion on how dynamicity affects NLU.
3 Time-Adaptive Learning Baselines
Here we discuss time-adaptive learning baselines
and how we leverage them with social media clas-
Figure 2: Our integrated framework of VAE and PL.
VAE-learned features (zs) are injected into MLP with
BERT-encoded bs(indicated as BERT-VAE-MLP). PL-
predicted trans-data (pseudo data) and the labeled data
from the past are both used to train BERT-VAE-MLP.
sification in dynamicity. We will start with the clas-
sification overflow, followed by the introduction to
VAE- and PL-based baselines (
§
3.1) and how they
can collaboratively work via joint training (§3.2).
Classification Overflow.
The NLU in a dynamic
setup will be examined on multiple tasks, all formu-
lated as post-level single-label classification. The
input is a social media post
s
and output a label
l
specified by a task. Here we assume the availability
of two data types: (1) posts with gold-standard la-
bels created in the past (henceforth
history labeled
data
), which can be employed to train a supervised
classifier; (2) posts created after the classifier is
trained and without labels (i.e., trans-data).
For classification, following the advanced prac-
tice (Devlin et al.,2019), the representation for lan-
guage understanding will be built in a pre-trained
BERT encoder, where we feed in the input
s
and
obtain a latent vector bsas the post embedding.
At the output, the learned classification features
are mapped to a specific label
ˆys
with the formula:
ˆys=fout(Wout ·rs+bout)(1)
fout(·)
is the activation function for classification
output (e.g., softmax).
Wout
and
bout
are learn-
able parameters for training.
rs
couples the BERT-
encoded latent semantics (
bs
) and the implicit
cross-time features gained by VAE (as a feature-
centric UDA) via a multi-layer perceptron (MLP):
摘要:

TimeWillChangeThings:AnEmpiricalStudyonDynamicLanguageUnderstandinginSocialMediaClassicationYujiZhang,JingLiDepartmentofComputing,TheHongKongPolytechnicUniversity,HKSAR,Chinayu-ji.zhang@connect.polyu.hkjing-amelia.li@polyu.edu.hkAbstractLanguagefeaturesareever-evolvinginthereal-worldsocialmediaenvi...

展开>> 收起<<
Time Will Change Things An Empirical Study on Dynamic Language Understanding in Social Media Classification Yuji Zhang Jing Li.pdf

共11页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:11 页 大小:1.77MB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 11
客服
关注