Unsupervised Domain Adaptation for COVID-19 Information Service with Contrastive Adversarial Domain Mixup

2025-05-06 0 0 591.19KB 4 页 10玖币
侵权投诉
Unsupervised Domain Adaptation for COVID-19
Information Service with Contrastive Adversarial
Domain Mixup
Huimin Zeng, Zhenrui Yue, Ziyi Kou, Lanyu Shang, Yang Zhang, Dong Wang
School of Information Sciences
University of Illinois Urbana-Champaign, IL, USA
{huiminz3, zhenrui3, ziyikou2, lshang3, dwang24}@illinois.edu
Department of Computer Science and Engineering
University of Notre Dame, IN, USA
yzhang42@nd.edu
Abstract—In the real-world application of COVID-19 misin-
formation detection, a fundamental challenge is the lack of the
labeled COVID data to enable supervised end-to-end training
of the models, especially at the early stage of the pandemic.
To address this challenge, we propose an unsupervised domain
adaptation framework using contrastive learning and adversarial
domain mixup to transfer the knowledge from an existing source
data domain to the target COVID-19 data domain. In particular,
to bridge the gap between the source domain and the target
domain, our method reduces a radial basis function (RBF)
based discrepancy between these two domains. Moreover, we
leverage the power of domain adversarial examples to establish
an intermediate domain mixup, where the latent representations
of the input text from both domains could be mixed during
the training process. Extensive experiments on multiple real-
world datasets suggest that our method can effectively adapt
misinformation detection systems to the unseen COVID-19 target
domain with significant improvements compared to the state-of-
the-art baselines.
I. INTRODUCTION
In this work, we focus on COVID-19 misinformation
detection, given its global impact of the ongoing pandemic
and the “Infodemic”
1
it causes on social media [1]. Regarding
COVID-19 misinformation detection, if the language models
trained on non-COVID datasets without any fine-tuning on
COVID-19 specific data, these models might suffer from a
severe issue of generalization and perform poorly on the
COVID-19 data, due to the domain shift between the non-
COVID training data distribution and the test COVID-19 data
distribution. Recently, the ongoing pandemic of COVID-19
inspires a variety of studies [2] to develop NLP models to
provide reliable COVID-19 information services across various
social media platforms (e.g., Twitter, Facebook). However,
the supervised learning approaches often require a large-scale
training dataset while collecting annotations for COVID training
data is extremely expensive and time consuming due to the cost
and complexity in recruiting the qualified annotators and keep
the annotations update to date to accommodate the dynamics
1https://www.who.int/health-topics/infodemic#tab=tab 1
of COVID-19 knowledge (e.g., different variants of the virus)
[3]. Moreover, our unsupervised domain adaptation setting
is motivated for a more general setting of any early-stage
pandemic (not limited to COVID-19) where there is no ground-
truth information about the novel disease at all, but the need for
correct information is urgent. Therefore, it is critical to develop
unsupervised domain adaptation frameworks to train COVID
models so that knowledge from an existing data domain could
be adapted and transferred to the unseen COVID data domain
without requiring any ground-truth training labels.
In this paper, we explore an unsupervised domain adaptation
problem for COVID-19 misinformation detection on social
media. In particular, we propose an unsupervised domain
adaptation framework
C
ontrastive
A
dversarial
D
omain
M
ixup
(CADM), which uses adversarial domain mixup and contrastive
learning to bridge the gap between the source training data
domain and the target COVID data domain. The overview
of our framework is shown in Figure 1. To demonstrate
the effectiveness of the proposed CADM, we evaluate it
on several real-world COVID-19 datasets. Our experimental
results suggest that our CADM effectively adapts pre-trained
language models to the target COVID domain, and consistently
outperforms state-of-the-art baselines.
II. RELATED WORK
Misinformation Detection.
Great efforts have been made
to detect the misinformation from online platforms (e.g., social
media). In [2], knowledge graphs are integrated into the
misinformation detection framework to enhance the model’s
performance. The concurrent work [4] also proposed a domain
adaptation framework to address the COVID-19 misinformation
detection using label correction. However, such misinformation
detection systems are built under a supervised or semi-
supervised learning setting, but in practice labeled COVID-19
misinformation data is not always accessible. Therefore, this
paper focuses on unsupervised domain adaptation of language
models for COVID-19 misinformation detection, where the
arXiv:2210.03250v1 [cs.CL] 6 Oct 2022
摘要:

UnsupervisedDomainAdaptationforCOVID-19InformationServicewithContrastiveAdversarialDomainMixupHuiminZeng,ZhenruiYue,ZiyiKou,LanyuShang,YangZhangy,DongWangSchoolofInformationSciencesUniversityofIllinoisUrbana-Champaign,IL,USAfhuiminz3,zhenrui3,ziyikou2,lshang3,dwang24g@illinois.eduyDepartmentof...

展开>> 收起<<
Unsupervised Domain Adaptation for COVID-19 Information Service with Contrastive Adversarial Domain Mixup.pdf

共4页,预览1页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:4 页 大小:591.19KB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 4
客服
关注