Unsupervised Domain Adaptation for COVID-19 Information Service with Contrastive Adversarial Domain Mixup

2025-05-06 0 0 591.19KB 4 页 10玖币

侵权投诉

Unsupervised Domain Adaptation for COVID-19

Information Service with Contrastive Adversarial

Domain Mixup

Huimin Zeng∗, Zhenrui Yue∗, Ziyi Kou∗, Lanyu Shang∗, Yang Zhang†, Dong Wang∗

∗School of Information Sciences

University of Illinois Urbana-Champaign, IL, USA

{huiminz3, zhenrui3, ziyikou2, lshang3, dwang24}@illinois.edu

†Department of Computer Science and Engineering

University of Notre Dame, IN, USA

yzhang42@nd.edu

Abstract—In the real-world application of COVID-19 misin-

formation detection, a fundamental challenge is the lack of the

labeled COVID data to enable supervised end-to-end training

of the models, especially at the early stage of the pandemic.

To address this challenge, we propose an unsupervised domain

adaptation framework using contrastive learning and adversarial

domain mixup to transfer the knowledge from an existing source

data domain to the target COVID-19 data domain. In particular,

to bridge the gap between the source domain and the target

domain, our method reduces a radial basis function (RBF)

based discrepancy between these two domains. Moreover, we

leverage the power of domain adversarial examples to establish

an intermediate domain mixup, where the latent representations

of the input text from both domains could be mixed during

the training process. Extensive experiments on multiple real-

world datasets suggest that our method can effectively adapt

misinformation detection systems to the unseen COVID-19 target

domain with signiﬁcant improvements compared to the state-of-

the-art baselines.

I. INTRODUCTION

In this work, we focus on COVID-19 misinformation

detection, given its global impact of the ongoing pandemic

and the “Infodemic”

it causes on social media [1]. Regarding

COVID-19 misinformation detection, if the language models

trained on non-COVID datasets without any ﬁne-tuning on

COVID-19 speciﬁc data, these models might suffer from a

severe issue of generalization and perform poorly on the

COVID-19 data, due to the domain shift between the non-

COVID training data distribution and the test COVID-19 data

distribution. Recently, the ongoing pandemic of COVID-19

inspires a variety of studies [2] to develop NLP models to

provide reliable COVID-19 information services across various

social media platforms (e.g., Twitter, Facebook). However,

the supervised learning approaches often require a large-scale

training dataset while collecting annotations for COVID training

data is extremely expensive and time consuming due to the cost

and complexity in recruiting the qualiﬁed annotators and keep

the annotations update to date to accommodate the dynamics

1https://www.who.int/health-topics/infodemic#tab=tab 1

of COVID-19 knowledge (e.g., different variants of the virus)

[3]. Moreover, our unsupervised domain adaptation setting

is motivated for a more general setting of any early-stage

pandemic (not limited to COVID-19) where there is no ground-

truth information about the novel disease at all, but the need for

correct information is urgent. Therefore, it is critical to develop

unsupervised domain adaptation frameworks to train COVID

models so that knowledge from an existing data domain could

be adapted and transferred to the unseen COVID data domain

without requiring any ground-truth training labels.

In this paper, we explore an unsupervised domain adaptation

problem for COVID-19 misinformation detection on social

media. In particular, we propose an unsupervised domain

adaptation framework

ontrastive

dversarial

omain

ixup

(CADM), which uses adversarial domain mixup and contrastive

learning to bridge the gap between the source training data

domain and the target COVID data domain. The overview

of our framework is shown in Figure 1. To demonstrate

the effectiveness of the proposed CADM, we evaluate it

on several real-world COVID-19 datasets. Our experimental

results suggest that our CADM effectively adapts pre-trained

language models to the target COVID domain, and consistently

outperforms state-of-the-art baselines.

II. RELATED WORK

Misinformation Detection.

Great efforts have been made

to detect the misinformation from online platforms (e.g., social

media). In [2], knowledge graphs are integrated into the

misinformation detection framework to enhance the model’s

performance. The concurrent work [4] also proposed a domain

adaptation framework to address the COVID-19 misinformation

detection using label correction. However, such misinformation

detection systems are built under a supervised or semi-

supervised learning setting, but in practice labeled COVID-19

misinformation data is not always accessible. Therefore, this

paper focuses on unsupervised domain adaptation of language

models for COVID-19 misinformation detection, where the

arXiv:2210.03250v1 [cs.CL] 6 Oct 2022

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

UnsupervisedDomainAdaptationforCOVID-19InformationServicewithContrastiveAdversarialDomainMixupHuiminZeng,ZhenruiYue,ZiyiKou,LanyuShang,YangZhangy,DongWangSchoolofInformationSciencesUniversityofIllinoisUrbana-Champaign,IL,USAfhuiminz3,zhenrui3,ziyikou2,lshang3,dwang24g@illinois.eduyDepartmentof...

展开>> 收起<<

Unsupervised Domain Adaptation for COVID-19 Information Service with Contrastive Adversarial Domain Mixup.pdf

共4页,预览1页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Unsupervised Domain Adaptation for COVID-19 Information Service with Contrastive Adversarial Domain Mixup

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: