Language Independent Stance Detection Social Interaction-based Embeddings and Large Language Models

2025-04-27 0 0 892.33KB 19 页 10玖币
侵权投诉
Language Independent Stance Detection: Social
Interaction-based Embeddings and Large Language
Models
Detecci´on de Stance Independiente del Idioma:
Representaciones Vectoriales basadas en Interacciones
Sociales y Grandes Modelos de Lenguaje
Joseba Fernandez de Landa, Rodrigo Agerri
HiTZ Center - Ixa, University of the Basque Country UPV/EHU
joseba.fernandezdelanda@ehu.eus, rodrigo.agerri@ehu.eus
Abstract: The large majority of the research performed on stance detection has
been focused on developing more or less sophisticated text classification systems,
even when many benchmarks are based on social network data such as Twitter.
This paper aims to take on the stance detection task by placing the emphasis not
so much on the text itself but on the interaction data available on social networks.
More specifically, we propose a new method to leverage social information such as
friends and retweets by generating Relational Embeddings, namely, dense vector
representations of interaction pairs. Our experiments on seven publicly available
datasets and four different languages (Basque, Catalan, Italian, and Spanish) show
that combining our relational embeddings with discriminative textual methods helps
to substantially improve performance, obtaining state-of-the-art results for six out
of seven evaluation settings, outperforming strong baselines based on Large Lan-
guage Models, or other popular interaction-based approaches such as DeepWalk or
node2vec.
Keywords: Stance Detection, Multilinguality, Social Networks, Interactions, Large
Language Models, Natural Language Processing.
Resumen: La gran mayor´ıa de los trabajos sobre la detecci´on de stance (posi-
cionamiento) se han centrado en clasificaci´on de texto, incluso cuando los datos se
recolectan de redes sociales como Twitter. Este art´ıculo aborda la tarea de detecci´on
de stance haciendo ´enfasis, adem´as de en los datos textuales de los mensajes, en los
datos de interacci´on disponibles en las redes sociales. Proponemos un nuevo m´etodo
para representar informaci´on social como amigos yretuits generando embeddings
relacionales, es decir, representaciones vectoriales densas basadas en pares de inter-
acci´on. Nuestros experimentos en siete conjuntos de datos p´ublicamente disponibles
y para cuatro idiomas (catal´an, euskera, espa˜nol e italiano) demuestran que la com-
binaci´on de los embeddings relacionales con m´etodos textuales ayuda a mejorar el
rendimiento, obteniendo resultados del estado del arte en seis de los siete escenar-
ios de evaluaci´on, superando otras aproximaciones basadas en grandes modelos de
lenguaje u otros enfoques basados en interacciones como DeepWalk o node2vec.
Palabras clave: Detecci´on de Stance, Multilingualismo, Redes Sociales, Interac-
ciones, Modelos de Lenguaje, Procesamiento del Lenguaje Natural.
1 Introduction
Stance detection consists of identifying the
viewpoint or attitude expressed by a piece of
text with respect to a given target. With
the enormous popularity of social networks,
users spontaneously share their opinions on
social media, generating a valuable resource
to investigate stance. This means that re-
search on stance has a social impact, for ex-
ample, to help addressing misinformation on
arXiv:2210.05715v2 [cs.CL] 27 Feb 2025
vaccines, or to better understand public opin-
ion about topics such as climate change or
migration. Furthermore, stance detection is
considered an important intermediate task
for fact-checking (Augenstein, 2021) or fake
news detection(Pomerleau and Rao, 2017).
The SemEval 2016 task on stance detec-
tion in Twitter (Mohammad et al., 2016)
presented a dataset with tweets expressing
FAVOR, AGAINST and NEUTRAL stances
with respect to five different targets, a trend
followed by many other researchers (Derczyn-
ski et al., 2017; Taul´e et al., 2018; Zotova,
Agerri, and Rigau, 2021; Hardalov et al.,
2022). However, despite many of them using
Twitter-based source data, the large major-
ity address the task by considering only the
textual content of tweets (Augenstein et al.,
2016; Schiller, Daxenberger, and Gurevych,
2021; Hardalov et al., 2021; Li, Zhao, and
Caragea, 2021; Ghosh et al., 2019; K¨c¨uk
and Can, 2020; Sobhani, Inkpen, and Zhu,
2017; Glandt et al., 2021a).
This shortcoming has been addressed by
proposing new datasets (Cignarella et al.,
2020; Agerri et al., 2021) that include dif-
ferent languages and social interaction data,
such as retweets or friends. Although these
new datasets have facilitated the develop-
ment of new techniques for stance detection
considering also interaction data, most of
them employ manually engineered features
tailored to each specific data type (Espinosa
et al., 2020; Lai et al., 2021; Alkhalifa and
Zubiaga, 2020), making it difficult to gen-
eralize across languages and targets. Re-
cently, significant attention has been directed
towards the use of Large Language Models
(LLMs) as few-shot learners (Brown et al.,
2020). However, the success of in-context
learning techniques using LLMs has been
mostly limited to English benchmarks such
as SemEval 2016 (Taranukhin, Shwartz, and
Milios, 2024; Gatto, Sharif, and Preum, 2023;
Zhang et al., 2023a), probably because the
pre-training of the large majority of publicly
available LLMs has been focused mostly on
English.
This paper focuses on stance detection of
tweets by placing emphasis on the interac-
tion data commonly available in social media.
More specifically, we propose a new method
to leverage social information such as friends
and retweets by generating Relational Em-
beddings, namely, dense vector representa-
tions of interaction pairs. The development
of our new method allows us to make the
following contributions to language indepen-
dent stance detection: (i) a new method to
represent and exploit interaction data, such
as friends and/or retweets, by generating re-
lational embeddings based on one-to-one re-
lations; (ii) comprehensive experiments on
seven publicly available datasets and four dif-
ferent languages different to English show
that our relational embeddings behave ro-
bustly across different targets and languages
without any specific manual engineering; (iii)
combining our method with text-based dis-
criminative classifiers helps to systemati-
cally improve their results, outperforming
also ensembles of pre-trained language mod-
els (Giorgioni et al., 2020) or strong in-
context learning baselines using Large Lan-
guage Models (Taranukhin, Shwartz, and
Milios, 2024); (iv) we empirically demon-
strate that our new Relational Embeddings
clearly outperform popular graph-based ap-
proaches to encode interaction data, such as
DeepWalk or node2vec; (v) exhaustive abla-
tion and error analyses show that the method
used to obtain the retweet data and the size
of the users community is crucial for state-of-
the-art performance using our technique; (vi)
the new generated datasets with interaction
data and code are publicly available 1.
Finally, while this paper is focused on
stance detection, we believe that our Rela-
tional Embeddings can be successfully ap-
plied to a large number of Computational So-
cial Science and NLP tasks based on social
media, especially those related to political
ideology, misinformation, and hate speech,
but also for health-related applications such
as the detection of early signs of epidemic
outbreaks (Mart´ın-Corral et al., 2022).
2 Related work
Recent studies have demonstrated that us-
ing LLMs on Stance Detection tasks can pro-
vide significant performance increases (Zhang
et al., 2023b; Zhang et al., 2023a). Fur-
thermore, combining the application of LLMs
with Chain-of-Thought (CoT) prompting
(Wei et al., 2022), and in-context learning in
which the model generates intermediate rea-
soning steps to arrive at a final prediction,
1https://github.com/joseba-fdl/relational_
embeddings/
has also helped to substantially improve re-
sults (Kojima et al., 2022; Wang et al., 2023;
Gatto, Sharif, and Preum, 2023).
Despite their high capabilities, the appli-
cation of LLMs still faces several challenges,
such as dealing with cases of implicit stance
or avoiding hallucinations, even when em-
ploying advanced prompting strategies such
as CoT reasoning (Gatto, Sharif, and Preum,
2023). To address these limitations, Stance
Reasoner (Taranukhin, Shwartz, and Milios,
2024) improves the CoT method by includ-
ing examples and reasoning as background
knowledge to achieve generalizable predic-
tions across different targets. However, these
approaches are only focused on English.
Additionally, most stance detection re-
search and datasets released do not in-
clude interaction data, despite being col-
lected from social media sources such as
Twitter. K¨c¨uk and Can (2020) lists stance-
annotated datasets for 11 languages, whereas
recent work on cross-domain and cross-
lingual stance provide experimentation for
16 datasets and 15 languages (Hardalov et
al., 2021; Hardalov et al., 2022). The fo-
cus, however, remains on the textual con-
tent of the tweets. This trend has recently
changed with the release of, to the best of our
knowledge, two datasets which, in addition to
the stance labeled tweets, include interaction
data such as retweets and friends: SardiS-
tance (Cignarella et al., 2020) and VaxxS-
tance (Agerri et al., 2021).
The winner (Espinosa et al., 2020) of
the SardiStance shared task (Cignarella et
al., 2020) used a weighted voting ensem-
ble that combined two inputs: (a) psycho-
logical, sentiment and friends distances as
features used to learn an XGBoost (Fried-
man, 2001) model, with (b) text classifiers
based on the Transformer architecture (De-
vlin et al., 2019). Other systems combined
textual data (emoticons, special characters,
and word embeddings) with 2 dimensions ex-
tracted from the interactions distance ma-
trix using Multidimensional Scaling (MDS)
(Ferraccioli et al., 2020), or friendship-based
graphs created with DeepWalk (Perozzi, Al-
Rfou, and Skiena, 2014) and various types of
textual embeddings (Alkhalifa and Zubiaga,
2020).
The VaxxStance shared task (Agerri et al.,
2021) provided textual and interaction data
(friends and retweets) to study stance detec-
tion on vaccines in Basque and Spanish. The
one system that systematically outperformed
the baselines (Lai et al., 2021) manually en-
gineered a large number of features based on
stylistic, tweet, and user data, lexicons, de-
pendency parsing, and network information,
which were specifically developed for these
datasets and languages.
The most recent approaches tackling un-
supervised stance detection using social me-
dia interactions as features use the force-
directed algorithm (Fruchterman and Rein-
gold, 1991) or UMAP (McInnes et al.,
2018). These algorithms transform inter-
action frequency vectors into features, re-
ducing huge interaction matrices into low-
dimensional features. Darwish et al. (2020)
use both the force-directed algorithm and
UMAP for unsupervised stance detection of
Twitter users. UMAP is also used to get
interaction-based features for automatically
tagging Twitter users’ stance (Stefanov et al.,
2020) and to explore political polarization in
Turkey (Rashed et al., 2021).
Other works are based on node2vec
(Grover and Leskovec, 2016) for user pro-
filing and extracting user features for abuse
detection (Mishra et al., 2018) and also for
sentiment, stance and hate speech detection
(Del Tredici et al., 2019). Commonly used al-
gorithms for building interaction-based mod-
els like DeepWalk (Perozzi, Al-Rfou, and
Skiena, 2014) and node2vec are based on gen-
erating Random Walks. However, those ran-
domly generated walks create artificial inter-
actions that may not occur in the gathered
interaction pairs. Furthermore, selecting the
structure of the random walks and deciding
the number of context users to be predicted
needs to be manually modeled and adapted.
In contrast to previous work based on
in-context learning with LLMs, supervised
text classification or interaction-based meth-
ods such as DeepWalk or node2vec, our Re-
lation Embeddings method provides dense
interaction-based representations of users, fo-
cusing on real interaction pairs. The training
process is designed to predict a target user
receiving a retweet or a follow from a source
user, each instance an item-to-item predic-
tion instead of context-to-item (CBOW) or
item-to-context (Skip-gram) prediction. Ad-
ditionally, we focus on all the interaction
pairs, without generating artificial random
interactions to train the model or manually
selecting the most salient users.
3 Method
We proposed a new method to generate
vector-based representations of interactions
in social networks, such as friends and
retweets. These new representations, which
we refer to as Relational Embeddings (RE),
are then leveraged to propose two methods to
perform stance detection: (i) building clas-
sifiers using just our relational embeddings
(§3.2) and, (ii) combining RE with various
classifiers based on textual data (§3.3).
3.1 Relational Embeddings
In this paper, the type of interactions used
is retweets and friends, which are seen as
relations between two users, one generating
the action (source) and the other receiving
it (target). Thus, the actions of retweeting
or following other users are considered in-
teraction pairs. Generally, these interactions
should help to reveal users’ preferences by
capturing meaningful information from their
performative actions.
The first step in our method consists of
gathering the interactions from the users in-
cluded in the labeled data, namely, the one-
to-one retweet and follow actions between the
users/authors of the tweets. It should be
noted that a set of retweet and follow inter-
actions can consist of independent one-to-one
actions without direct relation between them.
This is why in our model we consider each
interaction pair as a single instance without
any preprocessing or modification.
Using this interaction data, our model
is then trained in an unsupervised man-
ner to predict, in each instance, a target
user from a given source user. Note that
the instances used as input are real interac-
tion pairs, namely, they do not correspond
to sparse interaction frequency matrices or
neighbors arising from interaction networks
or without generating artificial ones as ran-
dom walks do.
In order to obtain our relational represen-
tations, we use a single hidden-layer neural
network (Figure 1). The network is used
to train a dense interaction representation
model using the friends and/or retweet based
data. Each user is encoded as a one-hot
vector of size U, where Uis the number of
users among interaction pairs (I) in a specific
dataset. Given a one-hot vector U, the aim
Figure 1: One hidden layer artificial neural network.
of the single hidden-layer feedforward neu-
ral network consists of predicting the target
user. The dimensions of the hidden layer (D)
determine the size of relational vectors rep-
resenting the target user, which correspond
to the number of learned features. During
training, the weights Wand Ware modified
to minimize the loss function due to back-
propagation. According to Equation 1, the
summation goes over all the interaction pairs
(I) in the training corpus, computing the log
probability of correctly predicting the target
user (utarget) from the source user (usource)
for each interaction (i). The training process
is done by sub-sampling the most frequent in-
stances and with negative sampling (Mikolov
et al., 2013). Finally, the Wmatrix is used to
retrieve the interaction vectors representing
each user, generating the relational embed-
ding, from which the relation vector for each
user is obtained. In this model, users with
similar interactions should have similar rep-
resentations, turning many interaction pairs
into dense relational representations of Ddi-
mensions.
1
I
I
X
i=1
log p(utarget|usource) (1)
3.2 Interaction-based Classifier
with Relational Embeddings
Our first system consists of a linear classi-
fier taking as input only the relational em-
beddings described in the previous section.
Building such a system will allow us to un-
derstand the performance of the generated
Relational Embedding models on their own.
Each of the tweets from a dataset will be
represented by its author’s (user) relational
vector, which represents the interactions of
its author. By doing so, we effectively project
the relations of the author to the tweet level,
generating a link between the relational data
and the stance labels. In this step, some users
摘要:

LanguageIndependentStanceDetection:SocialInteraction-basedEmbeddingsandLargeLanguageModelsDetecci´ondeStanceIndependientedelIdioma:RepresentacionesVectorialesbasadasenInteraccionesSocialesyGrandesModelosdeLenguajeJosebaFernandezdeLanda,RodrigoAgerriHiTZCenter-Ixa,UniversityoftheBasqueCountryUPV/EHUj...

展开>> 收起<<
Language Independent Stance Detection Social Interaction-based Embeddings and Large Language Models.pdf

共19页,预览4页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:19 页 大小:892.33KB 格式:PDF 时间:2025-04-27

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 19
客服
关注