Mutual Theory of Mind for Human-AI Communication

2025-05-02 0 0 1.16MB 7 页 10玖币
侵权投诉
Mutual Theory of Mind for Human-AI Communication
Qiaosi Wang
qswang@gatech.edu
Georgia Institute of Technology
Atlanta, GA, USA
Ashok K. Goel
ashok.goel@cc.gatech.edu
Georgia Institute of Technology
Atlanta, GA, USA
ABSTRACT
New developments are enabling AI systems to perceive, recognize,
and respond with social cues based on inferences made from hu-
mans’ explicit or implicit behavioral and verbal cues. These AI
systems, equipped with an equivalent of human’s Theory of Mind
(ToM) capability, are currently serving as matchmakers on dating
platforms, assisting student learning as teaching assistants, and
enhancing productivity as work partners. They mark a new era in
human-AI interaction (HAI) that diverges from traditional human-
computer interaction (HCI), where computers are commonly seen
as tools instead of social actors. Designing and understanding the
human perceptions and experiences in this emerging HAI era be-
comes an urgent and critical issue for AI systems to fulll human
needs and mitigate risks across social contexts. In this paper, we
posit the Mutual Theory of Mind (MToM) framework, inspired by
our capability of ToM in human-human communications, to guide
this new generation of HAI research by highlighting the iterative
and mutual shaping nature of human-AI communication. We dis-
cuss the motivation of the MToM framework and its three key
components that iteratively shape the human-AI communication in
three stages. We then describe two empirical studies inspired by the
MToM framework to demonstrate the power of MToM in guiding
the design and understanding of human-AI communication. Finally,
we discuss future research opportunities in human-AI interaction
through the lens of MToM.
CCS CONCEPTS
Human-centered computing
HCI theory, concepts and
models;Empirical studies in HCI;Natural language interfaces;
Computing methodologies Articial intelligence.
KEYWORDS
theory of mind, human-AI interaction, social intelligence
ACM Reference Format:
Qiaosi Wang and Ashok K. Goel. 2024. Mutual Theory of Mind for Human-AI
Communication. In Proceedings of Workshop on Theory of Mind in Human-AI
Interaction at CHI 2024 (ToMinHAI at CHI 2024). ACM, New York, NY, USA,
7pages. https://doi.org/XXXXXXX.XXXXXXX
1 INTRODUCTION
With new technology advancements, AI systems are increasingly
serving dierent social roles across contexts. For example, AI sys-
tems are acting as matchmakers to provide matches for our busi-
ness or life partners, as personal assistants to manage our daily
routines, as learning assistants to facilitate student learning, and
more. These AI systems are often able to perceive, recognize, and
ToMinHAI at CHI 2024, May 12th, Honolulu, Hawaii
2024. ACM ISBN 978-x-xxxx-xxxx-x/YY/MM. . . $15.00
https://doi.org/XXXXXXX.XXXXXXX
react to human characteristics, needs, and perceptions embedded
in our behavioral and verbal cues. This presents a new interaction
paradigm in Human-AI Interaction (HAI) that diverges from the
traditional Human-Computer Interaction (HCI)— people are ex-
pecting AI systems to possess social intelligence for diverse social
functions, yet are often uncertain about the AI’s capabilities and
social roles during interactions. Oftentimes during HAI, people
build dierent perceptions or mental models of the AI based on the
AI outputs [
13
,
42
,
44
,
46
]; at the same time, AI systems are also
constantly building dierent interpretations of human characteris-
tics, needs, and goals based on human’s input [
9
,
12
,
34
,
39
,
41
]. In
this emerging HAI paradigm, interpretations of each other, from
both the human side and the AI side, are playing an increasingly
crucial part in shaping human-AI interactions, but how should the
HAI community study it in a systematic way to enhance human-AI
interactions?
In this paper, we propose viewing this emerging HAI paradigm
through the lens of Mutual Theory of Mind (MToM). We draw in-
spirations from the human-human communication process, largely
enabled by our basic cognitive and social ability of Theory of Mind
(ToM). ToM is our ability to make conjectures about ourselves and
others’ mental states (e.g., emotions, intentions) [
2
,
16
], and it is
the key to enabling many human social behaviors such as com-
munication repair and making shared plans or goals [
3
]. Having
the capability of ToM enables us to construe a mental model of
others’ minds, which includes their thoughts, preferences, goals,
needs, plans, etc. [
2
,
33
]. In typical human-human communications,
having a Mutual Theory of Mind (MToM), meaning all par-
ties involved in the interaction possess the ToM, enables us
to continuously rene our interpretations of each others’ minds
through behavioral and verbal feedback, helping us to maintain
constructive and coherent communications.
Drawing on the parallel between the MToM in human-human
communications and the emerging HAI paradigm where both hu-
mans and AIs can construct representations of each other during
communications, we propose the framework of Mutual Theory of
Mind to guide the next generation of research in human-AI commu-
nications. We argue that MToM as a framework provides a process
and content account of human-AI communication that emphasizes
the iterative mutual shaping of each party’s interpretations and
feedback through dierent stages of the communication process.
We will rst review relevant literature on ToM and human-AI com-
munication, then describe the MToM framework in details. We
then summarize two of our empirical studies inspired by the MToM
framework, focusing specically on the second-level ToM— the
idea of “I can think about what you think about my mind”— in
human-AI communications. Finally, we discuss potential research
opportunities in human-AI interaction through the lens of MToM.
arXiv:2210.03842v2 [cs.HC] 25 May 2024
ToMinHAI at CHI 2024, May 12th, Honolulu, Hawaii Wang and Goel
2 RELATED WORK
2.1
Theoretical Perspectives of Communication
Communication is commonly dened as “the process of transmit-
ting information and common understanding from one person to
another.[
32
] Scholars across disciplines have oered dierent per-
spectives to study and enhance communication.
In communication studies, researchers have focused on the dif-
ferent components at play during the communication process. The
classic Shannon-Weaver model of communication [
38
] outlines sev-
eral key components during the communication process [
32
]: sender
who initiates the communication process by sending messages en-
coded using symbols, gestures, words, or sentences through a cho-
sen channel to the receiver. While the message is transmitting
through the channel, there could be noises that could distort the
message. After receiving the message from the sender, the receiver
will decode the message into meaningful information, depending on
how the receiver interprets the message. Finally, the receiver will
provide feedback as a response to the sender. These key components
determine the quality and eectiveness of the communication.
The Cognitive Science perspective of communication highlights
the critical role of ToM [
33
]. ToM enables us to make suppositions
of other’s minds through verbal and behavioral cues, acting as
the foundation of human-human communication [
2
,
3
]. From this
perspective, both interlocutors during communication can form
interpretations of what’s on the other interlocutor’s mind based
on the implicit and explicit communication cues. For example, we
can often infer the interlocutors’ goals, plans, or preferences based
on what they said, their facial expressions, or their bodily expres-
sions [
2
,
33
]. Based on that interpretation we formed about the
other’s mind, we will act accordingly to correct, explain, or per-
suade. This cycle of building an interpretation of other’s minds and
then act upon that interpretation continues iteratively throughout
the communication process. Inferring about each other’s minds
through behavioral cues, according to this perspective, is therefore
crucial to a smooth and successful communication.
Communication process can also be interpreted from the social
science perspective through impression management [
14
]. In his
seminal work, Goman describes social interaction as an informa-
tion game between individuals and their audience to maintain the
“veneer of consensus” to keep the conversation going and to avoid
awkwardness. During social interactions, the audience usually try
to gather as much information as they could about the individu-
als they interact with in order to elicit a desirable response from
the individual; whereas individuals put up performances through
two kinds of expressions—– expressions that are intentionally per-
formed to leave a certain impression (expression given) or expres-
sions that are unintentionally given o that could inuence the
audience’s impressions of them (expression given o)—– to manage
impressions [
14
]. Throughout interactions, each party conveys their
denition of the situation through communications: individuals by
expressions and audience by reactions to the individuals.
These three perspectives on communication emphasize dierent
aspects of the communication process: the communication study
perspective focuses on the encoding and decoding process of mes-
sages; the cognitive science perspective discusses how behavioral
cues can inform our interpretations of interlocutor’s minds; the
social science perspective describes how interpretations of others’
minds could predict our behaviors. Our Mutual Theory of Mind
framework attempts to bring these dierent emphasis together into
one coherent framework to understand the mutual shaping process
of interpretations and feedback during communication.
2.2 Theory of Mind in Human-AI
Communication
Over the years, many researchers have recognized the crucial role
of ToM in HAI. In human-robot teaming research, ToM has been
intentionally built in as part of the system architecture to help
robots monitor world state as well as the human state [
8
], to con-
struct simulation of hypothetical cognitive models of the human
partner to account for human behaviors that deviate from original
plans [
34
], and to help robots to build mental models about user
beliefs, plans and goals [
20
,
24
]. Robots built with ToM have demon-
strated positive outcomes in team operations [
8
] and are perceived
to be more natural and intelligent [29].
Other research in HCI and human-centered AI has also been
exploring along the realm of ToM, focusing mostly on enhancing
user’s mental models and understanding of the AI systems. Prior re-
search has explored people’s mental model of AI systems— people’s
mental model of AI agents could include global behavior, knowl-
edge distribution, and local behavior [
13
]. People’s perception of
AI systems is instrumental in guiding how they interact with AI
systems [
13
] and thus serves as a precursor to their expectation of
AI’s behavior. Some recent research has also begun to examine how
to automatically infer user’s mental model of AI. Prior research sug-
gests the potential of leveraging linguistic cues to indicate people’s
perception of AIs during human-AI interactions. Researchers have
been able to infer users’ emotions towards an AI agent [
40
] and
signs of conversation breakdowns [26]from communication cues.
Given that AI’s behavior and output could also inuence user’s
mental model of the AI, and therefore how the user decides to
interact with the AI, we want to highlight that the interpretation-
feedback loop is mutual during the human-AI communication
process— user’s mental model of the AI can be informed by the AI’s
output, yet AI’s interpretation of the user can also be informed by
the user’s output, which is determined by the user’s mental model
of the AI. We propose the Mutual Theory of Mind framework to cap-
ture this mutual shaping process of interpretation-feedback during
human-AI communication.
3 MUTUAL THEORY OF MIND FRAMEWORK
Drawing from theoretical and empirical work, we posit the MToM
framework to guide the understanding and design of communica-
tions between humans and AI systems that exhibit social behaviors
enabled by ToM-like capability. The MToM framework provides
both process and content account of human-AI communication
by highlighting three elements that mutually shape the human-AI
communication process in three stages.
3.1 Three Elements of the MToM Framework
In the MToM framework, three elements are critical for humans
and AI to reach mutual understanding during the communication
process: interpretation, feedback, and mutuality.
摘要:

MutualTheoryofMindforHuman-AICommunicationQiaosiWangqswang@gatech.eduGeorgiaInstituteofTechnologyAtlanta,GA,USAAshokK.Goelashok.goel@cc.gatech.eduGeorgiaInstituteofTechnologyAtlanta,GA,USAABSTRACTNewdevelopmentsareenablingAIsystemstoperceive,recognize,andrespondwithsocialcuesbasedoninferencesmadefro...

展开>> 收起<<
Mutual Theory of Mind for Human-AI Communication.pdf

共7页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:7 页 大小:1.16MB 格式:PDF 时间:2025-05-02

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 7
客服
关注