Artificial virtuous agents in a multi-agent tragedy of the commons

2025-04-30 0 0 3.5MB 18 页 10玖币

侵权投诉

Vol.:(0123456789)

1 3

AI & SOCIETY

https://doi.org/10.1007/s00146-022-01569-x

ORIGINAL PAPER

Artiﬁcial virtuous agents inamulti‑agent tragedy ofthecommons

JakobStenseke1

Received: 25 April 2022 / Accepted: 13 September 2022

Abstract

Although virtue ethics has repeatedly been proposed as a suitable framework for the development of artiﬁcial moral agents

(AMAs), it has been proven diﬃcult to approach from a computational perspective. In this work, we present the ﬁrst technical

implementation of artiﬁcial virtuous agents (AVAs) in moral simulations. First, we review previous conceptual and technical

work in artiﬁcial virtue ethics and describe a functionalistic path to AVAs based on dispositional virtues, bottom-up learning,

and top-down eudaimonic reward. We then provide the details of a technical implementation in a moral simulation based

on a tragedy of the commons scenario. The experimental results show how the AVAs learn to tackle cooperation problems

while exhibiting core features of their theoretical counterpart, including moral character, dispositional virtues, learning from

experience, and the pursuit of eudaimonia. Ultimately, we argue that virtue ethics provides a compelling path toward morally

excellent machines and that our work provides an important starting point for such endeavors.

Keywords Machine ethics· Artiﬁcial morality· Artiﬁcial moral agents· Virtue ethics· AI ethics· Ethics of autonomous

systems

1 Introduction

Over the last decades, the rapid development and application

of artiﬁcial intelligence (AI) has spawned a lot of research

focusing on various ethical aspects of AI (AI ethics), and the

prospects of implementing ethics into machines (machine

ethics)1. The latter project can further be divided into theo-

retical debates on machine morality2, conceptual work on

hypothetical artiﬁcial moral agents (Malle 2016), and more

technically oriented work on prototypical AMAs3. Following

the third branch, the vast majority of the technical work has

centered on constructing agent-based deontology (Ander-

son and Anderson 2008; Noothigattu etal. 2018), conse-

quentialism (Abel etal. 2016; Armstrong 2015), or hybrids

(Dehghani etal. 2008; Arkin 2007).

Virtue ethics has repeatedly been suggested as a prom-

ising blueprint for the creation of artiﬁcial moral agents

(Berberich and Diepold 2018; Coleman 2001; Gamez etal.

2020; Howard and Muntean 2017; Wallach and Allen 2008;

Mabaso 2020; Sullins 2021; Navon 2021; Stenseke 2021)4.

Beyond deontological rules and consequentialist utility

functions, it presents a path to construe a more compre-

hensive picture of what it in fact is to have a moral charac-

ter and be a competent ethical decision maker in general.

With the capacity to continuously learn from experience,

be context-sensitive and adaptable to changes, an AMA

based on virtue ethics could potentially accommodate

the subtleties of human values and norms in complex and

dynamic environments. However, although previous work

has proposed that artiﬁcial virtue could be realized through

* Jakob Stenseke

jakob.stenseke@ﬁl.lu.se

1 Department ofPhilosophy, Lund University, Lund, Sweden

1 For a broader introduction to machine ethics, see Wallach and

Allen (2008), Anderson and Anderson (2011), and Pereira et al.

(2016).

2 See Behdadi and Munthe (2020) for an excellent summary of these

debates.

3 For two recent surveys on implementations in machine ethics, see

Tolmeijer etal. (2020) and Cervantes etal. (2020).

4 Virtue ethics has also recently been explored in the context of

social robotics and human–robot interaction (Constantinescu and

Crisp 2022; Cappuccio etal. 2021; Sparrow 2021; Peeters and Hase-

lager 2021).

AI & SOCIETY

1 3

connectionism and the recent advancements made with arti-

ﬁcial neural networks and machine learning (Wallach and

Allen 2008; Howard and Muntean 2017; Gips 1995; DeM-

oss 1998), hardly any technical work has attempted to do so

(Tolmeijer etal. 2020). The major reason is that virtue eth-

ics has been proven diﬃcult to tackle from a computational

point of view (Tolmeijer etal. 2020; Bauer 2020; Lindner

etal. 2020; Arkin 2007). While action-centric frameworks

such as consequentialism and deontology oﬀer more or

less straight-forward instructions convenient for algorith-

mic implementation, those who try to construct a virtuous

machine quickly ﬁnd themselves overwhelmed with the task

of ﬁguring out how generic virtues relate to moral behavior

and how to interpret seemingly intangible concepts such

as moral character, eudaimonia (“ﬂourishing”), phronesis

(“practical wisdom”), and moral exemplars. This conun-

drum is further illustrated in Fig.1.

In this paper, we reﬁne and extend the technical details

of a conceptual model (Stenseke 2021) and present the ﬁrst

experimental implementation of AMAs that solely focuses on

virtue ethics. The experimental results show that our AVAs

manage to tackle cooperation problems while exhibiting core

features of their theoretical counterpart, including moral char-

acter, dispositional virtues, learning from experience, and the

pursuit of eudaimonia. The main aim is to show how virtue

ethics oﬀers a promising framework for the development of

moral machines that can be suitably incorporated in real-world

domains.

The paper is structured as follows. In Sects. 1.1 and 1.2,

we survey previous conceptual and technical work in arti-

ﬁcial virtue and outline an eudaimonic version of the the-

ory based on functionalism and connectionist learning. In

Sects. 2.1, we outline the computational model of an AVA

with dispositional virtues and a phronetic learning system

based on eudaimonic reward. In Sects. 2.2, we introduce

the ethical environment BridgeWorld, a virtual tragedy

of the commons scenario where a population of artiﬁ-

cial agents have to balance cooperation and self-interest

Fig. 1 Rough sketches of three basic ethical algorithms. aConse-

quentialism: Given an ethical dilemma E, a set of possible actions

in E, a way of determining the consequences of those actions and

their resulting utility, the consequentialist algorithm will perform

the action yielding the highest utility. bDeontology: Given an ethical

dilemma E and set of moral rules, the deontological algorithm will

search for the appropriate rule for E and perform the action dictated

by the rule. cVirtue ethics: By contrast, constructing an algorithm

based on virtue ethics presents a seemingly intriguing puzzle

AI & SOCIETY

1 3

in order to prosper. In Sects. 2.3, we describe how AVAs

based on our computational model are implemented in the

environment and provide the technical details of the experi-

mental setup. In the remaining sections, we present the

experimental results (Sects. 3), discuss a number of per-

sisting challenges, and describe fruitful venues for future

work (Sects. 4).

1.1 Virtue ethics

Virtue ethics refers to a large family of ethical traditions that

can be traced back to Aristotle and Plato in the West and Con-

fucius and Mencius in the East 5. In Western moral philoso-

phy of the modern day, it has claimed its place as one of the

three central frameworks in normative ethics through the work

of Anscombe (1958), Nussbaum (1988), Slote (1983, 1992),

Hursthouse (1999), and Annas (2011).

Essentially, virtue ethics is about being rather than doing.

Rather than looking at actions themselves (deontology) or the

consequences of actions (consequentialism), the virtuous agent

nurtures the character traits that allows her to be morally virtu-

ous. In this way, virtues can be viewed as the morally praise-

worthy dispositions—e.g., courage, temperance, fairness—that

an agent has or strives to have. Central to virtue ethics is also

the concept of phronesis (“practical wisdom”), which, accord-

ing to Aristotle, can be deﬁned as “a true and reasoned state of

capacity to act with regard to the things that are good or bad

for man” (NE VI.5). Not only does phronesis encompass the

ability to achieve certain ends, but also to exercise good judg-

ment in relation to more general ideas of the agents well-being.

To that end, phronesis is often construed as the kind of moral

wisdom gained from experience that a virtuous adult has but a

nice child lacks: “[...] a young man of practical wisdom cannot

be found. The case is that such wisdom is concerned not only

with universals but with particulars, which become familiar

from experience” (NE 1141b 10).

While most versions of virtue ethics agree on the central

importance of virtue and practical wisdom, they disagree

about the way these are combined and emphasized in diﬀer-

ent aspects of ethical life. For instance, eudaimonist versions

of virtue ethics (Hursthouse 1999; Ryan etal. 2008) deﬁne

virtues in relation to eudaimonia (commonly translated as

“well-being” or “ﬂourishing”), in the sense that the former

(virtues) are the traits that supports an agent to achieve the

latter (eudaimonia). That is, for an eudaimonist, the key rea-

son for developing virtues is that they contribute to an agent’s

eudaimonia. Agent-based and exemplarist versions, on the

other hand, hold that the normative value of virtues is best

explained in terms of dispositions and motivations of the agent

and that these qualities are most suitably characterized in moral

exemplars (Slote 1995; Zagzebski 2010)6.

1.2 Previous work inartiﬁcial virtue

The various versions of virtue ethics have given rise to a

rather diverse set of approaches to artiﬁcial virtuous agents,

ranging from narrow applications and formalizations to

more general and conceptual accounts. Of the work that

explicitly considers virtue ethics in the context of AMAs,

it is possible to identify ﬁve prominent themes: (1) the skill

metaphor developed by Annas (2011), (2) the virtue-theo-

retic action guidance and decision procedure described by

Hursthouse (1999), (3) learning from moral exemplars, (4)

connectionism about moral cognition, and (5) the emphasis

on function and role.

(1) The ﬁrst theme is the idea that virtuous moral com-

petence—including actions and judgments—is acquired and

reﬁned through active intelligent practice, similar to how

humans learn and exercise practical skills such as play-

ing the piano (Annas 2011; Dreyfus 2004). In a machine

context, this means that the development and reﬁnement of

artiﬁcial virtuous cognition ought to be based on a continu-

ous and interactive learning process, which emphasizes the

“bottom-up” nature of moral development as opposed to a

“top-down” implementation of principles and rules (Howard

and Muntean 2017).

(2) The second theme, following Hursthouse (1999), is

that virtue ethics can provide action guidance in terms of

“v-rules” that express what virtues and vices command (e.g.,

“do what is just” or “do not what is dishonest”), and oﬀers

a decision procedure in the sense that “An action is right

iﬀ it is what a virtuous agent would characteristically (i.e.,

acting in character) do in the circumstances” (Hursthouse

(1999),p.28). Hursthouse’s framework has been particu-

larly useful as a response against the claim that virtue ethics

is “uncodiﬁable” and does not provide a straight-forward

procedure or “moral code” that can be used for algorithmic

implementation (Bauer 2020; Arkin 2007; Tonkens 2012;

Gamez etal. 2020).

(3) The third theme is the recognition that moral exem-

plars provide an important source for moral education

(Hursthouse 1999; Zagzebski 2010; Slote 1995). In turn,

this has inspired a moral exemplar approach to artiﬁcial vir-

tuous agents, which centers on the idea that artiﬁcial agents

can become virtuous by imitating the behavior of excellent

5 See Crisp and Slote (1997) and Devettere (2002) for two outstand-

ing introductions to virtue ethics.

6 However, this does not mean that moral exemplars are unimportant

for eudaimonists, as they can serve to explain how one can, e.g., iden-

tify virtues and the aims of virtuous action (Hursthouse 1999). See

Hursthouse and Pettigrove (2018) for a comprehensive description of

contemporary directions in virtue ethics and their variations.

AI & SOCIETY

1 3

virtuous humans (Govindarajulu etal. 2019; Berberich and

Diepold 2018; Mabaso 2020). Apart from oﬀering conveni-

ent means for control and supervision, one major appeal of

the approach is that it could potentially resolve the alignment

problem, i.e., the problem of aligning machine values with

human values (Armstrong 2015; Gabriel 2020).

(4) The fourth theme is based on the relationship between

virtue ethics and connectionism, i.e., the cognitive theory

that mental phenomena can be described using artiﬁcial neu-

ral networks. The emphasis on learning, and the possibility

to apprehend context-sensitive and non-symbolic informa-

tion without general rules, has indeed led many authors to

highlight the appeal of unifying virtue ethics with connec-

tionism (Berberich and Diepold 2018; Wallach and Allen

2008; Howard and Muntean 2017; Stenseke 2021). The

major reason is that it would provide AVAs with a com-

pelling theoretical framework to account for the develop-

ment of moral cognition (Churchland 1996; DeMoss 1998;

Casebeer 2003), as well as the technological promises of

modern machine learning methods (e.g., deep learning and

reinforcement learning).

(5) The ﬁfth theme is the virtue-theoretic emphasis on

function and role (Coleman 2001; Thornton etal. 2016).

According to both Plato (R 352) and Aristotle (NE 1097b

26-27), virtues are those qualities that enable an agent to

perform their function well. The virtues of an artiﬁcial agent

would, consequently, be the traits that allow it to eﬀectively

carry out its function. For instance, a self-driving truck does

not share the same virtues as a social companion robot used

in childcare; they serve diﬀerent roles, are equipped with

diﬀerent functionalities, and meet their own domain-speciﬁc

challenges. Situating artiﬁcial morality within a broader vir-

tue-theoretic conception of function would therefore allow

us to clearly determine the relevant traits a speciﬁc artiﬁcial

agent needs in order to excel at its particular role.

The biggest challenge for the prospect of AVAs is to move

from the conceptual realm of promising ideas to the level of

formalism and details required for technical implementation.

Guarini (2006, 2013a, 2013b) has developed neural network

systems to deal with the ambiguity of moral language, and

in particular the gap between generalism and particularism.

Without the explicit use of principles, the neural networks

can learn to classify cases as morally permissible/impermis-

sible. Inspired by Guarini’s classiﬁcation approach, How-

ard and Muntean have broadly explored the conceptual and

technical foundations of autonomous artiﬁcial moral agents

(AAMAs) based on virtue ethics (Howard and Muntean

2017). Based on Annas skill metaphor (Annas 2011) and

the moral functionalism of Jackson and Pettit (1995), they

conjecture that artiﬁcial virtues (seen as dispositional traits)

and artiﬁcial moral cognition can be developed and reﬁned

in a bottom-up process through a combination of neural net-

works and evolutionary computation methods. Their central

idea is to evolve populations of neural networks using an

evolutionary algorithm that, via a ﬁtness selection, alters the

parameter values, learning functions, and topology of the

networks. The emerging candidate solution is the AAMA

with “a minimal and optimal set of virtues that solves a large

enough number of problems, by optimizing each of them”

(Howard and Muntean (2017),p.153). Although promising

in theory, Howard and Muntean’s proposed project is lacking

in several regards. First, while combinations of neural net-

works and randomized search methods have yielded promis-

ing results in well-deﬁned environments using NeuroEvolu-

tion of Augmenting Topologies (Stanley and Miikkulainen

2002) (NEATs) or deep reinforcement learning (Berner etal.

2019), Howard and Muntean’s proposal turns into a costly

search problem of inﬁnite dimensions. Furthermore, due to

the highly stochastic process of evolving neural networks

and an equivocal deﬁnition of ﬁtness evaluation, it is not

guaranteed that morally excellent agents would appear even

if we granted inﬁnite computational resources. Besides

being practically infeasible, several crucial details of their

implementation are missing, and they only provide fragmen-

tary results of an experiment where neural networks learn

to identify anomalies in moral data. It therefore remains

unclear how their envisioned AAMAs ought to be imple-

mented in moral environments apart from the classiﬁcation

tasks investigated by Guarini (2006).

Berberich and Diepold (2018) have, in a similar vein,

broadly described how various features of virtue ethics can

be carried out by connectionist methods. This includes (a)

how reinforcement learning (RL) can be used to inform the

moral reward function of artiﬁcial agents, (b) a three-com-

ponent model of artiﬁcial phronesis (encompassing moral

attention, moral concern, and prudential judgment), (c) a

list of virtues suitable for artiﬁcial agents (e.g., prudence,

justice, temperance, courage, gentleness, and friendship to

humans), and (d) learning from moral exemplars through

behavioral imitation by means of inverse RL (Ng and Rus-

sell 2000). However, apart from oﬀering a rich discussion

of promising features artiﬁcial virtuous agents could have,

along with some relevant machine learning methods that

could potentially carry out such features, they fail to provide

the technical details needed to construct and implement their

envisioned agents in moral environments.

As a ﬁrst step toward artiﬁcial virtue, Govindarajulu

etal. (2019) have provided an, in their words “embryonic”

formalization of how artiﬁcial agents can adopt and learn

from moral exemplars using deontic cognitive event cal-

culus (DCED). Based on Zagzebski’s “exemplarist moral

theory” (Zagzebski 2010), they describe how exemplars can

be identiﬁed via the emotion of admiration, which is deﬁned

as “approving (of) someone else’s praiseworthy action”

(Govindarajulu etal. (2019),p.33). In their model, an

action is considered praiseworthy if it triggers a pleasurable

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

AI&SOCIETYhttps://doi.org/10.1007/s00146-022-01569-xORIGINALPAPERArticialvirtuousagentsin a multi‑agenttragedyof the commonsJakob Stenseke1Received:25April2022/Accepted:13September2022©TheAuthor(s)2022AbstractAlthoughvirtueethicshasrepeatedlybeenproposedasasuitableframeworkforthedevelopmentofartic...

展开>> 收起<<

Artificial virtuous agents in a multi-agent tragedy of the commons.pdf

共18页,预览4页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Artificial virtuous agents in a multi-agent tragedy of the commons

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: