
AI & SOCIETY
1 3
virtuous humans (Govindarajulu etal. 2019; Berberich and
Diepold 2018; Mabaso 2020). Apart from offering conveni-
ent means for control and supervision, one major appeal of
the approach is that it could potentially resolve the alignment
problem, i.e., the problem of aligning machine values with
human values (Armstrong 2015; Gabriel 2020).
(4) The fourth theme is based on the relationship between
virtue ethics and connectionism, i.e., the cognitive theory
that mental phenomena can be described using artificial neu-
ral networks. The emphasis on learning, and the possibility
to apprehend context-sensitive and non-symbolic informa-
tion without general rules, has indeed led many authors to
highlight the appeal of unifying virtue ethics with connec-
tionism (Berberich and Diepold 2018; Wallach and Allen
2008; Howard and Muntean 2017; Stenseke 2021). The
major reason is that it would provide AVAs with a com-
pelling theoretical framework to account for the develop-
ment of moral cognition (Churchland 1996; DeMoss 1998;
Casebeer 2003), as well as the technological promises of
modern machine learning methods (e.g., deep learning and
reinforcement learning).
(5) The fifth theme is the virtue-theoretic emphasis on
function and role (Coleman 2001; Thornton etal. 2016).
According to both Plato (R 352) and Aristotle (NE 1097b
26-27), virtues are those qualities that enable an agent to
perform their function well. The virtues of an artificial agent
would, consequently, be the traits that allow it to effectively
carry out its function. For instance, a self-driving truck does
not share the same virtues as a social companion robot used
in childcare; they serve different roles, are equipped with
different functionalities, and meet their own domain-specific
challenges. Situating artificial morality within a broader vir-
tue-theoretic conception of function would therefore allow
us to clearly determine the relevant traits a specific artificial
agent needs in order to excel at its particular role.
The biggest challenge for the prospect of AVAs is to move
from the conceptual realm of promising ideas to the level of
formalism and details required for technical implementation.
Guarini (2006, 2013a, 2013b) has developed neural network
systems to deal with the ambiguity of moral language, and
in particular the gap between generalism and particularism.
Without the explicit use of principles, the neural networks
can learn to classify cases as morally permissible/impermis-
sible. Inspired by Guarini’s classification approach, How-
ard and Muntean have broadly explored the conceptual and
technical foundations of autonomous artificial moral agents
(AAMAs) based on virtue ethics (Howard and Muntean
2017). Based on Annas skill metaphor (Annas 2011) and
the moral functionalism of Jackson and Pettit (1995), they
conjecture that artificial virtues (seen as dispositional traits)
and artificial moral cognition can be developed and refined
in a bottom-up process through a combination of neural net-
works and evolutionary computation methods. Their central
idea is to evolve populations of neural networks using an
evolutionary algorithm that, via a fitness selection, alters the
parameter values, learning functions, and topology of the
networks. The emerging candidate solution is the AAMA
with “a minimal and optimal set of virtues that solves a large
enough number of problems, by optimizing each of them”
(Howard and Muntean (2017),p.153). Although promising
in theory, Howard and Muntean’s proposed project is lacking
in several regards. First, while combinations of neural net-
works and randomized search methods have yielded promis-
ing results in well-defined environments using NeuroEvolu-
tion of Augmenting Topologies (Stanley and Miikkulainen
2002) (NEATs) or deep reinforcement learning (Berner etal.
2019), Howard and Muntean’s proposal turns into a costly
search problem of infinite dimensions. Furthermore, due to
the highly stochastic process of evolving neural networks
and an equivocal definition of fitness evaluation, it is not
guaranteed that morally excellent agents would appear even
if we granted infinite computational resources. Besides
being practically infeasible, several crucial details of their
implementation are missing, and they only provide fragmen-
tary results of an experiment where neural networks learn
to identify anomalies in moral data. It therefore remains
unclear how their envisioned AAMAs ought to be imple-
mented in moral environments apart from the classification
tasks investigated by Guarini (2006).
Berberich and Diepold (2018) have, in a similar vein,
broadly described how various features of virtue ethics can
be carried out by connectionist methods. This includes (a)
how reinforcement learning (RL) can be used to inform the
moral reward function of artificial agents, (b) a three-com-
ponent model of artificial phronesis (encompassing moral
attention, moral concern, and prudential judgment), (c) a
list of virtues suitable for artificial agents (e.g., prudence,
justice, temperance, courage, gentleness, and friendship to
humans), and (d) learning from moral exemplars through
behavioral imitation by means of inverse RL (Ng and Rus-
sell 2000). However, apart from offering a rich discussion
of promising features artificial virtuous agents could have,
along with some relevant machine learning methods that
could potentially carry out such features, they fail to provide
the technical details needed to construct and implement their
envisioned agents in moral environments.
As a first step toward artificial virtue, Govindarajulu
etal. (2019) have provided an, in their words “embryonic”
formalization of how artificial agents can adopt and learn
from moral exemplars using deontic cognitive event cal-
culus (DCED). Based on Zagzebski’s “exemplarist moral
theory” (Zagzebski 2010), they describe how exemplars can
be identified via the emotion of admiration, which is defined
as “approving (of) someone else’s praiseworthy action”
(Govindarajulu etal. (2019),p.33). In their model, an
action is considered praiseworthy if it triggers a pleasurable