2 Related Work
Previous studies have explored agent communication in a multiagent RL setting; Kajic et al. [
13
]
investigate message-based navigation similar to the proposed work, while Cao et al. [
6
] study
communication grounding with respect to game rules in agents of varying degrees of self-interest. In
the work of Kim et al. [
14
], agents used a world model to predict future agent intents and environment
dynamics to generate, compress and transmit imagined trajectories. Other works explore topological
configurations different from fully-connected communication, such as the learnable hierarchical
approach in Sheng et. al [
23
], while communication via noisy channels has been investigated in Tung
et. al [24].
Agent deception, betrayal, truthfulness and trustworthiness has been previously investigated in
multiple settings [
7
]; for instance, Christiano et al. [
8
] present a challenge of discovering latent
knowledge in an agent that may produces false / unreliable reports, while Usui et al. [
25
] evaluate
analytic solutions of different strategies in iterated Prisoner’s Dilemmas.
Social dilemmas that gauge cooperation versus self-interest are explored in Leibo et al. [
16
],
applied via games like “Gather” and “Wolfpack”. “Hidden Agenda” is a team-based game offering
a complex action set including 2D navigation, agent / environment interaction, deception and
trustworthiness estimation via voting, and is investigated by Kopparapu et al. [
15
]. Asgharnia [
3
] use
a hierarchical fuzzy, situation-aware learning scheme to learn and utilize deception against one or
multiple adversaries in a custom environment.
Mitigation approaches include the work in Hughes et al. [
11
], where reward regularization is
approached by adding an inequity penalty in games with short-term versus long-term dilemmas,
like “Cleanup” and “Harvest”. Jaques et al. [
12
] use the same setting with a mutual information-
based mechanism that favors influential communication between agents, adopting a correlation
assumption of influence to cooperation. Blumenkamp et al. [
4
] utilize cooperative policy learning via
shared differentiable communication channel in three custom environments, investigating adaptation
dynamics when a self-interested adversary is introduced. Finally, Scmid et al. [
21
] explore using
agents that can explicitly impose penalties in a zero-sum setting, applied in N-player Prisoner’s
Dilemma games with large agent populations.
Given this body of work, the contributions of this work are as follows:
•
A betrayal-oriented environment: we design a simple, limited ruleset that can result in the
emergence complex betrayal behaviors, consolidated in a single-agent RL environment.
•
Interpretable Betrayal Detection: Proposal of a classification-based detector that utilizes
explainable, behavioral / observational evidence generated during agent play.
•
Betrayal penalization: proposal of avenues for penalizing detected betrayal during learning.
•
Experimental validation: we provide preliminary empirical findings showcasing emergence
and successful detection of betrayal behaviors in the proposed environment.
•
Future work proposals: we suggest pathways for utilizing the rich potential of the environ-
ment in future work, ruleset extensions and additional investigation axes of interest.
3 Proposed Environment
The proposed environment is built with a focus on betrayal detection and penalization goals expressed
in the literature [2], extending previous work on agent communication in RL settings [13].
It implements an episodic game that consists of a collection of
N≥2
gridworlds
[G1. . . Gn]
,
each paired with a single agent
Ai
. All worlds are associated with a pool of
k≥N
food items
F= [f1, . . . , fk]
that provide variable reward and nutrition to agents upon consumption. The
environment advances in a single-agent, turn-based fashion, using the following rules and mechanics:
• The game is played in rounds, wherein all agents act once in a randomly generated order.
•
At the start of each round, food items are randomly allocated and positioned in each world.
•
The objective of each agent
Ai
is to obtain food, which yields reward. Agent
Ai
may harvest
food by probing a location within their world Gi, but other worlds are inaccessible.
2