
TABLE I
PRIVACY DATASETS AND THEIR ANNOTATION.
Dataset Ref. # Images Labels Annotation
?PicAlert [4] 37,535 private, public, und.
“Private are photos which have to do with the private sphere (like self portraits, family,
friends, your home) or contain objects that you would not share with the entire world
(like a private email). The rest are public. In case no decision can be made, the picture
should be marked as undecidable”
VISPR [14] 22,167 private, public EU data Privacy [15], US Privacy Act [16], Social network rules [17]
?IPD [13] 38,525 private, public, und. see PicAlert and VISPR
?PrivacyAlert [8] 6,800
†clearly private, “Assume you have taken these photos, and you are about to upload them on your
clearly public, favourite social network [...] tell us whether these images are either private or
private, public public in nature. Assume that the people in the photos are those that you know ”
KEY – und.: undecidable. VISPR: Visual Privacy dataset [14], IPD: Image Privacy Dataset [13], ?: full dataset not available,
†: classes are merged into two classes (private and public).
(e.g. selfies, family, friends and home) (see Tab. I). Possible
labels are private,public and undecidable (if no decision could
be made on the image label). For VISPR, the annotations are
based on 68 attributes compiled from the EU Data Protection
Directive 95/46/EC [15], the US Privacy Act 1974 [16], social
network platform rules [17] and additional attributes after
manual inspection of the images [14]. Private images contain
at least one out of 32 privacy attributes related to personal life,
health, documents, visited locations, Internet conversations,
and automobiles. For PrivacyAlert, the images are annotated
through the Mechanical Turk2crowd-sourcing platform, and
the annotators are asked to classify the images into 4 classes
(clearly private,private,public and clearly public). The qual-
ity of the annotations is monitored using an attention checker
set that discards annotators who failed to provide the expected
response. The four-class annotations are then grouped into
binary labels that combine clearly private with private and
clearly public with public. PrivacyAlert provides binary labels
for each image and VISPR provides privacy attributes that are
classified as private or public. In PicAlert, 17% of the dataset
contains multiple ternary annotations for each image where
annotators’ agreement needs to be computed. Thus, labels
can be decided depending on the desired level of privacy, for
instance, labelling an image as private in case of annotation
disagreement.
B. Graph-based models for image classification
Graph-based methods have recently been introduced for
privacy classification [13], [18]. Graph-based networks model
information as nodes whose relationship is defined through
edges. The representation of each node is updated, propagating
the information through the edges. The initialisation of graphs
is often referred to as prior knowledge represented as an
adjacency matrix.
Prior knowledge structured as a knowledge graph can
improve image classification performance [19]. The Graph
Search Neural Network (GSNN) [19] incorporates prior
knowledge into Graph Neural Networks (GNN) [20] to solve
a multi-task vision classification problem (see Tab. II). GSNN
is based on the Gated Graph Neural Network (GGNN) [21],
reducing computational requirements and observing the flow
of information through the propagation model. GGNN uses
2https://www.mturk.com/
TABLE II
MAIN COMPONENTS OF THE ARCHITECTURE OF GRAPH-BASED METHODS.
.
Model Ref. Architecture Task
GGNN [21] GRU+GNN representation learning
GSNN [19] GGNN image classification
GRM [23] GGNN+GAT relationship recognition
GIP [13] GGNN+GAT image privacy classification
DRAG [18] GCN image privacy classification
KEY – GRU: Gated Recurrent Unit [22]; GNN: Graph Neural Network [20]; GAT:
Graph Attention Networks [24]; GCN: Graph Convolutional Network [25].
the Gated Recurrent Unit (GRU) [22] to update the hidden
state of each node with information from the neighbouring
nodes. GGNN is a differential recurrent neural network that
operates on graph data representations, iteratively propagating
the relationships to learn node-level and graph-level repre-
sentations. The Graph Reasoning Model (GRM) [23] uses
objects and interactions between people to predict social rela-
tionships. The graph model weighs the predicted relationships
with a graph attention mechanism based on Graph Attention
Networks (GAT) [24]. GRM uses prior knowledge on social
relationships, co-occurrences and objects in the scene as a
structured graph. Interactions between people of interest and
contextual objects are modelled by GGNN [21], where nodes
are initialised with the corresponding semantic regions. The
model learns about relevant objects that carry relevant task-
information.
Graph Image Privacy (GIP) [13] replaces the social rela-
tionship nodes with two nodes representing the privacy classes
(private, public). Dynamic Region-Aware Graph Convolutional
Network (DRAG) [18] adaptively models the correlation be-
tween important regions of the image (including objects) using
a self-attention mechanism. DRAG [18] is a Graph Convo-
lutional Network (GCN) [25] that learns the relationships
among specific regions in the image without the use of object
recognition.
III. METHOD
In this section, we present the content-based features, the
prior information to initialise Graph Privacy Advisor (GPA)
as well as the graph-based learning and privacy classification.
A. Features
Cardinality can affect the prediction of the privacy class,
especially considering the person category. An image is more