
user can flagrantly lie and respond with
n−1
, skewing their
response by as much as the number of users. We address this
challenge and design robust degree estimation protocols. Our
protocols are based on the key observation – graph data is
naturally redundant. Specifically, the presence (or absence)
of edge
ei j
is shared by both users
Ui
and
Uj
. This introduces
data redundancy which we leverage to provide robustness – as
long as at least one of
Ui
or
Uj
is honest, the analyst can check
for consistency between the two bits obtained from
Ui
and
Uj
. We illustrate this using the following degree-estimation
protocol. For the ease of exposition, we assume no privacy.
Consider a protocol where the users send in their (true) ad-
jacency lists. Here for every edge
ei j
, the analyst receives
two copies of the same bit from users
Ui
and
Uj
– in case
there is inconsistency
1
in the two reported bits, the analyst
can safely conclude that one party is malicious and act accord-
ingly. Thus, we see that the natural data redundancy in graphs
can be leveraged to perform robust data analysis. However,
LDP requires randomization which makes the user’s reports
probabilistic. Consequently, the aforementioned simple con-
sistency check cannot be applied to LDP protocols. To this
end, we design novel degree estimation protocols that enable
a data analyst to employ consistency checks and detect mali-
cious users even under LDP. Our proposed protocols provide
formal robustness guarantees against data poisoning attacks
that significantly reduce the impact of poisoning and compute
degree estimates with high accuracy.
In summary, we make the following contributions:
•
We propose a formal framework for analyzing the robust-
ness of a protocol. The framework quantifies the impact
of poisoning attacks on both honest and malicious users.
•
Based on the proposed framework, we study the impact
of poisoning attacks on private degree estimation in the
LDP setting. We observe that LDP makes a degree estima-
tion protocol more vulnerable to poisoning – the impact
of poisoning is worse when the adversary can directly
poison their (noisy) responses (Fig. 3), rather than their
input data (Fig. 2). The former utilizes the characteristics
of LDP while the latter is independent of LDP.
•
Leveraging the redundancy in graph data, we design ro-
bust degree estimation protocols under LDP that signif-
icantly reduce the impact of adversarial poisoning and
compute degree estimates with high accuracy.
•
We evaluate our proposed robust degree estimation pro-
tocols under poisoning attacks on real-world datasets to
demonstrate their efficacy in practice.
2 Background
We start by outlining the notation we use in the paper followed
by the necessary preliminaries on LDP for graph analysis.
1
For the ease of exposition, we disregard errors due to machine failures.
Notation.
In the following and throughout this paper, let
G= (V,E)
be an undirected graph with
V
and
E
representing
the set of nodes (vertices) and edges, respectively. We assume
a graph with
n∈N
nodes, i.e.,
V= [n]
where
[n]
denotes the
set {1,2,··· ,n}. Let Gdenote the domain of all graphs with
n
nodes. Each node
i∈V
corresponds to a user
Ui
. Let
li∈
{0,1}nbe the adjacency list for Ui,i∈[n]where bit li[j],j∈
[n]
encodes the edge
ei j
between a pair of users
Ui
and
Uj
.
Specifically,
li[j] = 1
if
ei j ∈E
and
ei j =0
otherwise. Note
that
li
is the
i
-th row of the adjacency matrix
L
of the graph
G
. Let
d=hd1,...,dni ∈ Rn
denote the vector of degrees in
G.mdenotes the number of malicious users.
2.1 Local Differential Privacy for Graphs
Differential privacy (DP) is a quantifiable measure of the sta-
bility of the output of a randomized mechanism to changes
to its input. One of the most popular models of DP is known
as local model the which is the focus of our paper. The local
model consists of a set of individual users (U) and an un-
trusted data aggregator (analyst); each user perturbs their data
using a (local) DP algorithm and sends it to the aggregator
which uses these noisy data to estimate certain statistics of
the entire dataset. Thus, the LDP model allows gleaning of
useful information from the dataset without requiring the data
owners to trust any third-party entity.
2.1.1 Privacy Definitions
Relationship DP.
As noted above, the knowledge of an
edge in a graph is shared between two users. Motivated by
this observation, recent work [30,31] has introduced a new
notion of DP called relationship DP that considers both the
users’ outputs and hides an edge in a graph during the entire
computation. Formally, we have:
Definition 1
(
ε
-relationship DP)
.
For
i∈[n]
, let
Ri:{0,1}n7→
X
be the local randomizer of user
Ui
that takes their adja-
cency list
li
as input. We say
hR1(·),··· ,Rn(·)i
provides
ε
-
relationship DP if for any two neighboring graphs
G,G0∈G
that differ in one edge and for any S ⊆Xn,
Pr[(R1(l1),··· ,Rn(ln)) = S]≤eεPr[(R1(l0
1),··· ,Rn(l0
n)) = S],
where
li
(respectively,
l0
i
) is the
i
-th row of the adjacency
matrix of graph G (respectively, G0).
Relationship DP is closely connected to the notion of edge
LDP which is one of the most popular privacy guarantee for
graphs [41,43] in the LDP setting. Formally:
Definition 2
(
ε
-edge LDP)
.
Let
R:{0,1}n7→ X
be a ran-
domized algorithm that takes an adajcency list
li∈ {0,1}n
as input. We say
R(·)
provides
ε
-edge LDP if for any two
neighboring lists
l,l0∈ {0,1}n
that differ in one bit (i.e., one
edge) and any output s ∈X,
PrR(l) = s≤eεPrR(l0) = s
2