
Analyzing the Robustness of Decentralized Horizontal and Vertical Federated Learning
To destroy the global model performance, the literature has documented several data and model falsification attacks
consisting of poisoning data, labels, or weights during training [8]. The detection and mitigation of these attacks
are challenging tasks since there is a trade-off between the performance of the global model and the privacy of the
participants’ sensitive data. In other words, since the FL paradigm aims to expose as little information about the
individual participants’ data as possible, recognizing and mitigating the presence of poisoned data samples is not easy
[9]. Therefore, despite the existing detection and mitigation solutions, such as the usage of clustering techniques to
detect anomalies in model parameters [10] or the use of secure aggregation functions to remove noisy weights [11],
no robust solution exists nowadays.
Additionally, before thinking about detecting and mitigating adversarial attacks, it is critical to analyze the impact
of heterogeneous attacks on different FL scenarios. In this sense, robust FL architectures and models should be built
to collaborate with detection and mitigation techniques and reduce attack impacts as much as possible. However,
the following challenges are still open regarding FL architectural robustness. First, the impact of existing data and
model poisoning attacks has mainly been validated in horizontal scenarios, being decentralized vertical scenarios
unexplored. Second, while different categories of attacks are well known, a direct comparison between their efficiency
in heterogeneous horizontal and vertical FL architectures is missing. Last but not least, the distribution of data held by
participants is a critical aspect to consider in FL, and there is a lack of work evaluating the robustness of FL models
trained with non-independent and identically distributed (non-IID) data.
To improve the previous challenges, this work presents the following main contributions:
•The design and implementation of three FL architectures, namely HoriChain, VertiChain, and VertiComb, one
for horizontal and two for vertical FL scenarios. HoriChain and VertiChain are inspired by a chain-based learning
protocol, while VertiComb follows a peer-to-peer network splitting strategy. The three architectures fully or
partially share the following characteristics: network architecture, training protocol, and dataset structure.
•The proposal of a distributed, decentralized, and privacy-preserving use case suitable for HFL and VFL that
uses non-IID data. In particular, the use case pretends to solve the problem of classifying handwritten digits in a
privacy-preserving way by splitting the MNIST dataset between seven participants. The three architectures are
executed using the same number of participants, number of adversaries, types of attacks, and implementations
of the attacks. Then, the performance of the three architectures is evaluated and compared. In conclusion, the
VertiChain architecture is less effective than VertiComb and HoriChain.
•The evaluation of the HoriChain and VertiComb architectures robustness when trained in the previous scenario
and affected by data and model poisoning attacks. The performed experiments show that different configurations
of both attacks highly affect the accuracy, F1-score, and learning time of both architectures. However, the
HoriChain architecture is more robust than the VertiComb when the attacks poison a reduced number of samples
and gradients.
The organization of this paper is as follows. First, related work dealing with FL and adversarial attacks are reviewed
in Section 2. Section 3details the FL architecture design. Section 4describes the use case, non-IID dataset splitting and
training pipeline in which the proposed architectures are tested. Section 5focuses on explaining the implementation
of adversarial attacks. The results and discussion of the performed experiments are evaluated in Section 6. Finally,
Section 7provides conclusions and draws future steps.
2. Related Work
This section reviews the state-of-the-art concerning FL architectures, adversarial attacks affecting different FL
scenarios, and works evaluating the robustness of FL models and architectures.
2.1. FL Scenarios and Architectures
In 2019, [12] defined the scenarios of HFL, VFL, and FTL. The definitions use the symbols 𝑋to mean features,
𝑌for labels, 𝐼for the IDs of participants, and 𝐷for the local datasets. Then, an HFL scenario is characterized as
𝑋𝑖=𝑋𝑗, 𝑌𝑖=𝑌𝑗, 𝐼𝑖≠𝐼𝑗,∀𝐷𝑖, 𝐷𝑗, 𝑖 ≠𝑗. A VFL scenario can be identified as 𝑋𝑖≠𝑋𝑗, 𝑌𝑖≠𝑌𝑗, 𝐼𝑖=𝐼𝑗,∀𝐷𝑖, 𝐷𝑗, 𝑖 ≠
𝑗. Lastly, an FTL scenario has 𝑋𝑖≠𝑋𝑗, 𝑌𝑖≠𝑌𝑗, 𝐼𝑖≠𝐼𝑗,∀𝐷𝑖, 𝐷𝑗, 𝑖 ≠𝑗. The authors also distinguished FL from
distributed ML. Despite being very similar, in FL, users have autonomy and the central server cannot control their
participation in the training process. FL also has an emphasis on privacy protection, while distributed ML does not.
First Author et al.: Preprint submitted to Elsevier Page 2 of 15