1
High-Throughput Flexible Belief Propagation
List Decoder for Polar Codes
Yuqing Ren, Yifei Shen, Leyu Zhang, Andreas Toftegaard Kristensen, Alexios
Balatsoukas-Stimming, Member, IEEE, Andreas Burg, Senior Member, IEEE, Chuan Zhang, Senior Member, IEEE
Abstract—Owing to its high parallelism, belief propagation
(BP) decoding is highly amenable to high-throughput implemen-
tations and thus represents a promising solution for meeting
the ultra-high peak data rate of future communication systems.
However, for polar codes, the error-correcting performance of
BP decoding is far inferior to that of the widely used CRC-aided
successive cancellation list (SCL) decoding algorithm. To close the
performance gap to SCL, BP list (BPL) decoding expands the
exploration of candidate codewords through multiple permuted
factor graphs (PFGs). From an implementation perspective,
designing a unified and flexible hardware architecture for BPL
decoding that supports various PFGs and code configurations
presents a big challenge. In this paper, we propose the first
hardware implementation of a BPL decoder for polar codes and
overcome the implementation challenge by applying a hardware-
friendly algorithm that generates flexible permutations on-the-fly.
First, we derive the graph selection gain and provide a sequential
generation (SG) algorithm to obtain a near-optimal PFG set.
We further prove that any permutation can be decomposed
into a combination of multiple fixed routings, and we design
a low-complexity permutation network to satisfy the decoding
schedule. Our BPL decoder not only has a low decoding latency
by executing the decoding and permutation generation in parallel,
but also supports an arbitrary list size without any area overhead.
Experimental results show that, for length-1024 polar codes with
a code rate of one-half, our BPL decoder with 32 PFGs has
a similar error-correcting performance to SCL with a list size
of 4and achieves a throughput of 25.63 Gbps and an area
efficiency of 29.46 Gbps/mm2at SNR = 4.0dB, which is
1.82×and 4.33×faster than the state-of-the-art BP flip and
SCL decoders, respectively.
Index Terms—polar codes, high-throughput, belief propagation
list (BPL) decoding, permuted factor graph, permutation, auto-
morphism ensemble, hardware implementation.
I. INTRODUCTION
P
OLAR codes, proposed by Arıkan in [1], have become
an integral part of 5G new radio (NR), where they
were ratified as the standard codes for the control channels
of 5G enhanced mobile broadband (eMBB) scenarios [2].
Along with the invention of polar codes, Arıkan introduced
successive cancellation (SC) decoding and belief propagation
Y. Ren, Y. Shen, A. T. Kristensen, and A. Burg are with the Telecom-
munications Circuits Laboratory (TCL),
´
Ecole Polytechnique F
´
ed
´
erale
de Lausanne (EPFL), Lausanne 1015, Switzerland (email:
{
yuqing.ren,
yifei.shen, andreas.kristensen, andreas.burg
}
@epfl.ch). Corresponding author:
Andreas Burg.
Y. Shen, L, Zhang and C. Zhang are with the LEADS of Southeast University,
the National Mobile Communications Research Laboratory, and the Purple
Mountain Laboratories, Nanjing 210096, China (email: chzhang@seu.edu.cn).
A. Balatsoukas-Stimming is with the Department of Electrical Engineering,
Eindhoven University of Technology, 5600 MB Eindhoven, The Netherlands
(email: a.k.balatsoukas.stimming@tue.nl).
(BP) decoding. Following the evolution of communication
scenarios, both SC and BP decoding led the development of
polar decoding algorithms and implementations, which were
extended into a series of advanced polar decoders such as
SC list (SCL) [3]–[9], BP list (BPL) [10]–[18], and BP flip
(BPF) [19]–[22] decoders.
While the original SC decoding algorithm can achieve
channel capacity at infinite code lengths, it shows poor error-
correcting performance with practical finite code lengths. To
improve the error-correcting performance of SC decoding, SCL
decoding was proposed in [3] to keep a list of up to
L
candi-
date codewords. Additionally, when concatenated with cyclic
redundancy check (CRC) codes [4], polar codes with SCL
decoding outperform low-density parity-check (LDPC) and
Turbo codes in terms of the error-correcting performance [23].
To satisfy the low-latency and high throughput requirements of
eMBB scenarios, node-based fast SCL decoders [5]–[9] focus
on exploiting special constituent codes [24]–[28], which help
to avoid traversing the lower stages of the decoding tree to
provide a significant reduction in decoding latency compared
to conventional bit-wise SCL decoders. The state-of-the-art
(SOA) node-based SCL decoder [9] with a list size (
L= 8
)
achieves a throughput of more than
2.94
Gbps, which fits
the reliability, latency, and throughput requirements of eMBB
scenarios. However, when considering the ultra-high peak data
rate requirements of future communication systems [29], SC-
based decoders become impractical due to the serial processing
inherent in these algorithms [5]–[9].
In contrast to SC-based decoding, BP decoding is an
inherently parallel algorithm. BP decoding can thus be im-
plemented easily in a multi-stage factor graph in pursuit of
a much higher throughput [30]. Additionally, BP decoding
has the potential to realize iterative detection and decoding
to achieve better system performance than separate detection
and decoding [31], [32], which further raises the interest in
BP decoding for academia and industry. Though the error-
correcting performance of BP decoding improves as the
iteration number increases, it is still far behind the SCL
performance. BPF decoding [19]–[22] and BPL decoding [10]–
[18] are two advanced BP algorithms that can approach the
performance of SCL by expanding the exploration of candidate
codewords. BPF decoding guesses the positions of error-prone
bits and sequentially corrects them in additional decoding
attempts. Unfortunately, online identification of error-prone
bits [20]–[22] through sorting and post-processing of channel
messages increases the hardware complexity and degrades
the maximum operating frequency [22]. Alternatively, BPL
arXiv:2210.13887v2 [cs.IT] 19 Mar 2023