Scalable Measurement Error Mitigation via Iterative Bayesian Unfolding

2025-05-03 0 0 1.19MB 13 页 10玖币
侵权投诉
Scalable Measurement Error Mitigation via Iterative Bayesian Unfolding
Siddarth Srinivasan,1Bibek Pokharel,2, 3 Gregory Quiroz,4, 5 and Byron Boots6
1Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, WA 98195
2Department of Physics and Astronomy, University of Southern California, Los Angeles, CA, 90089, USA
3Center for Quantum Information Science &Technology,
University of Southern California, Los Angeles, CA 90089, USA
4Johns Hopkins University Applied Physics Laboratory, Laurel, Maryland 20723
5William H. Miller III Department of Physics &Astronomy,
Johns Hopkins University, Baltimore, Maryland 21218, USA
6Department of Computer Science and Engineering, University of Washington, Seattle, WA 98195
(Dated: October 25, 2022)
Measurement error mitigation (MEM) techniques are postprocessing strategies to counteract systematic read-
out errors on quantum computers (QC). Currently used MEM strategies face a tradeoff: methods that scale
well with the number of qubits return negative probabilities, while those that guarantee a valid probability dis-
tribution are not scalable. Here, we present a scheme that addresses both of these issues. In particular, we
present a scalable implementation of iterative Bayesian unfolding, a standard mitigation technique used in high-
energy physics experiments. We demonstrate our method by mitigating QC data from experimental preparation
of Greenberger-Horne-Zeilinger (GHZ) states up to 127 qubits and implementation of the Bernstein-Vazirani
algorithm on up to 26 qubits.
I. INTRODUCTION
Measurement errors are a significant source of error [14]
for today’s programmable quantum computers (QCs). Cur-
rently, QCs typically implement quantum algorithms with a
measurement at the end of the quantum circuit. However, ad-
vanced quantum operations, including quantum error correc-
tion [5,6], involve measurements not only at the end of quan-
tum computation but throughout the computational scheme.
Thus, measurement accuracy is vital to obtaining accurate
computational results. Measurement error mitigation (MEM)
techniques are a collection of classical post-processing tech-
niques that aim to address the measurement errors inherent to
the QC hardware. Most MEM methods are unable to achieve
both scalability and a guarantee of valid probability distribu-
tions after mitigation. Tackling this challenge, we present a
scalable MEM strategy whose time and memory costs do not
scale exponentially with the number of qubits while also guar-
anteeing valid probability distributions.
Most MEM methods assume that the true classical proba-
bility distribution ~
θobtained from measuring a quantum state
in the computational basis is modified by a response matrix R
to yield ~p, a noisy probability distribution distorted by mea-
surement occurring in a faulty basis, i.e.,
~p=R~
θ.(1)
Eq. (1) assumes that measurement errors are systematic, lin-
ear, and time-independent. To obtain the response matrix,
most MEM techniques further assume that the state prepara-
tion errors can be ignored. These simplifying assumptions
have been verified for IBM Quantum Experience (IBMQE)
denotes equal contribution.
superconducting devices [7,8], provided the readout calibra-
tion data is up to date.
Assuming that systematic measurement errors manifest as
in Eq. (1), MEM schemes require two steps. The first step
is quantum detector tomography (QDT) [7] to acquire the re-
sponse matrix R(see Fig. 1). The entries of the response ma-
trix Ri j are conditional probabilities, i.e., the probability of
measuring a bitstring bigiven that computational basis ele-
ment |bjiwas prepared. In the second step, the classical prob-
ability distribution ~pacquired from measurements after run-
ning quantum circuit Uis used to compute a new probability
distribution ~
θwhich approximates the true distribution (for
any future experiment). The most popular strategy, response
matrix inversion, defines
~
θ=R1~p.(2)
Acquiring Rand implementing MEM face two challenges:
scaling and negative probabilities. First, Ris a 2n×2nmatrix
for an n-qubit system and in general requires 2ncalibration
experiments. In other words, the dimensions of Rare expo-
nential in n, and QDT requires exponentially many experi-
ments. Secondly, while the response matrix Ris stochastic, its
inverse is not. As most MEM methods rely on applying R1,
this leads to a ‘negative probability’ issue. The distributions
obtained after measurement error mitigation are quasiproba-
bilities - the elements of this quasiprobability distribution can
be negative or greater than 1 as long as they sum to 1.
Various strategies have been proposed to address the scal-
ability of the response matrix inversion method. Assuming
k-locality for measurement crosstalk [7,9] allows Rto be rep-
resented as n
/kdifferent 2k×2kmatrices. This assumption
has been verified in Refs [7,8] and presents the first step in
addressing the scalability issue. The unmitigated probability
distributions are already sparse as the experiments required to
construct these distributions are only performed for Sshots,
which are generally set between 500 to 100000. In practice,
Sdoes not scale exponentially with qubit size due to time and
arXiv:2210.12284v1 [quant-ph] 21 Oct 2022
2
computational constraints. Ref [8] relied on this sparsity to
make MEM tractable. In particular, only the matrix elements
of Rthat had bitstrings close to the observed ones (measured
using Hamming distance) could be considered, while others
were ignored. This subspace reduction of Rsuggested that a
user could perform QDT after and with the knowledge of the
unmitigated experimental results. In other words, the order
of QDT and the experiment in Fig. 1was flipped. Similarly,
Ref. [10] suggested placing a precision threshold on Rand
MEM, such that any probabilities below 1
/(10×S)are ignored.
These approaches to scaling MEM have been quite effective.
In particular, Ref. [8,10,11] have demonstrated Greenberger-
Horne-Zeilinger (GHZ) state preparation on 27, 42 and 65-
qubits respectively. MEM plays a crucial part in these demon-
strations by significantly improving success metrics.
The methods listed above, while scalable, aim to invert R
and apply R1, and the non-stochasticity of R1necessarily
leads to quasiprobabilities. Numerous methods [1,12,13]
have been proposed to address the negativity issue, but no
clear consensus has emerged. One strategy is to use con-
strained least-squares optimization to find a probability dis-
tribution with no negative probabilities close to the mitigated
quasiprobability distribution (in terms of some norm-based
distance between probability distributions). However, it is
unclear how to scale this approach. Alternatively, the nega-
tive terms can be canceled systematically by projecting to the
nearest valid probability distribution, as detailed in Ref. [13].
However, this zeroes out all negative probabilities and can no
longer be considered an unbiased estimator of the true dis-
tribution without measurement errors. Here Ref. [12] is par-
ticularly notable. It highlighted that error-prone detectors are
standard across experimental fields, and unfolding techniques
used in high-energy physics experiments can also be used to
mitigate readout errors on quantum devices. In particular,
Ref. [12] showed that iterative Bayesian unfolding (IBU) [14]
could mitigate readout errors while preserving non-negativity.
While IBU does not return any quasi-probabilities in its ba-
sic form, it requires iterative matrix multiplications with Rand
is challenging to scale to many qubits without further modi-
fication. In this paper, we demonstrate a scalable implemen-
tation of IBU. The original IBU formulation starts with the
response matrix R, the noisy empirical distribution ~p, and an
initial guess ~
θ0. Then it iteratively applies Bayes’ rule to find
a mitigated probability distribution ~
θklike so:
θk+1
j=
2n
i=1
pi·Ri jθk
j
mRimθk
m
(3)
A well-known technique in the high-energy physics com-
munity, IBU is also known as the Richardson-Lucy deconvo-
lution [15,16]. While IBU uses Bayes’ rule and starts with
an initial guess, sometimes referred to as a ‘prior, it does
not report a ‘posterior’ in any sense and is not a Bayesian
method. It is more appropriate to call it iterative expectation
maximization unfolding [17] as it converges to the maximum
likelihood estimate [18] for a large number of iterations. Since
this method may not be familiar to the quantum computing
community, we provide a derivation of IBU as expectation
maximization as relevant to the quantum computing setting.
Quantum Detector Tomography
|0i
State Preparation
|0i
.
.
.
|0i
|0i
Error-mitigated experiment
|0i
U
|0i
.
.
.MEM
|0i
|0i
FIG. 1. Two steps of MEM. The quantum detector tomography
(QDT) circuit is shown on top. QDT is performed to estimate the
response matrix R. The bottom figure shows the schematic circuit
for the MEM application. After applying a unitary U, the classical
measurement data is fed into a MEM scheme to produce an error-
mitigated probability distribution.
Our main challenge was to show that under reasonable as-
sumptions about measurement crosstalk and precision, IBU
can be implemented scalably. To demonstrate the scalability
of our method, we mitigate errors on QC data acquired by im-
plementing the Bernstein-Vazirani algorithm [19] on the 27-
qubit IBMQ Montreal [20] and from preparing GHZ states for
the 127-qubit IBMQ Washington. Our MEM implementation
performs as well as or better than the current state-of-the-art
M3 method [8]without producing negative probabilities, al-
beit with slightly higher computational time.
The rest of the paper is structured as follows: in Section II,
we describe our implementation of scalable iterative Bayesian
unfolding. We discuss the connection between IBU and ex-
pectation maximization in Section III. Section IV focuses on
demonstrating our MEM method on two quantum datasets
– the Bernstein-Vazirani algorithm on up to 26-qubits (Sec-
tion IV A) and GHZ state preparation up to 81 qubits (Sec-
tion IV B). We give some concluding remarks in Section V.
3
II. SCALABLE IBU IMPLEMENTATION
Here, we discuss some assumptions and computational ap-
proaches we use to scalably implement IBU (as in Eq. 3) for
measurement error mitigation (see Algorithm 1).
a. Vectorized Implementation First, we show how to
vectorize the IBU update rule given in Equation 3to enable
fast parallel computation on GPUs. Let Zbe a random vari-
able representing the true distribution over bitstrings without
any measurement error and Z0be a random variable repre-
senting the distribution over bitstrings with measurement er-
ror. With c(z0
i)the counts of bitstring z0
iand Sshots, let
~pR2nbe the noisy distribution with pi=c(z0
i)
S, let ~
θkR2n
be the kth iteration guess of the error-mitigated distribution
with θkj=P
~
θk(Zj), and let RR2n×2nbe the response ma-
trix where Ri j =P(Z0
i|Zj). Then, the IBU update rule is:
~
θk+1=
2n
i=1
pi·Ri~
θk
Ri~
θk
= ~pT·diag1
R~
θk·R·diag(~
θk)!T
=~
θkRT~pR~
θk
(4)
Here, is element-wise multiplication, and is element-
wise division. This expression fully vectorizes the IBU update
rule by writing it solely in terms of elementwise operations
and matrix products. It also optimizes the order of operations
to avoid constructing large intermediate matrices; only the ad-
ditional memory for storing ~
θk+1is needed.
b. Tensor Product Structure of Noise Model Despite the
efficient update scheme above, the memory requirements are
still quite unfavorable. As previously discussed, the first big
memory bottleneck is that the response matrix Rgrows expo-
nentially in the number of qubits: 2n×2n. To get around this
issue, we assume that there is no measurement crosstalk and
hence Rhas a convenient tensor product decomposition [79],
i.e., R=R(1)R(2)... R(n). While we restrict ourselves
to this scenario, our implementation can be extended to cases
where measurement crosstalk is restricted to kqubits.
With this tensor product structure on R, we now have 4npa-
rameters to represent R, as opposed to the original 4nparam-
eters. This structure also allows matrix product R~xto be com-
puted very quickly using only matrix products and reshapes
[21] without any additional memory requirement. This is the
simplest structure we can impose on Rto obtain tractability,
although this can be relaxed into other tensor network decom-
positions of Ras well (such as matrix product states (MPS)).
c. Subspace Reduction for Tractability While we man-
aged the exponential scaling of Rwith a tensor product de-
composition, the parameters of the true latent distribution
~
θR2nalso scale exponentially in the number of qubits.
To manage this, we can use the subspace reduction used in
Ref. [8]. In particular, instead of maintaining and updating a
vector of probabilities over 2nbitstrings, we only maintain
and update an M-dimensional vector over select bitstrings.
Specifically, we pre-select M-bitstrings (e.g., to be all those
bitstrings within Hamming distance dfrom any bitstring in
our dataset) and initialize ~
θ0with non-zero values in the en-
tries corresponding to those Mbitstrings; all other entries are
zero. Under such an initialization ~
θ0, the IBU update rule
will guarantee that all entries that started as zero remain zero.
Thus, we can ignore those entries and track only the M×1
sub-vector corresponding to non-zero entries. We refer to this
subspace-reduced implementation as IBU (Reduced), and IBU
without subspace reduction as IBU (Full). Under this sub-
space reduction trick, we can no longer easily leverage the
Kronecker product structure of Rto compute R~xfor any ~x.
Instead, we must construct a reduced matrix ˜
Rwhose rows
and columns correspond to the Mselected bitstrings. Unlike
[8], we are not solving a least squares problem, so matrix-free
methods like generalized minimal residual method (GMRES)
will not be helpful; we need to construct the reduced matrix
˜
R. Even if the matrix does not grow exponentially with the
number of qubits, it may still be prohibitively large to store
in memory. Our computational solution to this problem is
to build the solution to R~xby weighting each column Rjby
xjand only keeping the running sum. We also use the JAX
library’s built-in accelerated linear algebra routines and just-
in-time compilation, vectorize as many operations as possible,
and use a GPU for parallelization. Although we incur the cost
of repeatedly computing ˜
R, there is a tradeoff between this
cost and memory costs for explicitly storing R.
d. Worst-case Computational Complexity Suppose we
run an experiment with Sshots on an nqubit system and
track strings up to Hamming distance dfrom the measured
bitstrings. In the worst case, there will be Sdifferent mea-
surements, so subspace-reduced IBU tracks S·d
k=0n
k
O(Snd)bitstrings. The three main computational subroutines
are 1) constructing the subspace-reduced response matrix, 2)
matrix-vector multiplication, and 3) elementwise multiplica-
tion and division. First, the subspace-reduced response matrix
will have O(S2n2d)cells to be constructed. Each cell of the
matrix requires a single call to each of the nsingle-qubit re-
sponse matrices, so constructing the matrix takes O(S2n2d+1)
operations. Second, multiplying this matrix with a vector with
O(Snd)entries takes O(S2n2d)operations. Third, the elemen-
twise multiplication and division operations take O(Snd)op-
erations. Overall, a single IBU iteration scales as O(S2n2d+1),
better than exponential scaling of the naive approach.
III. ITERATIVE BAYESIAN UNFOLDING AS
EXPECTATION-MAXIMIZATION
Here we summarize the argument that iterative Bayesian
unfolding is an instantiation of the well-known Expectation-
Maximization (EM) algorithm from machine learning. See
Appendix Afor complete derivation as well as a ‘true’
Bayesian extension that computes that maximum a posteriori
(MAP) estimate of ~
θgiven Dirichlet priors. Shepp and Vardi
[18] have noted connections between IBU and EM for Poisson
摘要:

ScalableMeasurementErrorMitigationviaIterativeBayesianUnfoldingSiddarthSrinivasan,1BibekPokharel,2,3GregoryQuiroz,4,5andByronBoots61PaulG.AllenSchoolofComputerScienceandEngineering,UniversityofWashington,Seattle,WA981952DepartmentofPhysicsandAstronomy,UniversityofSouthernCalifornia,LosAngeles,CA,9...

展开>> 收起<<
Scalable Measurement Error Mitigation via Iterative Bayesian Unfolding.pdf

共13页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:13 页 大小:1.19MB 格式:PDF 时间:2025-05-03

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 13
客服
关注