Scalable Measurement Error Mitigation via Iterative Bayesian Unfolding

2025-05-03 0 0 1.19MB 13 页 10玖币

侵权投诉

Siddarth Srinivasan∗,1Bibek Pokharel∗,2, 3 Gregory Quiroz,4, 5 and Byron Boots6

1Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, WA 98195

2Department of Physics and Astronomy, University of Southern California, Los Angeles, CA, 90089, USA

3Center for Quantum Information Science &Technology,

University of Southern California, Los Angeles, CA 90089, USA

4Johns Hopkins University Applied Physics Laboratory, Laurel, Maryland 20723

5William H. Miller III Department of Physics &Astronomy,

Johns Hopkins University, Baltimore, Maryland 21218, USA

6Department of Computer Science and Engineering, University of Washington, Seattle, WA 98195

(Dated: October 25, 2022)

Measurement error mitigation (MEM) techniques are postprocessing strategies to counteract systematic read-

out errors on quantum computers (QC). Currently used MEM strategies face a tradeoff: methods that scale

well with the number of qubits return negative probabilities, while those that guarantee a valid probability dis-

tribution are not scalable. Here, we present a scheme that addresses both of these issues. In particular, we

present a scalable implementation of iterative Bayesian unfolding, a standard mitigation technique used in high-

energy physics experiments. We demonstrate our method by mitigating QC data from experimental preparation

of Greenberger-Horne-Zeilinger (GHZ) states up to 127 qubits and implementation of the Bernstein-Vazirani

algorithm on up to 26 qubits.

I. INTRODUCTION

Measurement errors are a signiﬁcant source of error [1–4]

for today’s programmable quantum computers (QCs). Cur-

rently, QCs typically implement quantum algorithms with a

measurement at the end of the quantum circuit. However, ad-

vanced quantum operations, including quantum error correc-

tion [5,6], involve measurements not only at the end of quan-

tum computation but throughout the computational scheme.

Thus, measurement accuracy is vital to obtaining accurate

computational results. Measurement error mitigation (MEM)

techniques are a collection of classical post-processing tech-

niques that aim to address the measurement errors inherent to

the QC hardware. Most MEM methods are unable to achieve

both scalability and a guarantee of valid probability distribu-

tions after mitigation. Tackling this challenge, we present a

scalable MEM strategy whose time and memory costs do not

scale exponentially with the number of qubits while also guar-

anteeing valid probability distributions.

Most MEM methods assume that the true classical proba-

bility distribution ~

θobtained from measuring a quantum state

in the computational basis is modiﬁed by a response matrix R

to yield ~p, a noisy probability distribution distorted by mea-

surement occurring in a faulty basis, i.e.,

~p=R~

θ.(1)

Eq. (1) assumes that measurement errors are systematic, lin-

ear, and time-independent. To obtain the response matrix,

most MEM techniques further assume that the state prepara-

tion errors can be ignored. These simplifying assumptions

have been veriﬁed for IBM Quantum Experience (IBMQE)

∗denotes equal contribution.

superconducting devices [7,8], provided the readout calibra-

tion data is up to date.

Assuming that systematic measurement errors manifest as

in Eq. (1), MEM schemes require two steps. The ﬁrst step

is quantum detector tomography (QDT) [7] to acquire the re-

sponse matrix R(see Fig. 1). The entries of the response ma-

trix Ri j are conditional probabilities, i.e., the probability of

measuring a bitstring bigiven that computational basis ele-

ment |bjiwas prepared. In the second step, the classical prob-

ability distribution ~pacquired from measurements after run-

ning quantum circuit Uis used to compute a new probability

distribution ~

θwhich approximates the true distribution (for

any future experiment). The most popular strategy, response

matrix inversion, deﬁnes

θ=R−1~p.(2)

Acquiring Rand implementing MEM face two challenges:

scaling and negative probabilities. First, Ris a 2n×2nmatrix

for an n-qubit system and in general requires 2ncalibration

experiments. In other words, the dimensions of Rare expo-

nential in n, and QDT requires exponentially many experi-

ments. Secondly, while the response matrix Ris stochastic, its

inverse is not. As most MEM methods rely on applying R−1,

this leads to a ‘negative probability’ issue. The distributions

obtained after measurement error mitigation are quasiproba-

bilities - the elements of this quasiprobability distribution can

be negative or greater than 1 as long as they sum to 1.

Various strategies have been proposed to address the scal-

ability of the response matrix inversion method. Assuming

k-locality for measurement crosstalk [7,9] allows Rto be rep-

resented as n

/kdifferent 2k×2kmatrices. This assumption

has been veriﬁed in Refs [7,8] and presents the ﬁrst step in

addressing the scalability issue. The unmitigated probability

distributions are already sparse as the experiments required to

construct these distributions are only performed for Sshots,

which are generally set between 500 to 100000. In practice,

Sdoes not scale exponentially with qubit size due to time and

arXiv:2210.12284v1 [quant-ph] 21 Oct 2022

computational constraints. Ref [8] relied on this sparsity to

make MEM tractable. In particular, only the matrix elements

of Rthat had bitstrings close to the observed ones (measured

using Hamming distance) could be considered, while others

were ignored. This subspace reduction of Rsuggested that a

user could perform QDT after and with the knowledge of the

unmitigated experimental results. In other words, the order

of QDT and the experiment in Fig. 1was ﬂipped. Similarly,

Ref. [10] suggested placing a precision threshold on Rand

MEM, such that any probabilities below 1

/(10×S)are ignored.

These approaches to scaling MEM have been quite effective.

In particular, Ref. [8,10,11] have demonstrated Greenberger-

Horne-Zeilinger (GHZ) state preparation on 27, 42 and 65-

qubits respectively. MEM plays a crucial part in these demon-

strations by signiﬁcantly improving success metrics.

The methods listed above, while scalable, aim to invert R

and apply R−1, and the non-stochasticity of R−1necessarily

leads to quasiprobabilities. Numerous methods [1,12,13]

have been proposed to address the negativity issue, but no

clear consensus has emerged. One strategy is to use con-

strained least-squares optimization to ﬁnd a probability dis-

tribution with no negative probabilities close to the mitigated

quasiprobability distribution (in terms of some norm-based

distance between probability distributions). However, it is

unclear how to scale this approach. Alternatively, the nega-

tive terms can be canceled systematically by projecting to the

nearest valid probability distribution, as detailed in Ref. [13].

However, this zeroes out all negative probabilities and can no

longer be considered an unbiased estimator of the true dis-

tribution without measurement errors. Here Ref. [12] is par-

ticularly notable. It highlighted that error-prone detectors are

standard across experimental ﬁelds, and unfolding techniques

used in high-energy physics experiments can also be used to

mitigate readout errors on quantum devices. In particular,

Ref. [12] showed that iterative Bayesian unfolding (IBU) [14]

could mitigate readout errors while preserving non-negativity.

While IBU does not return any quasi-probabilities in its ba-

sic form, it requires iterative matrix multiplications with Rand

is challenging to scale to many qubits without further modi-

ﬁcation. In this paper, we demonstrate a scalable implemen-

tation of IBU. The original IBU formulation starts with the

response matrix R, the noisy empirical distribution ~p, and an

initial guess ~

θ0. Then it iteratively applies Bayes’ rule to ﬁnd

a mitigated probability distribution ~

θklike so:

θk+1

∑

i=1

pi·Ri jθk

∑mRimθk

(3)

A well-known technique in the high-energy physics com-

munity, IBU is also known as the Richardson-Lucy deconvo-

lution [15,16]. While IBU uses Bayes’ rule and starts with

an initial guess, sometimes referred to as a ‘prior,’ it does

not report a ‘posterior’ in any sense and is not a Bayesian

method. It is more appropriate to call it iterative expectation

maximization unfolding [17] as it converges to the maximum

likelihood estimate [18] for a large number of iterations. Since

this method may not be familiar to the quantum computing

community, we provide a derivation of IBU as expectation

maximization as relevant to the quantum computing setting.

Quantum Detector Tomography

|0i

State Preparation

|0i

Error-mitigated experiment

|0i

.MEM

|0i

FIG. 1. Two steps of MEM. The quantum detector tomography

(QDT) circuit is shown on top. QDT is performed to estimate the

response matrix R. The bottom ﬁgure shows the schematic circuit

for the MEM application. After applying a unitary U, the classical

measurement data is fed into a MEM scheme to produce an error-

mitigated probability distribution.

Our main challenge was to show that under reasonable as-

sumptions about measurement crosstalk and precision, IBU

can be implemented scalably. To demonstrate the scalability

of our method, we mitigate errors on QC data acquired by im-

plementing the Bernstein-Vazirani algorithm [19] on the 27-

qubit IBMQ Montreal [20] and from preparing GHZ states for

the 127-qubit IBMQ Washington. Our MEM implementation

performs as well as or better than the current state-of-the-art

M3 method [8]without producing negative probabilities, al-

beit with slightly higher computational time.

The rest of the paper is structured as follows: in Section II,

we describe our implementation of scalable iterative Bayesian

unfolding. We discuss the connection between IBU and ex-

pectation maximization in Section III. Section IV focuses on

demonstrating our MEM method on two quantum datasets

– the Bernstein-Vazirani algorithm on up to 26-qubits (Sec-

tion IV A) and GHZ state preparation up to 81 qubits (Sec-

tion IV B). We give some concluding remarks in Section V.

II. SCALABLE IBU IMPLEMENTATION

Here, we discuss some assumptions and computational ap-

proaches we use to scalably implement IBU (as in Eq. 3) for

measurement error mitigation (see Algorithm 1).

a. Vectorized Implementation First, we show how to

vectorize the IBU update rule given in Equation 3to enable

fast parallel computation on GPUs. Let Zbe a random vari-

able representing the true distribution over bitstrings without

any measurement error and Z0be a random variable repre-

senting the distribution over bitstrings with measurement er-

ror. With c(z0

i)the counts of bitstring z0

iand Sshots, let

~p∈R2nbe the noisy distribution with pi=c(z0

S, let ~

θk∈R2n

be the kth iteration guess of the error-mitigated distribution

with θkj=P

θk(Zj), and let R∈R2n×2nbe the response ma-

trix where Ri j =P(Z0

i|Zj). Then, the IBU update rule is:

θk+1=

∑

i=1

pi·Ri~

θk

Ri~

θk

= ~pT·diag1

θk·R·diag(~

θk)!T

θkRT~pR~

θk

(4)

Here, is element-wise multiplication, and is element-

wise division. This expression fully vectorizes the IBU update

rule by writing it solely in terms of elementwise operations

and matrix products. It also optimizes the order of operations

to avoid constructing large intermediate matrices; only the ad-

ditional memory for storing ~

θk+1is needed.

b. Tensor Product Structure of Noise Model Despite the

efﬁcient update scheme above, the memory requirements are

still quite unfavorable. As previously discussed, the ﬁrst big

memory bottleneck is that the response matrix Rgrows expo-

nentially in the number of qubits: 2n×2n. To get around this

issue, we assume that there is no measurement crosstalk and

hence Rhas a convenient tensor product decomposition [7–9],

i.e., R=R(1)⊗R(2)⊗... ⊗R(n). While we restrict ourselves

to this scenario, our implementation can be extended to cases

where measurement crosstalk is restricted to kqubits.

With this tensor product structure on R, we now have 4npa-

rameters to represent R, as opposed to the original 4nparam-

eters. This structure also allows matrix product R~xto be com-

puted very quickly using only matrix products and reshapes

[21] without any additional memory requirement. This is the

simplest structure we can impose on Rto obtain tractability,

although this can be relaxed into other tensor network decom-

positions of Ras well (such as matrix product states (MPS)).

c. Subspace Reduction for Tractability While we man-

aged the exponential scaling of Rwith a tensor product de-

composition, the parameters of the true latent distribution

θ∈R2nalso scale exponentially in the number of qubits.

To manage this, we can use the subspace reduction used in

Ref. [8]. In particular, instead of maintaining and updating a

vector of probabilities over 2nbitstrings, we only maintain

and update an M-dimensional vector over select bitstrings.

Speciﬁcally, we pre-select M-bitstrings (e.g., to be all those

bitstrings within Hamming distance dfrom any bitstring in

our dataset) and initialize ~

θ0with non-zero values in the en-

tries corresponding to those Mbitstrings; all other entries are

zero. Under such an initialization ~

θ0, the IBU update rule

will guarantee that all entries that started as zero remain zero.

Thus, we can ignore those entries and track only the M×1

sub-vector corresponding to non-zero entries. We refer to this

subspace-reduced implementation as IBU (Reduced), and IBU

without subspace reduction as IBU (Full). Under this sub-

space reduction trick, we can no longer easily leverage the

Kronecker product structure of Rto compute R~xfor any ~x.

Instead, we must construct a reduced matrix ˜

Rwhose rows

and columns correspond to the Mselected bitstrings. Unlike

[8], we are not solving a least squares problem, so matrix-free

methods like generalized minimal residual method (GMRES)

will not be helpful; we need to construct the reduced matrix

R. Even if the matrix does not grow exponentially with the

number of qubits, it may still be prohibitively large to store

in memory. Our computational solution to this problem is

to build the solution to R~xby weighting each column Rjby

xjand only keeping the running sum. We also use the JAX

library’s built-in accelerated linear algebra routines and just-

in-time compilation, vectorize as many operations as possible,

and use a GPU for parallelization. Although we incur the cost

of repeatedly computing ˜

R, there is a tradeoff between this

cost and memory costs for explicitly storing R.

d. Worst-case Computational Complexity Suppose we

run an experiment with Sshots on an nqubit system and

track strings up to Hamming distance dfrom the measured

bitstrings. In the worst case, there will be Sdifferent mea-

surements, so subspace-reduced IBU tracks S·∑d

k=0n

k∼

O(Snd)bitstrings. The three main computational subroutines

are 1) constructing the subspace-reduced response matrix, 2)

matrix-vector multiplication, and 3) elementwise multiplica-

tion and division. First, the subspace-reduced response matrix

will have O(S2n2d)cells to be constructed. Each cell of the

matrix requires a single call to each of the nsingle-qubit re-

sponse matrices, so constructing the matrix takes O(S2n2d+1)

operations. Second, multiplying this matrix with a vector with

O(Snd)entries takes O(S2n2d)operations. Third, the elemen-

twise multiplication and division operations take O(Snd)op-

erations. Overall, a single IBU iteration scales as O(S2n2d+1),

better than exponential scaling of the naive approach.

III. ITERATIVE BAYESIAN UNFOLDING AS

EXPECTATION-MAXIMIZATION

Here we summarize the argument that iterative Bayesian

unfolding is an instantiation of the well-known Expectation-

Maximization (EM) algorithm from machine learning. See

Appendix Afor complete derivation as well as a ‘true’

Bayesian extension that computes that maximum a posteriori

(MAP) estimate of ~

θgiven Dirichlet priors. Shepp and Vardi

[18] have noted connections between IBU and EM for Poisson

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

ScalableMeasurementErrorMitigationviaIterativeBayesianUnfoldingSiddarthSrinivasan,1BibekPokharel,2,3GregoryQuiroz,4,5andByronBoots61PaulG.AllenSchoolofComputerScienceandEngineering,UniversityofWashington,Seattle,WA981952DepartmentofPhysicsandAstronomy,UniversityofSouthernCalifornia,LosAngeles,CA,9...

展开>> 收起<<

Scalable Measurement Error Mitigation via Iterative Bayesian Unfolding.pdf

共13页,预览3页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Scalable Measurement Error Mitigation via Iterative Bayesian Unfolding

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: