Quantum Error Mitigation Zhenyu Cai1 2Ryan Babbush3Simon C. Benjamin1 2Suguru Endo4William J. Huggins3Ying Li5Jarrod R. McClean3and Thomas E. OBrien3

2025-04-29 0 0 1.3MB 43 页 10玖币

侵权投诉

Quantum Error Mitigation

Zhenyu Cai,1, 2, ∗Ryan Babbush,3Simon C. Benjamin,1, 2 Suguru Endo,4William J. Huggins,3Ying Li,5Jarrod R.

McClean,3and Thomas E. O’Brien3

1Department of Materials, University of Oxford, Oxford, OX1 3PH, United Kingdom

2Quantum Motion, 9 Sterling Way, London N7 9HJ, United Kingdom

3Google Quantum AI, Venice, California 90291, USA

4NTT Computer and Data Science Laboratories, NTT Corporation, Musashino 180-8585, Japan

5Graduate School of China Academy of Engineering Physics, Beijing 100193, China

(Dated: January 1, 2024)

For quantum computers to successfully solve real-world problems, it is necessary to

tackle the challenge of noise: the errors which occur in elementary physical components

due to unwanted or imperfect interactions. The theory of quantum fault tolerance can

provide an answer in the long term, but in the coming era of ‘NISQ’ machines we

must seek to mitigate errors rather than completely remove them. This review surveys

the diverse methods that have been proposed for quantum error mitigation, assesses

their in-principle eﬃcacy, and then describes the hardware demonstrations achieved

to date. We identify the commonalities and limitations among the methods, noting

how mitigation methods can be chosen according to the primary type of noise present,

including algorithmic errors. Open problems in the ﬁeld are identiﬁed and we discuss

the prospects for realising mitigation-based devices that can deliver quantum advantage

with an impact on science and business.

CONTENTS

I. Introduction 1

II. Concepts 3

A. Narrative introduction to concepts and terminology 3

B. Error-mitigated estimators 3

C. Faults in the circuit 5

D. Exponential scaling of the sampling overhead 5

III. Methods 6

A. Zero-noise extrapolation 6

B. Probabilistic error cancellation 7

C. Measurement error mitigation 9

D. Symmetry constraints 11

E. Purity constraints 13

F. Subspace expansions 16

G. N-representability 17

H. Learning-based 18

IV. Comparisons and Combinations 21

A. Comparison among QEM methods 21

1. Noise calibration overhead 21

2. Mean square errors 22

B. Benchmarking QEM from other perspectives 22

1. State discrimination 22

2. Quantum estimation theory 23

3. State extraction 23

C. Combinations of QEM methods 24

D. Comparison to the other error suppression methods 25

V. Applications 27

A. Coherent errors 27

B. Logical errors in fault-tolerant quantum computation 28

Submitted to Reviews of Modern Physics.

Please address all correspondence to Zhenyu Cai.

∗cai.zhenyu.physics@gmail.com

C. Algorithmic (compilation) errors 29

VI. Open Problems 30

A. Overarching problems 30

B. Technical questions 30

VII. Conclusion 30

Acknowledgments 31

List of Symbols 31

A. Practical Techniques in Implementations 31

1. Monte carlo sampling 31

a. Exponential sampling overhead 32

2. Pauli twirling 32

3. Measurement techniques 33

References 34

I. INTRODUCTION

The central promise of quantum computing is to enable

algorithms that have been shown to provide both poly-

nomial and super-polynomial speed-ups over the best-

known classical algorithms for a special set of problems.

These problems range from simulating quantum mechan-

ics (Feynman,1982) to purely algebraic problems such

as factoring integers (Shor,1999). The strongest chal-

lenge to the viability of practical quantum computing

has always been its sensitivity to errors and noise. It

was realised early on that the coupling of quantum sys-

tems to their environment sets an ultimate time limit

and size limit for any quantum computation (Unruh,

1995). This constraint poses a formidable challenge to

the ambitions of realising a quantum computer, since it

arXiv:2210.00921v3 [quant-ph] 28 Dec 2023

set bounds on the scalability of any algorithm. With

the advent of quantum error correction (QEC) (Calder-

bank and Shor,1996;Shor,1995;Steane,1996), this

challenge has been solved at least in theory. The cele-

brated threshold theorem (Aharonov and Ben-Or,1997;

Kitaev,1997) showed that if errors in the quantum hard-

ware could be reduced below a ﬁnite rate, known as the

threshold, a fault-tolerant quantum computation could

be carried out for arbitrary length even on noisy hard-

ware. However, besides the technical challenge of build-

ing hardware that achieves the threshold, the implemen-

tation of a fault-tolerant universal gate set with current

codes, such as the surface code (Fowler et al.,2012), gen-

erates a qubit overhead that seems daunting at the mo-

ment. For example, recent optimised approaches show

that scientiﬁc applications that are classically intractable

may require hundreds of thousands of qubits (Kivlichan

et al.,2020), while industrial applications will require

millions of qubits (Lee et al.,2021). There is ongoing

theoretical research to ﬁnd alternative codes with a more

favourable overhead, and recent progress gives reasons to

be optimistic (Breuckmann and Eberhardt,2021;Dinur

et al.,2022;Gottesman,2014;Panteleev and Kalachev,

2022). Nevertheless the challenge of realising full-scale

fault-tolerant quantum computing is a considerable one.

This of course motivates the question of whether other

approaches, prior to the era of fully fault-tolerant sys-

tems, might achieve quantum advantage with signiﬁcant

practical impacts. One might hope so, given the con-

tinual and remarkable progress that has been made in

quantum computational hardware. In recent years, it

has become routine to see reports of experiments demon-

strating high-quality control over multiple qubits [see e.g.

Asavanant et al. (2019); Ebadi et al. (2021); Jurcevic

et al. (2021); Madjarov et al. (2020); and Xue et al.

(2022)], some even reaching beyond 50 qubits [see e.g.

Arute et al. (2019) and Wu et al. (2021)]. Meanwhile

other experiments have indeed demonstrated early-stage

fault-tolerant potentials [see e.g. Abobeih et al. (2022);

Egan et al. (2021); Google Quantum AI (2023); Krinner

et al. (2022); Postler et al. (2022); Ryan-Anderson et al.

(2022); and Takeda et al. (2022)]. Of course, the works

mentioned here are far from exhaustive as it is impos-

sible to capture the all the breakthroughs on diﬀerent

fronts across the diverse range of platforms, we refer the

reader to Ac´ın et al. (2018) and Altman et al. (2021) and

the references therein for the key milestones in diﬀerent

platforms.

The primary goal of quantum error mitigation (QEM)

is to translate this continuous progress in quantum hard-

ware into immediate improvements for quantum infor-

mation processing. While accepting that the hardware

imperfections will limit the complexity of quantum algo-

rithms, nevertheless we can expect that every advance

should enable this boundary to be pushed further. As

this review will demonstrate, the mitigation approach

indeed proves to be both practically eﬀective and quite

fascinating as an intellectual challenge.

When exploring the prospects for achieving quantum

advantage through error mitigation, it is crucial to con-

sider suitable forms of circuits. It is understood that

in the era of noisy, intermediate-scale quantum (NISQ)

devices, only certain approaches may be able to achieve

meaningful and useful results. Due to the limited coher-

ence times and the noise ﬂoor present in quantum hard-

ware, one typically resorts to the idea of quantum com-

putation with short-depth circuits. Motivating examples

include variational quantum circuits in physics simula-

tions (McClean et al.,2016;Peruzzo et al.,2014;Wecker

et al.,2015), approximate optimisation algorithms (Farhi

et al.,2014), and even heuristic algorithms for quantum

machine learning (Biamonte et al.,2017). Typically in

applications of these kinds, the algorithm can be under-

stood as applying a short-depth quantum circuit to a

simple initial state and then estimating the expectation

value of a relevant observable. Such expectation values

ultimately lead to the output of the algorithm, which

must be accurate enough to be useful in some context

(for example, for estimating the energies of molecular

states, a useful level of chemical accuracy corresponds

to 1kcal/mol (Helgaker et al.,2000)). This leads to the

most essential feature of QEM: the ability to minimise the

noise-induced bias in expectation values on noisy hard-

ware. However, this can also be achieved by QEC and

many other long-established tools like decoherence-free

subspaces and dynamical decoupling sequences (derived

from optimal quantum control) (Lidar,2014;Suter and

Alvarez,2016). Therefore this feature alone is not suf-

ﬁcient to capture the QEM techniques that we wish to

cover in this review.

It is challenging to ﬁnd a universally acceptable def-

inition of quantum error mitigation. For the purposes

of this review, we will deﬁne the term ‘quantum error

mitigation’ as algorithmic schemes that reduce the noise-

induced bias in the expectation value by post-processing

outputs from an ensemble of circuit runs, using circuits

at the same noise level as the original unmitigated cir-

cuit or above. That is to say QEM will only reduce the

eﬀective damage due to noise for the whole ensemble of

circuit runs (with the help of post-processing), but when

we zoom into each individual circuit run, the circuit noise

level remains unchanged or even increases. This is in

contrast to the other techniques like QEC which aim to

reduce the eﬀect of noise on the output in every single

circuit run.

Since QEM performs post-processing using data di-

rectly from noisy hardware, it will become impractical

if the amount of noise in the whole circuit is so large

that it completely damages the output. In practice, this

usually means that for a given hardware set-up, there is a

maximum circuit size (circuit depth times qubit number)

beyond which QEM will become impractical, usually due

to an infeasible number of circuit repetitions. In contrast

to QEC, there is no speciﬁc error threshold that one must

surpass before QEM can be useful; diﬀerent qubit oper-

ation error rates simply lead to diﬀerent circuit sizes for

which QEM will be practical. In other words, as quan-

tum hardware continues to advance, we will be able to

apply QEM to ever larger quantum circuits for more chal-

lenging applications, without requiring big jumps in the

technologies.

There are certain desirable features for error mitiga-

tion; the methods reviewed here meet the following cri-

teria to diﬀering extents. First, the mitigation method

should ideally only require a very modest qubit over-

head to remain practical on current and near-term quan-

tum hardware. Nevertheless, error mitigation techniques

should provide an accuracy guarantee for the method.

Such a guarantee should ideally provide a formal error

bound on the mitigated expectation values that indicate

how well the method works at varying levels of noise.

The bounds would then indicate which concrete hard-

ware improvements would lead to improved estimates.

Needless to say, methods that are conceptually simple

and easy to implement experimentally lead to practically

feasible approaches. Lastly, a reliable error mitigation

method should require few assumptions (or, no assump-

tions) about the ﬁnal state that is prepared by the com-

putation. Making strong assumptions about the ﬁnal

state, for example that the state is a product state, may

restrict the method to scenarios where a computational

advantage over classical approaches may not be given.

We will start by introducing the basic notion of QEM

in Sec. II, with the details of diﬀerent QEM techniques

presented in Sec. III. The comparison and combinations

of these individual techniques will then be discussed in

Sec. IV. In Sec. Vwe will explore the application of QEM

in diﬀerent noise scenarios. Finally, we will discuss the

open problems in the ﬁeld in Sec. VI and oﬀer a conclu-

sion in Sec. VII

II. CONCEPTS

A. Narrative introduction to concepts and terminology

In this section we will introduce certain key concepts

and terminology that are common to all the QEM meth-

ods. However, do note that the approach used in this

section might not be the native way for introducing in-

dividual techniques in Sec. III. In those cases, we will

keep terminology that is unique to the given technique

self-contained in the respective section.

Near-term quantum devices have imperfections that

degrade the desired output information. A QEM proto-

col will aim to minimise this degradation. We will use

the term primary circuit to describe the process that,

ideally, would produce the perfect output state ρ0whose

properties we are interested in. In practice, due to the

noise present in the primary circuit, the actual output

state is some noisy state ρinstead.

Typically the ideal output information we seek is the

expectation value of some observable of interest Oof the

ideal output ρ0. Commonly we would obtain this infor-

mation simply by averaging the measurement results of

repeated execution, as opposed to, say, some phase es-

timation techniques that can obtain the result through

single-shot measurements but require deeper circuits that

are more relevant to the fault-tolerant computing era.

Therefore, even if we had ideal hardware, we would still

need to perform repeated executions to determine the av-

erage. We will use Ncir to denote the number of circuit

executions, or ‘shots’, that we employ – this will include

any executions of variant circuits called for in the QEM

protocol. Even in the noiseless limit, the ﬁnite Ncir will

usually imply a ﬁnite inaccuracy in our estimated aver-

age, often called shot noise. However, with perfect noise-

less hardware, there would be zero bias. In other words,

there would be no systematic shift to the estimated mean

versus the true value (the inﬁnite sampling limit). Given

that our hardware is not perfect, there will generally be

ﬁnite bias. QEM protocols aim to reduce this bias, but

this often means an increase in the variance (for a ﬁxed

number of circuit executions Ncir). One could increase

Ncir to compensate but this cost should be acknowledged;

the cost is the sampling overhead of that error-mitigation

method versus the ideal noiseless case. In the following

subsections we will make these terms and concepts more

precise.

B. Error-mitigated estimators

Our goal is to estimate the expectation value Tr[Oρ0]

of some observable of interest O. Using the outputs of

the primary circuit and its variants, we can construct an

estimator ˆ

Ofor our target parameter Tr[Oρ0]. The qual-

ity of a given estimator can be assessed in diﬀerent ways.

One way is to use prediction intervals, which calculate the

interval within which the outcome of the estimator will

fall with a given probability, oﬀering a rigorous bound on

the worst-case deviation of the estimator. Here however,

in order to see the diﬀerent factors that contribute to the

deviation of the estimator more clearly, we will instead

focus on the expected (average-case) square deviation of

our estimator ˆ

Ofrom the true value Tr[Oρ0], which is

called the mean square error :

MSE[ ˆ

O] = E[( ˆ

O−Tr[Oρ0])2].(1)

The ultimate goal of error mitigation is to reduce MSE[ ˆ

as much as possible, but this needs to be achieved using

only ﬁnite resources. To quantify this, it is useful to

decompose the mean square error of an estimator into two

components, the bias and the variance of the estimator:

MSE[ ˆ

O] = Bias[ ˆ

O]2+ Var[ ˆ

with the bias and the variance deﬁned as:

Bias[ ˆ

O] = E[ ˆ

O]−Tr[Oρ0]

Var[ ˆ

O] = E[ ˆ

O2]−E[ ˆ

O]2.

In this review when we say ‘bias’, sometimes we are re-

ferring to the magnitude of the bias |Bias[ ˆ

O]|, the exact

meaning should be obvious from the context.

The simplest way to construct the estimator ˆ

Ois by

directly measuring Oon the noisy output state of the

primary circuit ρ, and the measurement output is simply

denoted using the random variable ˆ

Oρ. After running the

noisy primary circuit Ncir times, we can take the average

of these noisy outputs (obtaining the noisy sample mean)

to estimate the ideal expectation value Tr[Oρ0]. This

noisy sample mean estimator is denoted as Oρand its

mean square error is given by

MSE[Oρ] = Tr[Oρ]−Tr[Oρ0]

| {z }

Bias[Oρ]=Bias[ ˆ

Oρ]2+TrO2ρ−Tr[Oρ]2

Ncir

| {z }

Var[Oρ]=Var[ ˆ

Oρ]/Ncir

(2)

We can see that the error contribution due to the vari-

ance, which is often called shot noise, will reduce as we

increase the number of circuit runs Ncir. In the limit of a

large number of circuit executions, the mean square error

MSE[Oρ] will be mainly limited by the bias of the esti-

mator Bias[Oρ], which is a systematic error that cannot

be reduced by increasing the number of circuit runs.

In order to reduce the bias, we can apply QEM using

data obtained from the noisy primary circuit and its vari-

ants, as will be discussed in Sec. III. We will construct

an error-mitigated estimator Oem using the same num-

ber of circuit runs Ncir. We would want to construct the

error-mitigated estimator Oem in such a way that it can

achieve a smaller bias than the naive noisy estimator Oρ,

|Bias[Oem]| ≤ |Bias[Oρ]|.

This reduction in the bias is usually achieved by con-

structing a more complex estimator that extracts and

ampliﬁes the useful information buried within the noise.

As a result, the error-mitigated estimator is also more

sensitive to the variation in the sampled data, and thus

its variance will usually increase,

Var[Oem]≥Var[Oρ].

Such a bias-variance trade-oﬀ is illustrated in Fig. 1and

can be found in almost all areas of parameter estimation.

As we will see later, diﬀerent ways of performing error

mitigation often lead to diﬀerent trade-oﬀs between bias

and variance, giving the user a choice between a quickly-

converging QEM method with large residual error, and

one that is more costly but more accurate.

We can deﬁne an ‘one-shot’ error-mitigated estimator

Oem that will satisfy E[ ˆ

Oem] = E[Oem] and Var[ ˆ

Oem] =

NcirVar[Oem]. The number of circuit runs needed for

given estimator ˆ

Xto achieve the shot noise level ϵis given

by Nϵ

shot(ˆ

X) = Var[ ˆ

X]/ϵ2. To reach the same shot noise

level as the original noisy estimator, the error-mitigated

estimator will require more circuit runs. This factor of

increase in the number of circuit runs is called the sam-

pling overhead, which is given by:

Cem =Nϵ

shot(ˆ

Oem)

Nϵ

shot(ˆ

Oρ)=Var[ ˆ

Oem]

Var[ ˆ

Oρ].(3)

FIG. 1 The probability density distributions of the unmiti-

gated estimator and the error-mitigated estimator. We see a

decrease of bias and an increase of variance after performing

error mitigation.

The sampling overhead can also be estimated using

the range of the estimator through Hoeﬀding’s inequality.

The range of a random variable ˆ

X, denoted as R[ ˆ

X],

is the diﬀerence between the maximum and minimum

possible values taken by ˆ

X. Using Hoeﬀding’s inequality,

the number of samples that is suﬃcient to guarantee an

estimation of E[ ˆ

X] to ϵ-precision with 1 −δprobability

is given by

Nϵ,δ

hﬀ (ˆ

X) = ln(2/δ)

2ϵ2R[ ˆ

X]2(4)

which can be used to estimate the sampling overhead

required for the error-mitigated estimator ˆ

Oem to achieve

the same ϵand δas the unmitigated estimator ˆ

Oρ:

Cem =Nϵ

shot(ˆ

Oem)

Nϵ

shot(ˆ

Oρ)∼Nϵ,δ

hﬀ (ˆ

Oem)

Nϵ,δ

hﬀ (ˆ

Oρ)=R[ ˆ

Oem]2

R[ ˆ

Oρ]2.(5)

C. Faults in the circuit

To gain intuition about the performance and costs of

QEM, we need a way to quantify the damages due to

noise in the circuit. Doing this by modelling noise in

complete generality on a quantum system can be very

challenging (Lidar and Brun,2013). One useful approx-

imation, which is widely employed in the ﬁeld of quan-

tum error correction, is to model noise as discrete, prob-

abilistic events named faults that can occur at the var-

ious locations in the circuit including gates, idling steps

and measurements (Gottesman,2009;Terhal,2015). For

simplicity, we will assume all locations are aﬄicted with

Pauli noise with the error probability of location fgiven

by pf1. If we further assume the error events in all lo-

cations are independent, then the total probability that

there is no fault in the circuit is given by

P0=Y

(1 −pf),(6)

which we will simply call the fault-free probability of the

circuit. It decays exponentially with the increase of the

number of fault locations in the circuit (and thus the

number of qubits and the circuit depth). Do note that

the fault-free probability is not the ﬁdelity of the output

state, but it is a lower bound for the ﬁdelity under our

noise assumption.

We can also quantify the amount of noise in the circuit

using the average number of faults in each circuit run,

which is given by:

λ=X

pf,(7)

and will be called the circuit fault rate. In the simple case

that all Mfault locations in the circuit have the same er-

ror rate p, we then simply have λ=M p. If the circuit

contains a large number (more than dozens) of fault lo-

cations, and the circuit fault rate is of the order of unity

λ∼1, then the number of faults occurring in a given

circuit run can be modelled using a Poisson distribution

with mean λusing the Le Cam’s theorem (Cai,2021a;

Endo et al.,2018;Le Cam,1960), i.e. the probability that

ℓfaults occur in the circuit is given by Pℓ=e−λλℓ/ℓ!. In

this way, the fault-free probability is given by

P0=e−λ,(8)

1All of the arguments will still apply if we deﬁne 1 −pfas the

coeﬃcient of the error-free part of the Kraus representation of

some general noise (we will select the Kraus representation that

gives the largest 1−pf). In this case, 1−pfis not the average gate

ﬁdelity. For example, for any non-identity unitary channel, 1−pf

will always be 0, which is not the case for average gate ﬁdelity.

More generally, we can always apply Pauli twirling to transform

all noise to Pauli noise such that our arguments become valid.

which decay exponentially with the circuit fault rate.

Note that for the intuitive arguments made in the next

subsection, and indeed for general estimation about the

feasibility of error mitigation, approximate estimates of

P0and λare often good enough to be useful.

D. Exponential scaling of the sampling overhead

We can perform bias-free QEM if we are able to post-

select for the fault-free circuit runs without needing any

additional circuit components. The fraction of circuit

runs that are selected is simply given by the fault-free

probability e−λin Eq. (8), which means that we will still

require eλtimes more circuit runs to obtain the same

number of “eﬀective” circuit runs as a noise-free machine

and achieve the same level of shot noise. Hence, even

allowing for the “magical” post-selection of fault-free cir-

cuit runs, the sampling overhead Cem =eλwill still in-

crease exponentially with the circuit fault rate (and thus

the circuit size). This implies a sampling overhead of

Cem ∼150 when λ= 5 and Cem ∼104when λ= 9,

which provides intuition on why QEM is unlikely to be

eﬃcient when the circuit fault rate is beyond O(1). Of

course, this does not constitute a rigorous bound on the

overhead of QEM.

Now let us move beyond trying to extract the error-

free state and instead focus on obtaining the right expec-

tation value for the observable of interest. Cai (2021a)

and Wang et al. (2021b) have shown that the expecta-

tion value of Pauli observables under Pauli gate noise

is bounded by an exponential decay curve against the

increase of the circuit fault rate λ. In order to resolve

such an exponentially small quantity at large λ, we will

need an exponential number of samples (Cem =Oeβλ

for some positive β). This exponential scaling of sample

overhead still applies when we consider error-mitigated

estimators that are linear combinations of the output of

such noisy circuits (see Appendix A.1.a).

For a given noisy circuit with the circuit fault rate λ,

rather than performing active correction to reduce λin

each circuit run like in quantum error correction, QEM

relies on post-processing the outputs from an ensemble of

circuit runs with the same circuit fault rate λor above.

Hence, through the simple examples above, we see that

QEM cannot eﬃciently tackle noisy circuits with large

λon its own due to the exponential sampling overhead.

However, as we will see later, due to the much lower im-

plementation cost for QEM in terms of additional circuit

components and qubits, it has become an eﬀective means

to stretch the application potential of near-term noisy de-

vices and will be a useful tool to help alongside quantum

error correction in the longer term.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

QuantumErrorMitigationZhenyuCai,1,2,∗RyanBabbush,3SimonC.Benjamin,1,2SuguruEndo,4WilliamJ.Huggins,3YingLi,5JarrodR.McClean,3andThomasE.O’Brien31DepartmentofMaterials,UniversityofOxford,Oxford,OX13PH,UnitedKingdom2QuantumMotion,9SterlingWay,LondonN79HJ,UnitedKingdom3GoogleQuantumAI,Venice,California9...

展开>> 收起<<

Quantum Error Mitigation Zhenyu Cai1 2Ryan Babbush3Simon C. Benjamin1 2Suguru Endo4William J. Huggins3Ying Li5Jarrod R. McClean3and Thomas E. OBrien3.pdf

共43页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Quantum Error Mitigation Zhenyu Cai1 2Ryan Babbush3Simon C. Benjamin1 2Suguru Endo4William J. Huggins3Ying Li5Jarrod R. McClean3and Thomas E. OBrien3

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: