Misspeciﬁed Phase Retrieval with Generative Priors Zhaoqiang Liu National University of Singapore

2025-04-24 0 0 2.38MB 36 页 10玖币

侵权投诉

Misspeciﬁed Phase Retrieval with Generative Priors

Zhaoqiang Liu

National University of Singapore

dcslizha@nus.edu.sg

Xinshao Wang

University of Oxford

xinshao.wang@eng.ox.ac.uk

Jiulong Liu

Chinese Academy of Sciences

jiulongliu@lsec.cc.ac.cn

Abstract

In this paper, we study phase retrieval under model misspeciﬁcation and generative

priors. In particular, we aim to estimate an

-dimensional signal

from

i.i.d. re-

alizations of the single index model

y=f(aTx)

, where

is an unknown and

possibly random nonlinear link function and

a∈Rn

is a standard Gaussian vector.

We make the assumption

Cov[y, (aTx)2]6= 0

, which corresponds to the misspec-

iﬁed phase retrieval problem. In addition, the underlying signal

is assumed to

lie in the range of an

-Lipschitz continuous generative model with bounded

dimensional inputs. We propose a two-step approach, for which the ﬁrst step plays

the role of spectral initialization and the second step reﬁnes the estimated vector

produced by the ﬁrst step iteratively. We show that both steps enjoy a statistical

rate of order

p(klog L)·(log m)/m

under suitable conditions. Experiments on

image datasets are performed to demonstrate that our approach performs on par

with or even signiﬁcantly outperforms several competing methods.

1 Introduction

Compressed sensing (CS) is perhaps the most popular instance of high-dimensional inverse problems,

for which one has the linear measurement model

y=aTx+η, (1)

where

a∈Rn

is the sensing vector,

x∈Rn

is the sparse signal to estimate, and

represents additive

noise. It has been well-known for CS that roughly

m=O(slog(n/s))

i.i.d. Gaussian measurements

are sufﬁcient to ensure the accurate recovery of a signal with snon-zero entries [88, 1, 23, 11, 77].

Phase retrieval (PR) arises in numerous scientiﬁc areas including X-ray crystallography, acoustics,

astronomy, microscopy, optics, wireless communications, and quantum information [

], where one

cannot measure

aTx

directly, and can only record its magnitude. For example, the following noisy

magnitude-only measurement model has been adopted in various prior works on sparse (real-valued)

PR [92, 39, 25, 36, 9, 7]:

y=|aTx|+η, (2)

where the signal

is assumed to be sparse, and the sensing vector

is assumed to be a standard

Gaussian vector.

However, both the linear and magnitude-only measure models in

(1)

and

(2)

are idealized views

of the data generating process. To make the setup more general, one can utilize the following

semi-parametric single index model (SIM) for general nonlinear models:

y=f(aTx),(3)

36th Conference on Neural Information Processing Systems (NeurIPS 2022).

arXiv:2210.05571v1 [stat.ML] 11 Oct 2022

where

f:R→R

is an unknown (possibly random) nonlinear link function, and

is typically

assumed to be Gaussian. In addition, since the norm of

x∈Rn

is absorbed into the SIM, for brevity,

it is common to assume that

is a unit vector, i.e.,

kxk2= 1

. SIMs have long been studied in the

conventional setting where the number of measurements

m>n

[

]. In recent years, they

have also been analyzed in the high-dimensional setting where

mn

, mainly under the assumption

that the underlying signal is sparse. Relevant works include but are not limited to [

For all these works, it is crucial to impose the following assumption on the SIM:

Cov[y, aTx]6= 0.(4)

The pivotal condition in

(4)

is fairly generic and encompasses notable special examples such as noisy

-bit measurements and general (non-binary) quantization schemes. However, it is not satisﬁed by

PR models including (2) and the related models

y=|aTx+η|, y = (aTx)2+η, (5)

where

refers to zero-mean random noise that is independent of

. In order to formalize a class of

SIMs that encompass the above-mentioned PR models (and more general models, see the discussion

in Section 2.2) as special cases, misspeciﬁed phase retrieval (MPR) has been studied in [

], with

the condition (4) being replaced by the assumption

Cov[y, (aTx)2]6= 0.(6)

It is worth mentioning that another motivation behind studying MPR is that the theoretical analysis

for PR typically relies on the correct model speciﬁcation that the data points are indeed generated

by the correct model, and the MPR model enables theoretical analysis under statistical model

misspeciﬁcation.

In both works [

], the signal

is assumed to be sparse. Recently, motivated by tremendous

successful applications of deep generative models and following the seminal work [

] on generative

model based linear CS, it has been popular to study high-dimensional inverse problems under

generative priors [

]. In particular, instead of being sparse,

the underlying signal is assumed to be contained in (or lie near) the range of a generative model. It

has been empirically demonstrated in [

] and its various follow-up works (e.g., see [

] and a

literature review in [

]) that compared to sparsity based methods, corresponding generative model

based algorithms require signiﬁcantly fewer samples to recover the signal up to a given accuracy.

In this paper, we study the MPR problem under generative modeling assumptions.

1.1 Related Work

In this subsection, we summarize some relevant works on PR and SIM, for both cases with or without

generative priors.

PR and SIM without generative priors

: There is a large amount of literature providing practical

algorithms for PR, including convex methods [

] and empirically more competitive

non-convex methods [

]. In particular, in the seminal work [

] and a variety of its

follow-up works [

], whether it is for general PR (with no priors on the

signal) or sparse PR, two-step approaches have been proposed with provable guarantees. The ﬁrst

step consists of a spectral initialization method, and the second step is typically an iterative (such

as alternating minimization and gradient descent) algorithm that further reﬁnes the initial guess of

the ﬁrst step. For general PR, the spectral initialization step turns out to be unnecessary, and optimal

sample complexity guarantees can be established even when using random initialization [

However, to the best of our knowledge, it is not the case for sparse PR. More speciﬁcally, all

theoretically-guaranteed non-convex algorithms for sparse PR require spectral initialization, and this

typically results in a sub-optimal sample complexity of

Os2log n

(instead of

O(slog n)

), where

refers to the number of non-zero entries of the signal to estimate.

MPR is also closely related to SIMs, which have been extensively studied in the conventional setting

where

m > n

, see, e.g., [

]. High-dimensional SIMs have received a lot of attention in

recent years, with theoretical guarantees for signal estimation and support recovery [

]. In particular, motivated by the idea that under the assumption

(4)

, a

SIM can be converted into a scaled linear measurement model with unconventional noise, the authors

of [

] consider minimizing the linear least-squares objective function over a convex set. They show

that a reliable estimation of the signal can be obtained by such a simple method despite the unknown

nonlinear link function.

As mentioned earlier, MPR with sparse priors has been studied in [

]. The work [

] implements

a two-step procedure based on convex optimization, with the ﬁrst step being a spectral initialization

method. More speciﬁcally, a semideﬁnite program is solved to produce an initial vector, and then an

regularized least square is solved to obtain a reﬁned estimator. Such a procedure suffers from high

computational costs. A more efﬁcient two-step approach, which is a simple variant of the thresholded

Wirtinger ﬂow method [

], is proposed in [

]. In the ﬁrst step, identical to that in [

], the initial

vector is calculated by a thresholded spectral method that ﬁrst estimates the support of the sparse

signal by thresholding and then performs the classic power method over the submatrix corresponding

to the estimated support. In the second step, a thresholded gradient descent algorithm is employed.

Both approaches in [

] can attain the optimal statistical rate of order

p(slog n)/m

, provided

that the sensing vector is standard Gaussian and the number of samples

m= Ωs2log n

. In addition,

the second step of the approach proposed in [98] is shown to achieve a linear convergence rate.

PR and SIM with generative priors

: PR with generative priors has been studied in [

]. More speciﬁcally, an approximate message passing algorithm is proposed in [

]. The authors

of [

] minimize the objective over the latent space in

using gradient descent, where

the latent dimension of the generative prior. Corresponding algorithms may easily get stuck in local

minima since the explorable solution space is limited. Recovery guarantees for projected gradient

descent algorithms over the ambient space in

for noiseless PR with pre-trained or untrained neural

network priors have been proposed in [

]. No initialization methods have been proposed in these

works, making the assumption on the initial vector therein a bit stringent. On the other hand, the

authors of [

] propose a spectral initialization method for PR with generative priors and provide

recovery guarantees with respect to globally optimal solutions of a corresponding optimization

problem. The optimization problem is non-convex, and a projected power method is proposed in [

]

to approximately ﬁnd an optimal solution.

Generative model based SIMs have been studied in [

]. The authors of [

]

provide optimal sample complexity upper bounds under the assumption of Gaussian sensing vectors.

But their results rely on the assumption

(4)

, which is not satisﬁed by widely adopted PR models. The

SIM studied in [

] encompasses certain PR models as special cases, and the sensing vector can

be non-Gaussian. However, the nonlinear link function

is assumed to be differentiable, making

it not directly applicable to the PR model in

(2)

. Moreover, it is worth mentioning that in both

works [

], the recovery guarantees are with respect to globally optimal solutions of typically

highly non-convex optimization problems. Attaining these optimal solutions is practically difﬁcult.

1.2 Contributions

Throughout this paper, we assume that the signal

lies in the range of an

-Lipschitz continuous

generative model with bounded k-dimensional inputs. Our main contributions are as follows:

•

We propose a two-step approach for MPR with generative priors. In particular, in the ﬁrst

step, we make use of the projected power method proposed in [

] to obtain a good initial

vector for the iterative algorithm used in the second step.

•

We show that under appropriate initialization, both steps attain a statistical rate of order

p(klog L)·(log m)/m

, and the second step achieves a linear convergence rate. The

initialization condition for the ﬁrst step is mild in the sense that it allows the inner product

between the starting point and the signal

to be a sufﬁciently small positive constant. In

contrary, the initialization condition for the second step is more restrictive, making the

ﬁrst step necessary. Notably, unlike for the works on MPR with sparse priors [

] that

require the sub-optimal

Os2log n

sample complexity, the sample complexity requirement

for our recovery guarantees is

O((klog L)·(log m))

, which is naturally conjectured to be

near-optimal [52, 42].

•

We perform numerical experiments on image datasets to corroborate our theoretical results.

In particular, for the noisy magnitude-only measurement model

(2)

, we observe that our

approach gives reconstructions that are competitive with those of the alternating phase

projected gradient descent (APPGD) algorithm proposed in [

], which is the corresponding

state-of-the-art method, though we do not utilize the knowledge of the link function and

allow for model misspeciﬁcation. Moreover, for several closely related measurement models

that satisfy the condition

(6)

, our approach leads to superior reconstruction performance

compared to all other competing methods, including APPGD.

2 Problem Formulation

In this section, we overview some important assumptions that we adopt. Before proceeding, we

present the notation used in this paper.

2.1 Notation

We use upper and lower case boldface letters to denote matrices and vectors respectively. For any

positive integer

, we use the shorthand notation

[N] = {1,2,··· , N}

, and

represents the

identity matrix in

RN×N

. The support (set) of a vector is the index set of its non-zero entries.

We use

kXk2→2

to represent the spectral norm of

. We use

Bk(r)

to denote the radius-

, i.e.,

Bk(r) := {z∈Rk:kzk2≤r}

, and

Sn−1

represents the unit sphere in

, i.e.,

Sn−1:= {s∈Rn:ksk2= 1}

. We use

to denote a pre-trained

-Lipschitz continuous generative

model from

Bk(r)

. We focus on the setting where

kn

. For any set

S⊆Bk(r)

, we write

G(S) = {G(z) : z∈S}

. Our goal is to estimate the signal

x∈Range(G) = G(Bk(r))

from

realizations of the sensing vector

and the observation

(generated according to the SIM in

(3)

For any two sequences of real values

{aj}

and

{bj}

, we write

aj=O(bj)

if there exist an absolute

constant

and a positive integer

such that for any

j > j1

|aj| ≤ C1bj

aj= Ω(bj)

if there exist

an absolute constant

and a positive integer

such that for any

j > j2

|aj| ≥ C2bj

. We use the

generic notations

and

to denote large positive constants, and we use

to denote a small positive

constant; their values may differ from line to line.

2.2 Setup

First, the following are the standard deﬁnitions of a sub-exponential random variable and the associ-

ated sub-exponential norm.

Deﬁnition 1.

A random variable

is said to be sub-exponential if there exists a positive constant

such that

(E[|X|p])1/p ≤Cp

for all

p≥1

, or equivalently, if there exists a positive constant

such that

P(|X|> u)≤exp(1 −u/C0)

for all

u≥0

. The sub-exponential norm of

is deﬁned as

kXkψ1:= supp≥1p−1(E[|X|p])1/p.

We will focus on the following settings except where stated otherwise:

•

The observations are independently generated according to the SIM

(3)

, where

is the link

function that is unknown and possibly random.

•

We have an

-Lipschitz continuous generative model

G:Bk(r)→Rn

. For conve-

nience, similarly to [

], we assume that the generative model is normalized such that

Range(G)⊆ Sn−1

. For a general (unnormalized) generative model, we may essentially

consider its normalized version. See, e.g., [50, Remark 1].

• The signal xis contained in the range of G, i.e., x∈Range(G)⊆ Sn−1.

• The sensing vector a∈Rnis standard Gaussian, i.e., a∼ N(0,In).

•

The random variable

y=f(aTx)

is sub-exponential with the sub-exponential norm being

denoted by

, i.e.,

Ky:= kykψ1

. In addition, we use

to denote the expectation of

i.e., My:= E[y].

Remark 1. y

will be sub-exponential when

f(x)

comprises of

plus lower order terms

with

c≤2

(since the product of two sub-Gaussian random variables is sub-exponential),

and therefore we will see that the

corresponding to all the measurement models presented

in our paper is sub-exponential. We remark that the assumption of sub-exponential

not essential and it can be easily relaxed. For example, when

y=xc

with

being an even

integer that is larger than

, there will be only a minor change in the order of the

log m

term in the sample complexity and statistical rate. However, for brevity, we follow [

]

and make the assumption of sub-exponential yto avoid non-essential complications.

• We consider MPR and assume that1

ν:= Cov y, (aTx)2>0,(7)

or equivalently,

ν:= Cov f(g), g2>0,(8)

where

g∼ N(0,1)

is a standard normal random variable. Note that the assumption

(8)

only with respect to the nonlinear link function

. The condition in

(7)

is satisﬁed by PR

models described in (2) and (5). It is also satisﬁed by relevant models such as [98]

y=|aTx|+ 2 tanh(|aTx|) + η, y = 2(aTx)2+ 3 sin(|aTx|) + η. (9)

See [64, Proposition 3 and Remark 4] for more general examples.

3 Algorithm

In this section, we describe our two-step algorithm devised for MPR with generative priors. Suppose

that we have

i.i.d. realizations of

and

, namely

a1,...,am

and

y1, . . . , ym

. To estimate the

signal x, we consider the following two-step approach:

1. We perform T1iterations in the ﬁrst step. In particular, let

V=1

i=1

yiaiaT

i−In.(10)

We perform the projected power method proposed in [50]: For t= 0,1, . . . , T1−1, let

w(t+1) =PGVw(t),(11)

where

PG(·)

denotes the projection function onto

Range(G)

and we obtain

x(0) := w(T1)

Similarly to [

], we set the starting point

w(0)

as the column of

mPm

i=1 yiaiaT

(i.e.,

a shifted version of

) that corresponds to the largest diagonal entry. Note that it is easy to

calculate that

E[V] = νxxT

(see, e.g., [

, Lemma 8]), for which each column is a scalar

product of

. This motivates the use of a shifted version of

to get the initialization vector.

2. We perform T2iterations in the second step. In particular, let

¯y=1

i=1

yi(12)

be the empirical mean of the observations. We perform the following iterative procedure:

For t= 0,1,2, . . . , T2−1, let

ˆν(t)=1

i=1

(yi−¯y)·aT

ix(t)2

(13)

˜y(t)

i= (yi−¯y)·aT

ix(t), i = 1,2, . . . , m. (14)

x(t+1) =x(t)−ζ

m·

i=1 ˆν(t)·aT

ix(t)−˜y(t)

iai,(15)

x(t+1) =PG˜

x(t+1),(16)

where

ζ > 0

is a tuning parameter, and

ˆν(t)

can be thought of as an approximation of

deﬁned in

(7)

. The idea behind calculating

˜y(t)

is that by comparing

(4)

and

(6)

, we observe

that to transform the MPR model into a conventional SIM, we may use

(y−E[y])(aTx)

1The case that ν < 0can be similarly handled by considering −y.

That is, for any

s∈Rn

PG(s) := Garg minz∈Bk(r)kG(z)−sk2

. Similarly to [

we implicitly assume the exact projection in our analysis. In practice, approximate methods such as gradient-

and GAN-based projections [79, 75] have been shown to work well.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

MisspeciedPhaseRetrievalwithGenerativePriorsZhaoqiangLiuNationalUniversityofSingaporedcslizha@nus.edu.sgXinshaoWangUniversityofOxfordxinshao.wang@eng.ox.ac.ukJiulongLiuChineseAcademyofSciencesjiulongliu@lsec.cc.ac.cnAbstractInthispaper,westudyphaseretrievalundermodelmisspecicationandgenerativeprio...

展开>> 收起<<

Misspeciﬁed Phase Retrieval with Generative Priors Zhaoqiang Liu National University of Singapore.pdf

共36页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Misspeciﬁed Phase Retrieval with Generative Priors Zhaoqiang Liu National University of Singapore

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: