Optimization for Amortized Inverse Problems Tianci Liu1Tong Yang2Quan Zhang3Qi Lei4 Abstract

2025-04-29 1 0 5.82MB 23 页 10玖币

侵权投诉

Optimization for Amortized Inverse Problems

Tianci Liu 1Tong Yang 2Quan Zhang 3Qi Lei 4

Abstract

Incorporating a deep generative model as the prior

distribution in inverse problems has established

substantial success in reconstructing images from

corrupted observations. Notwithstanding, the ex-

isting optimization approaches use gradient de-

scent largely without adapting to the non-convex

nature of the problem and can be sensitive to

initial values, impeding further performance im-

provement. In this paper, we propose an efﬁcient

amortized optimization scheme for inverse prob-

lems with a deep generative prior. Speciﬁcally,

the optimization task with high degrees of difﬁ-

culty is decomposed into optimizing a sequence of

much easier ones. We provide a theoretical guar-

antee of the proposed algorithm and empirically

validate it on different inverse problems. As a re-

sult, our approach outperforms baseline methods

qualitatively and quantitatively by a large margin.

1. Introduction

Inverse problems aim to reconstruct the true image/signal

xTfrom a corrupted (noisy or lossy) observation

y=f(xT) + e,

where

is a known forward operator and

is the noise.

The problem is reduced to denoising when

f(x) = x

is the

identity map and is reduced to a compressed sensing (Can-

des et al.,2006;Donoho,2006), inpainting (Vitoria et al.,

2018), or a super-resolution problem (Menon et al.,2020)

when

f(x) = Ax

and

maps

to an equal or lower

dimensional space.

Inverse problems are generally ill-posed in the sense that

there may exist inﬁnitely many possible solutions, and thus

School of Electrical and Computer Engineering, Purdue

University, United States. email:

liu3351@purdue.edu

Center of Data Science, Peking University, China. email:

tongyang@stu.pku.edu.cn 3

Department of Accounting

and Information Systems, Michigan State University, United States.

email:

quan.zhang@broad.msu.edu 4

Courant Institute of

Mathematical Sciences and Center for Data Science, New York

University, United States. email: ql518@nyu.edu .

require some natural signal priors to reconstruct the cor-

rupted image (Ongie et al.,2020). Classical methods assume

smoothness, sparsity in some basis, or other geometric prop-

erties on the image structures (Candes et al.,2006;Donoho,

2006;Danielyan et al.,2011;Yu et al.,2011). However,

such assumptions may be too general and not task-speciﬁc.

Recently, deep generative models, such as the generative

adversarial network (GAN) and its variants (Goodfellow

et al.,2014;Karras et al.,2017;2019), are used as the prior

of inverse problems after pre-training and have established

great success (Bora et al.,2017;Hand & Voroninski,2018;

Hand et al.,2018;Asim et al.,2020b;Jalal et al.,2020;

2021). Compared to classical methods, using a GAN prior

is able to produce better reconstructions at much fewer mea-

surements (Bora et al.,2017).

Asim et al. (2020a) points out that a GAN prior can be

prone to representation errors and signiﬁcant performance

degradation if the image to be recovered is out of the data

distribution where the GAN is trained. To address this

limitation, the authors propose replacing the GAN prior

with normalizing ﬂows (NFs) (Rezende & Mohamed,2015).

NFs are invertible generative models that learn a bijection

between images and some base random variable such as

standard Gaussian (Dinh et al.,2016;Kingma & Dhariwal,

2018;Papamakarios et al.,2021). Notably, the invertibil-

ity of NFs guarantees that any image is assigned with a

valid probability, and NFs have shown higher degrees of

robustness and better performance than GANs, especially

on reconstructions of out-of-distribution images (Asim et al.,

2020a;Whang et al.,2021a;b;Li & Denli,2021;Hong et al.,

2022). In this paper, we focus on inverse problems incorpo-

rating an NF model as the generative prior. We give more

details of NFs in Section 2.

Conceptually, the aforementioned approaches can be seen

as reconstructing an image as x∗deﬁned by

x∗(λ)∈argminxLrecon(x,y) + λLreg(x),(1)

where

Lrecon

is a reconstruction error between the observa-

tion

and a recovered image

(Bora et al.,2017;Ulyanov

et al.,2018), and

Lreg(x)

multiplied by the hyperparmeter

regularizes the reconstruction

using the prior information.

Speciﬁcally, when a probabilistic deep generative prior, like

an NF model, is used,

Lreg(x)

can be the likelihood for the

generative model to synthesize

. Note that the loss func-

arXiv:2210.13983v3 [cs.LG] 28 Jan 2023

Optimization for Amortized Inverse Problems

tion is subject to different noise models (Van Veen et al.,

2018;Asim et al.,2020a;Whang et al.,2021a).

In execution, the reconstruction can be challenging as

Lreg

involves a deep generative prior. The success fundamen-

tally relies on an effective optimization algorithm to ﬁnd

the global or a satisfactory local minimum of

(1)

. However,

the non-convex nature of inverse problems often makes

gradient descent unprincipled and non-robust, e.g., to ini-

tialization. In fact, even in a simpler problem where the

forward operator is the identity map (corresponding to a

denoising problem), solving

(1)

with a deep generative prior

is NP-hard as demonstrated in Lei et al. (2019). This es-

tablishes the complexity of solving inverse problems in

general. On the other hand, even for speciﬁc cases, gradient

descent possibly fails to ﬁnd global optima, unlike training

an (over-parameterized) neural network. This is because

inverse problems require building a consistent (or under-

parameterized) system and yielding a unique solution. It

is known both theoretically and empirically that the more

over-parameterized the system is, the easier it is to ﬁnd the

global minima with ﬁrst-order methods (Jacot et al.,2018;

Du et al.,2019;Allen-Zhu et al.,2019).

In this paper, we overcome the difﬁculty by proposing a

new principled optimization scheme for inverse problems

with a deep generative prior. Our algorithm incrementally

optimizes the reconstruction conditioning on a sequence of

’s that are gradually increased from 0 to a prespeciﬁed

value. Intuitively, suppose we have found a satisfactory

solution (e.g., the global optimum)

x∗(λ)

as in

(1)

. Then

with a small increase

∆λ

in the hyperparameter, the new

solution

x∗(λ+ ∆λ)

should be close to

x∗(λ)

and easy to

ﬁnd if starting from

x∗(λ)

. Our algorithm is related to amor-

tized optimization (Amos,2022) in that the difﬁculty and

the high computing cost of ﬁnding

x∗(λ)

for the original

inverse problem is amortized over a sequence of much easier

tasks, where ﬁnding

x∗(0)

is feasible and the solution to

one task facilitates solving the next. We refer to our method

as Amortized Inverse Problem Optimization (AIPO).

It is noteworthy that AIPO is different from the amortized

optimization in the conventional sense, which uses learning

to approximate/predict the solutions to similar problems

(Amos,2022). In stark contrast, we are spreading the dif-

ﬁculty in solving the original problem into a sequence of

much easier tasks, each of which is still the optimization of

an inverse problem objective function. We provide a the-

oretical underpinning of AIPO: Under some conventional

assumptions, AIPO is guaranteed to ﬁnd the global mini-

mum. A practical and efﬁcient algorithm is also provided.

Empirically, our algorithm exhibits superior performance

in minimizing the loss of various inverse problems, includ-

ing denoising, noisy compressed sensing, and inpainting.

To the best of our knowledge, AIPO is the ﬁrst principled

and efﬁcient algorithm for solving inverse problems with a

ﬂow-based prior.

The paper proceeds as follows. In Section 2, we provide

background knowledge in normalizing ﬂows and amortized

optimization and introduce example inverse problems. In

Section 3, we formally propose AIPO and give theoreti-

cal analysis. In Section 4, we illustrate our algorithm and

show its outstanding performance compared to conventional

methods that have

ﬁxed during optimization. Section 5

concludes the paper. We defer the proofs, some technical

details and experiment settings, and supplementary results

to the appendix.

2. Backgrounds

We ﬁrst provide an overview of normalizing ﬂows that are

used as the generative prior in our setting. We also brieﬂy

introduce amortized optimization based on learning and

highlight its difference from the proposed AIPO. In addition,

we showcase three representative inverse problem tasks, on

which our algorithm will be evaluated.

2.1. Normalizing Flows

Normalizing ﬂows (NFs) (Rezende & Mohamed,2015;Pa-

pamakarios et al.,2021) are a family of generative models

and capable of representing an

-dimensional complex dis-

tribution by transforming it to a simple base distribution

(e.g., standard Gaussian or uniform distribution) of the same

dimension. Compared to other generative models such as

GAN (Goodfellow et al.,2014) and variational autoencoders

(Kingma & Welling,2013), NFs use a bijective (invertible)

mapping and are computationally ﬂexible in the sense that

they admit sampling from the distribution efﬁciently and

conduct exact likelihood estimation.

To be more speciﬁc, let

x∈Rn

denote a data point that

follows an unknown complex distribution and

z∈Rn

fol-

low some pre-speciﬁed based distribution such as a standard

Gaussian. An NF model learns a differentiable bijective

function

G:Rn→Rn

such that

x=G(z)

. To sam-

ple from the data distribution

pG(x)

, one can ﬁrst generate

z∼p(z)

and then apply the transformation

x=G(z)

Moreover, the invertibility of

allows one to use the change-

of-variable formula to calculate the likelihood of xby

log pG(x) = log p(z) + log |det(JG−1(x))|,

where

JG−1

denotes the Jacobian matrix of the inverse map-

ping

G−1

evaluated at

. To speed up the computation,

is usually composed of several simpler invertible functions

that have triangular Jacobian matrices. Typical NFs include

RealNVP (Dinh et al.,2016) and GLOW (Kingma & Dhari-

wal,2018). For more details of NF models, we refer the

readers to the review by Papamakarios et al. (2021) and the

Optimization for Amortized Inverse Problems

references therein.

2.2. Amortized Optimization Based on Learning

Amortized optimization methods based on learning are used

to improve repeated solutions to the optimization problem

x?(λ)∈argminxg(x;λ),(2)

where the non-convex objective

g:X × A → R

takes

some context

λ∈ A

that can be continuous or discrete. The

continuous, unconstrained domain of the problem is given

x∈ X =Rn

, and the solution

x?(λ)

, deﬁned implic-

itly by the optimization process, is usually assumed to be

unique. Given different

, instead of optimizing each

x?(λ)

separately, amortized optimization utilizes the similarities

between subroutines induced by different

to amortize the

computational difﬁculty and cost across them and gets its

name thereof. Typically, an amortized optimization method

(Amos,2022) solving (2) can be represented by

M,(g, X,A, p(λ),ˆxθ,L),

where

g:X × A → R

is the unconstrained objective to

optimize,

is the domain,

is the context space,

p(λ)

is the probability distribution over contexts to optimize,

ˆxθ:A→X

is the amortized model parameterized by

which is learned by optimizing a loss

L(g, X,A, p(λ),ˆxθ)

deﬁned on all the components (Kim et al.,2018;Marino

et al.,2018;Marino,2021;Liu et al.,2022b).

Learning-based amortized optimization has been used in

various machine learning models, including variational in-

ference (Kingma & Welling,2013), model-agnostic meta-

learning (Finn et al.,2017), multi-task learning (Stanley

et al.,2009), sparse coding (Chen et al.,2021), reinforce-

ment learning (Ichnowski et al.,2021), and so on. For a

more comprehensive survey of amortized optimization, we

refer the readers to Amos (2022). Our proposed AIPO is

different from the learning-based amortized optimization in

two aspects. First, we decompose the task of an inverse prob-

lem into easier ones and still require optimization, rather

than learning, to solve each subroutine problem. Second, the

easier tasks in AIPO are not independent; the solution to one

task is used as the initial value for the next and facilitates its

optimization.

2.3. Representative Inverse Problems

We brieﬂy introduce three representative inverse problems

that we use to validate AIPO and refer the readers to Ongie

et al. (2020) for recent progress using deep learning. Given

an unknown clean image

, we observe a corrupted mea-

surement

y=f(xT) + e

, where

f:Rn→Rm

is some

known forward operator such that

m≤n

. The additive term

e∈Rm

denotes some random noise that is usually assumed

to have independent and identically distributed entries (Bora

et al.,2017;Asim et al.,2020a). Representative inverse

problem tasks include denoising, noisy compressed sensing,

inpainting, and so on, with different forward operators

. In

this work, we focus on the following three tasks.

Denoising

assumes that

y=xT+e

and noise

e∼

N(0, σ2I)

is an isotropic Gaussian vector (Asim et al.,

2020a;Ongie et al.,2020).

Noisy Compressed Sensing (NCS)

assumes that

AxT+e

where

A∈Rm×n

m<n

, is a known

m×n

matrix of i.i.d.

N(0,1/m)

entries (Bora et al.,

2017;Asim et al.,2020a;Ongie et al.,2020), and the noise

e∼ N(0, σ2I)

is an isotropic Gaussian vector. Typically,

the smaller mis, the more difﬁcult the NCS task will be.

Inpainting

assumes that

y=AxT+e

where

A∈Rn×n

is a diagonal matrix with binary entries and a certain propor-

tion of them are zeros. In other words,

indicates whether

a pixel is observed or missing (Asim et al.,2020a;Ongie

et al.,2020). Again, we consider the noise e∼ N(0, σ2I).

3. Methodology

We propose the Amortized Inverse Problem Optimization

(AIPO) algorithm to reconstruct images by maximum a

posterior estimation. First, we formulate the loss function

for inverse problems using an NF generative prior. Then

we introduce the AIPO algorithm and show the theoretical

guarantee of its convergence.

3.1. Maximum A Posterior Estimation

Recent work on inverse problems using deep generative pri-

ors (Asim et al.,2020b;Whang et al.,2021a) has established

successes in image reconstruction by a maximum a posterior

(MAP) estimation. We adopt the same MAP formulation as

in Whang et al. (2021a). Speciﬁcally, we use a pre-trained

invertible NF model

G:Rn→Rn

as the prior that we can

effectively sample from.

maps the latent variable

z∈Rn

to an image

x∈Rn

and induces a tractable distribution

over

by the change-of-variable formula (Papamakarios

et al.,2021). Optimization with respect to

has been

considered in the literature (e.g., Asim et al.,2020a). In

our context, they make no difference due to the invertibility

, and thus we directly optimize

by minimizing the

MAP loss. To be speciﬁc, denoting the prior density of

pG(x)

, which quantiﬁes how likely an image

is sampled

from the pre-trained NF model, we reconstruct the image

from yby the MAP estimation

x∗(λ)∈argminxLMAP(x;λ)(3)

= argminx−log pe(y−f(x)) −λlog pG(x),

where the hyperparameter

λ > 0

controls the weight of the

prior, and

is the density of the noise

−log pe(y−

Optimization for Amortized Inverse Problems

f(x))

and

−λlog pG(x)

are the reconstruction error and

the regularization in

(1)

, respectively, both of which are

continuous in

. In practice,

is usually assumed to be

an isotropic Gaussian distribution (Bora et al.,2017;Asim

et al.,2020a;Ongie et al.,2020) whose coordinates are

independent and identically Gaussian distributed with a

zero mean. Consequently, the loss function for the MAP

estimation is equivalent to

LMAP(x;λ) = ky−f(x)k2−λlog pG(x).

Note that the reconstruction error reaches its minimum value

if and only if

y=f(x)

. It is challenging to directly mini-

mize

LMAP(x;λ)

given a prespeciﬁed

in the presence of

the deep generative prior

because of its non-convexity

and NP-hardness (Lei et al.,2019). To effectively and ef-

ﬁciently solve the problem, we propose to amortize the

difﬁculty and computing cost over a sequence of easier sub-

routine optimization and provide theoretical guarantees.

3.2. Amortized Optimization for MAP

We propose Amortized Inverse Problems Optimization

(AIPO) for solving

(3)

. Given a prespeciﬁed hyperparame-

ter value

, to obtain a good approximation of

x∗(Λ)

, we

start from

λ= 0

, where the optimization may have an an-

alytical solution

x∗(0) ∈arg minxLMAP(x; 0) = f−1(y)

and gradually increase

towards

in multiple steps. In

each step, assuming that the current solution

x∗(λ)

obtained and given a small enough

∆λ > 0

, we ex-

pect

x∗(λ)

to lie close to the solution

x∗(λ+ ∆λ)

the next step under regular conditions (see Section 3.3,

shortly). In other words,

x∗(λ)

is nearly optimal

for minimizing the MAP loss

LMAP(x;λ+ ∆λ)

. Con-

sequently, minimizing

LMAP(x;λ+ ∆λ)

starting from

x∗(λ)

makes the optimization easier and converge faster

than starting from random initialization. In partic-

ular, we amortize the difﬁculty in directly solving

x∗(Λ)

over solving a sequence of optimization problems

{minxLMAP(x;λi+1 =λi+ ∆λi)|x∗(λi)}i.

Notably, the starting point

x∗(0)

corresponds to maximizing

the log-likelihood of the noise

and equals to the maximum

likelihood estimation (MLE). However, not all inverse prob-

lems admit a unique MLE. For under-determined

, there ex-

ist inﬁnitely many choices of

x∗(0)

such that

f(x∗(0)) = y

(e.g., in NCS and inpainting tasks), among which we choose

the initial value of AIPO as the MLE x∗(0) deﬁned by

x∗(0) ∈argmaxxpG(x),s.t. f(x) = y.(4)

In practice,

(4)

can be solved by projected gradient de-

scent (Boyd et al.,2004). Speciﬁcally, all the tasks we

consider in this paper have a linear forward operator f; the

constraints are in the afﬁne space, and the projection can be

readily solved as presented in Appendix A. We summarize

Algorithm 1 AIPO algorithm

1: Input: Λ>0

, generative model

G:Rn→Rn

L > 0

(see Assumption 3.2),

σ > 0, δ > 0

(see Assump-

tion 3.3),

C > 0

(see Assumption 3.4), precision

ε > 0

2: Initialize: λ= 0

µ=1

2Lσ2

δ0=

min δ, µ

√2(L−µ)Lδ,N= [2ΛC

δ0]+1

3: Find the MLE x0=x∗(0) by solving (4)

4: for i= 0, . . . , N −1do

5: λ=λ+Λ

6: K= [ 2 log(2δ/δ0)

log(L/(L−µ)) ]+1

7: if i=N−1then

8: K= max{0,2 log(2δ/ε)

log(L/(L−µ)) ]+1}

9: end if

10: for k= 1, . . . , K do

11: xk+1 =xk−1

L∇xLMAP(xk, λ)

12: end for

13: x0=xK+1

14: end for

output ˆ

x(Λ) = x0

AIPO in Algorithm 1. Note that its theoretical guarantee in

Section 3.3, shortly, does not rely on the linearity of f.

Remark 3.1.In denoising tasks, existing literature largely

uses

for initialization (Asim et al.,2020a;Whang et al.,

2021a) and can be regarded as a special case of our method

by taking one large step from λ= 0 with ∆λ= Λ.

3.3. Theoretical Analysis

We provide a theoretical analysis of the convergence of

AIPO. We make the following assumptions, under which

we prove that AIPO by Algorithm 1ﬁnds an approximation

of the global minimum of LMAP with arbitrary precision.

Assumption 3.2

(

-smoothness of

LMAP

)

There exists

L > 0

such that

∀λ∈[0,Λ]

LMAP(·;λ)

-smooth, i.e.,

for all

and

k∇xLMAP(x1;λ)− ∇xLMAP(x2;λ)k ≤

Lkx1−x2k.

Assumption 3.3

(local property of

∇xLMAP

)

There ex-

ists

σ > 0

and

δ > 0

such that for all

λ∈[0,Λ]

and

x∈B(x∗(λ), δ) := {x| kx−x∗(λ)k ≤ δ}

, we have

kx−x∗(λ)k ≤ σk∇xLMAP(x;λ)k.

Assumption 3.4

(

-smoothness of

x∗(λ)

)

For all

λ∈

(0,Λ]

x∗(λ)

is unique, and there exists

C > 0

such that for

all λ1, λ2∈(0,Λ],kx∗(λ1)−x∗(λ2)k ≤ C|λ1−λ2|.

Remark 3.5.Smoothness assumptions like Assumptions 3.2

and 3.4 are commonly used in convergence analysis. See,

for example, Song et al. (2019, Assumption 3.2), Zhou et al.

(2019, Assumption 1), and Scaman et al. (2022).

We make two comments on Assumption 3.3:

(i) Local strong convexity around the minima of the loss is a

Optimization for Amortized Inverse Problems

widely-adopted assumption in deep learning literature, e.g.,

Li & Yuan (2017, Deﬁnition 2.4), Whang et al. (2020, Theo-

rem 4.1), and Safran et al. (2021, Deﬁnition 5). Our Assump-

tion 3.3 is weaker than the local strong convexity. Reversely,

if there exists

µ0>0

and

δ > 0

such that for all

λ∈[0,Λ]

LMAP(·;λ)

µ0

-strongly convex on

B(x∗(λ), δ)

, then As-

sumption 3.3 holds with

σ= 2/µ0

. More Details about this

comment can be found in Appendix B.1.

(ii) If Assumption 3.2 holds, then Assumption 3.3 implies

the local Polyak-Lojasiewicz condition. That is to say, if

Assumptions 3.2 and 3.3 hold, then for all

λ∈[0,Λ]

and

x∈B(x∗(λ), δ), we have

LMAP(x;λ)−LMAP(x∗(λ); λ)≤1

2µk∇xLMAP(x;λ)k2,

where

µ=1

2Lσ2

. More details about this comment can be

found in Appendix. B.1.

The local Polyak-Lojasiewicz property is previously used

to characterize the local optimization landscape for training

neural networks (Song et al.,2021;Karimi et al.,2016;Liu

et al.,2022a). It has been shown that wide neural networks

satisfy the local Polyak-Lojasiewicz property under mild

assumptions (Liu et al.,2020, Theorem 7.2).

Now we present our main result.

Theorem 3.6.

Under Assumptions 3.2,3.3, and 3.4, for

all

ε > 0

, Algorithm 1returns

x(Λ)

that satisﬁes

kˆ

x(Λ) −x∗(Λ)k ≤ ε.

Note that Theorem 3.6 ensures that Algorithm 1for AIPO

ﬁnds an

-approximate point of the

global

minimum of

LMAP(·; Λ)

. We give a proof sketch of Theorem 3.6 here

and defer the formal proof to the appendix.

Proof sketch.

Starting from the global minimum when

λ= 0

, our algorithm ensures that the

-th outer iteration

learns

x∗(iΛ/N)

approximately and serves as a good ini-

tialization for the next target

x∗((i+ 1)Λ/N)

. Note that

in each iteration we incrementally grow

from

iΛ/N

(i+ 1)Λ/N

until reaching our target

. Speciﬁcally, As-

sumptions 3.4 and 3.3, respectively, ensure

x∗(iΛ/N)

to be

close enough to

x∗((i+1)Λ/N)

and that the ﬁrst order algo-

rithm can ﬁnd

x∗((i+ 1)Λ/N)

from the good initialization

obtained in the last iteration.

To avoid specifying the parameters in the assumptions

(

L, σ, δ, C

) and the precision

, we provide an efﬁcient and

practical implementation of AIPO in Algorithm 2in the

appendix, where the scheme for hyperparameter increment

is data-adaptive.

4. Experiments

In this section, we evaluate the performance of the proposed

algorithm on three inverse problem tasks, including denois-

ing, noisy compressed sensing, and inpainting. We use two

normalizing ﬂow models as the generative prior, both of

which work well, to justify that our algorithm is a general

framework and does not require model-speciﬁc adaption.

Note that using deep generative priors in inverse problems

has been demonstrated to outperform classical approaches

(Bora et al.,2017;Asim et al.,2020a;Whang et al.,2021a).

We focus on illustrating the advantage of AIPO over conven-

tional optimizations without the amortization scheme and

skip the comparison with classical approaches.

4.1. Setup

We use two commonly used normalizing ﬂow models, Real-

NVP (Dinh et al.,2016) and GLOW (Kingma & Dhariwal,

2018), respectively, as the generative prior

. The two mod-

els are trained on the CelebA dataset (Liu et al.,2015), and

we follow the pertaining suggestions by Asim et al. (2020a)

and Whang et al. (2021a), to which we refer the readers for

the model architecture and technical details.

Our experiments consist of two sets of data. One is in-

distribution samples that are randomly selected from the

CelebA test set. The other is out-of-distribution (OOD)

images that contain human or human-like objects. Due

to budget constraints, we run the experiments on 200 in-

distribution and 7 OOD samples. For baseline algorithms,

we consider minimizing the MAP loss as in equation

(3)

by gradient descent with random or zero initialization that

is widely used in literature (Bora et al.,2017;Asim et al.,

2020a;Whang et al.,2021a). Concretely, random initial-

ization ﬁrst draws a Gaussian random vector

and uses

x0=G(z0)

as the initial value. Zero initialization takes

z0=0

and initializes

x0=G(z0)

. Furthermore, to demon-

strate that AIPO’s outperformance indeed results from amor-

tization rather than merely the MLE initialization, gradient

descent with the MLE initialization is used as the third

baseline approach.

All the three baselines have

ﬁxed throughout the optimiza-

tion process. In all the experiments, we optimize

by Adam

(Kingma & Ba,2014) and assign an equal computing bud-

get to all the approaches compared in each subsection. Our

AIPO and the baseline algorithm with the MLE initialization

require a solution to

(4)

on the NCS and inpainting tasks,

where we run 500 iterations of projected gradient descent.

In these cases, we assign an extra computing budget of the

same amount to the baseline algorithms with the random

and zero initialization.

In all the experiments, we compare the algorithms with a

prespeciﬁed

, which is set to be

0.3,0.5,1.0,1.5,2.0

, re-

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

OptimizationforAmortizedInverseProblemsTianciLiu1TongYang2QuanZhang3QiLei4AbstractIncorporatingadeepgenerativemodelasthepriordistributionininverseproblemshasestablishedsubstantialsuccessinreconstructingimagesfromcorruptedobservations.Notwithstanding,theex-istingoptimizationapproachesusegradientde-sc...

展开>> 收起<<

Optimization for Amortized Inverse Problems Tianci Liu1Tong Yang2Quan Zhang3Qi Lei4 Abstract.pdf

共23页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Optimization for Amortized Inverse Problems Tianci Liu1Tong Yang2Quan Zhang3Qi Lei4 Abstract

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: