Loop Unrolled Shallow Equilibrium Regularizer LUSER - A Memory-Eﬃcient Inverse Problem Solver Peimeng Guan1 Jihui Jin1 Justin Romberg1 Mark Davenport1

2025-04-29 0 0 1.08MB 11 页 10玖币

侵权投诉

Loop Unrolled Shallow Equilibrium Regularizer (LUSER) - A

Memory-Eﬃcient Inverse Problem Solver

Peimeng Guan1∗, Jihui Jin1, Justin Romberg1, Mark Davenport1

1Electrical and Computer Engineering

Georgia Institute of Technology

Atlanta, GA 30332, United States

October 17, 2022

Abstract

In inverse problems we aim to reconstruct some underlying signal of interest from potentially corrupted

and often ill-posed measurements. Classical optimization-based techniques proceed by optimizing a data

consistency metric together with a regularizer. Current state-of-the-art machine learning approaches

draw inspiration from such techniques by unrolling the iterative updates for an optimization-based solver

and then learning a regularizer from data. This loop unrolling (LU) method has shown tremendous

success, but often requires a deep model for the best performance leading to high memory costs during

training. Thus, to address the balance between computation cost and network expressiveness, we propose

an LU algorithm with shallow equilibrium regularizers (LUSER). These implicit models are as expressive

as deeper convolutional networks, but far more memory eﬃcient during training. The proposed method

is evaluated on image deblurring, computed tomography (CT), as well as single-coil Magnetic Resonance

Imaging (MRI) tasks and shows similar, or even better, performance while requiring up to 8×less

computational resources during training when compared against a more typical LU architecture with

feedforward convolutional regularizers.

1 Introduction

In an inverse problems we face the task of reconstructing some data or parameters of an unknown signal from

indirect observations. The forward process, or the mapping from the data to observations, is typically well

known, but ill-posed or non-invertible. More formally, we consider the task of recovering some underlying

signal xfrom measurements ytaken via some forward operator Aaccording to

y=Ax +η,(1)

where ηrepresents noise. The forward operator can be nonlinear, but to simplify the notation, we illustrate

the idea in linear form throughout this paper. A common approach to recover the signal is via an iterative

method based on the least squares loss:

x= arg min

ky−Axk2.(2)

For many problems of interest, Ais ill-posed and does not have full column rank. Thus, attempting to

solve (2) does not yield a unique solution. To address this, we can extend (2) by including a regularizing

term to bias the inversion towards solutions with favorable properties. Common examples of regularization

include `2,`1, and Total Variation (TV). Each regularizer encourages certain properties on the estimated

signal b

x(e.g., smoothness, sparsity, piece-wise constant, etc.) and is often chosen based on task-speciﬁc prior

knowledge.

∗Correspondence: guanpeimeng@gatech.edu

arXiv:2210.04987v2 [eess.IV] 14 Oct 2022

Recent works [1] attempt to tackle inverse problems using more data-driven methods. Unlike typical

supervised learning tasks that attempt to learn a mapping purely from examples, deep learning for inverse

problems have access to the forward operator and thus should be able to guide the learning process for

more accurate reconstructions. One popular approach to incorporating knowledge of the forward operator is

termed loop unrolling (LU). These methods are heavily inspired by standard iterative inverse problem solvers,

but rather than use a hand tuned regularizer, they instead learn the update with some parameterized model.

They tend to have a ﬁxed number of iterations (typically around 5-10) due to computational constraints.

[2] proposes an interesting alternative that takes advantage of deep equilibrium (DEQ) models [3–6] that we

refer to as DEQ4IP. Equilibrium models are designed to recursively iterate on their input until a “ﬁxed point"

is found (i.e., the input no longer changes after passing through the model). They extend this principle to

the LU method, choosing to iterate until convergence rather than for a “ﬁxed budget".

Our Contributions. We propose an alternative novel architecture for solving inverse problems called

Loop Unrolled Shallow Equilibrium Regularizer (LUSER). It incorporates knowledge of the forward model

by adopting the principles of LU architectures while reducing its memory consumption by using a shallow

(relative to existing feed-forward models) DEQ as the learned regularizer update. Unlike DEQ4IP that

converts the entire LU architecture into a DEQ model, we only convert the learned regularizer at each stage.

This has the advantage of simplifying the learning task for DEQ models, which can be unstable to train in

practice. To our knowledge, this is the ﬁrst use of multiple sequential DEQ models within a single architecture

for solving inverse problems. Our proposed architecture (i) reduces the number of forward/adjoint operations

compared to the work proposed by [2], (ii) reduces the memory footprint during training without loss of

expressiveness as demonstrated by our experiments, and (iii) is more stable to train in practice than DEQ4IP.

We empirically demonstrate better reconstruction across multiple tasks than state-of-the-art LU alternatives

with a similar number of parameters, with the ability to reduce computational memory costs during training

by a factor of up to 8×.

The remainder of the paper is organized as follows. Section 2 reviews related works in solving inverse

problems. Section 3 introduces the proposed LUSER, which we compare with other baseline methods in

image deblurring, CT, and MRI tasks in Section 4. We conclude in Section 5 with a brief discussion.

2 Related Work

2.1 Loop Unrolling

As noted above, a common approach to tackling an inverse problem is to cast it as an optimization problem

consisting of the sum of a data consistency term and a regularization term

min

xky−Axk2

2+γ r(x),(3)

where ris a regularization function mapping from the domain of the parameters of interest to a real number

and γ≥0is a well-tuned coeﬃcient. The regularization function is chosen for speciﬁc classes of signals to

exploit any potential structure, e.g., kxk2for smooth signals and kxk0or kxk1for sparse signals. The total-

variation (TV) norm is another popular example of a regularizer that promotes smoothness while preserving

edges and is often used for

When ris diﬀerentiable, the solution of (3) can be obtained in an iterative fashion via gradient descent.

For some step size λat iteration k= 1,2, . . . , K, we apply the update:

xk+1 =xk+λA>(y−Axk)−λγ∇r(xk).(4)

For non-diﬀerentiable r, the more generalized proximal gradient algorithm can be applied with the following

update, where τis a well-tuned hyperparameter related to the proximal operator:

xk+1 =proxτ,r (xk+λA>(y−Axk)).(5)

The loop unrolling (LU) method considers running the update in (4) or (5), but replaces λγ∇ror the proximal

operator with a learned neural network instead. The overall architecture repeats the neural network based

update for a pre-determined number of iterations, ﬁxing the overall computational budget. Note that the

network is only implicitly learning the regularizer. In practice, it is actually learning an update step, which

can be thought of as de-noising or a projection onto the manifold of the data. LU is typically trained end-

to-end, i.e., when fed some initialization x0, the network will output the ﬁnal estimate xK, compute a loss

with respect to the ground truth, and then back-propagate across the entire computational graph to update

network parameters. While end-to-end training is easier to perform and encourages faster convergence, it

requires all intermediate activations to be stored in memory. Thus, the maximum number of iterations is

always kept small compared to classical iterative inverse problem solvers.

Due to the limitation in memory, there is a trade-oﬀ between the depth of a LU and the richness of

each regularization network. Intuitively, one can raise the network performance by increasing the number

of loop unrolled iterations. For example, [2] extends the LU model to potentially inﬁnite number of itera-

tions using an implicit network, and [7] allows deeper unrolling iterations using invertible networks, while

requiring recalculation of the intermediate results from the output in training phase. This approach can

be computationally intensive for large-scale inverse problems or when the forward operator is nonlinear and

computationally expensive to apply. For example, the forward operator may involve solving diﬀerential equa-

tions such as the wave equation for seismic wave propagation [8] and the Lorenz equations for atmospheric

modeling[9].

Alternatively, one can design a richer regularization network. For example [10] uses a transformer as the

regularization network and achieves extremely competitive results in the fastMRI challenge [11], but requires

multiple 24GB GPU for training with batch size of 1, which is often impractical, especially for large systems.

Our design strikes a balance in the expressiveness in regularization networks and memory eﬃciency during

training. Our proposed work is an alternative method to achieve a rich regularization network without the

additional computational memory costs during training.

2.2 Deep Equilibrium Models for Inverse Problems (DEQ4IP)

Deep equilibrium (DEQ) models introduce an alternative to traditional feed-forward networks [3–6]. Rather

than feed an input through a ﬁxed (relatively small) number of layers, DEQ models solve for the “ﬁxed-point"

given some input. More formally, given a network fθand some input x(0) and y, we recursively apply the

network via

x(k+1) =fθ(x(k),y),(6)

until convergence.1In this instance, yacts as an input injection that determines the ﬁnal output. This is

termed the forward process. The weights θof the model can be trained via implicit diﬀerentiation, removing

the need to store all intermediate activations from recursively applying the network. This allows for deeper,

more expressive models without the associated memory footprint to train such models.

[2] demonstrates an application of one such model, applying similar principles to a single iteration of an LU

architecture. Such an idea is a natural extension as it allows the model to “iterate until convergence" rather

than rely on a “ﬁxed budget". In another word, looping over (4) and (5) many times (in practice, usually

around 50 iterations) until xkconverges. However, such a model can be unstable to train and often performs

best with pre-training of the learned portion of the model (typically acting as a learned regularizer/de-noiser).

It is also important to note is that such a model would have to apply the forward operator (and potentially

the adjoint) many times during the forward process. Although this can be accelerated to reduce the number

of applications, it is still often more than the number of applications for an LU equivalent. This can be an

issue if the forward operator is computationally expensive to apply, an issue LU methods somewhat mitigate

by ﬁxing the total number of iterations.

2.3 Alternative Approaches to Tackle Memory Issues

I-RIM [7] is a deep invertible network that address the memory issue by recalculating the intermediate

results from the output. However it is not ideal when forward model is computationally expensive. Gradient

checkpointing [12] is another practical technique to reduce memory costs for deep neural networks. It saves

1Note that, since our approach will ultimately use both methods, to aid in a clearer presentation we use subscript, i.e., xk,

to denote the LU iterations, and superscript with parenthesis, i.e., r(i), to denote the iterations in the deep equilibrium model.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

LoopUnrolledShallowEquilibriumRegularizer(LUSER)-AMemory-EcientInverseProblemSolverPeimengGuan1*,JihuiJin1,JustinRomberg1,MarkDavenport11ElectricalandComputerEngineeringGeorgiaInstituteofTechnologyAtlanta,GA30332,UnitedStatesOctober17,2022AbstractIninverseproblemsweaimtoreconstructsomeunderlyingsig...

展开>> 收起<<

Loop Unrolled Shallow Equilibrium Regularizer LUSER - A Memory-Eﬃcient Inverse Problem Solver Peimeng Guan1 Jihui Jin1 Justin Romberg1 Mark Davenport1.pdf

共11页,预览3页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Loop Unrolled Shallow Equilibrium Regularizer LUSER - A Memory-Eﬃcient Inverse Problem Solver Peimeng Guan1 Jihui Jin1 Justin Romberg1 Mark Davenport1

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: