Loop Unrolled Shallow Equilibrium Regularizer LUSER - A Memory-Efficient Inverse Problem Solver Peimeng Guan1 Jihui Jin1 Justin Romberg1 Mark Davenport1

2025-04-29 0 0 1.08MB 11 页 10玖币
侵权投诉
Loop Unrolled Shallow Equilibrium Regularizer (LUSER) - A
Memory-Efficient Inverse Problem Solver
Peimeng Guan1, Jihui Jin1, Justin Romberg1, Mark Davenport1
1Electrical and Computer Engineering
Georgia Institute of Technology
Atlanta, GA 30332, United States
October 17, 2022
Abstract
In inverse problems we aim to reconstruct some underlying signal of interest from potentially corrupted
and often ill-posed measurements. Classical optimization-based techniques proceed by optimizing a data
consistency metric together with a regularizer. Current state-of-the-art machine learning approaches
draw inspiration from such techniques by unrolling the iterative updates for an optimization-based solver
and then learning a regularizer from data. This loop unrolling (LU) method has shown tremendous
success, but often requires a deep model for the best performance leading to high memory costs during
training. Thus, to address the balance between computation cost and network expressiveness, we propose
an LU algorithm with shallow equilibrium regularizers (LUSER). These implicit models are as expressive
as deeper convolutional networks, but far more memory efficient during training. The proposed method
is evaluated on image deblurring, computed tomography (CT), as well as single-coil Magnetic Resonance
Imaging (MRI) tasks and shows similar, or even better, performance while requiring up to 8×less
computational resources during training when compared against a more typical LU architecture with
feedforward convolutional regularizers.
1 Introduction
In an inverse problems we face the task of reconstructing some data or parameters of an unknown signal from
indirect observations. The forward process, or the mapping from the data to observations, is typically well
known, but ill-posed or non-invertible. More formally, we consider the task of recovering some underlying
signal xfrom measurements ytaken via some forward operator Aaccording to
y=Ax +η,(1)
where ηrepresents noise. The forward operator can be nonlinear, but to simplify the notation, we illustrate
the idea in linear form throughout this paper. A common approach to recover the signal is via an iterative
method based on the least squares loss:
b
x= arg min
x
kyAxk2.(2)
For many problems of interest, Ais ill-posed and does not have full column rank. Thus, attempting to
solve (2) does not yield a unique solution. To address this, we can extend (2) by including a regularizing
term to bias the inversion towards solutions with favorable properties. Common examples of regularization
include `2,`1, and Total Variation (TV). Each regularizer encourages certain properties on the estimated
signal b
x(e.g., smoothness, sparsity, piece-wise constant, etc.) and is often chosen based on task-specific prior
knowledge.
Correspondence: guanpeimeng@gatech.edu
1
arXiv:2210.04987v2 [eess.IV] 14 Oct 2022
Recent works [1] attempt to tackle inverse problems using more data-driven methods. Unlike typical
supervised learning tasks that attempt to learn a mapping purely from examples, deep learning for inverse
problems have access to the forward operator and thus should be able to guide the learning process for
more accurate reconstructions. One popular approach to incorporating knowledge of the forward operator is
termed loop unrolling (LU). These methods are heavily inspired by standard iterative inverse problem solvers,
but rather than use a hand tuned regularizer, they instead learn the update with some parameterized model.
They tend to have a fixed number of iterations (typically around 5-10) due to computational constraints.
[2] proposes an interesting alternative that takes advantage of deep equilibrium (DEQ) models [3–6] that we
refer to as DEQ4IP. Equilibrium models are designed to recursively iterate on their input until a “fixed point"
is found (i.e., the input no longer changes after passing through the model). They extend this principle to
the LU method, choosing to iterate until convergence rather than for a “fixed budget".
Our Contributions. We propose an alternative novel architecture for solving inverse problems called
Loop Unrolled Shallow Equilibrium Regularizer (LUSER). It incorporates knowledge of the forward model
by adopting the principles of LU architectures while reducing its memory consumption by using a shallow
(relative to existing feed-forward models) DEQ as the learned regularizer update. Unlike DEQ4IP that
converts the entire LU architecture into a DEQ model, we only convert the learned regularizer at each stage.
This has the advantage of simplifying the learning task for DEQ models, which can be unstable to train in
practice. To our knowledge, this is the first use of multiple sequential DEQ models within a single architecture
for solving inverse problems. Our proposed architecture (i) reduces the number of forward/adjoint operations
compared to the work proposed by [2], (ii) reduces the memory footprint during training without loss of
expressiveness as demonstrated by our experiments, and (iii) is more stable to train in practice than DEQ4IP.
We empirically demonstrate better reconstruction across multiple tasks than state-of-the-art LU alternatives
with a similar number of parameters, with the ability to reduce computational memory costs during training
by a factor of up to 8×.
The remainder of the paper is organized as follows. Section 2 reviews related works in solving inverse
problems. Section 3 introduces the proposed LUSER, which we compare with other baseline methods in
image deblurring, CT, and MRI tasks in Section 4. We conclude in Section 5 with a brief discussion.
2 Related Work
2.1 Loop Unrolling
As noted above, a common approach to tackling an inverse problem is to cast it as an optimization problem
consisting of the sum of a data consistency term and a regularization term
min
xkyAxk2
2+γ r(x),(3)
where ris a regularization function mapping from the domain of the parameters of interest to a real number
and γ0is a well-tuned coefficient. The regularization function is chosen for specific classes of signals to
exploit any potential structure, e.g., kxk2for smooth signals and kxk0or kxk1for sparse signals. The total-
variation (TV) norm is another popular example of a regularizer that promotes smoothness while preserving
edges and is often used for
When ris differentiable, the solution of (3) can be obtained in an iterative fashion via gradient descent.
For some step size λat iteration k= 1,2, . . . , K, we apply the update:
xk+1 =xk+λA>(yAxk)λγr(xk).(4)
For non-differentiable r, the more generalized proximal gradient algorithm can be applied with the following
update, where τis a well-tuned hyperparameter related to the proximal operator:
xk+1 =proxτ,r (xk+λA>(yAxk)).(5)
The loop unrolling (LU) method considers running the update in (4) or (5), but replaces λγror the proximal
operator with a learned neural network instead. The overall architecture repeats the neural network based
2
update for a pre-determined number of iterations, fixing the overall computational budget. Note that the
network is only implicitly learning the regularizer. In practice, it is actually learning an update step, which
can be thought of as de-noising or a projection onto the manifold of the data. LU is typically trained end-
to-end, i.e., when fed some initialization x0, the network will output the final estimate xK, compute a loss
with respect to the ground truth, and then back-propagate across the entire computational graph to update
network parameters. While end-to-end training is easier to perform and encourages faster convergence, it
requires all intermediate activations to be stored in memory. Thus, the maximum number of iterations is
always kept small compared to classical iterative inverse problem solvers.
Due to the limitation in memory, there is a trade-off between the depth of a LU and the richness of
each regularization network. Intuitively, one can raise the network performance by increasing the number
of loop unrolled iterations. For example, [2] extends the LU model to potentially infinite number of itera-
tions using an implicit network, and [7] allows deeper unrolling iterations using invertible networks, while
requiring recalculation of the intermediate results from the output in training phase. This approach can
be computationally intensive for large-scale inverse problems or when the forward operator is nonlinear and
computationally expensive to apply. For example, the forward operator may involve solving differential equa-
tions such as the wave equation for seismic wave propagation [8] and the Lorenz equations for atmospheric
modeling[9].
Alternatively, one can design a richer regularization network. For example [10] uses a transformer as the
regularization network and achieves extremely competitive results in the fastMRI challenge [11], but requires
multiple 24GB GPU for training with batch size of 1, which is often impractical, especially for large systems.
Our design strikes a balance in the expressiveness in regularization networks and memory efficiency during
training. Our proposed work is an alternative method to achieve a rich regularization network without the
additional computational memory costs during training.
2.2 Deep Equilibrium Models for Inverse Problems (DEQ4IP)
Deep equilibrium (DEQ) models introduce an alternative to traditional feed-forward networks [3–6]. Rather
than feed an input through a fixed (relatively small) number of layers, DEQ models solve for the “fixed-point"
given some input. More formally, given a network fθand some input x(0) and y, we recursively apply the
network via
x(k+1) =fθ(x(k),y),(6)
until convergence.1In this instance, yacts as an input injection that determines the final output. This is
termed the forward process. The weights θof the model can be trained via implicit differentiation, removing
the need to store all intermediate activations from recursively applying the network. This allows for deeper,
more expressive models without the associated memory footprint to train such models.
[2] demonstrates an application of one such model, applying similar principles to a single iteration of an LU
architecture. Such an idea is a natural extension as it allows the model to “iterate until convergence" rather
than rely on a “fixed budget". In another word, looping over (4) and (5) many times (in practice, usually
around 50 iterations) until xkconverges. However, such a model can be unstable to train and often performs
best with pre-training of the learned portion of the model (typically acting as a learned regularizer/de-noiser).
It is also important to note is that such a model would have to apply the forward operator (and potentially
the adjoint) many times during the forward process. Although this can be accelerated to reduce the number
of applications, it is still often more than the number of applications for an LU equivalent. This can be an
issue if the forward operator is computationally expensive to apply, an issue LU methods somewhat mitigate
by fixing the total number of iterations.
2.3 Alternative Approaches to Tackle Memory Issues
I-RIM [7] is a deep invertible network that address the memory issue by recalculating the intermediate
results from the output. However it is not ideal when forward model is computationally expensive. Gradient
checkpointing [12] is another practical technique to reduce memory costs for deep neural networks. It saves
1Note that, since our approach will ultimately use both methods, to aid in a clearer presentation we use subscript, i.e., xk,
to denote the LU iterations, and superscript with parenthesis, i.e., r(i), to denote the iterations in the deep equilibrium model.
3
摘要:

LoopUnrolledShallowEquilibriumRegularizer(LUSER)-AMemory-EcientInverseProblemSolverPeimengGuan1*,JihuiJin1,JustinRomberg1,MarkDavenport11ElectricalandComputerEngineeringGeorgiaInstituteofTechnologyAtlanta,GA30332,UnitedStatesOctober17,2022AbstractIninverseproblemsweaimtoreconstructsomeunderlyingsig...

展开>> 收起<<
Loop Unrolled Shallow Equilibrium Regularizer LUSER - A Memory-Efficient Inverse Problem Solver Peimeng Guan1 Jihui Jin1 Justin Romberg1 Mark Davenport1.pdf

共11页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:11 页 大小:1.08MB 格式:PDF 时间:2025-04-29

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 11
客服
关注