LINEAR INVERSE PROBLEMS WITH HESSIAN-SCHATTEN TOTAL VARIATION LUIGI AMBROSIO SHAYAN AZIZNEJADy CAMILLO BRENA AND MICHAEL UNSERy

2025-05-03 0 0 1005.16KB 29 页 10玖币
侵权投诉
LINEAR INVERSE PROBLEMS WITH
HESSIAN-SCHATTEN TOTAL VARIATION
LUIGI AMBROSIO*, SHAYAN AZIZNEJAD, CAMILLO BRENA*, AND MICHAEL UNSER
Abstract. In this paper, we characterize the class of extremal points of the unit ball
of the Hessian-Schatten total variation (HTV) functional. The underlying motivation for
our work stems from a general representer theorem that characterizes the solution set of
regularized linear inverse problems in terms of the extremal points of the regularization
ball. Our analysis is mainly based on studying the class of continuous and piecewise linear
(CPWL) functions. In particular, we show that in dimension d= 2, CPWL functions are
dense in the unit ball of the HTV functional. Moreover, we prove that a CPWL function
is extremal if and only if its Hessian is minimally supported. For the converse, we prove
that the density result (which we have only proven for dimension d= 2) implies that the
closure of the CPWL extreme points contains all extremal points.
Contents
1. Introduction 2
2. Preliminaries 4
2.1. Schatten norms 4
2.2. Poincar´e inequalities 5
2.3. Distributions 5
3. Hessian–Schatten Total Variation 6
3.1. Definitions and Basic Properties 6
3.2. Boundary Extension 11
4. A Density Result for CPWL Functions 12
4.1. Definitions and The Main Result 13
4.2. Proof of Theorem 21 14
5. Extremal Points of The Unit Ball 24
Acknowledgments 27
References 27
*Scuola Normale Superiore di Pisa, Pisa, Italy (luigi.ambrosio@sns.it, camillo.brena@sns.it).
Biomedical Imaging Group, E´cole Polytechnique Fe´
de´rale de Lausanne, Lausanne, Switzerland
(sh.aziznejad@gmail.com, michael.unser@epfl.ch).
1
arXiv:2210.04077v1 [math.FA] 8 Oct 2022
LINEAR INVERSE PROBLEMS WITH HESSIAN-SCHATTEN TOTAL VARIATION 2
1. Introduction
Broadly speaking, the goal of an inverse problem is to reconstruct an unknown signal
of interest from a collection of (possibly noisy) observations. Linear inverse problems, in
particular, are prevalent in various areas of signal processing, such as denoising, impaint-
ing, and image reconstruction. They are defined via the specification of three principal
components: (i) a hypothesis space Ffrom which we aim to reconstruct the unknown
signal f∈ F; (ii) a linear forward operator ν:F RMthat models the data acquisition
process; and, (iii) the observed data that is stored in an array yRMwith the implicit
assumption that yν(f). The task is then to (approximately) reconstruct the unknown
signal ffrom the observed data y. From a variational perspective, the problem can be
formulated as a minimization of the form
farg min
f∈F
(E(ν(f),y) + λR(f)) ,(1)
where E:RM×RMRis a convex loss function that measures the data discrepancy, R:
F Ris the regularization functional that enforces prior knowledge on the reconstructed
signal, and λ > 0 is a tunable parameter that adjusts the two terms.
The use of regularization for solving inverse problems dates back to the 1960s, when
Tikhonov proposed a quadratic (`2-type) functional for solving finite-dimensional prob-
lems [Tik63]. More recently, Tikhonov regularization has been outperformed by `1-type
functionals in various settings [Tib96,DE03]. This is largely due to the sparsity-promoting
effect of the latter, in the sense that the solution of an `1-regularized inverse problem can be
typically written as the linear combination of a few predefined elements, known as atoms
[Don06b,BDE09]. Sparsity is a pivotal concept in modern signal processing and consti-
tutes the core of many celebrated methods. The most notable example is the framework
of compressed sensing [CRT06,Don06a,EK12], which has brought lots of attention in the
past decades.
In general, regularization enhances the stability of the problem and alleviates its inherent
ill-posedness, especially when the hypothesis space is much larger than M. While this can
happen in the discrete setting (e.g. when F=Rdwith dM), it is inevitable in the con-
tinuum where Fis an infinite-dimensional space of functions. Since naturally occurring
signals and images are usually indexed over the whole continuum, studying continuous-
domain problems is, therefore, undeniably important. It thus comes with no surprise to
see the rich literature on this class of optimization problems. Among the classical examples
are the smoothing splines for interpolation [Sch88,Rei67] and the celebrated framework of
learning over reproducing kernel Hilbert spaces [Wah90,SHS01]. Remarkably, the latter
laid the foundation of numerous kernel-based machine learning schemes such as support-
vector machines [EPP00]. The key theoretical result of these frameworks is a “representer
theorem” that provides a parametric form for their optimal solutions. While these ex-
amples formulate optimization problems over Hilbert spaces, the representer theorem has
been recently extended to cover generic convex optimization problems over Banach spaces
[BCDC+19,BC20,Uns21,UA22]. In simple terms, these abstract results characterize the
solution set of (1) in terms of the extreme points of the unit ball of the regularization
LINEAR INVERSE PROBLEMS WITH HESSIAN-SCHATTEN TOTAL VARIATION 3
functional BR={f∈ F :R(f)1}. Hence, the original problem can be translated in
finding the extreme points of the unit ball BR.
In parallel, Osher-Rudin-Fatemi’s total-variation has been systematically explored in the
context of image restoration and denoising [ROF92,Cha04,Get12]. The total-variation of
a differentiable function f: Ω Rcan be computed as
TV(f) = ˆkf(x)k`2dx.(2)
The notion can be extended to cover non-differentiable functions using the theory of func-
tions with bounded variation [AFP00,CDDD03]. In this case, the representer theorem
states that the solution can be written as the linear combination of some indicator func-
tions [BC20]. This adequately explains the so called “stair-case effect” of TV regulariza-
tion. Subsequently, higher-order generalizations of TV regularization have been proposed
by Bredies et al. [BKP10,BH14,BH20]. Particularly, the second-order TV has been used
in various applications [HS06,BP10,KBPS11]. By analogy with (2), the second-order TV
is defined over the space of functions with bounded Hessian [Dem84]. In particular, it can
be computed for twice-differentiable functions f: Ω Ras
TV(2)(f) = ˆk∇2f(x)kFdx,(3)
where k·kFdenotes the Frobenius norm of a matrix. Lefkimiatis et al. generalized the
notion by replacing the Frobenius norm with any Schatten-pnorm for p[1,+] [LWU13,
LU13]. While this had been only defined for twice-differentiable functions, it has been
recently extended to the space of functions with bounded Hessian [ACU21]. The extended
seminorm—the Hessian-Schatten total variation (HTV)—has also been used for learning
continuous and piecewise linear (CPWL) mappings [CAU21,PGU22]. The motivation and
importance of the latter stems from the following observations:
(1) The CPWL family plays a significant role in deep learning. Indeed, it is known
that the input-output mapping of any deep neural networks (DNN) with rectified
linear unit (ReLU) activation functions is a CPWL function [MPCB14]. Conversely,
any CPWL mapping can be exactly represented by a DNN with ReLU activation
functions [ABMM16]. These results provide a one-to-one correspondence between
the CPWL family and the input-output mappings of commonly used DNNs.
(2) For one-dimensional problems (i.e., when Ω R), the HTV seminorm coincides
with the second-order TV. Remarkably, the representer theorem in this case states
that the optimal solution can be achieved by a linear spline; that is, a univariate
CPWL function. The latter suggests the use of TV(2) regularization for learning
univariate functions [SESS19,Uns19,AGCU20,BCG+20,DDUF21,ADU22].
(3) It is known from the literature on low-rank matrix recovery that the Schatten-
1 norm (also known as the nuclear norm) promotes low rank matrices [DR16].
Hence, by using the HTV seminorm with p= 1, one expects to obtain a mapping
whose Hessian has low rank at most points, with the extreme case being the CPWL
family whose Hessian is zero almost everywhere.
LINEAR INVERSE PROBLEMS WITH HESSIAN-SCHATTEN TOTAL VARIATION 4
The aim of this paper is to identify the solution set of linear inverse problems with HTV reg-
ularization. Motivated by recent general representer theorems (see, [BCDC+19,UA22], we
focus on the characterization of the extreme points of the unit ball of the HTV functional.
After recalling some preliminary concepts (Section 2), we study the HTV seminorm and
its associated native space from a mathematical perspective (Section 3). Next, we prove
our main theoretical result on density of CPWL functions in the unit ball of the HTV
seminorm (Theorem 21) in Section 4. Finally, we invoke a variant of the Krein-Milman
theorem to characterize the extreme points of the unit ball of the HTV seminorm (Section
5).
2. Preliminaries
Throughout the paper, we shall use fairly standard notations for various objects, such as
function spaces and sets. For example, Lnand Hkdenote the Lebesgue and k-dimensional
Hausdorff measures on Rn, respectively. Below, we recall some of the concepts that are
foundational for this paper.
2.1. Schatten norms.
Definition 1 (Schatten norm).Let p[1,+]. If MRn×nand s1(M), . . . , sn(M)0
denote the singular values of M(counted with their multiplicity), we define the Schatten
p-norm of Mby
|M|p:=k(s1(M), . . . , sn(M))k`p.
We recall that the scalar product between M, N Rn×nis defined by
M·N:= trMtN=X
i,j=1,...,n
Mi,jNi,j
and induces the Hilbert-Schmidt norm. Next, we enumerate several properties of the
Schatten norms that shall be used throughout the paper. We refer to standard books on
matrix analysis (such as [Bha97]) for the proof of these results.
Proposition 2. The family of Schatten norms satisfies the following properties.
(1) If MRn×nis symmetric, then its singular values s1(M), . . . , sn(M)are equal
to |λ1(M)|,...,|λn(M)|, where λ1(M), . . . , λn(M)denote the eigenvalues of M
(counted with their multiplicity). Hence |M|p=k(λ1(M), . . . , λn(M))k`p.
(2) If MRn×nand NO(Rn), then |MN|p=|NM|p=|M|p.
(3) If M, N Rn×n, then |MN|p≤ |M|p|N|p.
(4) If MRn×n, then |M|p= supNM·N, where the supremum is taken among all
NRn×nwith |N|p1, for pthe conjugate exponent of p.
(5) If Mhas rank 1, then |M|pcoincides with the Hilbert-Schmidt norm of Mfor every
p[1,+].
(6) If p(1,+), then the Schatten p-norm is strictly convex [AU21, Corollary 1].
(7) If MRn×n, then |M|pC|M|q, where C=C(n, p, q)depends only on n,pand
q.
LINEAR INVERSE PROBLEMS WITH HESSIAN-SCHATTEN TOTAL VARIATION 5
Definition 3 (Lr-Schatten p-norm).Let p, r [1,+] and let M(Lr(Rn))n×n. We
define the Lr-Schatten p-norm of Mas
kMkp,r :=k|M|pkLr(Rn).
An analogous definition can be given when the reference measure for the Lrspace is not
the Lebesgue measure.
2.2. Poincar´e inequalities. We recall that for a Borel set ARnwith Ln(A)>0 and
fL1(A), then
ˆA
fdLn:=1
Ln(A)ˆA
fdLn.
Definition 4. Let ARnbe an open domain. We say that Asupports Poincar´e inequal-
ities if for every q[1, n) there exists a constant C=C(A, q) depending on Aand qsuch
that
ˆAf− −
ˆA
f
q
dLn1/q
C
ˆA|∇f|qdLn1/q
for every fW1,q(A),
where 1/q= 1/q 1/n.
We recall that any ball in Rnsupports Poincar´e inequalities [EG15, Theorem 4.9].
Remark 5.Let Abe a bounded open domain supporting Poincar´e inequalities. We recall
the following fact: if fW1,1
loc (A) is such that ´A|∇f|qdLn<+, then fLq(A),
where 1/q= 1/q 1/n. To show this, apply a Poincar´e inequality to fm:= (fm)
mW1,q(A), with ´A|∇fm|qdLn´A|∇f|qdLn, and deduce that, for a constant cm:=
´AfmdLn, it holds
ˆA|fmcm|qdLn1/q
C
ˆA|∇f|qdLn1/q
.
Now, if BAis a ball with ¯
BA, we have that kfmkL1(B)≤ kfkL1(B)<+and
kfmcmkL1(B)is bounded in m, so that supm|cm|<. We also have that kfmcmkLq(A)
is uniformly bounded. Thus, we infer that kfmkLq(A)is bounded in m, whence fLq(A).
2.3. Distributions. We denote, as usual, D(Ω) = C
c(Ω) the space of test functions
and D0(Ω) its dual, i.e. the space of distributions [Sch57]. If T∈ D0(Ω), we denote with
2Tthe distributional Hessian of T, i.e. the matrix of distributions {2
i,jT}i,j1,...,n where
(2
i,jT)(f):=T(ijf) for every f∈ D(Ω). In a natural way, if F∈ D(Ω)n×n, we denote
2T(F):=X
i,j=1...,n
2
i,jT(Fi,j ).
Remark 6.Let Tbe a distribution on Ω such that for every i= 1, . . . , n,iTis a Radon
measure. Then Tis induced by a BVloc(Ω) function.
The proof of this fact is classical. Here, we sketch it for the reader’s convenience.
摘要:

LINEARINVERSEPROBLEMSWITHHESSIAN-SCHATTENTOTALVARIATIONLUIGIAMBROSIO*,SHAYANAZIZNEJADy,CAMILLOBRENA*,ANDMICHAELUNSERyAbstract.Inthispaper,wecharacterizetheclassofextremalpointsoftheunitballoftheHessian-Schattentotalvariation(HTV)functional.Theunderlyingmotivationforourworkstemsfromageneralrepresente...

展开>> 收起<<
LINEAR INVERSE PROBLEMS WITH HESSIAN-SCHATTEN TOTAL VARIATION LUIGI AMBROSIO SHAYAN AZIZNEJADy CAMILLO BRENA AND MICHAEL UNSERy.pdf

共29页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:29 页 大小:1005.16KB 格式:PDF 时间:2025-05-03

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 29
客服
关注