LINEAR INVERSE PROBLEMS WITH HESSIAN-SCHATTEN TOTAL VARIATION LUIGI AMBROSIO SHAYAN AZIZNEJADy CAMILLO BRENA AND MICHAEL UNSERy

2025-05-03 0 0 1005.16KB 29 页 10玖币

侵权投诉

LINEAR INVERSE PROBLEMS WITH

HESSIAN-SCHATTEN TOTAL VARIATION

LUIGI AMBROSIO*, SHAYAN AZIZNEJAD†, CAMILLO BRENA*, AND MICHAEL UNSER†

Abstract. In this paper, we characterize the class of extremal points of the unit ball

of the Hessian-Schatten total variation (HTV) functional. The underlying motivation for

our work stems from a general representer theorem that characterizes the solution set of

regularized linear inverse problems in terms of the extremal points of the regularization

ball. Our analysis is mainly based on studying the class of continuous and piecewise linear

(CPWL) functions. In particular, we show that in dimension d= 2, CPWL functions are

dense in the unit ball of the HTV functional. Moreover, we prove that a CPWL function

is extremal if and only if its Hessian is minimally supported. For the converse, we prove

that the density result (which we have only proven for dimension d= 2) implies that the

closure of the CPWL extreme points contains all extremal points.

Contents

1. Introduction 2

2. Preliminaries 4

2.1. Schatten norms 4

2.2. Poincar´e inequalities 5

2.3. Distributions 5

3. Hessian–Schatten Total Variation 6

3.1. Deﬁnitions and Basic Properties 6

3.2. Boundary Extension 11

4. A Density Result for CPWL Functions 12

4.1. Deﬁnitions and The Main Result 13

4.2. Proof of Theorem 21 14

5. Extremal Points of The Unit Ball 24

Acknowledgments 27

References 27

*Scuola Normale Superiore di Pisa, Pisa, Italy (luigi.ambrosio@sns.it, camillo.brena@sns.it).

†Biomedical Imaging Group, E´cole Polytechnique Fe´

de´rale de Lausanne, Lausanne, Switzerland

(sh.aziznejad@gmail.com, michael.unser@epﬂ.ch).

arXiv:2210.04077v1 [math.FA] 8 Oct 2022

LINEAR INVERSE PROBLEMS WITH HESSIAN-SCHATTEN TOTAL VARIATION 2

1. Introduction

Broadly speaking, the goal of an inverse problem is to reconstruct an unknown signal

of interest from a collection of (possibly noisy) observations. Linear inverse problems, in

particular, are prevalent in various areas of signal processing, such as denoising, impaint-

ing, and image reconstruction. They are deﬁned via the speciﬁcation of three principal

components: (i) a hypothesis space Ffrom which we aim to reconstruct the unknown

signal f∗∈ F; (ii) a linear forward operator ν:F → RMthat models the data acquisition

process; and, (iii) the observed data that is stored in an array y∈RMwith the implicit

assumption that y≈ν(f∗). The task is then to (approximately) reconstruct the unknown

signal f∗from the observed data y. From a variational perspective, the problem can be

formulated as a minimization of the form

f∗∈arg min

f∈F

(E(ν(f),y) + λR(f)) ,(1)

where E:RM×RM→Ris a convex loss function that measures the data discrepancy, R:

F → Ris the regularization functional that enforces prior knowledge on the reconstructed

signal, and λ > 0 is a tunable parameter that adjusts the two terms.

The use of regularization for solving inverse problems dates back to the 1960s, when

Tikhonov proposed a quadratic (`2-type) functional for solving ﬁnite-dimensional prob-

lems [Tik63]. More recently, Tikhonov regularization has been outperformed by `1-type

functionals in various settings [Tib96,DE03]. This is largely due to the sparsity-promoting

eﬀect of the latter, in the sense that the solution of an `1-regularized inverse problem can be

typically written as the linear combination of a few predeﬁned elements, known as atoms

[Don06b,BDE09]. Sparsity is a pivotal concept in modern signal processing and consti-

tutes the core of many celebrated methods. The most notable example is the framework

of compressed sensing [CRT06,Don06a,EK12], which has brought lots of attention in the

past decades.

In general, regularization enhances the stability of the problem and alleviates its inherent

ill-posedness, especially when the hypothesis space is much larger than M. While this can

happen in the discrete setting (e.g. when F=Rdwith dM), it is inevitable in the con-

tinuum where Fis an inﬁnite-dimensional space of functions. Since naturally occurring

signals and images are usually indexed over the whole continuum, studying continuous-

domain problems is, therefore, undeniably important. It thus comes with no surprise to

see the rich literature on this class of optimization problems. Among the classical examples

are the smoothing splines for interpolation [Sch88,Rei67] and the celebrated framework of

learning over reproducing kernel Hilbert spaces [Wah90,SHS01]. Remarkably, the latter

laid the foundation of numerous kernel-based machine learning schemes such as support-

vector machines [EPP00]. The key theoretical result of these frameworks is a “representer

theorem” that provides a parametric form for their optimal solutions. While these ex-

amples formulate optimization problems over Hilbert spaces, the representer theorem has

been recently extended to cover generic convex optimization problems over Banach spaces

[BCDC+19,BC20,Uns21,UA22]. In simple terms, these abstract results characterize the

solution set of (1) in terms of the extreme points of the unit ball of the regularization

LINEAR INVERSE PROBLEMS WITH HESSIAN-SCHATTEN TOTAL VARIATION 3

functional BR={f∈ F :R(f)≤1}. Hence, the original problem can be translated in

ﬁnding the extreme points of the unit ball BR.

In parallel, Osher-Rudin-Fatemi’s total-variation has been systematically explored in the

context of image restoration and denoising [ROF92,Cha04,Get12]. The total-variation of

a diﬀerentiable function f: Ω →Rcan be computed as

TV(f) = ˆΩk∇f(x)k`2dx.(2)

The notion can be extended to cover non-diﬀerentiable functions using the theory of func-

tions with bounded variation [AFP00,CDDD03]. In this case, the representer theorem

states that the solution can be written as the linear combination of some indicator func-

tions [BC20]. This adequately explains the so called “stair-case eﬀect” of TV regulariza-

tion. Subsequently, higher-order generalizations of TV regularization have been proposed

by Bredies et al. [BKP10,BH14,BH20]. Particularly, the second-order TV has been used

in various applications [HS06,BP10,KBPS11]. By analogy with (2), the second-order TV

is deﬁned over the space of functions with bounded Hessian [Dem84]. In particular, it can

be computed for twice-diﬀerentiable functions f: Ω →Ras

TV(2)(f) = ˆΩk∇2f(x)kFdx,(3)

where k·kFdenotes the Frobenius norm of a matrix. Lefkimiatis et al. generalized the

notion by replacing the Frobenius norm with any Schatten-pnorm for p∈[1,+∞] [LWU13,

LU13]. While this had been only deﬁned for twice-diﬀerentiable functions, it has been

recently extended to the space of functions with bounded Hessian [ACU21]. The extended

seminorm—the Hessian-Schatten total variation (HTV)—has also been used for learning

continuous and piecewise linear (CPWL) mappings [CAU21,PGU22]. The motivation and

importance of the latter stems from the following observations:

(1) The CPWL family plays a signiﬁcant role in deep learning. Indeed, it is known

that the input-output mapping of any deep neural networks (DNN) with rectiﬁed

linear unit (ReLU) activation functions is a CPWL function [MPCB14]. Conversely,

any CPWL mapping can be exactly represented by a DNN with ReLU activation

functions [ABMM16]. These results provide a one-to-one correspondence between

the CPWL family and the input-output mappings of commonly used DNNs.

(2) For one-dimensional problems (i.e., when Ω ⊆R), the HTV seminorm coincides

with the second-order TV. Remarkably, the representer theorem in this case states

that the optimal solution can be achieved by a linear spline; that is, a univariate

CPWL function. The latter suggests the use of TV(2) regularization for learning

univariate functions [SESS19,Uns19,AGCU20,BCG+20,DDUF21,ADU22].

(3) It is known from the literature on low-rank matrix recovery that the Schatten-

1 norm (also known as the nuclear norm) promotes low rank matrices [DR16].

Hence, by using the HTV seminorm with p= 1, one expects to obtain a mapping

whose Hessian has low rank at most points, with the extreme case being the CPWL

family whose Hessian is zero almost everywhere.

LINEAR INVERSE PROBLEMS WITH HESSIAN-SCHATTEN TOTAL VARIATION 4

The aim of this paper is to identify the solution set of linear inverse problems with HTV reg-

ularization. Motivated by recent general representer theorems (see, [BCDC+19,UA22], we

focus on the characterization of the extreme points of the unit ball of the HTV functional.

After recalling some preliminary concepts (Section 2), we study the HTV seminorm and

its associated native space from a mathematical perspective (Section 3). Next, we prove

our main theoretical result on density of CPWL functions in the unit ball of the HTV

seminorm (Theorem 21) in Section 4. Finally, we invoke a variant of the Krein-Milman

theorem to characterize the extreme points of the unit ball of the HTV seminorm (Section

5).

2. Preliminaries

Throughout the paper, we shall use fairly standard notations for various objects, such as

function spaces and sets. For example, Lnand Hkdenote the Lebesgue and k-dimensional

Hausdorﬀ measures on Rn, respectively. Below, we recall some of the concepts that are

foundational for this paper.

2.1. Schatten norms.

Deﬁnition 1 (Schatten norm).Let p∈[1,+∞]. If M∈Rn×nand s1(M), . . . , sn(M)≥0

denote the singular values of M(counted with their multiplicity), we deﬁne the Schatten

p-norm of Mby

|M|p:=k(s1(M), . . . , sn(M))k`p.

We recall that the scalar product between M, N ∈Rn×nis deﬁned by

M·N:= trMtN=X

i,j=1,...,n

Mi,jNi,j

and induces the Hilbert-Schmidt norm. Next, we enumerate several properties of the

Schatten norms that shall be used throughout the paper. We refer to standard books on

matrix analysis (such as [Bha97]) for the proof of these results.

Proposition 2. The family of Schatten norms satisﬁes the following properties.

(1) If M∈Rn×nis symmetric, then its singular values s1(M), . . . , sn(M)are equal

to |λ1(M)|,...,|λn(M)|, where λ1(M), . . . , λn(M)denote the eigenvalues of M

(counted with their multiplicity). Hence |M|p=k(λ1(M), . . . , λn(M))k`p.

(2) If M∈Rn×nand N∈O(Rn), then |MN|p=|NM|p=|M|p.

(3) If M, N ∈Rn×n, then |MN|p≤ |M|p|N|p.

(4) If M∈Rn×n, then |M|p= supNM·N, where the supremum is taken among all

N∈Rn×nwith |N|p∗≤1, for p∗the conjugate exponent of p.

(5) If Mhas rank 1, then |M|pcoincides with the Hilbert-Schmidt norm of Mfor every

p∈[1,+∞].

(6) If p∈(1,+∞), then the Schatten p-norm is strictly convex [AU21, Corollary 1].

(7) If M∈Rn×n, then |M|p≤C|M|q, where C=C(n, p, q)depends only on n,pand

LINEAR INVERSE PROBLEMS WITH HESSIAN-SCHATTEN TOTAL VARIATION 5

Deﬁnition 3 (Lr-Schatten p-norm).Let p, r ∈[1,+∞] and let M∈(Lr(Rn))n×n. We

deﬁne the Lr-Schatten p-norm of Mas

kMkp,r :=k|M|pkLr(Rn).

An analogous deﬁnition can be given when the reference measure for the Lrspace is not

the Lebesgue measure.

2.2. Poincar´e inequalities. We recall that for a Borel set A⊆Rnwith Ln(A)>0 and

f∈L1(A), then

−

ˆA

fdLn:=1

Ln(A)ˆA

fdLn.

Deﬁnition 4. Let A⊆Rnbe an open domain. We say that Asupports Poincar´e inequal-

ities if for every q∈[1, n) there exists a constant C=C(A, q) depending on Aand qsuch

that

−

ˆAf− −

ˆA

f

q∗

dLn1/q∗

≤C−

ˆA|∇f|qdLn1/q

for every f∈W1,q(A),

where 1/q∗= 1/q −1/n.

We recall that any ball in Rnsupports Poincar´e inequalities [EG15, Theorem 4.9].

Remark 5.Let Abe a bounded open domain supporting Poincar´e inequalities. We recall

the following fact: if f∈W1,1

loc (A) is such that ´A|∇f|qdLn<+∞, then f∈Lq∗(A),

where 1/q∗= 1/q −1/n. To show this, apply a Poincar´e inequality to fm:= (f∧m)∨

−m∈W1,q(A), with ´A|∇fm|qdLn≤´A|∇f|qdLn, and deduce that, for a constant cm:=

−

´AfmdLn, it holds

−

ˆA|fm−cm|q∗dLn1/q∗

≤C−

ˆA|∇f|qdLn1/q

Now, if B⊆Ais a ball with ¯

B⊆A, we have that kfmkL1(B)≤ kfkL1(B)<+∞and

kfm−cmkL1(B)is bounded in m, so that supm|cm|<∞. We also have that kfm−cmkLq∗(A)

is uniformly bounded. Thus, we infer that kfmkLq∗(A)is bounded in m, whence f∈Lq∗(A).



2.3. Distributions. We denote, as usual, D(Ω) = C∞

c(Ω) the space of test functions

and D0(Ω) its dual, i.e. the space of distributions [Sch57]. If T∈ D0(Ω), we denote with

∇2Tthe distributional Hessian of T, i.e. the matrix of distributions {∂2

i,jT}i,j∈1,...,n where

(∂2

i,jT)(f):=T(∂i∂jf) for every f∈ D(Ω). In a natural way, if F∈ D(Ω)n×n, we denote

∇2T(F):=X

i,j=1...,n

∂2

i,jT(Fi,j ).

Remark 6.Let Tbe a distribution on Ω such that for every i= 1, . . . , n,∂iTis a Radon

measure. Then Tis induced by a BVloc(Ω) function.

The proof of this fact is classical. Here, we sketch it for the reader’s convenience.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

LINEARINVERSEPROBLEMSWITHHESSIAN-SCHATTENTOTALVARIATIONLUIGIAMBROSIO*,SHAYANAZIZNEJADy,CAMILLOBRENA*,ANDMICHAELUNSERyAbstract.Inthispaper,wecharacterizetheclassofextremalpointsoftheunitballoftheHessian-Schattentotalvariation(HTV)functional.Theunderlyingmotivationforourworkstemsfromageneralrepresente...

展开>> 收起<<

LINEAR INVERSE PROBLEMS WITH HESSIAN-SCHATTEN TOTAL VARIATION LUIGI AMBROSIO SHAYAN AZIZNEJADy CAMILLO BRENA AND MICHAEL UNSERy.pdf

共29页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

LINEAR INVERSE PROBLEMS WITH HESSIAN-SCHATTEN TOTAL VARIATION LUIGI AMBROSIO SHAYAN AZIZNEJADy CAMILLO BRENA AND MICHAEL UNSERy

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: