Deep Kronecker Network Long Feng lfenghku.hk

2025-05-06 0 0 3.4MB 40 页 10玖币

侵权投诉

Deep Kronecker Network

Long Feng

lfeng@hku.hk

The University of Hong Kong

Guang Yang

guang.yang@my.cityu.edu.hk

City University of Hong Kong

Abstract

We propose Deep Kronecker Network (DKN), a novel framework designed for analyzing medical imaging

data, such as MRI, fMRI, CT, etc. Medical imaging data is diﬀerent from general images in at least two

aspects: i) sample size is usually much more limited, ii) model interpretation is more of a concern compared

to outcome prediction. Due to its unique nature, general methods, such as convolutional neural network

(CNN), are diﬃcult to be directly applied. As such, we propose DKN, that is able to i) adapt to low sample

size limitation, ii) provide desired model interpretation, and iii) achieve the prediction power as CNN. The

DKN is general in the sense that it not only works for both matrix and (high-order) tensor represented image

data, but also could be applied to both discrete and continuous outcomes. The DKN is built on a Kronecker

product structure and implicitly imposes a piecewise smooth property on coeﬃcients. Moreover, the

Kronecker structure can be written into a convolutional form, so DKN also resembles a CNN, particularly, a

fully convolutional network (FCN). Furthermore, we prove that with an alternating minimization algorithm,

the solutions of DKN are guaranteed to converge to the truth geometrically even if the objective function

is highly nonconvex. Interestingly, the DKN is also highly connected to the tensor regression framework

proposed by Zhou et al. (2010), where a CANDECOMP/PARAFAC (CP) low-rank structure is imposed on

tensor coeﬃcients. Finally, we conduct both classiﬁcation and regression analyses using real MRI data

from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) to demonstrate the eﬀectiveness of DKN.

Keywords: Image Analysis, Brain Imaging, Tensor Decomposition, CNN, Kronecker Product

1 Introduction

Medical imaging analysis has played a central role in medicine today. From Computed Tomography

(CT) to magnetic resonance imaging (MRI) and from MRI to functional MRI (fMRI), the advancement of

modern imaging technologies has beneﬁted tremendously to the diagnosis and treatment of a disease.

Although image analysis has been intensively studied over the past decades, medical image data is

signiﬁcantly diﬀerent from general image in at least two aspects. First, the sample size is much more limited,

arXiv:2210.13327v1 [stat.ML] 24 Oct 2022

while the image data are of higher order and higher dimension. For example, in MRI analysis, it is common

to have a dataset containing only hundreds or at most thousands of patients, but each with an MRI scan of

millions of voxels. In fMRI analysis, the number of voxels could be even larger. As a comparison, the sample

size in general image recognition or computer vision problems could be millions and easily much larger

than the image dimension. For instance, the ImageNet (Deng et al.,2009) database contains more than 14

million images nowadays. Second, model interpretation is usually more important than outcome prediction.

Compared to simply recognizing whether a patient has certain disease or not using medical imaging data, it

is usually more of a concern to interpret the prediction outcome. But for many image recognition problems,

outcome prediction is nearly the only thing of interest.

Due to its unique nature, it is diﬃcult to directly apply general image methods to medical imaging data.

Convolutional Neural Networks (CNN, Fukushima and Miyake,1982;LeCun et al.,1998) is arguably the

most successful method in image recognition in recent years. By introducing thousands or even millions of

unknown parameters in a composition of nonlinear functions, the CNN is able to achieve optimal prediction

accuracy. However, training a CNN requires large amount of samples, which is hardly available in medical

imaging analysis. Moreover, with such large amount of unknown parameters presented in a “black box”,

a CNN model is extremely diﬃcult to be interpreted and unable to satisfy the needs of medical imaging

analysis.

In statistics community, there are also numerous attempts in developing methodologies for medical

imaging analysis. One of the most commonly used strategy is to vectorize the image data and use the

obtained pixels as independent predictors. Built on this strategy, various methods have been developed in the

literature. To list a few, Total Variation (TV, Rudin et al.,1992) based approaches, e.g., Wang et al. (2017), aim

to promote smoothness in the vectorized coeﬃcients. Bayesian methods model the vectorized coeﬃcients

with certain prior distribution, such as the Ising prior (Goldsmith et al.,2014), Gaussian process (Kang

et al.,2018), etc. Although the aforementioned methods have demonstrated their eﬀectiveness in diﬀerent

applications, vectorizing the image data is clearly not an optimal strategy. Not mentioning that the spatial

information could be omitted, the resulted ultra high-dimensional vectors also face severe computational

limitations. Recently, Wu and Feng (2022) proposed an innovative framework named Sparse Kronecker

Product Decomposition (SKPD) to detect signal regions in image regression. The proposed approach is

appealing for sparse signal detection, but unable to analyze medical imaging data with dense signals.

When image data are represented as three or higher order tensors (such as MRI or fMRI), Zhou et al.

(2013) proposed a tensor regression (TR) framework that impose a CANDECOMP/PARAFAC (CP) low-rank

structure on the tensor coeﬃcients. By imposing the CP structure, the number of unknown parameters in the

tensor coeﬃcients could be signiﬁcantly reduced, so the computation could also be eased. Built on the CP

structure, Feng et al. (2021) further proposed a new Internal Variation (IV) penalization to mimic the eﬀects

of TV and promote smoothness of image coeﬃcients. While the TR framework is eﬀective, it is designed

for general tensor represented predictors, and does not fully utilize the special nature of image data. As a

consequence, it is unable to achieve the prediction power as CNN.

To this end, it is desired to develop an approach for medical imaging analysis that is able to i) adapt to

low sample size limitation, ii) enjoy good interpretability, and iii) achieve the prediction power as CNN. In

this paper, we develop a novel framework named Deep Kronecker Network (DKN) that is able to achieve

all three goals. The DKN is built on a Kronecker product structure and implicitly impose a latent piecewise

smooth property of coeﬃcients. Moreover, the DKN allows us to locate the image regions that is most

inﬂuential to the outcome and helps model interpretation. The DKN is general in the sense that it works for

both matrix and (high-order) tensor represented image data. Therefore, CT, MRI, fMRI and other types of

medical imaging data could all be handled by DKN. Furthermore, the DKN is embedded in a generalized

linear model, therefore it works for both discrete and continuous responses.

We call DKN a network because it resembles a CNN, particularly, a fully convolutional network (FCN).

Although DKN is started with a Kronecker structure, it could also be written into a convolutional form. But

diﬀerent from classical CNN, the convolutions in DKN have no overlaps. This design not only allows us to

achieve maximized dimension reduction, but also provides desired model interpretability. Interestingly, the

DKN is also highly connected to the tensor regression framework of Zhou et al. (2013). We show that the

DKN is equivalent to tensor regression applied to reshaped images. Therefore, the three seemingly irrelevant

methods: FCN, tensor regression and the DKN are connected to each other.

The DKN is solved by maximum likelihood estimation via alternating minimization algorithm. The

resulted loss function is highly nonconvex, however, we prove that the solutions of DKN is guaranteed to

converge to the truth under a restricted isometry property (RIP). The proof of the theory is based on a carefully

constructed power method. Due to the connections between DKN and FCN, our theoretical results also shed

light on the understanding of FCN. Finally, a comprehensive simulation study along with a real MRI analysis

from Alzheimer’s Disease Neuroimaging Initiative (ADNI) further demonstrated the eﬀectiveness of DKN.

The rest of the paper is organized as follows: we introduce the DKN in Section 2. In Section 3, we

discuss the computation of DKN. In Section 4we demonstrate the connections between DKN and FCN.

The connections between DKN and tensor regression are illustrated in Section 5. We in Section 6provide

theoretical guarantees of DKN. Section 7contains a comprehensive simulation study. Finally, we conduct a

real MRI analysis from ADNI in Section 8.

Notation: We use calligraphic letters A,Bto denote tensors, upper-case letters A,Bto denote

matrices, bold lower-case letters a,bto denote vectors. We let vecp¨q be the vectorization operator and

vec´1

p¨q p¨qbe its inverse with the subscripts subjecting the matrix/tensor size. For example, vec´1

pd,p,qqp¨qstands

for transforming a vector of dimension dpq to a tensor of dimension dˆpˆq. We let x¨,¨y to denote inner

product, bto denote Kronecker product. For vector v,}v}q“ přj|vj|qq1{qis the `qnorm. For a tensor A,

}A}F“bři,j,k A2

i,j,k is the Frobenius norm.

We use square brackets around the indices to denote the entries of tensors. For example, suppose that

APRn1ˆn2ˆn3ˆn4is a four-order tensor. Then the entries of Ais denoted as Ari1s,ri2s,ri3s,ri4s. For simplicity,

we may omit the square brackets when all indices are considered separate, i.e., Ai1,i2,i3,i4“Ari1s,ri2s,ri3s,ri4s.

By forming indices together, we obtain lower order tensors. For example, a three-order tensor can be obtained

by forming the ﬁrst two indices together, with entries denoted by Ari1i2s,ri3s,ri4s. Here the grouped index ri1i2s

is equivalent to the linear index i1`n1pi2´1q. Grouping the last three indices together results to a matrix

(two-order tensor) with entries Ari1s,ri2i3i4s, where the index ri2i3i4sdenotes i2`n2pi3´1q`n2n3pi4´1q.

When all the indices are grouped together, we obtain the vectorization of A, also denoted as vecpAq, with

entries Ari1i2i3i4s.

2 Deep Kronecker Network

To get started, suppose that we observe nsamples with matrix represented images XiPRdˆpand

scalar responses yi,i“1, . . . , n. We assume that the response yifollow a generalized linear model:

yi|Xi„Ppyi|Xiq “ ρpyiqexp !yixXi,Cy ´ ψ`xXi,Cy˘),(1)

where CPRdˆpis the target unknown coeﬃcients matrix, ρp¨q and ψp¨q are certain known univariate

function. Note that in model (1), we focus on the image analysis and omit other potential design variables,

such as age, sex, etc. They can be added back to the model easily if necessary. Given model (1), we have

that for a certain known link function gp¨q,

gpEpyiqq “ xXi,Cy.(2)

Under the framework of DKN, we propose to model the coeﬃcients Cwith a rank-R Kronecker product

decomposition with Lpě 2qfactors:

C“

r“1

LbBr

L´1b ¨¨¨ b Br

1,(3)

where Br

lPRdlˆpl,l“1, . . . , L,r“1, . . . , R are unknown matrices, and referred to Kronecker factors.

The sizes of Br

lare not assumed known. However, due to the property of Kronecker product, they certainly

need to satisfy d“śL

l“1dland p“śL

l“1pl. For ease of notation, we write

Bl1bBl1´1b ¨¨¨ b Bl2“

l“l1

for any matrices Bl1,...,Bl2with l1ěl2. Therefore, the decomposition (3) could be written as C“

řR

r“1Â1

l“LBr

The Figure 1below illustrates a decomposition of DKN. It suggests a decomposition with rank R“2

and factor number L“3for a sparse matrix with the signal being a circle. In general, the decomposition (3)

is able to approximate arbitrary matrices with a suﬃciently large rank R. This can be seen by relating the

decomposition (3) to a tensor CP decomposition. We defer to Section 5for a discussion on the connections

between DKN and CP decomposition.

Figure 1: An illustration of DKN with L“3,R“2,Br

3,Br

2PR2ˆ2,Br

1PR4ˆ4,r“1,2.

We call model (1), (2) and (3) Deep Kronecker Network as it resembles a fully convolutional network

(FCN). In particular, the rank Rand factor number Lin DKN could be viewed as the width and depth of

DKN, respectively. A detailed discussion between DKN and FCN is deferred to Section 4. Moreover, as in

a neural network, the performance of DKN would also be aﬀected by its structure, including depth, width

and factor sizes. We defer to Section 4.3 for a detailed discussion on the network structure and its impact for

performance.

Beyond matrix image, the DKN could be easily extended to tensor represented images. This would

allow us to address general 3D and even higher order image data, e.g., MRI, fMRI, etc. We ﬁrst need to

introduce the deﬁnition of tensor Kronecker product (TKP):

Deﬁnition 2.1. (Tensor Kronecker Product) Let APRd1ˆp1ˆq1and BPRd2ˆp2ˆq2be two three-order tensor

with entries denoted by Ai1,j1,k1and Bi2,j2,k2, respectively. Then the tensor Kronecker product C“AbB

is deﬁned by Crh1h2s,rj1j2s,rk1k2s“Ah1,j1,k1Bh2,j2,k2for all possible values of ph1, j1, k1qand ph2, j2, k2q.

For ease of presentation, Deﬁnition 2.1 is illustrated for three-order TKP. But it could be further extended

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

DeepKroneckerNetworkLongFenglfeng@hku.hkTheUniversityofHongKongGuangYangguang.yang@my.cityu.edu.hkCityUniversityofHongKongAbstractWeproposeDeepKroneckerNetwork(DKN),anovelframeworkdesignedforanalyzingmedicalimagingdata,suchasMRI,fMRI,CT,etc.Medicalimagingdataisdierentfromgeneralimagesinatleasttwoas...

展开>> 收起<<

Deep Kronecker Network Long Feng lfenghku.hk.pdf

共40页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Deep Kronecker Network Long Feng lfenghku.hk

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: