Deep Kronecker Network Long Feng lfenghku.hk

2025-05-06 0 0 3.4MB 40 页 10玖币
侵权投诉
Deep Kronecker Network
Long Feng
lfeng@hku.hk
The University of Hong Kong
Guang Yang
guang.yang@my.cityu.edu.hk
City University of Hong Kong
Abstract
We propose Deep Kronecker Network (DKN), a novel framework designed for analyzing medical imaging
data, such as MRI, fMRI, CT, etc. Medical imaging data is different from general images in at least two
aspects: i) sample size is usually much more limited, ii) model interpretation is more of a concern compared
to outcome prediction. Due to its unique nature, general methods, such as convolutional neural network
(CNN), are difficult to be directly applied. As such, we propose DKN, that is able to i) adapt to low sample
size limitation, ii) provide desired model interpretation, and iii) achieve the prediction power as CNN. The
DKN is general in the sense that it not only works for both matrix and (high-order) tensor represented image
data, but also could be applied to both discrete and continuous outcomes. The DKN is built on a Kronecker
product structure and implicitly imposes a piecewise smooth property on coefficients. Moreover, the
Kronecker structure can be written into a convolutional form, so DKN also resembles a CNN, particularly, a
fully convolutional network (FCN). Furthermore, we prove that with an alternating minimization algorithm,
the solutions of DKN are guaranteed to converge to the truth geometrically even if the objective function
is highly nonconvex. Interestingly, the DKN is also highly connected to the tensor regression framework
proposed by Zhou et al. (2010), where a CANDECOMP/PARAFAC (CP) low-rank structure is imposed on
tensor coefficients. Finally, we conduct both classification and regression analyses using real MRI data
from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) to demonstrate the effectiveness of DKN.
Keywords: Image Analysis, Brain Imaging, Tensor Decomposition, CNN, Kronecker Product
1 Introduction
Medical imaging analysis has played a central role in medicine today. From Computed Tomography
(CT) to magnetic resonance imaging (MRI) and from MRI to functional MRI (fMRI), the advancement of
modern imaging technologies has benefited tremendously to the diagnosis and treatment of a disease.
Although image analysis has been intensively studied over the past decades, medical image data is
significantly different from general image in at least two aspects. First, the sample size is much more limited,
1
arXiv:2210.13327v1 [stat.ML] 24 Oct 2022
while the image data are of higher order and higher dimension. For example, in MRI analysis, it is common
to have a dataset containing only hundreds or at most thousands of patients, but each with an MRI scan of
millions of voxels. In fMRI analysis, the number of voxels could be even larger. As a comparison, the sample
size in general image recognition or computer vision problems could be millions and easily much larger
than the image dimension. For instance, the ImageNet (Deng et al.,2009) database contains more than 14
million images nowadays. Second, model interpretation is usually more important than outcome prediction.
Compared to simply recognizing whether a patient has certain disease or not using medical imaging data, it
is usually more of a concern to interpret the prediction outcome. But for many image recognition problems,
outcome prediction is nearly the only thing of interest.
Due to its unique nature, it is difficult to directly apply general image methods to medical imaging data.
Convolutional Neural Networks (CNN, Fukushima and Miyake,1982;LeCun et al.,1998) is arguably the
most successful method in image recognition in recent years. By introducing thousands or even millions of
unknown parameters in a composition of nonlinear functions, the CNN is able to achieve optimal prediction
accuracy. However, training a CNN requires large amount of samples, which is hardly available in medical
imaging analysis. Moreover, with such large amount of unknown parameters presented in a “black box”,
a CNN model is extremely difficult to be interpreted and unable to satisfy the needs of medical imaging
analysis.
In statistics community, there are also numerous attempts in developing methodologies for medical
imaging analysis. One of the most commonly used strategy is to vectorize the image data and use the
obtained pixels as independent predictors. Built on this strategy, various methods have been developed in the
literature. To list a few, Total Variation (TV, Rudin et al.,1992) based approaches, e.g., Wang et al. (2017), aim
to promote smoothness in the vectorized coefficients. Bayesian methods model the vectorized coefficients
with certain prior distribution, such as the Ising prior (Goldsmith et al.,2014), Gaussian process (Kang
et al.,2018), etc. Although the aforementioned methods have demonstrated their effectiveness in different
applications, vectorizing the image data is clearly not an optimal strategy. Not mentioning that the spatial
information could be omitted, the resulted ultra high-dimensional vectors also face severe computational
limitations. Recently, Wu and Feng (2022) proposed an innovative framework named Sparse Kronecker
Product Decomposition (SKPD) to detect signal regions in image regression. The proposed approach is
appealing for sparse signal detection, but unable to analyze medical imaging data with dense signals.
When image data are represented as three or higher order tensors (such as MRI or fMRI), Zhou et al.
(2013) proposed a tensor regression (TR) framework that impose a CANDECOMP/PARAFAC (CP) low-rank
structure on the tensor coefficients. By imposing the CP structure, the number of unknown parameters in the
2
tensor coefficients could be significantly reduced, so the computation could also be eased. Built on the CP
structure, Feng et al. (2021) further proposed a new Internal Variation (IV) penalization to mimic the effects
of TV and promote smoothness of image coefficients. While the TR framework is effective, it is designed
for general tensor represented predictors, and does not fully utilize the special nature of image data. As a
consequence, it is unable to achieve the prediction power as CNN.
To this end, it is desired to develop an approach for medical imaging analysis that is able to i) adapt to
low sample size limitation, ii) enjoy good interpretability, and iii) achieve the prediction power as CNN. In
this paper, we develop a novel framework named Deep Kronecker Network (DKN) that is able to achieve
all three goals. The DKN is built on a Kronecker product structure and implicitly impose a latent piecewise
smooth property of coefficients. Moreover, the DKN allows us to locate the image regions that is most
influential to the outcome and helps model interpretation. The DKN is general in the sense that it works for
both matrix and (high-order) tensor represented image data. Therefore, CT, MRI, fMRI and other types of
medical imaging data could all be handled by DKN. Furthermore, the DKN is embedded in a generalized
linear model, therefore it works for both discrete and continuous responses.
We call DKN a network because it resembles a CNN, particularly, a fully convolutional network (FCN).
Although DKN is started with a Kronecker structure, it could also be written into a convolutional form. But
different from classical CNN, the convolutions in DKN have no overlaps. This design not only allows us to
achieve maximized dimension reduction, but also provides desired model interpretability. Interestingly, the
DKN is also highly connected to the tensor regression framework of Zhou et al. (2013). We show that the
DKN is equivalent to tensor regression applied to reshaped images. Therefore, the three seemingly irrelevant
methods: FCN, tensor regression and the DKN are connected to each other.
The DKN is solved by maximum likelihood estimation via alternating minimization algorithm. The
resulted loss function is highly nonconvex, however, we prove that the solutions of DKN is guaranteed to
converge to the truth under a restricted isometry property (RIP). The proof of the theory is based on a carefully
constructed power method. Due to the connections between DKN and FCN, our theoretical results also shed
light on the understanding of FCN. Finally, a comprehensive simulation study along with a real MRI analysis
from Alzheimer’s Disease Neuroimaging Initiative (ADNI) further demonstrated the effectiveness of DKN.
The rest of the paper is organized as follows: we introduce the DKN in Section 2. In Section 3, we
discuss the computation of DKN. In Section 4we demonstrate the connections between DKN and FCN.
The connections between DKN and tensor regression are illustrated in Section 5. We in Section 6provide
theoretical guarantees of DKN. Section 7contains a comprehensive simulation study. Finally, we conduct a
real MRI analysis from ADNI in Section 8.
3
Notation: We use calligraphic letters A,Bto denote tensors, upper-case letters A,Bto denote
matrices, bold lower-case letters a,bto denote vectors. We let vecp¨q be the vectorization operator and
vec´1
p¨q p¨qbe its inverse with the subscripts subjecting the matrix/tensor size. For example, vec´1
pd,p,qqp¨qstands
for transforming a vector of dimension dpq to a tensor of dimension dˆpˆq. We let ,¨y to denote inner
product, bto denote Kronecker product. For vector v,}v}q“ přj|vj|qq1{qis the `qnorm. For a tensor A,
}A}Fbři,j,k A2
i,j,k is the Frobenius norm.
We use square brackets around the indices to denote the entries of tensors. For example, suppose that
APRn1ˆn2ˆn3ˆn4is a four-order tensor. Then the entries of Ais denoted as Ari1s,ri2s,ri3s,ri4s. For simplicity,
we may omit the square brackets when all indices are considered separate, i.e., Ai1,i2,i3,i4Ari1s,ri2s,ri3s,ri4s.
By forming indices together, we obtain lower order tensors. For example, a three-order tensor can be obtained
by forming the first two indices together, with entries denoted by Ari1i2s,ri3s,ri4s. Here the grouped index ri1i2s
is equivalent to the linear index i1`n1pi2´1q. Grouping the last three indices together results to a matrix
(two-order tensor) with entries Ari1s,ri2i3i4s, where the index ri2i3i4sdenotes i2`n2pi3´1q`n2n3pi4´1q.
When all the indices are grouped together, we obtain the vectorization of A, also denoted as vecpAq, with
entries Ari1i2i3i4s.
2 Deep Kronecker Network
To get started, suppose that we observe nsamples with matrix represented images XiPRdˆpand
scalar responses yi,i1, . . . , n. We assume that the response yifollow a generalized linear model:
yi|XiPpyi|Xiq “ ρpyiqexp !yixXi,Cy ´ ψ`xXi,Cy˘),(1)
where CPRdˆpis the target unknown coefficients matrix, ρp¨q and ψp¨q are certain known univariate
function. Note that in model (1), we focus on the image analysis and omit other potential design variables,
such as age, sex, etc. They can be added back to the model easily if necessary. Given model (1), we have
that for a certain known link function gp¨q,
gpEpyiqq “ xXi,Cy.(2)
Under the framework of DKN, we propose to model the coefficients Cwith a rank-R Kronecker product
decomposition with L2qfactors:
C
R
ÿ
r1
Br
LbBr
L´1b ¨¨¨ b Br
1,(3)
where Br
lPRdlˆpl,l1, . . . , L,r1, . . . , R are unknown matrices, and referred to Kronecker factors.
The sizes of Br
lare not assumed known. However, due to the property of Kronecker product, they certainly
4
need to satisfy dśL
l1dland pśL
l1pl. For ease of notation, we write
Bl1bBl1´1b ¨¨¨ b Bl2
l2
â
ll1
Bk
for any matrices Bl1,...,Bl2with l1ěl2. Therefore, the decomposition (3) could be written as C
řR
r1Â1
lLBr
l.
The Figure 1below illustrates a decomposition of DKN. It suggests a decomposition with rank R2
and factor number L3for a sparse matrix with the signal being a circle. In general, the decomposition (3)
is able to approximate arbitrary matrices with a sufficiently large rank R. This can be seen by relating the
decomposition (3) to a tensor CP decomposition. We defer to Section 5for a discussion on the connections
between DKN and CP decomposition.
Figure 1: An illustration of DKN with L3,R2,Br
3,Br
2PR2ˆ2,Br
1PR4ˆ4,r1,2.
We call model (1), (2) and (3) Deep Kronecker Network as it resembles a fully convolutional network
(FCN). In particular, the rank Rand factor number Lin DKN could be viewed as the width and depth of
DKN, respectively. A detailed discussion between DKN and FCN is deferred to Section 4. Moreover, as in
a neural network, the performance of DKN would also be affected by its structure, including depth, width
and factor sizes. We defer to Section 4.3 for a detailed discussion on the network structure and its impact for
performance.
Beyond matrix image, the DKN could be easily extended to tensor represented images. This would
allow us to address general 3D and even higher order image data, e.g., MRI, fMRI, etc. We first need to
introduce the definition of tensor Kronecker product (TKP):
Definition 2.1. (Tensor Kronecker Product) Let APRd1ˆp1ˆq1and BPRd2ˆp2ˆq2be two three-order tensor
with entries denoted by Ai1,j1,k1and Bi2,j2,k2, respectively. Then the tensor Kronecker product CAbB
is defined by Crh1h2s,rj1j2s,rk1k2sAh1,j1,k1Bh2,j2,k2for all possible values of ph1, j1, k1qand ph2, j2, k2q.
For ease of presentation, Definition 2.1 is illustrated for three-order TKP. But it could be further extended
5
摘要:

DeepKroneckerNetworkLongFenglfeng@hku.hkTheUniversityofHongKongGuangYangguang.yang@my.cityu.edu.hkCityUniversityofHongKongAbstractWeproposeDeepKroneckerNetwork(DKN),anovelframeworkdesignedforanalyzingmedicalimagingdata,suchasMRI,fMRI,CT,etc.Medicalimagingdataisdi erentfromgeneralimagesinatleasttwoas...

展开>> 收起<<
Deep Kronecker Network Long Feng lfenghku.hk.pdf

共40页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!

相关推荐

分类:图书资源 价格:10玖币 属性:40 页 大小:3.4MB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 40
客服
关注