A Comparative Study on 1.5T - 3T MRI Conversion through Deep Neural Network Models Binhua Liaoy Yani Chen Zhewei Wang Charles D. Smithz Jundong Liu

2025-04-27 0 0 445.73KB 6 页 10玖币

侵权投诉

A Comparative Study on 1.5T - 3T MRI

Conversion through Deep Neural Network Models

Binhua Liao∗†, Yani Chen∗, Zhewei Wang∗, Charles D. Smith‡, Jundong Liu∗

∗School of Electrical Engineering and Computer Science, Ohio University, USA

†College of Mathematics and Statistics, Huazhong Normal University, PR China

‡Department of Neurology, University of Kentucky, USA

Abstract—In this paper, we explore the capabilities of a number

of deep neural network models in generating whole-brain 3T-

like MR images from clinical 1.5T MRIs. The models include

a fully convolutional network (FCN) method and three state-

of-the-art super-resolution solutions, ESPCN [26], SRGAN [17]

and PRSR [7]. The FCN solution, U-Convert-Net, carries out

mapping of 1.5T-to-3T slices through a U-Net-like architecture,

with 3D neighborhood information integrated through a multi-

view ensemble. The pros and cons of the models, as well the

associated evaluation metrics, are measured with experiments

and discussed in depth. To the best of our knowledge, this study

is the ﬁrst work to evaluate multiple deep learning solutions for

whole-brain MRI conversion, as well as the ﬁrst attempt to utilize

FCN/U-Net-like structure for this purpose.

Index Terms—FCN, MRI, modality conversion, U-Net, U-

Convert-Net, GAN, SRGAN

I. INTRODUCTION

Magnetic resonance imaging (MRI) is widely used in neu-

roimaging and the popularity is due to its non-invasive nature,

high soft tissue contrast, as well as the availability of safe

intracellular contrast agents. Currently 1.5 tesla (T) short-bore

MRI is the standard technology for clinical use. However,

3T (and even 7T) MRI scanners are becoming increasingly

more desirable, as they can provide extremely clear and vivid

images. Comparing with 1.5T, 3T MR images have higher

signal-to-noise ratios (SNR) and higher contrast-to-noise ratios

(CNR) between gray and white matter. The latter make 3T

MRI a better choice for brain tissue segmentation, as well as

a generally preferred modality in neuroimaging studies.

While the availability of 3T MRI has increased signiﬁcantly

over the past decade, the majority of clinical scanners across

the US are still 1.5T systems. Converting 1.5T images into 3T-

like images, if with great ﬁdelity, would help physicians make

better informed diagnosis and treatment decisions. In addition,

historical 1.5T MR images in various ongoing longitudinal

studies can also be brought into a better use. One of such

examples is the Alzheimer’s Disease Neuroimaging Initiative

(ADNI) project – 1.5T was the major MRI modality in ADNI

1, the ﬁrst stage of the project, but the acquisition switched

to 3T alone in later stages (ADNI GO, 2 and 3). Converting

1.5T images into 3T-like counterparts may allow the datasets

generated in such studies to be delivered in a more uniform

form.

Establishing a nonlinear spatially-varying intensity mapping

between two images is a challenging task. The efforts to tackle

this problem can trace back to at least the Image Analogies

model [12], which relies on a nonparametric texture model [9]

to learn the mapping on a single pair of input-output images.

The emerge of the powerful deep learning paradigm in recent

years makes the task more viable. Generative adversarial net-

works (GAN) [15], [17], [34], [37], [41] and pixel-RNN/CNN

[7] are among the models that have been applied for modality

conversion, producing impressive results.

The original GAN model by Goodfellow et al. [11] was

designed to generate images that are similar to the training

samples. Several later solutions, including DualGAN [37],

CycleGAN [41] and DiscoGAN [16], take the similar idea

to train image-to-image translation with unpaired natural im-

ages. The CycleGAN model has been adopted to synthesize

CT images from MRIs [34]. While ﬂexible and with broad

applicability, this group of solutions reply on the distribution

of real samples instead of paired inputs/outputs, even if the

latter are available. Consequently, the results from this group

can be rather unstable and far from uniformly positive [41].

Some GANs, including pix2pix [15] and PAN [31], take paired

training samples to trade ﬂexibility for stability.

With paired input/output samples, MR modality conversion

could be implemented as a special case of super-resolution,

where one or multiple low-resolution images are combined

to generate images with higher spatial resolution. Traditional

super-resolution solutions include reconstruction-based meth-

ods [25], [27], [30], [35], and example-based methods [3],

[10], [14], [20], [24], [36], [39]. Under the deep learning

framework, numerous new super-resolution solutions have

recently been developed, in both the computer vision [7],

[26] and medical image computing [2], [1], [8], [18], [28],

[40] communities. SRGAN [17], a model designed to recover

the ﬁner texture details even with large upscaling factors, is

commonly regarded as one of the state-of-the-art solutions.

Fully convolutional networks (FCN) proposed by Long et

al. [19] was primarily designed for image segmentation, which

can also be regarded as a special type of modality mapping

– from gray-valued intensities to binary-valued labels. U-Net

[23] and its variants [4], [5], [29], [32], [33] follow the similar

idea of FCN and rely on skip connections to concatenate

features from the contracting (convolution) and expanding

arXiv:2210.06362v1 [eess.IV] 12 Oct 2022

SRGAN for image translation

Conv

PReLU

Conv

PReLU

Elementwise sum

Conv

Elementwise Sum

Conv

k9n64s1 k3n64s1 k3n64s1 k3n64s1 k3n256s1 k3n256s1 k9n3s1

16 residual blocks

skip connection

Generator Network

Discriminator Network

PReLU

Conv

Leaky ReLU

k3n64s1

Conv

Leaky ReLU

Dense(1024)

Leaky ReLU

Dense(1)

Sigmoid

k3n64s2 k3n128s1 k3n128s2 k3n256s1 k3n256s2

k3n512s1k3n512s2

1.5T 3T-like

3T-like

Fig. 1: The overall architecture of SRGAN for MRI conversion: generator and discriminator with corresponding kernel size

(k), number of feature maps (n) and stride (s) indicated for each convolutional layer.

(deconvolution) paths. In theory, an FCN with a proper setup

can potentially describe any intensity mapping between two

modalities. However, such capacity of FCN has yet been

explored for general-purpose modality conversion. It should be

noted that, Nie et al. [21] use a convolutional network for MR

to CT conversion, but their network structure is not FCN/U-

Net equivalent, as no pooling, skip connections, contracting

and expanding components are utilized.

In this paper, we explore the capability of a number of

super-resolution (SR) and segmentation models in handling

modality conversion. More speciﬁcally, we adopt SR models

including ESPCN [26], SRGAN and PRSR [7], and modify

Chen’s segmentation model [5], to convert 1.5T whole-brain

MR images into 3T. Experiments are conducted with ADNI

data. To the best of our knowledge, this study is the ﬁrst

work to compare and evaluate multiple deep learning solutions,

based on various performance metrics, for whole-brain MRI

conversion.

II. MOTHOD

The MR conversion models to be analyzed in this study are

modiﬁed from SR and segmentation solutions, respectively. In

this section, we introduce them in detail.

A. Modiﬁed from Super-Resolution Solutions

SRGAN is designed to generate 4×upscaled photo-realistic

natural images with highly perceptual quality. Focused on

recovering ﬁner texture details in up-scaled images, SRGAN

adopts a perceptual loss function that consists of an adversarial

loss and a content loss. As a super-resolution solution, SRGAN

produces outputs that have different sizes from inputs. To suit

for our MRI conversion task, we remove one upsampling layer

to make the input/output of equal size.

As shown in Fig. 1, the modiﬁed SRGAN for MR conver-

sion model consists of two major components: generator and

discriminator. The generator part is a deep residual network

(ResNet) with skip-connections, generating 3T-like images

from 1.5T inputs. The goal of the generator is to be able

to produce 3T images so realistic that it would be able to

fool the discriminator. The discriminator, on the other hand,

is conﬁgured as a classiﬁcation CNN and its goal is to be

trained as sharp as possible to distinguish fake 3T from real

3T images. With this setup, the generator can eventually learn

to create outputs that are highly similar to real 3T images.

ESPCN uses two convolutional layers to extract feature

maps from low resolution image and then applies a sub-pixel

convolution layer to transform these feature maps back to

an enlarged super resolution image. The sub-pixel layer is

designed to be very efﬁcient, which reduces the computational

complexity of the model and enables the system to achieve

real-time super-resolution of 1080p videos on a single K2

GPU.

PRSR is a super resolution model build upon ResNet and

PixelCNNs (a probabilistic generative model) that is capable

of enlarging small input image to a wide range of plausible

high-resolution images with large ampliﬁcation factors. Exper-

iments show that the transformed images obtain high rate of

perceptual evaluation by humans.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

AComparativeStudyon1.5T-3TMRIConversionthroughDeepNeuralNetworkModelsBinhuaLiaoy,YaniChen,ZheweiWang,CharlesD.Smithz,JundongLiuSchoolofElectricalEngineeringandComputerScience,OhioUniversity,USAyCollegeofMathematicsandStatistics,HuazhongNormalUniversity,PRChinazDepartmentofNeurology,Universityof...

展开>> 收起<<

A Comparative Study on 1.5T - 3T MRI Conversion through Deep Neural Network Models Binhua Liaoy Yani Chen Zhewei Wang Charles D. Smithz Jundong Liu.pdf

共6页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

A Comparative Study on 1.5T - 3T MRI Conversion through Deep Neural Network Models Binhua Liaoy Yani Chen Zhewei Wang Charles D. Smithz Jundong Liu

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: