3D Brain and Heart Volume Generative Models A Survey

2025-04-27 1 0 1.69MB 37 页 10玖币

侵权投诉

3D Brain and Heart Volume Generative Models: A Survey

YANBIN LIU,Harry Perkins Institute of Medical Research, Department of Computer Science and Software

Engineering, The University of Western Australia, Australia

GIRISH DWIVEDI

∗

,Harry Perkins Institute of Medical Research, The University of Western Australia,

Fiona Stanley Hospital, Australia

FARID BOUSSAID, Department of Electrical, Electronic and Computer Engineering, The University of

Western Australia, Australia

MOHAMMED BENNAMOUN, Department of Computer Science and Software Engineering, The

University of Western Australia, Australia

Generative models such as generative adversarial networks and autoencoders have gained a great deal

of attention in the medical eld due to their excellent data generation capability. This paper provides a

comprehensive survey of generative models for three-dimensional (3D) volumes, focusing on the brain and

heart. A new and elaborate taxonomy of unconditional and conditional generative models is proposed to cover

diverse medical tasks for the brain and heart: unconditional synthesis, classication, conditional synthesis,

segmentation, denoising, detection, and registration. We provide relevant background, examine each task and

also suggest potential future directions. A list of the latest publications will be updated on GitHub to keep up

with the rapid inux of papers at https://github.com/csyanbin/3D-Medical-Generative-Survey.

CCS Concepts: •Computing methodologies →Computer vision;3D imaging.

Additional Key Words and Phrases: generative models, three-dimensional, medical images, brain and heart

1 INTRODUCTION

A wide range of research elds has embraced deep learning (DL) in recent years, including image

processing [

], speech recognition [

128

], natural language processing [

190

203

], and robotics [

149

189

]. Thus, the medical imaging community has put in signicant

eorts to take advantage of deep learning advances, and medical imaging research has made

signicant progress with respect to a variety of applications including classication [

153

154

245

], segmentation [

121

137

195

], registration [

160

255

], detection [

150

151

197

denoising [

161

213

214

216

252

], and synthesis [

111

119

], as well as with various

imaging modalities, including Computed Tomography (CT) [

107

217

], ultrasound [

117

], Magnetic

Resonance Imaging (MRI) [3,123], and Positron Emission Tomography (PET) [163].

A large number of annotated training images, obtained with the aid of crowd-sourcing annotation

platforms like Amazon Mechanical Turk [

144

], were required for deep learning to be successful in

natural image processing. However, the complexity of collection procedures, the lack of experts,

privacy concerns, and the mandatory requirement of consent from patients make the annotation

process a major bottleneck in medical imaging. In order to mitigate this issue, deep generative

models (e.g., generative adversarial networks (GANs) [

] and variational autoencoder (VAE) [

])

have been introduced to medical imaging. In these generative models, the original data distribution

∗This work was supported by MRFF Frontier Health and Medical Research - RFRHPI000147.

Authors’ addresses: Yanbin Liu, csyanbin@gmail.com, Harry Perkins Institute of Medical Research, Department of Computer

Science and Software Engineering, The University of Western Australia, Canberra, ACT, Australia, 2601; Girish Dwivedi,

Harry Perkins Institute of Medical Research, The University of Western Australia, Fiona Stanley Hospital, Perth, WA,

Australia, girish.dwivedi@perkins.uwa.edu.au; Farid Boussaid, Department of Electrical, Electronic and Computer Engi-

neering, The University of Western Australia, Perth, WA, Australia, farid.boussaid@uwa.edu.au; Mohammed Bennamoun,

Department of Computer Science and Software Engineering, The University of Western Australia, Perth, WA, Australia,

mohammed.bennamoun@uwa.edu.au.

arXiv:2210.05952v2 [eess.IV] 6 Dec 2023

2 Yanbin L., Girish D., and Mohammed B.

Fig. 1. Statistics of the 3D brain and heart volume generative models. (a) Statistics of all publications according

to medical applications. (b) Categorization by year of publication (2017-2022). Uncond. Syn.: Unconditional

Synthesis, Cond. Syn.: Conditional Synthesis.

is mimicked so that realistic images are generated [

188

207

] or cross-modality synthesis can be

achieved [111,114,237].

There have been numerous survey papers published on deep generative models for medical

imaging due to the rapid progress of the eld [

219

234

]. These surveys cover

dierent medical applications and provide an overall review of GANs on general medical image

analysis [

234

]. Some focus on a specic application only, such as augmentation [

], seg-

mentation [

] and registration [

219

]. Others concentrate on a specic image modality, such as

MRI [

]. Even though many surveys exist, we nd that there is a lack of comprehensive surveys

on three-dimensional (3D) medical volume, which is the original data format of many medical

modalities, such as MRI, CT, and PET. Moreover, existing surveys mainly focus on GANs, neglecting

other eective generative models such as Autoencoders (AEs) [

] and Autoregressive models [

202

In Section 2.1, we provide a comprehensive comparison to existing survey papers in the eld of

medical image generation (as shown in Table 1) and detail what sets our survey apart.

This inspired us to conduct a comprehensive survey of generative models for 3D medical volume

images of the brain and heart. Since 3D volume is the intrinsic representation of many medical

imaging modalities, it displays the entire and thorough anatomical structure of organs, whereas

a 2D medical image only shows a specic view/plane. GANs are widely used for 3D medical

volumes, but there has also been an increased interest in AEs (e.g., Diusion Model [

150

]) and

Autoregressive models (e.g., Autoregressive Transformers [

151

]). As a result, our survey covers all

three types of generative models. As far as organs are concerned, we restrict our interest to the

brain and heart for the following reasons: (1) they are two vital organs that control the mental and

physiological functions of the human body; (2) both organs involve a wide range of applications,

e.g., segmentation (Fig. 1(a)); (3) generative models are essential for both organs because of their

data scarcity; (4) by covering these 2 organs, we are able to cover generative models for both static

(i.e., brain) and dynamic organs (i.e., heart).

Contributions. To provide a comprehensive and organized survey, we introduce a new taxonomy

(Fig. 2) that divides generative models into unconditional (only taking a random variable as input)

and conditional (taking an additional data modality as input). In Figure 1, we provide a statistical

analysis of the proportion of publications per application and the number of publications per year.

Our contributions can be summarized as follows:

•

This is the rst survey on generative models for 3D medical volume images, focusing on

two important organs, i.e., the brain and the heart. It aims to bridge the gap between the

3D Brain and Heart Volume Generative Models: A Survey 3

3D Generative

Models

Unconditional

Conditional

3.1 Unconditional

Synthesis

3.2 Classification

3.3 Conditional

Synthesis

3.4 Segmentation

3.5 Denoising

3.6 Detection

3.7 Registration

Brain

Heart

3D-StyleGAN (21)

SCGANs (17)

3D-!-WGAN-GP (19) SlabGAN (19) Özbey et al. (20) 2D Slice VAE (20)

3D-StyleGAN (21) DCR-

-GAN (21) Shape+Texture GAN (21)

Split&Shuffle-GAN (22) HA-GAN (22) DDM (22)

SCGANs (17) Biffi et al. (18) 3D-CapsNet (19) 3DPixelCNN (19)

Puyol-Antón et al. (20)

PAN et al. (18) 3D cGAN (18) Liao et al. (18) Joyce et al. (19)

Ea-GANs (19) TPSDicyc (19) Zhang et al. (20) dEa-SA-GAN (20)

3D-RevGAN (20) QSMGAN (20) XCAT-GAN (20) Mcmt-gan (20)

CNet (20) Lin et al. (21) CAE-ACGAN (21) CACR-Net (21) DiCyc (21)

ProvoGAN (22) ResViT (22) Subramaniam et al. (22) CounterSynth (22)

HDL (22) Qiao et al. (22)

Salehi et al. (17) Myronenko (18) S3D-UNet (28) voxel-GAN (18)

Mondal et al. (18) Ya ng et al. (18) MuTGAN (18) Vox elAtl asGAN (18)

Zhang et al. (18) Liu et al. (19) RP-Net (19) DSTGAN (20) PSCGAN (20)

Yua n et al. (20) 3D DR-UNet (20) Peng et al. (20) SASSNet (20)

Vox 2Cox (20) Kolarik et al. (21) FM-Pre-ResNet (21) Ullah et al. (21)

MVSGAN (21) Zhang et al. (21) DAR-UNet (22) Bustamante et al. (22)

Xing et al. (22)

Wolterink et al. (17) LA-GANs (18) 3D c-GANs (18)

RED-WGAN (19) SGSGAN (22)

Uzunova et al. (19) MADGAN (21) 3D MTGA (22) Pinaya et al. (22)

VTN (19) Vox el Mo rph (19) Deform-GAN (20) Zhu et al. (21)

Krebs et al. (21) Ramon et al. (22) TGAN (22)

Fig. 2. Proposed taxonomy of 3D generative models for the brain and the heart. The numbers in the paren-

theses denote the publication year (aer 2000).

research of the 3D generative models community and the research of the medical imaging

community.

•

We provide a new taxonomy (Fig. 2) of 3D generative models by categorizing them as

unconditional or conditional generative models. Every category includes several relevant

medical applications.

•

Whilst most existing surveys focus on GANs only, we cover three main categories of

generative models: GANs, AEs, and Autoregressive models.

•

We discuss the key challenges and future directions of 3D medical generative models and

applications.

Paper Organization. The remainder of this paper is organized as follows. We introduce the

foundational techniques and challenges of 3D generative models in Section 2. Section 3 compre-

hensively elaborates on the 3D medical applications of both the unconditional and conditional

generative models, including unconditional synthesis, classication, conditional synthesis, seg-

mentation, denoising, detection, and registration. Then, Section 4 discusses the above-surveyed

applications and gives four future directions. Finally, Section 5 concludes the paper.

2 BACKGROUND

2.1 Related Survey

As generative models nd increasing use in the medical eld, many survey papers have been

published to provide overviews [

234

]. In Table 1, we dierentiate our survey

from existing ones by comparing key elements such as Model(s), Organ(s), Image Format, and

Application(s). The distinct contributions and unique aspects of our survey are summarized below:

4 Yanbin L., Girish D., and Mohammed B.

Table 1. Comparison with existing survey papers on medical image generation.

Publication Year Model(s) Organ(s) Image Format Application(s)

Yi et al. [234] 2019 GANs All mainly 2D, 3D synthesis, reconstruction, segmentation, classication

detection, registration

Kazeminia et al. [89] 2020 GANs All mainly 2D, 3D synthesis, segmentation, reconstruction, detection

de-noising, registration, classication

ALAMIR et al. [4] 2022 GANs All mainly 2D, 3D cross-modality, segmentation, augmentation, reconstruction

detection, classication, registration

Chen et al. [21] 2022 GANs All 2D, 3D augmentation

Iqbal et al. [79] 2022 GANs All 2D, 3D segmentation

Jeong et al. [83] 2022 GANs All mainly 2D, 3D classication, segmentation

Ali et al. [5] 2022 GANs Brain 2D, 3D only statistics of all applications (no description)

This Survey 2023 GANs, AEs Brain, Heart focus on 3D unconditional synthesis, classication, conditional synthesis

Autoregressive segmentation, denoising, detection, registration

Generation

!!Generation

Noise Generated Image

Input Data

Generated Image

(a) Unconditional generation (b) Conditional generation

Noise

Fig. 3. Generation process in unconditional and conditional generative models.

•

While the majority of existing surveys concentrate on Generative Adversarial Networks

(GANs), our survey includes all three major types of generative models: GANs, Autoen-

coders (AEs), and Autoregressive models. Notably, a recent AE variant, Denoising Diusion

Probabilistic Models (DDPMs) [

], has outperformed GANs in the generation of natural

images [

]. Additionally, Autoregressive Transformer [

] has shown promise in gen-

erating high-resolution images. Both AEs and Autoregressive models are poised to make

signicant contributions to medical image generation in the near future.

•

We observed that there is a gap in the literature when it comes to survey papers focused

specically on brain and heart image generation. While Ali et al. [

] do cover brain MRI, their

scope is limited to providing general statistics on demographics, applications, evaluations,

and datasets. To our knowledge, no survey exists that focuses on heart image generation.

Our survey aims to ll this gap.

•

Given that GANs were initially developed for 2D images, several existing surveys [

234

] primarily focus on 2D generative models. In contrast, our survey focuses on 3D

volumes, the native image format for medical imaging. Utilizing this 3D format oers

additional advantages for subsequent applications [74,119].

2.2 Unconditional and Conditional Generative Models

In this paper, we divide generative models for 3D volume images into two categories: unconditional

model and conditional model (Fig. 3), where we only show the generation process for simplicity.

In an unconditional generative model (Fig. 3(a)), the input is a random noise variable

𝑧

, and

the output from the generation model is the generated image

. Several model architectures

belong to this type of model. A GAN generator, for example, only uses random noise variables to

synthesize images. Random Gaussian variables are input to the decoder of the VAE model. The

unconditional generative model is used in several 3D medical applications, including unconditional

synthesis [74,100,119,207] and classication [14,102,153].

3D Brain and Heart Volume Generative Models: A Survey 5

In a conditional generative model (Fig. 3(b)), in addition to the random noise

𝑧

, the informative

input data

𝑋1

(e.g., semantic or visual input) is also fed to the generation model to help generate

the output image X

. Depending on the applications, a variety of data formats are supported by

𝑋1

, including class labels [

130

], attributes [

229

], texts [

164

], and images [

]. For example, the

original cGAN [

130

] generated synthetic MNIST [

105

] images conditioned on the class labels. Pixel-

to-pixel [

] achieved image style transfer by training a conditional GAN whose generator and

discriminator were both based on the input images. Many 3D medical applications take advantage of

the conditional generative model, including conditional synthesis [

111

143

152

231

235

236

segmentation [

165

242

247

], denoising [

161

213

214

216

252

], detection [

151

197

224

and registration [11,160,246,250,254].

From the data distribution perspective, the unconditional generative model captures the original

realistic data distribution without requiring any additional information other than random noise.

The conditional generative model can be regarded as a transformation from the input data distribu-

tion

𝑝(

to the output image distribution

𝑝(

. By proposing a new unconditional/conditional

taxonomy perspective, we hope researchers can gain valuable insights into future model design

and apply them to target medical applications in the future.

2.3 Generative Adversarial Networks

Generative Adversarial Networks were proposed by Goodfellow et al. [

] in 2014. The main

idea is to design two networks (a discriminator and a generator) to contest with each other in a

zero-sum two-layer game. Specically, the generator

𝐺

takes a random noise variable

𝑧

as the input

to generate the synthesized images

𝐺(𝑧)

. The role of the discriminator is to distinguish between

the realistic images

𝑥

and the generated fake images

𝐺(𝑧)

. In an ideal case, the two-player game

reaches a Nash equilibrium [

] where the synthesized images

𝐺(𝑧)

are indistinguishable from

real images

𝑥

. The equilibrium is dicult to achieve in practice, and GANs training suers from

two problems, i.e., training instability [

173

191

] and mode collapse [

184

192

]. Diverse architectures

and training strategies have been proposed to address the two problems and improve GANs

performance [

172

248

251

]. Below, we describe the variants that are most relevant to the

generation of 3D medical volume images.

2.3.1 Vanilla GAN. The structure of the vanilla GAN is shown in Fig. 4(a). Based on a prior

distribution of the input noise variable

𝑧∼𝑝𝑧(𝑧)

, the generator trains its distribution

𝑝𝑔

over

𝑥

to approximate the real data distribution

𝑝𝑑𝑎𝑡𝑎

. The discriminator maximizes the accuracy of

classifying real/fake images by optimizing over

𝐷(𝑥)

𝐷(𝐺(𝑧))

. The minmax optimization problem

for 𝐺and 𝐷is dened as follow:

min

𝐺max

𝐷𝑉(𝐷, 𝐺 )=E𝑥∼𝑝𝑑𝑎𝑡𝑎 (𝑥)[log 𝐷(𝑥)] + E𝑧∼𝑝𝑧(𝑧)[log(1−𝐷(𝐺(𝑧)))].(1)

Theoretically, for arbitrary functions

𝐺

and

𝐷

, the optimal solution satises

𝑝𝑔=𝑝𝑑𝑎𝑡𝑎

and

𝐷(𝑥)=𝐷(𝐺(𝑧)) =1

. But in practice,

𝐺

and

𝐷

are usually implemented by deep neural networks

(e.g., multilayer perceptrons or convolution neural networks), which only have limited capacity

and cover a limited family of 𝑝𝑔distributions.

2.3.2 Conditional GAN. The vanilla GAN is unconditional, which means it cannot use additional

information or control the modes of the generated data. Hence, conditional GAN (cGAN) [

130

]

was proposed in order to improve the exibility of vanilla GAN by conditioning on additional

information. As shown in Fig. 4(b), the conditional input

𝑐

is fed into both the generator

𝐺

and

discriminator

𝐷

(red line). The minmax optimization problem for

𝐷

and

𝐺

is then dened as follow:

min

𝐺max

𝐷𝑉(𝐷, 𝐺 )=E𝑥∼𝑝𝑑𝑎𝑡𝑎 (𝑥)[log 𝐷(𝑥|𝑐)] + E𝑧∼𝑝𝑧(𝑧)[log(1−𝐷(𝐺(𝑧|𝑐)))].(2)

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

3DBrainandHeartVolumeGenerativeModels:ASurveyYANBINLIU,HarryPerkinsInstituteofMedicalResearch,DepartmentofComputerScienceandSoftwareEngineering,TheUniversityofWesternAustralia,AustraliaGIRISHDWIVEDI∗,HarryPerkinsInstituteofMedicalResearch,TheUniversityofWesternAustralia,FionaStanleyHospital,Australi...

展开>> 收起<<

3D Brain and Heart Volume Generative Models A Survey.pdf

共37页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

3D Brain and Heart Volume Generative Models A Survey

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: