3D Brain and Heart Volume Generative Models A Survey

2025-04-27 0 0 1.69MB 37 页 10玖币
侵权投诉
3D Brain and Heart Volume Generative Models: A Survey
YANBIN LIU,Harry Perkins Institute of Medical Research, Department of Computer Science and Software
Engineering, The University of Western Australia, Australia
GIRISH DWIVEDI
,Harry Perkins Institute of Medical Research, The University of Western Australia,
Fiona Stanley Hospital, Australia
FARID BOUSSAID, Department of Electrical, Electronic and Computer Engineering, The University of
Western Australia, Australia
MOHAMMED BENNAMOUN, Department of Computer Science and Software Engineering, The
University of Western Australia, Australia
Generative models such as generative adversarial networks and autoencoders have gained a great deal
of attention in the medical eld due to their excellent data generation capability. This paper provides a
comprehensive survey of generative models for three-dimensional (3D) volumes, focusing on the brain and
heart. A new and elaborate taxonomy of unconditional and conditional generative models is proposed to cover
diverse medical tasks for the brain and heart: unconditional synthesis, classication, conditional synthesis,
segmentation, denoising, detection, and registration. We provide relevant background, examine each task and
also suggest potential future directions. A list of the latest publications will be updated on GitHub to keep up
with the rapid inux of papers at https://github.com/csyanbin/3D-Medical-Generative-Survey.
CCS Concepts: Computing methodologies Computer vision;3D imaging.
Additional Key Words and Phrases: generative models, three-dimensional, medical images, brain and heart
1 INTRODUCTION
A wide range of research elds has embraced deep learning (DL) in recent years, including image
processing [
52
,
63
,
64
,
97
], speech recognition [
57
,
68
,
128
], natural language processing [
22
,
30
,
90
,
190
,
203
], and robotics [
149
,
189
]. Thus, the medical imaging community has put in signicant
eorts to take advantage of deep learning advances, and medical imaging research has made
signicant progress with respect to a variety of applications including classication [
14
,
98
,
153
,
154
,
245
], segmentation [
46
,
121
,
137
,
195
,
195
], registration [
160
,
255
], detection [
150
,
151
,
197
],
denoising [
161
,
213
,
214
,
216
,
252
], and synthesis [
34
,
74
,
78
,
111
,
119
], as well as with various
imaging modalities, including Computed Tomography (CT) [
107
,
217
], ultrasound [
117
], Magnetic
Resonance Imaging (MRI) [3,123], and Positron Emission Tomography (PET) [163].
A large number of annotated training images, obtained with the aid of crowd-sourcing annotation
platforms like Amazon Mechanical Turk [
144
], were required for deep learning to be successful in
natural image processing. However, the complexity of collection procedures, the lack of experts,
privacy concerns, and the mandatory requirement of consent from patients make the annotation
process a major bottleneck in medical imaging. In order to mitigate this issue, deep generative
models (e.g., generative adversarial networks (GANs) [
55
] and variational autoencoder (VAE) [
92
])
have been introduced to medical imaging. In these generative models, the original data distribution
This work was supported by MRFF Frontier Health and Medical Research - RFRHPI000147.
Authors’ addresses: Yanbin Liu, csyanbin@gmail.com, Harry Perkins Institute of Medical Research, Department of Computer
Science and Software Engineering, The University of Western Australia, Canberra, ACT, Australia, 2601; Girish Dwivedi,
Harry Perkins Institute of Medical Research, The University of Western Australia, Fiona Stanley Hospital, Perth, WA,
Australia, girish.dwivedi@perkins.uwa.edu.au; Farid Boussaid, Department of Electrical, Electronic and Computer Engi-
neering, The University of Western Australia, Perth, WA, Australia, farid.boussaid@uwa.edu.au; Mohammed Bennamoun,
Department of Computer Science and Software Engineering, The University of Western Australia, Perth, WA, Australia,
mohammed.bennamoun@uwa.edu.au.
arXiv:2210.05952v2 [eess.IV] 6 Dec 2023
2 Yanbin L., Girish D., and Mohammed B.
Fig. 1. Statistics of the 3D brain and heart volume generative models. (a) Statistics of all publications according
to medical applications. (b) Categorization by year of publication (2017-2022). Uncond. Syn.: Unconditional
Synthesis, Cond. Syn.: Conditional Synthesis.
is mimicked so that realistic images are generated [
74
,
188
,
207
] or cross-modality synthesis can be
achieved [111,114,237].
There have been numerous survey papers published on deep generative models for medical
imaging due to the rapid progress of the eld [
4
,
5
,
21
,
79
,
83
,
89
,
219
,
234
]. These surveys cover
dierent medical applications and provide an overall review of GANs on general medical image
analysis [
4
,
89
,
234
]. Some focus on a specic application only, such as augmentation [
21
], seg-
mentation [
79
,
83
] and registration [
219
]. Others concentrate on a specic image modality, such as
MRI [
5
]. Even though many surveys exist, we nd that there is a lack of comprehensive surveys
on three-dimensional (3D) medical volume, which is the original data format of many medical
modalities, such as MRI, CT, and PET. Moreover, existing surveys mainly focus on GANs, neglecting
other eective generative models such as Autoencoders (AEs) [
67
] and Autoregressive models [
202
].
In Section 2.1, we provide a comprehensive comparison to existing survey papers in the eld of
medical image generation (as shown in Table 1) and detail what sets our survey apart.
This inspired us to conduct a comprehensive survey of generative models for 3D medical volume
images of the brain and heart. Since 3D volume is the intrinsic representation of many medical
imaging modalities, it displays the entire and thorough anatomical structure of organs, whereas
a 2D medical image only shows a specic view/plane. GANs are widely used for 3D medical
volumes, but there has also been an increased interest in AEs (e.g., Diusion Model [
91
,
150
]) and
Autoregressive models (e.g., Autoregressive Transformers [
151
]). As a result, our survey covers all
three types of generative models. As far as organs are concerned, we restrict our interest to the
brain and heart for the following reasons: (1) they are two vital organs that control the mental and
physiological functions of the human body; (2) both organs involve a wide range of applications,
e.g., segmentation (Fig. 1(a)); (3) generative models are essential for both organs because of their
data scarcity; (4) by covering these 2 organs, we are able to cover generative models for both static
(i.e., brain) and dynamic organs (i.e., heart).
Contributions. To provide a comprehensive and organized survey, we introduce a new taxonomy
(Fig. 2) that divides generative models into unconditional (only taking a random variable as input)
and conditional (taking an additional data modality as input). In Figure 1, we provide a statistical
analysis of the proportion of publications per application and the number of publications per year.
Our contributions can be summarized as follows:
This is the rst survey on generative models for 3D medical volume images, focusing on
two important organs, i.e., the brain and the heart. It aims to bridge the gap between the
3D Brain and Heart Volume Generative Models: A Survey 3
3D Generative
Models
Unconditional
Conditional
3.1 Unconditional
Synthesis
3.2 Classification
3.3 Conditional
Synthesis
3.4 Segmentation
3.5 Denoising
3.6 Detection
3.7 Registration
Brain
Heart
3D-StyleGAN (21)
SCGANs (17)
3D-!-WGAN-GP (19) SlabGAN (19) Özbey et al. (20) 2D Slice VAE (20)
3D-StyleGAN (21) DCR-
"
-GAN (21) Shape+Texture GAN (21)
Split&Shuffle-GAN (22) HA-GAN (22) DDM (22)
SCGANs (17) Biffi et al. (18) 3D-CapsNet (19) 3DPixelCNN (19)
Puyol-Antón et al. (20)
PAN et al. (18) 3D cGAN (18) Liao et al. (18) Joyce et al. (19)
Ea-GANs (19) TPSDicyc (19) Zhang et al. (20) dEa-SA-GAN (20)
3D-RevGAN (20) QSMGAN (20) XCAT-GAN (20) Mcmt-gan (20)
CNet (20) Lin et al. (21) CAE-ACGAN (21) CACR-Net (21) DiCyc (21)
ProvoGAN (22) ResViT (22) Subramaniam et al. (22) CounterSynth (22)
HDL (22) Qiao et al. (22)
Salehi et al. (17) Myronenko (18) S3D-UNet (28) voxel-GAN (18)
Mondal et al. (18) Ya ng et al. (18) MuTGAN (18) Vox elAtl asGAN (18)
Zhang et al. (18) Liu et al. (19) RP-Net (19) DSTGAN (20) PSCGAN (20)
Yua n et al. (20) 3D DR-UNet (20) Peng et al. (20) SASSNet (20)
Vox 2Cox (20) Kolarik et al. (21) FM-Pre-ResNet (21) Ullah et al. (21)
MVSGAN (21) Zhang et al. (21) DAR-UNet (22) Bustamante et al. (22)
Xing et al. (22)
Wolterink et al. (17) LA-GANs (18) 3D c-GANs (18)
RED-WGAN (19) SGSGAN (22)
Uzunova et al. (19) MADGAN (21) 3D MTGA (22) Pinaya et al. (22)
VTN (19) Vox el Mo rph (19) Deform-GAN (20) Zhu et al. (21)
Krebs et al. (21) Ramon et al. (22) TGAN (22)
Fig. 2. Proposed taxonomy of 3D generative models for the brain and the heart. The numbers in the paren-
theses denote the publication year (aer 2000).
research of the 3D generative models community and the research of the medical imaging
community.
We provide a new taxonomy (Fig. 2) of 3D generative models by categorizing them as
unconditional or conditional generative models. Every category includes several relevant
medical applications.
Whilst most existing surveys focus on GANs only, we cover three main categories of
generative models: GANs, AEs, and Autoregressive models.
We discuss the key challenges and future directions of 3D medical generative models and
applications.
Paper Organization. The remainder of this paper is organized as follows. We introduce the
foundational techniques and challenges of 3D generative models in Section 2. Section 3 compre-
hensively elaborates on the 3D medical applications of both the unconditional and conditional
generative models, including unconditional synthesis, classication, conditional synthesis, seg-
mentation, denoising, detection, and registration. Then, Section 4 discusses the above-surveyed
applications and gives four future directions. Finally, Section 5 concludes the paper.
2 BACKGROUND
2.1 Related Survey
As generative models nd increasing use in the medical eld, many survey papers have been
published to provide overviews [
4
,
5
,
21
,
79
,
89
,
234
]. In Table 1, we dierentiate our survey
from existing ones by comparing key elements such as Model(s), Organ(s), Image Format, and
Application(s). The distinct contributions and unique aspects of our survey are summarized below:
4 Yanbin L., Girish D., and Mohammed B.
Table 1. Comparison with existing survey papers on medical image generation.
Publication Year Model(s) Organ(s) Image Format Application(s)
Yi et al. [234] 2019 GANs All mainly 2D, 3D synthesis, reconstruction, segmentation, classication
detection, registration
Kazeminia et al. [89] 2020 GANs All mainly 2D, 3D synthesis, segmentation, reconstruction, detection
de-noising, registration, classication
ALAMIR et al. [4] 2022 GANs All mainly 2D, 3D cross-modality, segmentation, augmentation, reconstruction
detection, classication, registration
Chen et al. [21] 2022 GANs All 2D, 3D augmentation
Iqbal et al. [79] 2022 GANs All 2D, 3D segmentation
Jeong et al. [83] 2022 GANs All mainly 2D, 3D classication, segmentation
Ali et al. [5] 2022 GANs Brain 2D, 3D only statistics of all applications (no description)
This Survey 2023 GANs, AEs Brain, Heart focus on 3D unconditional synthesis, classication, conditional synthesis
Autoregressive segmentation, denoising, detection, registration
Generation
!!Generation
"!
!"
Noise Generated Image
Input Data
Generated Image
(a) Unconditional generation (b) Conditional generation
!
Noise
Fig. 3. Generation process in unconditional and conditional generative models.
While the majority of existing surveys concentrate on Generative Adversarial Networks
(GANs), our survey includes all three major types of generative models: GANs, Autoen-
coders (AEs), and Autoregressive models. Notably, a recent AE variant, Denoising Diusion
Probabilistic Models (DDPMs) [
70
], has outperformed GANs in the generation of natural
images [
40
,
71
]. Additionally, Autoregressive Transformer [
47
] has shown promise in gen-
erating high-resolution images. Both AEs and Autoregressive models are poised to make
signicant contributions to medical image generation in the near future.
We observed that there is a gap in the literature when it comes to survey papers focused
specically on brain and heart image generation. While Ali et al. [
5
] do cover brain MRI, their
scope is limited to providing general statistics on demographics, applications, evaluations,
and datasets. To our knowledge, no survey exists that focuses on heart image generation.
Our survey aims to ll this gap.
Given that GANs were initially developed for 2D images, several existing surveys [
4
,
83
,
89
,
234
] primarily focus on 2D generative models. In contrast, our survey focuses on 3D
volumes, the native image format for medical imaging. Utilizing this 3D format oers
additional advantages for subsequent applications [74,119].
2.2 Unconditional and Conditional Generative Models
In this paper, we divide generative models for 3D volume images into two categories: unconditional
model and conditional model (Fig. 3), where we only show the generation process for simplicity.
In an unconditional generative model (Fig. 3(a)), the input is a random noise variable
𝑧
, and
the output from the generation model is the generated image
X
. Several model architectures
belong to this type of model. A GAN generator, for example, only uses random noise variables to
synthesize images. Random Gaussian variables are input to the decoder of the VAE model. The
unconditional generative model is used in several 3D medical applications, including unconditional
synthesis [74,100,119,207] and classication [14,102,153].
3D Brain and Heart Volume Generative Models: A Survey 5
In a conditional generative model (Fig. 3(b)), in addition to the random noise
𝑧
, the informative
input data
𝑋1
(e.g., semantic or visual input) is also fed to the generation model to help generate
the output image X
2
. Depending on the applications, a variety of data formats are supported by
𝑋1
, including class labels [
130
], attributes [
229
], texts [
164
], and images [
81
]. For example, the
original cGAN [
130
] generated synthetic MNIST [
105
] images conditioned on the class labels. Pixel-
to-pixel [
81
] achieved image style transfer by training a conditional GAN whose generator and
discriminator were both based on the input images. Many 3D medical applications take advantage of
the conditional generative model, including conditional synthesis [
34
,
111
,
143
,
152
,
231
,
235
,
236
],
segmentation [
28
,
165
,
165
,
242
,
247
], denoising [
161
,
213
,
214
,
216
,
252
], detection [
61
,
151
,
197
,
224
],
and registration [11,160,246,250,254].
From the data distribution perspective, the unconditional generative model captures the original
realistic data distribution without requiring any additional information other than random noise.
The conditional generative model can be regarded as a transformation from the input data distribu-
tion
𝑝(
X
1)
to the output image distribution
𝑝(
X
2)
. By proposing a new unconditional/conditional
taxonomy perspective, we hope researchers can gain valuable insights into future model design
and apply them to target medical applications in the future.
2.3 Generative Adversarial Networks
Generative Adversarial Networks were proposed by Goodfellow et al. [
55
] in 2014. The main
idea is to design two networks (a discriminator and a generator) to contest with each other in a
zero-sum two-layer game. Specically, the generator
𝐺
takes a random noise variable
𝑧
as the input
to generate the synthesized images
𝐺(𝑧)
. The role of the discriminator is to distinguish between
the realistic images
𝑥
and the generated fake images
𝐺(𝑧)
. In an ideal case, the two-player game
reaches a Nash equilibrium [
48
,
66
] where the synthesized images
𝐺(𝑧)
are indistinguishable from
real images
𝑥
. The equilibrium is dicult to achieve in practice, and GANs training suers from
two problems, i.e., training instability [
173
,
191
] and mode collapse [
184
,
192
]. Diverse architectures
and training strategies have been proposed to address the two problems and improve GANs
performance [
69
,
172
,
248
,
251
]. Below, we describe the variants that are most relevant to the
generation of 3D medical volume images.
2.3.1 Vanilla GAN. The structure of the vanilla GAN is shown in Fig. 4(a). Based on a prior
distribution of the input noise variable
𝑧𝑝𝑧(𝑧)
, the generator trains its distribution
𝑝𝑔
over
𝑥
to approximate the real data distribution
𝑝𝑑𝑎𝑡𝑎
. The discriminator maximizes the accuracy of
classifying real/fake images by optimizing over
𝐷(𝑥)
/
𝐷(𝐺(𝑧))
. The minmax optimization problem
for 𝐺and 𝐷is dened as follow:
min
𝐺max
𝐷𝑉(𝐷, 𝐺 )=E𝑥𝑝𝑑𝑎𝑡𝑎 (𝑥)[log 𝐷(𝑥)] + E𝑧𝑝𝑧(𝑧)[log(1𝐷(𝐺(𝑧)))].(1)
Theoretically, for arbitrary functions
𝐺
and
𝐷
, the optimal solution satises
𝑝𝑔=𝑝𝑑𝑎𝑡𝑎
and
𝐷(𝑥)=𝐷(𝐺(𝑧)) =1
2
. But in practice,
𝐺
and
𝐷
are usually implemented by deep neural networks
(e.g., multilayer perceptrons or convolution neural networks), which only have limited capacity
and cover a limited family of 𝑝𝑔distributions.
2.3.2 Conditional GAN. The vanilla GAN is unconditional, which means it cannot use additional
information or control the modes of the generated data. Hence, conditional GAN (cGAN) [
130
]
was proposed in order to improve the exibility of vanilla GAN by conditioning on additional
information. As shown in Fig. 4(b), the conditional input
𝑐
is fed into both the generator
𝐺
and
discriminator
𝐷
(red line). The minmax optimization problem for
𝐷
and
𝐺
is then dened as follow:
min
𝐺max
𝐷𝑉(𝐷, 𝐺 )=E𝑥𝑝𝑑𝑎𝑡𝑎 (𝑥)[log 𝐷(𝑥|𝑐)] + E𝑧𝑝𝑧(𝑧)[log(1𝐷(𝐺(𝑧|𝑐)))].(2)
摘要:

3DBrainandHeartVolumeGenerativeModels:ASurveyYANBINLIU,HarryPerkinsInstituteofMedicalResearch,DepartmentofComputerScienceandSoftwareEngineering,TheUniversityofWesternAustralia,AustraliaGIRISHDWIVEDI∗,HarryPerkinsInstituteofMedicalResearch,TheUniversityofWesternAustralia,FionaStanleyHospital,Australi...

收起<<
3D Brain and Heart Volume Generative Models A Survey.pdf

共37页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:37 页 大小:1.69MB 格式:PDF 时间:2025-04-27

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 37
客服
关注