1 MULTITASK BRAIN TUMOR INPAINTING WITH DIFFUSION MODELS

2025-04-30 0 0 1.07MB 17 页 10玖币
侵权投诉
1
MULTITASK BRAIN TUMOR INPAINTING WITH
DIFFUSION MODELS:
A METHODOLOGICAL REPORT
Pouria Rouzrokh1,2,*, Bardia Khosravi1,2,*,
Shahriar Faghani1, Mana Moassefi1, Sanaz Vahdati1,
Bradley J. Erickson1,+
(1) Mayo Clinic Artificial Intelligence Laboratory, Mayo Clinic, MN, USA
(2) Orthopedic Surgery Artificial Intelligence Laboratory, Mayo Clinic, MN, USA
(*) Co-first authors, (+) Corresponding author
Please email all correspondence to: bje@mayo.edu
Abstract
Despite the ever-increasing interest in applying deep learning (DL) models to medical imaging,
the typical scarcity and imbalance of medical datasets can severely impact the performance of
DL models. The generation of synthetic data that might be freely shared without compromising
patient privacy is a well-known technique for addressing these difficulties. Inpainting
algorithms are a subset of DL generative models that can alter one or more regions of an input
image while matching its surrounding context and, in certain cases, non-imaging input
conditions. Although the majority of inpainting techniques for medical imaging data use
generative adversarial networks (GANs), the performance of these algorithms is frequently
suboptimal due to their limited output variety, a problem that is already well-known for GANs.
Denoising diffusion probabilistic models (DDPMs) are a recently introduced family of
generative networks that can generate results of comparable quality to GANs, but with diverse
outputs. In this paper, we describe a DDPM to execute multiple inpainting tasks on 2D axial
slices of brain MRI with various sequences, and present proof-of-concept examples of its
performance in a variety of evaluation scenarios. Our model and a public online interface to
try our tool are available here.
2
Figure 1. Expected target features from the multitask inpainting model. A) the model should be able to inpaint regions of
interest (ROIs) for pre-determined tumoral components (task 1), a random tumor with undetermined components (task
2), or tumor-less (apparently normal) brain tissues. B) the model should be able to do tasks 1 to 3 at the same inference
round. C) the model should be able to perform tasks 1 to 3 on two distinct modes of input ROIs, i.e., free-form ROIs and
bounding box ROIs.
1 Introduction
Number of Artificial Intelligence (AI) and in particular machine learning (ML) publications related to
medical imaging has expanded dramatically over the recent years(1). A recent PubMed search with the
Mesh keywords "artificial intelligence" and "radiology" yielded 5,369 papers in 2021, which is more
than five times the number of results from the same search in 2011. From classification to semantic
segmentation, object detection, and image generation, ML models are constantly being developed to
improve healthcare efficiency and outcomes(2). In diagnostic radiology, for instance, there are
numerous published reports indicating that ML models may perform on par or even better than medical
experts in certain tasks, such as anomaly detection and screening for pathologies(3,4). It is therefore
undeniable that AI can assist radiologists and drastically cut their labor, if applied properly(5).
Despite the growing interest in developing ML models for medical imaging, there are significant
challenges that can limit the practical applications of such models or even predispose them to
3
significant bias(68). Two of these challenges are the issues of data scarcity and data imbalance. On
the one hand, medical imaging datasets are often much smaller than the natural photograph datasets
like ImageNet, and pooling institutional datasets or making them publicly available may be impossible
due to patient privacy concerns. On the other hand, even those medical imaging datasets that are
available to data scientists are frequently imbalanced. In other words, the volume of medical imaging
data for patients with particular pathologies is substantially less than that of patients with common
pathologies or healthy individuals. Training or evaluating a ML model with insufficiently large or
imbalanced datasets may result in systemic biases in model performance(6).
In addition to the public release of deidentified medical imaging datasets and the endorsement of
strategies such as federated learning, which facilitates machine learning (ML) model development on
multi-institutional datasets without data sharing, synthetic image generation is one of the primary
strategies to combat both data scarcity and data imbalance(9). Generative ML models can learn how
to generate realistic medical imaging data that does not belong to a real patient and can therefore be
shared publicly without compromising patient privacy. Since the emergence of generative adversarial
networks (GANs), various generating models have been introduced which are capable of synthesizing
high quality synthetic data(10). The majority of these models generate unlabeled imaging data that
may be useful for certain use cases, such as self-supervised or semi-supervised of downstream models.
Additionally, some other models are capable of conditional generation, which provides the ability to
generate an image based on predetermined clinical, textual, or imaging variables. The latter group of
generative models enables the production of labeled synthetic data, thereby advancing machine
learning research, medical imaging quality, and patient care.
Despite the enormous success of GANs in generating synthetic medical imaging data, these models
are frequently criticized for their lack of output diversity and unstable training. As a conventional
alternative to GANs, autoencoder deep learning models are easier to train and able to generate more
diversified outputs, but their synthetic results lack the image quality of GANs(11). Denoising Diffusion
Probabilistic Models (DDPMs), or diffusion models for short, are a new class of image generation
models that surpass GANs in terms of synthetic image quality and are comparable to autoencoders in
terms of output diversity(12,13). Based on the Markov chain theory, diffusion models learn to generate
their synthetic outputs by gradually denoising an initial image packed with random gaussian noise.
This iterative denoising process makes the inference runs of diffusion models significantly slower than
4
other generative models, but in exchange, it allows them to extract more representative features from
their input data, enabling them to outperform other models in the end(14).
Figure 2. An overview of our strategy to train the multitask brain tumor inpainting algorithm. A) a pair of a ground
truth image and its corresponding tumor mask is read from the training set; B) the input mask is preprocessed according
to a randomization schema to generate five separate masks for distinct ROIs (from top to bottom: normal brain, necrotic
tumor core, tumoral edema, tumoral enhancement, and multi-component tumor); C) the input image is preprocessed in a
way that all pixels with at least one corresponding ROI are filled with random gaussian noise. This preprocessed image
will be concatenated to the five mask ROIs developed in the previous step to create a six-channel tensor of size
6×256×256. Furthermore, a one-dimensional class vector is built to denote the ROI mode for each of the 5 ROI channels
in the previous tensor; D) the six-channel tensor and the class vector are fed to a diffusion model to denoise the noisy
image to a version that is less noisy for one step; E) The input image will similarly be converted to a ground truth noisy
image at the same step that the output of the diffusion model should be; F) the output of the diffusion model will be
compared with the ground truth noisy image to calculate the loss and optimize the model.
In this methodological paper, we introduce a proof-of-concept diffusion model that can be used for
multitask brain tumor inpainting on multi-sequential brain magnetic resonance imaging (MRI) studies.
More precisely, we developed a diffusion model that can receive a two-dimensional (2D) axial slice
from a T1-weighted (T1), a contrast-enhanced T1-weighted (T1CE), a T2-Weighted (T2), or a fluid
attenuated inversion recovery (FLAIR) sequence of a brain MRI and inpaint a user-defined cropped
area of that slice with realistic and controllable image of either a high-grade glioma and its
corresponding components (e.g., the surrounding edema), or tumor-less (apparently normal) brain
tissues. The incidence of high-grade glioma is 3.56 per 100,000 population in the United States, and
there are only a few publicly available MRI datasets for brain tumors(15,16). In the context of such
摘要:

1MULTITASKBRAINTUMORINPAINTINGWITHDIFFUSIONMODELS:AMETHODOLOGICALREPORTPouriaRouzrokh1,2,*,BardiaKhosravi1,2,*,ShahriarFaghani1,ManaMoassefi1,SanazVahdati1,BradleyJ.Erickson1,+(1)MayoClinicArtificialIntelligenceLaboratory,MayoClinic,MN,USA(2)OrthopedicSurgeryArtificialIntelligenceLaboratory,MayoClin...

展开>> 收起<<
1 MULTITASK BRAIN TUMOR INPAINTING WITH DIFFUSION MODELS.pdf

共17页,预览4页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:17 页 大小:1.07MB 格式:PDF 时间:2025-04-30

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 17
客服
关注