1 MULTITASK BRAIN TUMOR INPAINTING WITH DIFFUSION MODELS

2025-04-30 0 0 1.07MB 17 页 10玖币

侵权投诉

MULTITASK BRAIN TUMOR INPAINTING WITH

DIFFUSION MODELS:

A METHODOLOGICAL REPORT

Pouria Rouzrokh1,2,*, Bardia Khosravi1,2,*,

Shahriar Faghani1, Mana Moassefi1, Sanaz Vahdati1,

Bradley J. Erickson1,+

(1) Mayo Clinic Artificial Intelligence Laboratory, Mayo Clinic, MN, USA

(2) Orthopedic Surgery Artificial Intelligence Laboratory, Mayo Clinic, MN, USA

(*) Co-first authors, (+) Corresponding author

Please email all correspondence to: bje@mayo.edu

Abstract

Despite the ever-increasing interest in applying deep learning (DL) models to medical imaging,

the typical scarcity and imbalance of medical datasets can severely impact the performance of

DL models. The generation of synthetic data that might be freely shared without compromising

patient privacy is a well-known technique for addressing these difficulties. Inpainting

algorithms are a subset of DL generative models that can alter one or more regions of an input

image while matching its surrounding context and, in certain cases, non-imaging input

conditions. Although the majority of inpainting techniques for medical imaging data use

generative adversarial networks (GANs), the performance of these algorithms is frequently

suboptimal due to their limited output variety, a problem that is already well-known for GANs.

Denoising diffusion probabilistic models (DDPMs) are a recently introduced family of

generative networks that can generate results of comparable quality to GANs, but with diverse

outputs. In this paper, we describe a DDPM to execute multiple inpainting tasks on 2D axial

slices of brain MRI with various sequences, and present proof-of-concept examples of its

performance in a variety of evaluation scenarios. Our model and a public online interface to

try our tool are available here.

Figure 1. Expected target features from the multitask inpainting model. A) the model should be able to inpaint regions of

interest (ROIs) for pre-determined tumoral components (task 1), a random tumor with undetermined components (task

2), or tumor-less (apparently normal) brain tissues. B) the model should be able to do tasks 1 to 3 at the same inference

round. C) the model should be able to perform tasks 1 to 3 on two distinct modes of input ROIs, i.e., free-form ROIs and

bounding box ROIs.

1 Introduction

Number of Artificial Intelligence (AI) and in particular machine learning (ML) publications related to

medical imaging has expanded dramatically over the recent years(1). A recent PubMed search with the

Mesh keywords "artificial intelligence" and "radiology" yielded 5,369 papers in 2021, which is more

than five times the number of results from the same search in 2011. From classification to semantic

segmentation, object detection, and image generation, ML models are constantly being developed to

improve healthcare efficiency and outcomes(2). In diagnostic radiology, for instance, there are

numerous published reports indicating that ML models may perform on par or even better than medical

experts in certain tasks, such as anomaly detection and screening for pathologies(3,4). It is therefore

undeniable that AI can assist radiologists and drastically cut their labor, if applied properly(5).

Despite the growing interest in developing ML models for medical imaging, there are significant

challenges that can limit the practical applications of such models or even predispose them to

significant bias(6–8). Two of these challenges are the issues of data scarcity and data imbalance. On

the one hand, medical imaging datasets are often much smaller than the natural photograph datasets

like ImageNet, and pooling institutional datasets or making them publicly available may be impossible

due to patient privacy concerns. On the other hand, even those medical imaging datasets that are

available to data scientists are frequently imbalanced. In other words, the volume of medical imaging

data for patients with particular pathologies is substantially less than that of patients with common

pathologies or healthy individuals. Training or evaluating a ML model with insufficiently large or

imbalanced datasets may result in systemic biases in model performance(6).

In addition to the public release of deidentified medical imaging datasets and the endorsement of

strategies such as federated learning, which facilitates machine learning (ML) model development on

multi-institutional datasets without data sharing, synthetic image generation is one of the primary

strategies to combat both data scarcity and data imbalance(9). Generative ML models can learn how

to generate realistic medical imaging data that does not belong to a real patient and can therefore be

shared publicly without compromising patient privacy. Since the emergence of generative adversarial

networks (GANs), various generating models have been introduced which are capable of synthesizing

high quality synthetic data(10). The majority of these models generate unlabeled imaging data that

may be useful for certain use cases, such as self-supervised or semi-supervised of downstream models.

Additionally, some other models are capable of conditional generation, which provides the ability to

generate an image based on predetermined clinical, textual, or imaging variables. The latter group of

generative models enables the production of labeled synthetic data, thereby advancing machine

learning research, medical imaging quality, and patient care.

Despite the enormous success of GANs in generating synthetic medical imaging data, these models

are frequently criticized for their lack of output diversity and unstable training. As a conventional

alternative to GANs, autoencoder deep learning models are easier to train and able to generate more

diversified outputs, but their synthetic results lack the image quality of GANs(11). Denoising Diffusion

Probabilistic Models (DDPMs), or diffusion models for short, are a new class of image generation

models that surpass GANs in terms of synthetic image quality and are comparable to autoencoders in

terms of output diversity(12,13). Based on the Markov chain theory, diffusion models learn to generate

their synthetic outputs by gradually denoising an initial image packed with random gaussian noise.

This iterative denoising process makes the inference runs of diffusion models significantly slower than

other generative models, but in exchange, it allows them to extract more representative features from

their input data, enabling them to outperform other models in the end(14).

Figure 2. An overview of our strategy to train the multitask brain tumor inpainting algorithm. A) a pair of a ground

truth image and its corresponding tumor mask is read from the training set; B) the input mask is preprocessed according

to a randomization schema to generate five separate masks for distinct ROIs (from top to bottom: normal brain, necrotic

tumor core, tumoral edema, tumoral enhancement, and multi-component tumor); C) the input image is preprocessed in a

way that all pixels with at least one corresponding ROI are filled with random gaussian noise. This preprocessed image

will be concatenated to the five mask ROIs developed in the previous step to create a six-channel tensor of size

6×256×256. Furthermore, a one-dimensional class vector is built to denote the ROI mode for each of the 5 ROI channels

in the previous tensor; D) the six-channel tensor and the class vector are fed to a diffusion model to denoise the noisy

image to a version that is less noisy for one step; E) The input image will similarly be converted to a ground truth noisy

image at the same step that the output of the diffusion model should be; F) the output of the diffusion model will be

compared with the ground truth noisy image to calculate the loss and optimize the model.

In this methodological paper, we introduce a proof-of-concept diffusion model that can be used for

multitask brain tumor inpainting on multi-sequential brain magnetic resonance imaging (MRI) studies.

More precisely, we developed a diffusion model that can receive a two-dimensional (2D) axial slice

from a T1-weighted (T1), a contrast-enhanced T1-weighted (T1CE), a T2-Weighted (T2), or a fluid

attenuated inversion recovery (FLAIR) sequence of a brain MRI and inpaint a user-defined cropped

area of that slice with realistic and controllable image of either a high-grade glioma and its

corresponding components (e.g., the surrounding edema), or tumor-less (apparently normal) brain

tissues. The incidence of high-grade glioma is 3.56 per 100,000 population in the United States, and

there are only a few publicly available MRI datasets for brain tumors(15,16). In the context of such

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

1MULTITASKBRAINTUMORINPAINTINGWITHDIFFUSIONMODELS:AMETHODOLOGICALREPORTPouriaRouzrokh1,2,*,BardiaKhosravi1,2,*,ShahriarFaghani1,ManaMoassefi1,SanazVahdati1,BradleyJ.Erickson1,+(1)MayoClinicArtificialIntelligenceLaboratory,MayoClinic,MN,USA(2)OrthopedicSurgeryArtificialIntelligenceLaboratory,MayoClin...

展开>> 收起<<

1 MULTITASK BRAIN TUMOR INPAINTING WITH DIFFUSION MODELS.pdf

共17页,预览4页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

1 MULTITASK BRAIN TUMOR INPAINTING WITH DIFFUSION MODELS

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: