Diversity-Promoting Ensemble for Medical Image Segmentation

2025-05-03 0 0 1.54MB 8 页 10玖币

侵权投诉

Mariana-Iuliana Georgescu

University of Bucharest

Romania

Radu Tudor Ionescu∗

University of Bucharest

Romania

raducu.ionescu@gmail.com

Andreea-Iuliana Miron

“Carol Davila” University of Medicine

and Pharmacy, Colţea Hospital

Romania

ABSTRACT

Medical image segmentation is an actively studied task in med-

ical imaging, where the precision of the annotations is of utter

importance towards accurate diagnosis and treatment. In recent

years, the task has been approached with various deep learning

systems, among the most popular models being U-Net. In this work,

we propose a novel strategy to generate ensembles of dierent

architectures for medical image segmentation, by leveraging the

diversity (decorrelation) of the models forming the ensemble. More

specically, we utilize the Dice score among model pairs to esti-

mate the correlation between the outputs of the two models forming

each pair. To promote diversity, we select models with low Dice

scores among each other. We carry out gastro-intestinal tract image

segmentation experiments to compare our diversity-promoting en-

semble (DiPE) with another strategy to create ensembles based on

selecting the top scoring U-Net models. Our empirical results show

that DiPE surpasses both individual models as well as the ensemble

creation strategy based on selecting the top scoring models.

CCS CONCEPTS

•Computing methodologies →

Supervised learning;Image pro-

cessing;Image segmentation;

•Applied computing →

Health in-

formatics;

KEYWORDS

medical imaging; medical image segmentation; model ensemble;

neural network ensemble; deep learning; neural networks; voting-

based ensemble; plurality voting.

ACM Reference Format:

Mariana-Iuliana Georgescu, Radu Tudor Ionescu, and Andreea-Iuliana Miron.

2023. Diversity-Promoting Ensemble for Medical Image Segmentation. In

The 37th ACM/SIGAPP Symposium on Applied Computing (SAC ’23), March

27-April 2, 2023, Tallinn, Estonia. ACM, New York, NY, USA, Article 12927.99,

8 pages. https://doi.org/10.1145/3555776.3577682

1 INTRODUCTION

Physicians extensively use medical imaging techniques, e.g. Com-

puted Tomography (CT), Magnetic Resonance Imaging (MRI) and

∗Corresponding author.

Permission to make digital or hard copies of all or part of this work for personal or

classroom use is granted without fee provided that copies are not made or distributed

for prot or commercial advantage and that copies bear this notice and the full citation

on the rst page. Copyrights for components of this work owned by others than the

author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or

republish, to post on servers or to redistribute to lists, requires prior specic permission

and/or a fee. Request permissions from permissions@acm.org.

SAC ’23, March 27-April 2, 2023, Tallinn, Estonia

ACM ISBN 978-1-4503-9517-5/23/03. . . $15.00

https://doi.org/10.1145/3555776.3577682

Optical Coherence Tomography (OCT) [

], as one of the least inva-

sive investigation alternatives to diagnose lesions inside the human

body. Segmenting (delimiting) regions of interest, such as organs

or tumors, is often required for precise diagnosis and treatment.

For example, a precise segmentation of a malignant tumor can lead

to an accurate calibration of the radiation dosage in radiotherapy

[

]. In recent years, the medical image segmentation task

has been approached with various deep learning systems, ranging

from convolutional neural networks (CNNs) [

] to trans-

formers [

]. Among these, U-Net [

] remains one of the most

popular methods. Although U-Net was introduced in 2015, it con-

sistently received updates [

], keeping its performance

at a competitive level. However, using a single neural network to

perform segmentation is not always the best solution. Indeed, con-

structing ensembles of multiple neural networks is an extensively

validated method [2, 7, 19, 33] to boost accuracy.

Since the precision of the medical image segmentation output

is of utter importance towards accurate diagnosis and treatment,

we focus on combining multiple U-Net architectures to address

the task. We conjecture that decorrelated models lead to a supe-

rior ensemble, since decorrelated models can better complement

each other’s decisions. To this end, we propose a novel strategy

to construct ensembles of dierent models for medical image seg-

mentation by promoting the diversity (decorrelation) of the models

comprising the ensemble, while also giving equal importance to ac-

curacy. To measure the correlation among two models, we compute

the Dice score between the outputs of the respective models. We

then construct the ensemble in a bottom-up fashion, starting from

the best model and gradually adding the least correlated models

with those already included, one by one. At the same time, our

ensemble creation strategy assigns equal importance to the perfor-

mance level of the model to be added at each step. Since we select

models with lower Dice scores at each step, our strategy promotes

the diversity among the models comprising the ensemble, hence

bearing the name Diversity-Promoting Ensemble (DiPE).

We conduct image segmentation experiments on the gastro-

intestinal tract data set provided by the UW-Madison Carbone

Cancer Center [

]. We evaluate nine individual U-Net models

based on three dierent backbones (ResNet-34 [

], EcientNet-B0

[

], EcientNet-B1 [

]) with or without multi-head convolutional

attention [

]. Along with the individual models, we evaluate two

strategies to create voting-based ensembles, namely

(𝑖)

a baseline

(conventional) strategy selecting the top scoring models and

(𝑖𝑖)

our strategy promoting diversity among the selected models. The

empirical results indicate that our strategy, DiPE, outperforms both

individual models, as well as the baseline ensemble.

In summary, our contribution is twofold:

•

We introduce a diversity-promoting strategy to create an

ensemble of medical image segmentation models that are

arXiv:2210.12388v2 [eess.IV] 21 Dec 2022

SAC ’23, March 27-April 2, 2023, Tallinn, Estonia Mariana-Iuliana Georgescu, Radu Tudor Ionescu, and Andreea-Iuliana Miron

low correlated among each other, by leveraging the Dice

score between the outputs of various models.

•

We provide empirical evidence showing that our diversity-

promoting ensemble leads to superior performance levels

compared with individual models and the conventional strat-

egy selecting the top scoring models.

2 RELATED WORK

Medical image segmentation can be divided into two tasks, with

respect to the input image. Indeed, there are works that tackle the

segmentation task on 2D images [

], while others rely

on 3D images [

]. The works using

2D images as input naturally produce 2D slices as output, while

the works using entire 3D volumes as input produce 3D volumes

as output.

Perhaps the most popular architecture for 2D segmentation is U-

Net [

]. U-Net is a fully convolutional (conv) network designed for

medical image segmentation. The architecture follows a “U” shape

and is composed of a contracting and an expansive path. Each step

of the expansive path is composed of an upsampling operation, a

convolution layer which halves the number of feature maps, and a

concatenation with the corresponding cropped feature maps from

the contracting path. Seo et al. [

] proposed the mU-Net model,

a modied version of the U-Net architecture. mU-Net [

] adds a

residual path to the deconvolution operations, and an additional

convolutional layer to the skip connections in order to extract high-

level global features of small objects.

Chen et al. [

] proposed the voxel-wise residual network (VoxRes-

Net), a 3D CNN formed of 25 layers with residual connections.

Multimodal and multi-level contextual information is introduced

into the VoxResNet model. The multimodal information is added

by concatenating multimodal data before giving it as input to the

model. To improve the 3D segmentation performance of brain le-

sions, Kamnitsas et al. [

] employed a 3D CNN comprising 11

layers with parallel convolutional pathways for multi-scale process-

ing. Rather than modifying the layers of their architecture, Zhao et

al. [

] inserted a lesion-related spatial attention mechanism into

the network.

In order to help physicians obtain better segmentation results,

Luo et al. [

] proposed interactive segmentation to further improve

the performance of CNN models, even to unseen objects.

Closer to our study, the work of Gibson et al. [

] shares the same

target task, being focused on multi-organ abdominal segmentation.

Gibson et al. [

] presented a registration-free approach based on

Dense V-Networks for multi-organ abdominal segmentation of 3D

images. They also proposed a batch-wise spatial dropout to lower

the memory usage and processing time of dropout.

Dierent from the aforementioned works, which are trained

in a fully-supervised learning setting, there are several works [

] proposing weakly-supervised learning frameworks. Zhou et

al. [

] found that data sets having only one organ annotated as the

positive class, leaving the other organs as part of the background,

attain misleading results in multi-organ segmentation, since the

background class contains many organs. In order to alleviate this

problem, Zhou et al. [

] proposed a prior-aware neural network,

incorporating anatomical priors on abdominal organ sizes into the

training objective.

Similar to our approach proposing an ensemble of multiple net-

works to improve the segmentation results, Lyksborg [

] proposed

to use a model for each of the axial, sagittal and coronal planes, fus-

ing the corresponding segmentations into a single 3D segmentation.

Baldeon et al. [

] proposed AdaEn-Net, an ensemble of networks

that boosts the segmentation performance. AdaEn-Net [

] rstly

employs an ensemble of 2D and 3D models to predict the output

segmentation. Then, it trains the 2D-3D ensemble on

𝑘

-folds, ob-

taining

𝑘

models. The nal segmentation mask is the average of

the 𝑘models forming the nal ensemble.

Dierent from previous works, such as [

], which directly

combined models into ensembles without taking into account their

output correlation, we propose a novel ensemble creation algorithm

which promotes the diversity among the models comprising the

ensemble.

3 METHOD

3.1 Neural Architectures

To address our medical image segmentation task, we employ the

well known U-Net architecture [

]. The U-Net architecture is a

fully convolutional network that belongs to the family of encoder-

decoder neural networks. In the encoding part, the spatial informa-

tion is downsampled through convolution and pooling operations.

In the decoding part, the spatial information is upsampled back to

the original size via convolution transpose. High-resolution fea-

tures from the encoder are passed through skip connections and

concatenated to the corresponding features from the decoder, thus

infusing high-resolution information into the decoder. The intro-

duction of skip connections gives the network its “U” shape. We

further present our changes to the U-Net model, leading to a total

of nine distinct model variants forming the basis of our ensemble.

3.1.1 Backbone Variations. To build an ensemble of a diverse set of

models, we rst introduce variations in terms of the backbone archi-

tecture. Therefore, we try the following three encoder architectures:

ResNet-34 [12], EcientNet-B0 [27], and EcientNet-B1 [27]. We

choose ResNet-34 due to its fairly good trade-o between running

time and accuracy level. The reason behind adding EcientNet-B0

and EcientNet-B1 into our study is the superior performance

levels of these models compared to ResNet-34.

The residual network (ResNet) architecture was proposed by

He et al. [

]. ResNet models are composed of residual blocks. A

residual block consists of a few stacked conv layers and a skip

connection from the rst layer to the last layer of the block. Skip

connections allow the training of very deep neural networks, alle-

viating the vanishing gradient problem. He et al. [

] proposed ve

ResNet variants of dierent depth, namely ResNet-18, ResNet-34,

ResNet-50, ResNet-101 and ResNet-152. Among these, we select

ResNet-34 to serve as backbone for some of our U-Net models.

The EcientNet architecture was introduced by Tan et al. [

]

to eciently scale convolutional neural networks. Tan et al. [

]

demonstrated that, in order to obtain better performance under

a certain computational budget, all three components of the net-

work, namely the depth, the width and the resolution, should be

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

Diversity-PromotingEnsembleforMedicalImageSegmentationMariana-IulianaGeorgescuUniversityofBucharestRomaniaRaduTudorIonescu∗UniversityofBucharestRomaniaraducu.ionescu@gmail.comAndreea-IulianaMiron“CarolDavila”UniversityofMedicineandPharmacy,ColţeaHospitalRomaniaABSTRACTMedicalimagesegmentationisanact...

展开>> 收起<<

Diversity-Promoting Ensemble for Medical Image Segmentation.pdf

共8页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Diversity-Promoting Ensemble for Medical Image Segmentation

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: