UGformer for Robust Left Atrium and Scar Segmentation Across Scanners Tianyi Liu1 Size Hou3 Jiayuan Zhu2 Zilong Zhao1 and Haochuan Jiang1

2025-05-06 0 0 4.27MB 14 页 10玖币

侵权投诉

UGformer for Robust Left Atrium and Scar

Segmentation Across Scanners ?

Tianyi Liu1, Size Hou3, Jiayuan Zhu2, Zilong Zhao1, and Haochuan Jiang*1

1School of Robotics

2School of Artiﬁcial Intelligence and Advanced Computing

XJTLU Entrepreneur College (Taicang), Suzhou, Jiangsu, 215412, P.R. China

3School of Science, Xi’an Jiaotong-Liverpool University, SIP, Suzhou, Jiangsu,

215123, P.R.China

Abstract. Thanks to the capacity for long-range dependencies and ro-

bustness to irregular shapes, vision transformers and deformable con-

volutions are emerging as powerful vision techniques of segmentation.

Meanwhile, Graph Convolution Networks (GCN) optimize local features

based on global topological relationship modeling. Particularly, they have

been proved to be eﬀective in addressing issues in medical imaging seg-

mentation tasks including multi-domain generalization for low-quality

images. In this paper, we present a novel, eﬀective, and robust frame-

work for medical image segmentation, namely, UGformer. It uniﬁes novel

transformer blocks, GCN bridges, and convolution decoders originating

from U-Net to predict left atriums (LAs) and LA scars. We have identi-

ﬁed two appealing ﬁndings of the proposed UGformer: 1). an enhanced

transformer module with deformable convolutions to improve the blend-

ing of the transformer information with convolutional information and

help predict irregular LAs and scar shapes. 2). Using a bridge incor-

porating GCN to further overcome the diﬃculty of capturing condition

inconsistency across diﬀerent Magnetic Resonance Images scanners with

various inconsistent domain information. The proposed UGformer model

exhibits outstanding ability to segment the left atrium and scar on the

LAScarQS 2022 dataset, outperforming several recent state-of-the-arts.

Keywords: Left atrium segmentation, scar prediction, Transformer, Graph

convolution model

1 Introduction

Late gadolinium enhancement magnetic resonance imaging (LGE-MRI) is typi-

cally used to provide quantitative information on atrial scars [25]. In this mea-

surement, location and size in the left atrium (LA) indicate pathology (i.e., LA

scars) and progression of atrial ﬁbrillation [12].

?This research is funded by XJTLU Research Development Funding 20-02-60. Com-

putational resources used in this research are provided by the School of Robotics,

XJTLU Entrepreneur College (Taicang), Xi’an Jiaotong-Liverpool University.

arXiv:2210.05151v1 [cs.CV] 11 Oct 2022

2 T. et al.

Nowadays, deep learning models have been widely used to segment LA cav-

ities and quantify LA scars from LGE-MRIs [3] to help radiologists with initial

screening for quick pathology detection. Meanwhile, LGE-MRIs are often col-

lected by multiple scanners and possibly in low imaging quality. Each of them

produces inconsistent domain information [14], including diﬀerent contrast and

spatial resolutions. (1) Promoting the generalization of a segmentation model

against domain inconsistency becomes another challenge.

(a) (b) (c) (d)

Fig. 1. Typical examples of LAScarQS Dataset [14,15,16] in various contrast: (a)

Proper contrast, (b) low contrast, and diﬀerent spatial resolution (c) 886 ×864, (d)

480 ×480.

Essentially, semantic segmentation is a mapping from input images to output

pixel labels through an empirically designed segmentation model. Recent com-

puter vision research communities have witnessed great achievements brought by

the Convolutional Neural Network (CNN) and Vision Transformers (ViT) [4,10].

However, there is a lack of theoretical explanations to guarantee prediction

and generalization performance [2]. Besides, there is no ﬁxed shape in human

anatomies (i.e., LAs) and pathologies (i.e., LA scars). Atlas-based segmentation

strategy cannot be utilized ideally [30,13], while normal CNNs are not good at

predicting deformable objects either [22].

Conventional CNN-based segmentation models only take care of local depen-

dencies since the convolutional kernel only sees visual information in closing pix-

els within the receptive ﬁeld. It leads to ignoring the full picture as a whole [21].

Common pooling layers in CNN will also degrade spatial information since it re-

gards neighboring pixels as one single pixel. Losses in spatial information restrict

the prediction performance of conventional CNN models [26].

Fortunately, Graph Convolutional Networks (GCN) are promised to address

those challenges eﬀectively by leveraging the robustness brought by the topo-

logical properties [11]. The topological relationship extracted by GCN while

performing representation learning has been proved more stable against various

application scenarios than that of the geometric relationship of general vision

models, i.e., CNNs and ViTs [1]. In addition to the local features extracted by

CNNs, GCN also provides an approach to model the relationship among diﬀer-

ent local features. It optimizes local features of low-quality images by Laplacian

smoothing to a certain extent [9], beneﬁcial to promoting generation across data

from diﬀerent domains.

Meanwhile, recently ViT models are becoming popular in semantic segmen-

tations in handling long-range dependencies. It models spatial image information

UGformer for Robust Left Atrium and Scar Segmentation Across Scanners 3

by engaging the self-attention mechanism [24]. Swin Transformer [17] and Seg-

Former [27] are two pioneering approaches to engaging ViTs in segmentation

tasks. Swin Transformer engages sliding window operation. It fulﬁlls the local-

ization of convolutional operations while saving time consumption in computa-

tion. SegFormer connects the transformer to lightweight multi-layer perception

decoders, allowing it to combine local and global attention. In medical image

segmentations, TransUnet [4], UTnet [7], and LeViT-Unet [28] are the ﬁrst few

trials to integrate ViT modules in the U-Net [22] architecture. All of them achieve

state-of-the-art segmentation performance on the Synapse dataset [23].

Fig. 2. Positions of LA and LA scars [16]

In terms of LA scar prediction, prior work predicts LA and LA scars sepa-

rately without considering the relationship between them [16]. Meanwhile, the

size of the scars is relatively insigniﬁcant, bringing diﬃculties in the prediction.

Fortunately, LAs are much easier to be predicted, while LA scars are often de-

tected near identiﬁed LA boundaries Fig. 2. Inspired by [29], we believe that

combining the prediction of LAs and LA scars can be expected to improve scar

segmentation performance.

In this paper, we propose a novel U-shaped GCN with Enhanced Transformer

module (UGformer). It is a two-stage segmentation model by segmenting the

LA before quantifying the irregularly shaped LA scars. It consists of a novel

transformer block as the encoder, convolution blocks as the decoder, and skip-

connections with a GCN as the bridge.

In the encoder, the novel transformer block, namely, enhanced transformer

block (ETB), is built by replacing the single multi-head self-attention module

with paralleling the multi-head self-attention module (MHSA) and deformable

convolutions (DCs). It models global spatial attention while dealing with irregu-

lar shape information by leveraging advantages in both convolutions and trans-

formers, i.e., proper generalization ability and suﬃcient model capacity [26].

The bridge with GCN connection optimizes the fusion of long-range information

and context information between the encoder and the decoder [9]. It contin-

uously strengthens the representation of intermediate feature maps to ﬁnd a

low-dimensional invariant topology, improving the extrapolation of segmenta-

tion models.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

UGformerforRobustLeftAtriumandScarSegmentationAcrossScanners?TianyiLiu1,SizeHou3,JiayuanZhu2,ZilongZhao1,andHaochuanJiang*11SchoolofRobotics2SchoolofArticialIntelligenceandAdvancedComputingXJTLUEntrepreneurCollege(Taicang),Suzhou,Jiangsu,215412,P.R.China3SchoolofScience,Xi'anJiaotong-LiverpoolUnive...

展开>> 收起<<

UGformer for Robust Left Atrium and Scar Segmentation Across Scanners Tianyi Liu1 Size Hou3 Jiayuan Zhu2 Zilong Zhao1 and Haochuan Jiang1.pdf

共14页,预览3页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

UGformer for Robust Left Atrium and Scar Segmentation Across Scanners Tianyi Liu1 Size Hou3 Jiayuan Zhu2 Zilong Zhao1 and Haochuan Jiang1

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: