UGformer for Robust Left Atrium and Scar Segmentation Across Scanners Tianyi Liu1 Size Hou3 Jiayuan Zhu2 Zilong Zhao1 and Haochuan Jiang1

2025-05-06 0 0 4.27MB 14 页 10玖币
侵权投诉
UGformer for Robust Left Atrium and Scar
Segmentation Across Scanners ?
Tianyi Liu1, Size Hou3, Jiayuan Zhu2, Zilong Zhao1, and Haochuan Jiang*1
1School of Robotics
2School of Artificial Intelligence and Advanced Computing
XJTLU Entrepreneur College (Taicang), Suzhou, Jiangsu, 215412, P.R. China
3School of Science, Xi’an Jiaotong-Liverpool University, SIP, Suzhou, Jiangsu,
215123, P.R.China
Abstract. Thanks to the capacity for long-range dependencies and ro-
bustness to irregular shapes, vision transformers and deformable con-
volutions are emerging as powerful vision techniques of segmentation.
Meanwhile, Graph Convolution Networks (GCN) optimize local features
based on global topological relationship modeling. Particularly, they have
been proved to be effective in addressing issues in medical imaging seg-
mentation tasks including multi-domain generalization for low-quality
images. In this paper, we present a novel, effective, and robust frame-
work for medical image segmentation, namely, UGformer. It unifies novel
transformer blocks, GCN bridges, and convolution decoders originating
from U-Net to predict left atriums (LAs) and LA scars. We have identi-
fied two appealing findings of the proposed UGformer: 1). an enhanced
transformer module with deformable convolutions to improve the blend-
ing of the transformer information with convolutional information and
help predict irregular LAs and scar shapes. 2). Using a bridge incor-
porating GCN to further overcome the difficulty of capturing condition
inconsistency across different Magnetic Resonance Images scanners with
various inconsistent domain information. The proposed UGformer model
exhibits outstanding ability to segment the left atrium and scar on the
LAScarQS 2022 dataset, outperforming several recent state-of-the-arts.
Keywords: Left atrium segmentation, scar prediction, Transformer, Graph
convolution model
1 Introduction
Late gadolinium enhancement magnetic resonance imaging (LGE-MRI) is typi-
cally used to provide quantitative information on atrial scars [25]. In this mea-
surement, location and size in the left atrium (LA) indicate pathology (i.e., LA
scars) and progression of atrial fibrillation [12].
?This research is funded by XJTLU Research Development Funding 20-02-60. Com-
putational resources used in this research are provided by the School of Robotics,
XJTLU Entrepreneur College (Taicang), Xi’an Jiaotong-Liverpool University.
arXiv:2210.05151v1 [cs.CV] 11 Oct 2022
2 T. et al.
Nowadays, deep learning models have been widely used to segment LA cav-
ities and quantify LA scars from LGE-MRIs [3] to help radiologists with initial
screening for quick pathology detection. Meanwhile, LGE-MRIs are often col-
lected by multiple scanners and possibly in low imaging quality. Each of them
produces inconsistent domain information [14], including different contrast and
spatial resolutions. (1) Promoting the generalization of a segmentation model
against domain inconsistency becomes another challenge.
(a) (b) (c) (d)
Fig. 1. Typical examples of LAScarQS Dataset [14,15,16] in various contrast: (a)
Proper contrast, (b) low contrast, and different spatial resolution (c) 886 ×864, (d)
480 ×480.
Essentially, semantic segmentation is a mapping from input images to output
pixel labels through an empirically designed segmentation model. Recent com-
puter vision research communities have witnessed great achievements brought by
the Convolutional Neural Network (CNN) and Vision Transformers (ViT) [4,10].
However, there is a lack of theoretical explanations to guarantee prediction
and generalization performance [2]. Besides, there is no fixed shape in human
anatomies (i.e., LAs) and pathologies (i.e., LA scars). Atlas-based segmentation
strategy cannot be utilized ideally [30,13], while normal CNNs are not good at
predicting deformable objects either [22].
Conventional CNN-based segmentation models only take care of local depen-
dencies since the convolutional kernel only sees visual information in closing pix-
els within the receptive field. It leads to ignoring the full picture as a whole [21].
Common pooling layers in CNN will also degrade spatial information since it re-
gards neighboring pixels as one single pixel. Losses in spatial information restrict
the prediction performance of conventional CNN models [26].
Fortunately, Graph Convolutional Networks (GCN) are promised to address
those challenges effectively by leveraging the robustness brought by the topo-
logical properties [11]. The topological relationship extracted by GCN while
performing representation learning has been proved more stable against various
application scenarios than that of the geometric relationship of general vision
models, i.e., CNNs and ViTs [1]. In addition to the local features extracted by
CNNs, GCN also provides an approach to model the relationship among differ-
ent local features. It optimizes local features of low-quality images by Laplacian
smoothing to a certain extent [9], beneficial to promoting generation across data
from different domains.
Meanwhile, recently ViT models are becoming popular in semantic segmen-
tations in handling long-range dependencies. It models spatial image information
UGformer for Robust Left Atrium and Scar Segmentation Across Scanners 3
by engaging the self-attention mechanism [24]. Swin Transformer [17] and Seg-
Former [27] are two pioneering approaches to engaging ViTs in segmentation
tasks. Swin Transformer engages sliding window operation. It fulfills the local-
ization of convolutional operations while saving time consumption in computa-
tion. SegFormer connects the transformer to lightweight multi-layer perception
decoders, allowing it to combine local and global attention. In medical image
segmentations, TransUnet [4], UTnet [7], and LeViT-Unet [28] are the first few
trials to integrate ViT modules in the U-Net [22] architecture. All of them achieve
state-of-the-art segmentation performance on the Synapse dataset [23].
Fig. 2. Positions of LA and LA scars [16]
In terms of LA scar prediction, prior work predicts LA and LA scars sepa-
rately without considering the relationship between them [16]. Meanwhile, the
size of the scars is relatively insignificant, bringing difficulties in the prediction.
Fortunately, LAs are much easier to be predicted, while LA scars are often de-
tected near identified LA boundaries Fig. 2. Inspired by [29], we believe that
combining the prediction of LAs and LA scars can be expected to improve scar
segmentation performance.
In this paper, we propose a novel U-shaped GCN with Enhanced Transformer
module (UGformer). It is a two-stage segmentation model by segmenting the
LA before quantifying the irregularly shaped LA scars. It consists of a novel
transformer block as the encoder, convolution blocks as the decoder, and skip-
connections with a GCN as the bridge.
In the encoder, the novel transformer block, namely, enhanced transformer
block (ETB), is built by replacing the single multi-head self-attention module
with paralleling the multi-head self-attention module (MHSA) and deformable
convolutions (DCs). It models global spatial attention while dealing with irregu-
lar shape information by leveraging advantages in both convolutions and trans-
formers, i.e., proper generalization ability and sufficient model capacity [26].
The bridge with GCN connection optimizes the fusion of long-range information
and context information between the encoder and the decoder [9]. It contin-
uously strengthens the representation of intermediate feature maps to find a
low-dimensional invariant topology, improving the extrapolation of segmenta-
tion models.
摘要:

UGformerforRobustLeftAtriumandScarSegmentationAcrossScanners?TianyiLiu1,SizeHou3,JiayuanZhu2,ZilongZhao1,andHaochuanJiang*11SchoolofRobotics2SchoolofArti cialIntelligenceandAdvancedComputingXJTLUEntrepreneurCollege(Taicang),Suzhou,Jiangsu,215412,P.R.China3SchoolofScience,Xi'anJiaotong-LiverpoolUnive...

展开>> 收起<<
UGformer for Robust Left Atrium and Scar Segmentation Across Scanners Tianyi Liu1 Size Hou3 Jiayuan Zhu2 Zilong Zhao1 and Haochuan Jiang1.pdf

共14页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!

相关推荐

分类:图书资源 价格:10玖币 属性:14 页 大小:4.27MB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 14
客服
关注