MASKED AUTOENCODERS FOR LOW DOSE CT DENOISING Dayang Wang Yongshun Xu Shuo Han Hengyong Yu Department of Electrical and Computer Engineering

2025-05-02 0 0 2.09MB 4 页 10玖币
侵权投诉
MASKED AUTOENCODERS FOR LOW DOSE CT DENOISING
Dayang Wang, Yongshun Xu, Shuo Han, Hengyong Yu*
Department of Electrical and Computer Engineering,
University of Massachusetts Lowell, Lowell, MA, USA, 01854.
ABSTRACT
Low-dose computed tomography (LDCT) reduces the
X-ray radiation but compromises image quality with more
noises and artifacts. A plethora of transformer models have
been developed recently to improve LDCT image quality.
However, the success of a transformer model relies on a large
amount of paired noisy and clean data, which is often unavail-
able in clinical applications. In computer vision and natural
language processing fields, masked autoencoders (MAE)
have been proposed as an effective label-free self-pretraining
method for transformers, due to its excellent feature rep-
resentation ability. Here, we redesign the classical encoder-
decoder learning model to match the denoising task and apply
it to LDCT denoising problem. The MAE can leverage the
unlabeled data and facilitate structural preservation for the
LDCT denoising model when there are insufficient ground
truth data. Experiments on the Mayo dataset validate that the
MAE can boost the transformer’s denoising performance and
relieve the dependence on the ground truth data.
Index TermsLow-dose CT, Masked Autoencoder,
Self-pretraining, Transformer.
1. INTRODUCTION
In recent years, LDCT has become the mainstream in clinical
applications due to the substantial x-ray radiation danger in
normal-dose CT (NDCT). However, LDCT images compro-
mise the image quality and diagnosis value, which has been a
barrier for its applications. To overcome this issue, numerous
deep learning models are explored along this direction [1–5].
Transformer Models. In the past few years, transformer
models have gained lots of attention due to their ability to cap-
ture global contextual information. Several researches also
cultivated their applications to the LDCT denoising. Zhang et
al. employed a Gaussian filter to decompose LDCT images
into high/low frequency parts [4]. Then, a transformer mod-
ule was applied to the two parts for feature inference and con-
textual information fusion. Wang et al. proposed a more ad-
vanced convolution-free encoder-decoder transformer. Then,
*email: hengyong-yu@ieee.org. This work has been submitted to the
IEEE for possible publication. Copyright may be transferred without notice,
after which this version may no longer be accessible.
the static and dynamic latent learning behavior of the model
was revealed by analyzing the attention maps [5]. Recently,
the Swin transformer has been very popular as a backbone
architecture for a variety of downstream tasks [6, 7]. The
SwinIR is one of its important adaptions for image denoising,
super-resolution, and artifact reduction [8]. Notably, some
SwinIR-based transformer models or modules have also been
proposed for CT image enhancement/denoising and have
achieved excellent results [9]. Therefore, the application of
the MAE in this paper is also based on the SwinIR.
Masked Autoencoder. Recently, MAE has emerged as an
excellent self-supervised learning strategy for various com-
puter vision tasks [10]. He et. al. revealed that small frac-
tion of an image can infer complex and holistic visual con-
cepts like semantics [10]. Zhou et. al. showed that MAE
can enhance medical image segmentation and classification
performance [11]. However, these works are on high-level
vision tasks. Currently, there is no exploration of MAE as
self-pretraining on low-level vision tasks like LDCT denois-
ing.
Therefore, we are motivated to explore the potential of
the MAE in LDCT denoising. We believe this study is signif-
icant for two reasons: i) There are usually insufficient ground
truth data for clinical applications, and the self-pretraining
paradigm of the MAE can reap the benefits of the unlabeled
data. Therefore, it ought to be the ideal choice for an LDCT
denoising work. ii) Structural preservation and enhancement
are crucial goals in LDCT denoising. But, the anatomical
structures in the CT image are supposed to be connected with
each other mechanically and functionally. The MAE can ag-
gregate the contextual information to infer the masked struc-
tures. Therefore, it can potentially strengthen the dependence
between each anatomical region and complement the struc-
tural loss.
2. METHODS
2.1. Masked Autoencoder Design
As shown in the model flowchart in Fig. 1, the MAE learn-
ing paradigm is employed for LDCT denoising. In the self-
pretraining stage, the CT images are trained from LDCT im-
ages to LDCT images to learn the structure relationship, and
1
arXiv:2210.04944v1 [eess.IV] 10 Oct 2022
摘要:

MASKEDAUTOENCODERSFORLOWDOSECTDENOISINGDayangWang,YongshunXu,ShuoHan,HengyongYu*DepartmentofElectricalandComputerEngineering,UniversityofMassachusettsLowell,Lowell,MA,USA,01854.ABSTRACTLow-dosecomputedtomography(LDCT)reducestheX-rayradiationbutcompromisesimagequalitywithmorenoisesandartifacts.Apleth...

展开>> 收起<<
MASKED AUTOENCODERS FOR LOW DOSE CT DENOISING Dayang Wang Yongshun Xu Shuo Han Hengyong Yu Department of Electrical and Computer Engineering.pdf

共4页,预览1页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:4 页 大小:2.09MB 格式:PDF 时间:2025-05-02

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 4
客服
关注