A deep learning network with differentiable dynamic programming for retina OCT surface segmentation

2025-04-27 0 0 2.18MB 11 页 10玖币
侵权投诉
A deep learning network with differentiable
dynamic programming for retina OCT surface
segmentation
Hui Xie, Weiyu Xu, and Xiaodong Wu
The University of Iowa, Iowa City, IA 52242, USA
xiaodong-wu@uiowa.edu
Abstract. Multiple-surface segmentation in Optical Coherence Tomog-
raphy (OCT) images is a challenge problem, further complicated by the
frequent presence of weak image boundaries. Recently, many deep learn-
ing (DL) based methods have been developed for this task and yield
remarkable performance. Unfortunately, due to the scarcity of training
data in medical imaging, it is challenging for DL networks to learn the
global structure of the target surfaces, including surface smoothness. To
bridge this gap, this study proposes to seamlessly unify a U-Net for
feature learning with a constrained differentiable dynamic programming
module to achieve an end-to-end learning for retina OCT surface seg-
mentation to explicitly enforce surface smoothness. It effectively utilizes
the feedback from the downstream model optimization module to guide
feature learning, yielding a better enforcement of global structures of the
target surfaces. Experiments on Duke AMD (age-related macular degen-
eration) and JHU MS (multiple sclerosis) OCT datasets for retinal layer
segmentation demonstrated very promising segmentation accuracy.
Keywords: retina OCT ·surface segmentation ·deep learning ·differ-
entiable dynamic programming.
1 Introduction
Highly accurate surface segmentation for retina optical coherence tomography
(OCT) is a clinical necessity in many diagnostic and treatment tasks of oph-
thalmic diseases. In retina OCT imaging, the frequent presence of weak image
boundaries complicated by image artifacts often leads to undesirable boundary
spikes with many automated segmentation methods. However, experienced oph-
thalmologists can well delineate retinal surfaces from OCT scans in those difficult
scenarios while taking advantage of their global shape information and mutual
interaction. This indicates that surface insufficiency can be remedied by mak-
ing use of surface shape and context priors in the segmentation methods [29].
We thus propose to seamlessly integrate differentiable dynamic programming
(DDP) [19] into a deep learning framework with an end-to-end training for retina
OCT surface segmentation to enforce surface smoothness.
Many retina OCT segmentation methods have been proposed in past years.
Garvin et al. first introduced the graph-based optimal surface segmentation
arXiv:2210.06335v1 [eess.IV] 8 Oct 2022
2 H. Xie et al.
method [14] for surface delineation in retinal OCT [6], which was further de-
veloped by incorporating various a priori knowledge reflecting anatomic and
imaging information [29,25]. Other known OCT surface segmentation approaches
include level set [2,7,17], probabilistic global shape model [21], random forest
classifier [12,30], and dynamic programming [3,32,10,20]. Each of these tradi-
tional methods has its own strength. They all share a common drawback that is
their dependence on handcrafted features.
Armed with superior data representation learning capacity, deep learning
(DL) methods are emerging as powerful alternatives to traditional segmentation
algorithms for many medical image segmentation tasks [15,28]. Fully convolu-
tional networks (FCNs) [24,18], Convolutional neural networks (CNNs) [27], and
U-Net [23,13,8,16,31] have been utilized for retinal layer segmentation in OCT
images. Due to the scarcity of training data in medical imaging, it is yet nontriv-
ial for DL networks to implicitly learn global structures of the target surfaces.
Thus, the retinal layer topology cannot be guaranteed with those methods, nei-
ther the continuity and smoothness of the retinal surfaces can be ensured. To
address those limitations, the graph-based method and dynamic programming
were used as post-processing for the deep learning models to enforce surface
monotonicity and smoothness [4,11]. In this scheme, feature learning is, in fact,
disconnected from the downstream optimization; the learned features thus may
not be truly appropriated for the model. He et al. further extended the deep
regression idea[27] with fully differentiable soft-argmax operations to generate
surface positions followed by ReLU operations to guarantee the surface order in
their fully convolutional regression network (FCRN) [8]. The hybrid 2D-3D CNN
[16] using B-scan alignment was proposed to obtain continuous 3D retinal layer
surfaces from OCT. The IPM optimization method [31] effectively integrates
the DL feature learning with the IPM optimization to enforce mutual interac-
tion between surfaces, but the IPM optimization runs on each A-scan. All these
methods[8,16,31] achieved highly accurate segmentation of retinal surfaces from
OCT. However, their performance is prone to be affected by image outliers with
bad quality or artifacts with the limited size of training data [8], as they lack
the capability of explicitly learning surface smoothness structure.
This study proposes to unify the powerful feature learning capability of DL
with a constrained DDP module in a single deep neural network for end-to-
end training to achieve globally optimal segmentation while explicitly enforcing
surface smoothness. In the proposed segmentation framework, a U-Net with ad-
ditional image gradient channels [31] is leveraged as the backbone for learning
parameterized surface costs. The retinal surface inference by minimizing the total
surface cost while satisfying surface smoothness constraints is realized by a DDP
module for a globally optimal solution. The differentiability of the DDP module
enables efficient backward propagation of gradients for an end-to-end learning.
To the best of our knowledge, this is the first work to apply differentiable dy-
namic programming for surface segmentation in medical images. Experiments on
retina spectral-domain OCT datasets demonstrated improved surface segmenta-
tion accuracy.
A DL network with DDP for OCT segmentation 3
2 Method
2.1 Problem Formulation
Let I(X, Z of size X×Zbe a given 2D B-scan of an OCT image. For each
x= 0,1, . . . , X 1, the pixel subset {I(x, z)|0z < Z}forms a column
parallel to the z-axis, denoted by Col(x), which corresponds to an A-scan of
the OCT image. Our goal is to seek N > 0retinal surfaces, each of which Si
(i= 0,1, . . . , N 1) intersects every column Col(x)at exactly one location
z(i)
x, that is, I(x, z(i)
x)Si. To find an optimal surface Si, each pixel I(x, z)
is associated with an on-surface cost ci(x, z), which is related to the likelihood
of I(x, z)on Si. Each retinal surface express a certain degree of smoothness,
which specifies the maximum allowed change in the z-dimension of a feasible
surface along each unit distance change in the x-dimension. More specifically,
with given smoothness parameters (i)
x>0for Si, if I(x, z0),I(x1, z00)Si,
then |z0z00| ≤ (i)
x. The optimization objective of our surface segmentation
problem is to maximize the total on-surface cost of all pixels on the Nsought
surfaces S={S
0, S
1, . . . , S
N1}, with
S= argmax
S={S0,S1,...,SN1}
E(S) =
N1
X
i=0 X
I(x,z)Si
ci(x, z)
s.t. |z(i)
xz(i)
x+1| ≤ (i)
x,for i= 0,1, ..., N 1;x= 0,1, ..., X 2.
(1)
2.2 Network Architecture
Gradient channels
Soft-max
on A-scan
L1 Loss
Segmentation
U-Net
MaxPool2d
3x3 Conv2d,
Change channels
InstanceNorm2d,
ReLu
3x3 Conv2d,
Keep channels,
InstanceNorm2d,
ReLu
Bi-linear
UpSample
Legend
1x1 Conv2d
Residual
connection
DDP
Multisurface
CE Loss
-(z-μ)2
Raw image
c=24
c=1536
c=768
c=384
c=192
c=96
c=48
c=24
c=768
c=384
c=192
c=96
c=48
z*
z*
Fig. 1. The U-Net based network architecture with additional seven gradient channels
as input. The number of channels for each layer is indicated with c. The DDP module
solves the optimization problem for segmentation and outputs optimal smooth retinal
surfaces Sfor the L1loss.
The proposed surface segmentation network is based on a U-Net architec-
ture [22], as illustrated in Fig. 1, which consists of seven convolution layers. This
U-Net acts as a feature-extracting module for the surface segmentation head. We
started with 24 feature maps in the first convolution layer. In each downsampling
layer, a conv2d module followed by a 2x2 max-pooling doubles the feature maps,
and then a cascade of three same conv2d modules with a residual connection is
摘要:

AdeeplearningnetworkwithdierentiabledynamicprogrammingforretinaOCTsurfacesegmentationHuiXie,WeiyuXu,andXiaodongWuTheUniversityofIowa,IowaCity,IA52242,USAxiaodong-wu@uiowa.eduAbstract.Multiple-surfacesegmentationinOpticalCoherenceTomog-raphy(OCT)imagesisachallengeproblem,furthercomplicatedbythefrequ...

展开>> 收起<<
A deep learning network with differentiable dynamic programming for retina OCT surface segmentation.pdf

共11页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:11 页 大小:2.18MB 格式:PDF 时间:2025-04-27

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 11
客服
关注