A deep learning network with diﬀerentiable dynamic programming for retina OCT surface segmentation

2025-04-27 0 0 2.18MB 11 页 10玖币

侵权投诉

A deep learning network with diﬀerentiable

dynamic programming for retina OCT surface

segmentation

Hui Xie, Weiyu Xu, and Xiaodong Wu

The University of Iowa, Iowa City, IA 52242, USA

xiaodong-wu@uiowa.edu

Abstract. Multiple-surface segmentation in Optical Coherence Tomog-

raphy (OCT) images is a challenge problem, further complicated by the

frequent presence of weak image boundaries. Recently, many deep learn-

ing (DL) based methods have been developed for this task and yield

remarkable performance. Unfortunately, due to the scarcity of training

data in medical imaging, it is challenging for DL networks to learn the

global structure of the target surfaces, including surface smoothness. To

bridge this gap, this study proposes to seamlessly unify a U-Net for

feature learning with a constrained diﬀerentiable dynamic programming

module to achieve an end-to-end learning for retina OCT surface seg-

mentation to explicitly enforce surface smoothness. It eﬀectively utilizes

the feedback from the downstream model optimization module to guide

feature learning, yielding a better enforcement of global structures of the

target surfaces. Experiments on Duke AMD (age-related macular degen-

eration) and JHU MS (multiple sclerosis) OCT datasets for retinal layer

segmentation demonstrated very promising segmentation accuracy.

Keywords: retina OCT ·surface segmentation ·deep learning ·diﬀer-

entiable dynamic programming.

1 Introduction

Highly accurate surface segmentation for retina optical coherence tomography

(OCT) is a clinical necessity in many diagnostic and treatment tasks of oph-

thalmic diseases. In retina OCT imaging, the frequent presence of weak image

boundaries complicated by image artifacts often leads to undesirable boundary

spikes with many automated segmentation methods. However, experienced oph-

thalmologists can well delineate retinal surfaces from OCT scans in those diﬃcult

scenarios while taking advantage of their global shape information and mutual

interaction. This indicates that surface insuﬃciency can be remedied by mak-

ing use of surface shape and context priors in the segmentation methods [29].

We thus propose to seamlessly integrate diﬀerentiable dynamic programming

(DDP) [19] into a deep learning framework with an end-to-end training for retina

OCT surface segmentation to enforce surface smoothness.

Many retina OCT segmentation methods have been proposed in past years.

Garvin et al. ﬁrst introduced the graph-based optimal surface segmentation

arXiv:2210.06335v1 [eess.IV] 8 Oct 2022

2 H. Xie et al.

method [14] for surface delineation in retinal OCT [6], which was further de-

veloped by incorporating various a priori knowledge reﬂecting anatomic and

imaging information [29,25]. Other known OCT surface segmentation approaches

include level set [2,7,17], probabilistic global shape model [21], random forest

classiﬁer [12,30], and dynamic programming [3,32,10,20]. Each of these tradi-

tional methods has its own strength. They all share a common drawback that is

their dependence on handcrafted features.

Armed with superior data representation learning capacity, deep learning

(DL) methods are emerging as powerful alternatives to traditional segmentation

algorithms for many medical image segmentation tasks [15,28]. Fully convolu-

tional networks (FCNs) [24,18], Convolutional neural networks (CNNs) [27], and

U-Net [23,13,8,16,31] have been utilized for retinal layer segmentation in OCT

images. Due to the scarcity of training data in medical imaging, it is yet nontriv-

ial for DL networks to implicitly learn global structures of the target surfaces.

Thus, the retinal layer topology cannot be guaranteed with those methods, nei-

ther the continuity and smoothness of the retinal surfaces can be ensured. To

address those limitations, the graph-based method and dynamic programming

were used as post-processing for the deep learning models to enforce surface

monotonicity and smoothness [4,11]. In this scheme, feature learning is, in fact,

disconnected from the downstream optimization; the learned features thus may

not be truly appropriated for the model. He et al. further extended the deep

regression idea[27] with fully diﬀerentiable soft-argmax operations to generate

surface positions followed by ReLU operations to guarantee the surface order in

their fully convolutional regression network (FCRN) [8]. The hybrid 2D-3D CNN

[16] using B-scan alignment was proposed to obtain continuous 3D retinal layer

surfaces from OCT. The IPM optimization method [31] eﬀectively integrates

the DL feature learning with the IPM optimization to enforce mutual interac-

tion between surfaces, but the IPM optimization runs on each A-scan. All these

methods[8,16,31] achieved highly accurate segmentation of retinal surfaces from

OCT. However, their performance is prone to be aﬀected by image outliers with

bad quality or artifacts with the limited size of training data [8], as they lack

the capability of explicitly learning surface smoothness structure.

This study proposes to unify the powerful feature learning capability of DL

with a constrained DDP module in a single deep neural network for end-to-

end training to achieve globally optimal segmentation while explicitly enforcing

surface smoothness. In the proposed segmentation framework, a U-Net with ad-

ditional image gradient channels [31] is leveraged as the backbone for learning

parameterized surface costs. The retinal surface inference by minimizing the total

surface cost while satisfying surface smoothness constraints is realized by a DDP

module for a globally optimal solution. The diﬀerentiability of the DDP module

enables eﬃcient backward propagation of gradients for an end-to-end learning.

To the best of our knowledge, this is the ﬁrst work to apply diﬀerentiable dy-

namic programming for surface segmentation in medical images. Experiments on

retina spectral-domain OCT datasets demonstrated improved surface segmenta-

tion accuracy.

A DL network with DDP for OCT segmentation 3

2 Method

2.1 Problem Formulation

Let I(X, Z of size X×Zbe a given 2D B-scan of an OCT image. For each

x= 0,1, . . . , X −1, the pixel subset {I(x, z)|0≤z < Z}forms a column

parallel to the z-axis, denoted by Col(x), which corresponds to an A-scan of

the OCT image. Our goal is to seek N > 0retinal surfaces, each of which Si

(i= 0,1, . . . , N −1) intersects every column Col(x)at exactly one location

z(i)

x, that is, I(x, z(i)

x)∈Si. To ﬁnd an optimal surface Si, each pixel I(x, z)

is associated with an on-surface cost ci(x, z), which is related to the likelihood

of I(x, z)on Si. Each retinal surface express a certain degree of smoothness,

which speciﬁes the maximum allowed change in the z-dimension of a feasible

surface along each unit distance change in the x-dimension. More speciﬁcally,

with given smoothness parameters ∆(i)

x>0for Si, if I(x, z0),I(x−1, z00)∈Si,

then |z0−z00| ≤ ∆(i)

x. The optimization objective of our surface segmentation

problem is to maximize the total on-surface cost of all pixels on the Nsought

surfaces S∗={S∗

0, S∗

1, . . . , S∗

N−1}, with

S∗= argmax

S={S0,S1,...,SN−1}

E(S) =

N−1

i=0 X

I(x,z)∈Si

ci(x, z)

s.t. |z(i)

x−z(i)

x+1| ≤ ∆(i)

x,for i= 0,1, ..., N −1;x= 0,1, ..., X −2.

(1)

2.2 Network Architecture

Gradient channels

Soft-max

on A-scan

L1 Loss

Segmentation

U-Net

MaxPool2d

3x3 Conv2d,

Change channels

InstanceNorm2d,

ReLu

3x3 Conv2d,

Keep channels,

InstanceNorm2d,

ReLu

Bi-linear

UpSample

Legend

1x1 Conv2d

Residual

connection

DDP

Multisurface

CE Loss

-(z-μ)2

Raw image

c=24

c=1536

c=768

c=384

c=192

c=96

c=48

c=24

c=768

c=384

c=192

c=96

c=48

Fig. 1. The U-Net based network architecture with additional seven gradient channels

as input. The number of channels for each layer is indicated with c. The DDP module

solves the optimization problem for segmentation and outputs optimal smooth retinal

surfaces Sfor the L1loss.

The proposed surface segmentation network is based on a U-Net architec-

ture [22], as illustrated in Fig. 1, which consists of seven convolution layers. This

U-Net acts as a feature-extracting module for the surface segmentation head. We

started with 24 feature maps in the ﬁrst convolution layer. In each downsampling

layer, a conv2d module followed by a 2x2 max-pooling doubles the feature maps,

and then a cascade of three same conv2d modules with a residual connection is

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

AdeeplearningnetworkwithdierentiabledynamicprogrammingforretinaOCTsurfacesegmentationHuiXie,WeiyuXu,andXiaodongWuTheUniversityofIowa,IowaCity,IA52242,USAxiaodong-wu@uiowa.eduAbstract.Multiple-surfacesegmentationinOpticalCoherenceTomog-raphy(OCT)imagesisachallengeproblem,furthercomplicatedbythefrequ...

展开>> 收起<<

A deep learning network with diﬀerentiable dynamic programming for retina OCT surface segmentation.pdf

共11页,预览3页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

A deep learning network with diﬀerentiable dynamic programming for retina OCT surface segmentation

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: