2 H. Xie et al.
method [14] for surface delineation in retinal OCT [6], which was further de-
veloped by incorporating various a priori knowledge reflecting anatomic and
imaging information [29,25]. Other known OCT surface segmentation approaches
include level set [2,7,17], probabilistic global shape model [21], random forest
classifier [12,30], and dynamic programming [3,32,10,20]. Each of these tradi-
tional methods has its own strength. They all share a common drawback that is
their dependence on handcrafted features.
Armed with superior data representation learning capacity, deep learning
(DL) methods are emerging as powerful alternatives to traditional segmentation
algorithms for many medical image segmentation tasks [15,28]. Fully convolu-
tional networks (FCNs) [24,18], Convolutional neural networks (CNNs) [27], and
U-Net [23,13,8,16,31] have been utilized for retinal layer segmentation in OCT
images. Due to the scarcity of training data in medical imaging, it is yet nontriv-
ial for DL networks to implicitly learn global structures of the target surfaces.
Thus, the retinal layer topology cannot be guaranteed with those methods, nei-
ther the continuity and smoothness of the retinal surfaces can be ensured. To
address those limitations, the graph-based method and dynamic programming
were used as post-processing for the deep learning models to enforce surface
monotonicity and smoothness [4,11]. In this scheme, feature learning is, in fact,
disconnected from the downstream optimization; the learned features thus may
not be truly appropriated for the model. He et al. further extended the deep
regression idea[27] with fully differentiable soft-argmax operations to generate
surface positions followed by ReLU operations to guarantee the surface order in
their fully convolutional regression network (FCRN) [8]. The hybrid 2D-3D CNN
[16] using B-scan alignment was proposed to obtain continuous 3D retinal layer
surfaces from OCT. The IPM optimization method [31] effectively integrates
the DL feature learning with the IPM optimization to enforce mutual interac-
tion between surfaces, but the IPM optimization runs on each A-scan. All these
methods[8,16,31] achieved highly accurate segmentation of retinal surfaces from
OCT. However, their performance is prone to be affected by image outliers with
bad quality or artifacts with the limited size of training data [8], as they lack
the capability of explicitly learning surface smoothness structure.
This study proposes to unify the powerful feature learning capability of DL
with a constrained DDP module in a single deep neural network for end-to-
end training to achieve globally optimal segmentation while explicitly enforcing
surface smoothness. In the proposed segmentation framework, a U-Net with ad-
ditional image gradient channels [31] is leveraged as the backbone for learning
parameterized surface costs. The retinal surface inference by minimizing the total
surface cost while satisfying surface smoothness constraints is realized by a DDP
module for a globally optimal solution. The differentiability of the DDP module
enables efficient backward propagation of gradients for an end-to-end learning.
To the best of our knowledge, this is the first work to apply differentiable dy-
namic programming for surface segmentation in medical images. Experiments on
retina spectral-domain OCT datasets demonstrated improved surface segmenta-
tion accuracy.