1
Dual-Stage Deeply Supervised Attention-based
Convolutional Neural Networks for Mandibular
Canal Segmentation in CBCT Scans
Muhammad Usman1,2, Azka Rehman1, Amal Saleem1, Rabeea Jawaid3, Shi-Sub Byon1, Sung-Hyun Kim1,
Byoung Dai Lee3, Byung-il Lee1, and Yeong-Gil Shin2
1Center for Artificial Intelligence in Medicine and Imaging, HealthHub Co. Ltd., Seoul, 06524, South Korea
2Seoul National University, Seoul, Republic of Korea
3Division of AI and Computer Engineering, Kyonggi University, Suwon, Republic of South Korea
Abstract—Accurate segmentation of mandibular canals in
lower jaws is important in dental implantology. Medical experts
determine the implant position and dimensions manually from
3D CT images to avoid damaging the mandibular nerve inside
the canal. In this paper, we propose a novel dual-stage deep
learning-based scheme for the automatic segmentation of the
mandibular canal. Particularly, we first enhance the CBCT scans
by employing the novel histogram-based dynamic windowing
scheme, which improves the visibility of mandibular canals. After
enhancement, we design 3D deeply supervised attention U-Net
architecture for localizing the volumes of interest (VOIs), which
contain the mandibular canals (i.e., left and right canals). Finally,
we employed the multi-scale input residual U-Net architecture
(MS-R-UNet) to segment the mandibular canals using VOIs
accurately. The proposed method has been rigorously evaluated
on 500 scans. The results demonstrate that our technique out-
performs the current state-of-the-art segmentation performance
and robustness methods.
Index Terms—Mandibular Canal, 3D Segmentation, Jaw Lo-
calization
I. INTRODUCTION
Inferior alveolar nerve (IAN), also known as mandibular
canal, is the most critical structures in the mandible region that
supplies sensation to the lower teeth. The sensation, provided
to lips and chin is via the mental nerve which passing through
the mental foramen [1]. One of a very critical steps in implant
placement, third molar extraction, and various other craniofa-
cial procedures including orthognathic surgery, is determining
the position of the mandibular canal. Patients may experience
aches and pain and temporary paralysis if the mandibular canal
get injured [2] [3] during any of these process. Localization
of the mandibular canal is important not only for diagnosis
of vascular and neurogenic diseases associated with the nerve,
but also for diagnosis of lesions near the mandibular canal,
and planning of oral and maxillofacial procedures. Therefore,
preoperative treatment planning and simulation are necessary
to avoid nerve injury. The identification of exact location of
can assist in achieving the planning strategy required for the
task at hand [4].
One of the most frequently used three-dimensional (3D)
imaging modalities for preoperative treatment planning and
postoperative assessment in dentistry is Cone Beam Computed
Tomography which is also known as CBCT [5]. The CBCT
volume is reconstructed using projection images realized from
different angles with a cone-shaped beam and stored as a
sequence of axial images [6]. A clinical replacement is multi-
detector computed tomography (MDCT), but its application
is limited by high radiation dose and insufficient spatial
resolution. In contrast, the CBCT allows more precise imaging
of hard tissues in the dentomaxillofacial area and its effective
radiation dosage is lower than that of the MDCT1. In addition,
CBCT is inexpensive and readily available. However, in prac-
tice, there are certain challenges associated with mandibular
canal segmentation from CBCT images, such as inaccurate
density and large amount of noise [7].
Surgical planning and pre-surgical examination are crucial
in dental clinics. One of the standard imaging tools used for
such assessments and planning is a panoramic radiograph,
constructed from a dental arch to provide all the relevant infor-
mation in a single view. These radiographs bear disadvantages
such as difficulties in determining the 3D rendering of an entire
canal and connected nerves [8]. One of the most common
approaches for preoperative assessment is to annotate the canal
in 3D images to produce the segmentation of the canal. This
kind of manual annotation is a very knowledge-intensive, time-
consuming, and tedious task. Thus, there is a need for a tool to
assist the radiologist and reduce the burden by using automatic
or semi-automatic segmentation of the canal.
Kwak et. al. [9] studied different models based on 2D and
3D techniques such as on 2D SegNet, and 2D and 3D U-Nets.
Their study also involved detailed pre and post-processing
steps including thresholding of teeth as well as bones. Jaskari
et. al. [10] presented an FCNN-based model to extract IAN.
Dhar et. al. [11] used a model based on 3D UNet to segment
the canal. They used pre-processing techniques to generate
the center lines of the mandibular canals and used them as
ground truths in the training process. Verhelst et. al. [12]
used a patch-based technique to localize the jaw and then
used the 3d UNet model to segment the canal in that ROI.
Lahoud et. al. [13] first coarsely segmented the canal, and
performed fine segmentation of the canal on patches that
arXiv:2210.03739v4 [eess.IV] 2 Nov 2022