Rapid and robust endoscopic content area estimation A lean GPU-based pipeline and curated benchmark dataset Charlie Budda Luis C. Garcia-Peraza-Herreraa Martin Hubera Sebastien

2025-05-01 0 0 9.03MB 16 页 10玖币
侵权投诉
Rapid and robust endoscopic content area estimation: A lean
GPU-based pipeline and curated benchmark dataset
Charlie Budda, Luis C. Garcia-Peraza-Herreraa, Martin Hubera, Sebastien
Ourselina,b, and Tom Vercauterena,b
aKing’s College London, UK
bHypervision Surgical Ltd, UK
ARTICLE HISTORY
Compiled October 27, 2022
ABSTRACT
Endoscopic content area refers to the informative area enclosed by the dark, non-
informative, border regions present in most endoscopic footage. The estimation
of the content area is a common task in endoscopic image processing and com-
puter vision pipelines. Despite the apparent simplicity of the problem, several fac-
tors make reliable real-time estimation surprisingly challenging. The lack of rig-
orous investigation into the topic combined with the lack of a common bench-
mark dataset for this task has been a long-lasting issue in the field. In this pa-
per, we propose two variants of a lean GPU-based computational pipeline combin-
ing edge detection and circle fitting. The two variants differ by relying on hand-
crafted features, and learned features respectively to extract content area edge
point candidates. We also present a first-of-its-kind dataset of manually anno-
tated and pseudo-labelled content areas across a range of surgical indications. To
encourage further developments, the curated dataset, and an implementation of
both algorithms, has been made public (https://doi.org/10.7303/syn32148000,
https://github.com/charliebudd/torch-content-area). We compare our pro-
posed algorithm with a state-of-the-art U-Net-based approach and demonstrate sig-
nificant improvement in terms of both accuracy (Hausdorff distance: 6.3 px versus
118.1 px) and computational time (Average runtime per frame: 0.13 ms versus 11.2
ms).
KEYWORDS
Endoscopy; laparoscopy; computer vision; content area
1. Introduction
1.1. Endoscopic content area
In normal commercial cameras, the optics create a circular projection which fully
covers the image sensor, resulting in a full rectangular image. In minimally invasive
intervention, however, an endoscope is often used to allow an external camera to view
inside the patient. The size restriction of the endoscope constrains the field of view
that can be captured by the optics. This reduction in the potentially visible area, and
the critical nature of the application, make it desirable for the surgeon to see a large
portion of the optical field of view. As illustrated in Figure 1 and Figure 2, a typical
CONTACT Charlie Budd. Email: charles.budd@kcl.ac.uk
arXiv:2210.14771v1 [cs.CV] 26 Oct 2022
trade-off is thus to allow the circular projection to fall so that part of the image sensor
lies outside the projection. This maximises what is visible to the surgeon, but results
in dark, non-informative regions near the edge of the image. We define the endoscopic
content area as the informative region of the image, which is formed by the intersection
of the circular image projection with the image sensor. We further define the border as
being the (non-informative) sensor area not covered by the circular image projection.
Image sensor
Circular projection
Resultant content area
Figure 1. A diagram showing examples of the formation of the content area as the intersection of the circular
image projection and the rectangular image sensor. The leftmost content area forms a rectangle, while the
rightmost content area forms a complete circle. The central example, however, forms a more complex shape
formed from straight lines and circular arcs.
Figure 2. A frame from a laparoscopic procedure taken from the Cholec80 dataset (Twinanda et al. 2016).
The image shows a clear example of a circular endoscopic content area. The content area is bright and well
centred, the border region is dark and almost noise free, and the edge between them is clear and sharp. The
content area is incomplete in that one or more sections of the content area extend beyond the extents of the
frame. Examples such as this, where the top and bottom of the content area are cropped, are very common.
1.2. Content area estimation
Estimation of the content area is a common task in endoscopic image processing and
computer vision. We define the task as the estimation of the geometric shape of the
content area, not just a pixel wise segmentation.
The estimation forms a foundational building block when constructing a geometric
understanding of an endoscopic image. For example, determining where the endoscopic
view is centred, or if a detected tool is obscured by organic tissue or is simply exiting
the content area. Both of these tasks are vital when considering robotic control, such
2
as autonomous control of the endoscope (Gruijthuijsen et al. 2022). The knowledge
of the content area may also be useful when training and using deep learning vision
models. The regions outside of the content area contain non-informative details in the
form of noise and text overlays. Indeed, these details may, in fact, serve to bias a
model. For example, the information outside of the content area may be characteristic
of the endoscope used during the intervention. A model trained to detect the type
of intervention may learn to detect the characteristic non-content area information
present in the training data. Masking out these details could simplify the input and
remove such sources of bias. More speculatively, when training a segmentation or object
detection model, the loss function could be modified so it does not penalise predictions
made outside of the content area. In this way, the task to be learned may be simplified
as the model need not learn to correctly classify these regions. At inference time, any
activations in these regions may be discarded. Additionally, as the content area only
takes up a portion of the image, the amount of required computation can be reduced
by skipping border regions when performing inference for time-critical applications.
For any follow-up task to be able to rely on the estimated content area, a high level
of robustness must be achieved under all expected conditions. While still important,
precision is less of a concern, as a content area found to be slightly off from the true
content area will likely have little consequence on subsequent processing. To utilise
content area estimation in a real-time setting, the estimation would ideally use minimal
computing resources and processing time. Thereby leaving these resources available
for follow-up tasks.
As image sensor technology improves, and the manufacture of sensitive compact
image sensors becomes cheaper, it may become increasingly common to mount the
sensor on the end of the endoscope, known as chip-on-tip. This may remove the circular
border artefact currently mostly prominent when using proximal cameras. Should
chip-on-tip endoscopes become the only norm, estimation of the content area may
become less critical. Until such a time, it will remain important to be able to efficiently
and reliably detect the border. Several chip-on-tip endoscope manufacturer, especially
in the flexible endoscopy field, also continue to opt for an endoscope design with
incomplete content area. Finally, the ability to exploit historical endoscopic imaging
data also warrants the availability of robust content area estimation algorithms.
1.3. Challenges in content area estimation
Delineation between the border and the content area of the image is made non-trivial
by a few factors. Figure 3 shows a selection of endoscopic images demonstrating some
of these difficulties. Firstly, while the border is generally a uniform black, a fair amount
of low level noise is often observed, and imperfections in the scopes optics can result
in aberrations such as bright spots, diffuse light bleeding outside of the content area,
or imperfect circles. Secondly, the image within the content area may be adverse in
that it can have low brightness or contain within it a secondary circular oculus, such
as when the tip of the endoscope is only partially inserted through a trocar. Thirdly,
while the circular image projection is generally centred around the middle of the image,
it can in fact be significantly offset from the centre and its radius can fall within in
a large range, even passing beyond the horizontal extent of the image for much of
the image height. The spatial position and size of the circular image projection may
also be surprisingly dynamic throughout an intervention, varying due to mechanical
stresses placed through the endoscope and as the operator adjusts the zoom level on
3
(a) A saturated area at the edge of the content area
bleeds into the border.
(b) A dark content area leaves only a small segment
of the circle visible.
(c) A partially cropped circle combined with a black
overlay.
(d) Text overlays the content area and border.
(e) Structure within the top right quadrant of the con-
tent area appears as a misleading circle segment.
(f) A dark content area combined with a mostly
cropped border provides a truly challenging example.
Figure 3. A selection of examples taken from our hand annotated dataset, chosen to portray some of the
adverse features faced during endoscopic content area estimation.
the camera. Finally, there can exist additional overlays such as secondary camera feeds,
logos, and text.
1.4. Related work
Despite the fundamental nature of endoscopic content area estimation, there has been
little published work dedicated to the topic. One possible reason for this is that the
problem may be assumed to be solved by trivial methods, such as applying an intensity
threshold. While this is indeed the case for the majority of images, we find that real
endoscopic video often contains difficult cases which are missed by such approaches.
Another potential explanation may relate to the absence of established datasets to
develop and evaluate content area estimation algorithms.
4
摘要:

Rapidandrobustendoscopiccontentareaestimation:AleanGPU-basedpipelineandcuratedbenchmarkdatasetCharlieBudda,LuisC.Garcia-Peraza-Herreraa,MartinHubera,SebastienOurselina,b,andTomVercauterena,baKing'sCollegeLondon,UKbHypervisionSurgicalLtd,UKARTICLEHISTORYCompiledOctober27,2022ABSTRACTEndoscopiccontent...

展开>> 收起<<
Rapid and robust endoscopic content area estimation A lean GPU-based pipeline and curated benchmark dataset Charlie Budda Luis C. Garcia-Peraza-Herreraa Martin Hubera Sebastien.pdf

共16页,预览4页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:16 页 大小:9.03MB 格式:PDF 时间:2025-05-01

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 16
客服
关注