Rapid and robust endoscopic content area estimation A lean GPU-based pipeline and curated benchmark dataset Charlie Budda Luis C. Garcia-Peraza-Herreraa Martin Hubera Sebastien

2025-05-01 0 0 9.03MB 16 页 10玖币

侵权投诉

Rapid and robust endoscopic content area estimation: A lean

GPU-based pipeline and curated benchmark dataset

Charlie Budda, Luis C. Garcia-Peraza-Herreraa, Martin Hubera, Sebastien

Ourselina,b, and Tom Vercauterena,b

aKing’s College London, UK

bHypervision Surgical Ltd, UK

ARTICLE HISTORY

Compiled October 27, 2022

ABSTRACT

Endoscopic content area refers to the informative area enclosed by the dark, non-

informative, border regions present in most endoscopic footage. The estimation

of the content area is a common task in endoscopic image processing and com-

puter vision pipelines. Despite the apparent simplicity of the problem, several fac-

tors make reliable real-time estimation surprisingly challenging. The lack of rig-

orous investigation into the topic combined with the lack of a common bench-

mark dataset for this task has been a long-lasting issue in the ﬁeld. In this pa-

per, we propose two variants of a lean GPU-based computational pipeline combin-

ing edge detection and circle ﬁtting. The two variants diﬀer by relying on hand-

crafted features, and learned features respectively to extract content area edge

point candidates. We also present a ﬁrst-of-its-kind dataset of manually anno-

tated and pseudo-labelled content areas across a range of surgical indications. To

encourage further developments, the curated dataset, and an implementation of

both algorithms, has been made public (https://doi.org/10.7303/syn32148000,

https://github.com/charliebudd/torch-content-area). We compare our pro-

posed algorithm with a state-of-the-art U-Net-based approach and demonstrate sig-

niﬁcant improvement in terms of both accuracy (Hausdorﬀ distance: 6.3 px versus

118.1 px) and computational time (Average runtime per frame: 0.13 ms versus 11.2

ms).

KEYWORDS

Endoscopy; laparoscopy; computer vision; content area

1. Introduction

1.1. Endoscopic content area

In normal commercial cameras, the optics create a circular projection which fully

covers the image sensor, resulting in a full rectangular image. In minimally invasive

intervention, however, an endoscope is often used to allow an external camera to view

inside the patient. The size restriction of the endoscope constrains the ﬁeld of view

that can be captured by the optics. This reduction in the potentially visible area, and

the critical nature of the application, make it desirable for the surgeon to see a large

portion of the optical ﬁeld of view. As illustrated in Figure 1 and Figure 2, a typical

CONTACT Charlie Budd. Email: charles.budd@kcl.ac.uk

arXiv:2210.14771v1 [cs.CV] 26 Oct 2022

trade-oﬀ is thus to allow the circular projection to fall so that part of the image sensor

lies outside the projection. This maximises what is visible to the surgeon, but results

in dark, non-informative regions near the edge of the image. We deﬁne the endoscopic

content area as the informative region of the image, which is formed by the intersection

of the circular image projection with the image sensor. We further deﬁne the border as

being the (non-informative) sensor area not covered by the circular image projection.

Image sensor

Circular projection

Resultant content area

Figure 1. A diagram showing examples of the formation of the content area as the intersection of the circular

image projection and the rectangular image sensor. The leftmost content area forms a rectangle, while the

rightmost content area forms a complete circle. The central example, however, forms a more complex shape

formed from straight lines and circular arcs.

Figure 2. A frame from a laparoscopic procedure taken from the Cholec80 dataset (Twinanda et al. 2016).

The image shows a clear example of a circular endoscopic content area. The content area is bright and well

centred, the border region is dark and almost noise free, and the edge between them is clear and sharp. The

content area is incomplete in that one or more sections of the content area extend beyond the extents of the

frame. Examples such as this, where the top and bottom of the content area are cropped, are very common.

1.2. Content area estimation

Estimation of the content area is a common task in endoscopic image processing and

computer vision. We deﬁne the task as the estimation of the geometric shape of the

content area, not just a pixel wise segmentation.

The estimation forms a foundational building block when constructing a geometric

understanding of an endoscopic image. For example, determining where the endoscopic

view is centred, or if a detected tool is obscured by organic tissue or is simply exiting

the content area. Both of these tasks are vital when considering robotic control, such

as autonomous control of the endoscope (Gruijthuijsen et al. 2022). The knowledge

of the content area may also be useful when training and using deep learning vision

models. The regions outside of the content area contain non-informative details in the

form of noise and text overlays. Indeed, these details may, in fact, serve to bias a

model. For example, the information outside of the content area may be characteristic

of the endoscope used during the intervention. A model trained to detect the type

of intervention may learn to detect the characteristic non-content area information

present in the training data. Masking out these details could simplify the input and

remove such sources of bias. More speculatively, when training a segmentation or object

detection model, the loss function could be modiﬁed so it does not penalise predictions

made outside of the content area. In this way, the task to be learned may be simpliﬁed

as the model need not learn to correctly classify these regions. At inference time, any

activations in these regions may be discarded. Additionally, as the content area only

takes up a portion of the image, the amount of required computation can be reduced

by skipping border regions when performing inference for time-critical applications.

For any follow-up task to be able to rely on the estimated content area, a high level

of robustness must be achieved under all expected conditions. While still important,

precision is less of a concern, as a content area found to be slightly oﬀ from the true

content area will likely have little consequence on subsequent processing. To utilise

content area estimation in a real-time setting, the estimation would ideally use minimal

computing resources and processing time. Thereby leaving these resources available

for follow-up tasks.

As image sensor technology improves, and the manufacture of sensitive compact

image sensors becomes cheaper, it may become increasingly common to mount the

sensor on the end of the endoscope, known as chip-on-tip. This may remove the circular

border artefact currently mostly prominent when using proximal cameras. Should

chip-on-tip endoscopes become the only norm, estimation of the content area may

become less critical. Until such a time, it will remain important to be able to eﬃciently

and reliably detect the border. Several chip-on-tip endoscope manufacturer, especially

in the ﬂexible endoscopy ﬁeld, also continue to opt for an endoscope design with

incomplete content area. Finally, the ability to exploit historical endoscopic imaging

data also warrants the availability of robust content area estimation algorithms.

1.3. Challenges in content area estimation

Delineation between the border and the content area of the image is made non-trivial

by a few factors. Figure 3 shows a selection of endoscopic images demonstrating some

of these diﬃculties. Firstly, while the border is generally a uniform black, a fair amount

of low level noise is often observed, and imperfections in the scopes optics can result

in aberrations such as bright spots, diﬀuse light bleeding outside of the content area,

or imperfect circles. Secondly, the image within the content area may be adverse in

that it can have low brightness or contain within it a secondary circular oculus, such

as when the tip of the endoscope is only partially inserted through a trocar. Thirdly,

while the circular image projection is generally centred around the middle of the image,

it can in fact be signiﬁcantly oﬀset from the centre and its radius can fall within in

a large range, even passing beyond the horizontal extent of the image for much of

the image height. The spatial position and size of the circular image projection may

also be surprisingly dynamic throughout an intervention, varying due to mechanical

stresses placed through the endoscope and as the operator adjusts the zoom level on

(a) A saturated area at the edge of the content area

bleeds into the border.

(b) A dark content area leaves only a small segment

of the circle visible.

overlay.

(d) Text overlays the content area and border.

(e) Structure within the top right quadrant of the con-

tent area appears as a misleading circle segment.

(f) A dark content area combined with a mostly

cropped border provides a truly challenging example.

Figure 3. A selection of examples taken from our hand annotated dataset, chosen to portray some of the

adverse features faced during endoscopic content area estimation.

the camera. Finally, there can exist additional overlays such as secondary camera feeds,

logos, and text.

1.4. Related work

Despite the fundamental nature of endoscopic content area estimation, there has been

little published work dedicated to the topic. One possible reason for this is that the

problem may be assumed to be solved by trivial methods, such as applying an intensity

threshold. While this is indeed the case for the majority of images, we ﬁnd that real

endoscopic video often contains diﬃcult cases which are missed by such approaches.

Another potential explanation may relate to the absence of established datasets to

develop and evaluate content area estimation algorithms.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

Rapidandrobustendoscopiccontentareaestimation:AleanGPU-basedpipelineandcuratedbenchmarkdatasetCharlieBudda,LuisC.Garcia-Peraza-Herreraa,MartinHubera,SebastienOurselina,b,andTomVercauterena,baKing'sCollegeLondon,UKbHypervisionSurgicalLtd,UKARTICLEHISTORYCompiledOctober27,2022ABSTRACTEndoscopiccontent...

展开>> 收起<<

Rapid and robust endoscopic content area estimation A lean GPU-based pipeline and curated benchmark dataset Charlie Budda Luis C. Garcia-Peraza-Herreraa Martin Hubera Sebastien.pdf

共16页,预览4页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Rapid and robust endoscopic content area estimation A lean GPU-based pipeline and curated benchmark dataset Charlie Budda Luis C. Garcia-Peraza-Herreraa Martin Hubera Sebastien

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: