Open-source High-precision Autonomous Suturing
Framework With Visual Guidance
Hongbin Lin1,∗, Bin Li1,∗, Yunhui Liu1and Kwok Wai Samuel Au1
Abstract—Autonomous surgery has attracted increasing at-
tention for revolutionizing robotic patient care, yet remains a
distant and challenging goal. In this paper, we propose an
image-based framework for high-precision autonomous suturing
operation. We first build an algebraic geometric algorithm
to achieve accurate needle pose estimation, then design the
corresponding keypoint-based calibration network for joint-offset
compensation, and further plan and control suture trajectory.
Our solution ranked first among all competitors in the AccelNet
Surgical Robotics Challenge. Videos and codes can be found in
https://sites.google.com/view/accel-2022-cuhk.
I. INTRODUCTION
Autonomous suturing is a long-standing robotic challenge
for surgical autonomy [1]. Yet, research in surgical autonomy
was slowed down by limited scopes of benchmarking setups
and lack of standardization [2]. Aiming to address these
issues, organizers raised the 2021-2022 AccelNet Surgical
Robotic Challenge (online), providing high-fidelity simula-
tion, standardized problem definitions, and evaluations for
benchmarking autonomous suturing [2]. Targeting at three
main problems in AccelNet Challenge, including needle pose
estimation, needle grasp under joint-offset, and the multi-loop
suture operation, we built an autonomous suturing framework
and achieved state-of-the-art (SOTA) performance. Our main
contribution are:
•a SOTA image-guided framework for high-precision au-
tonomous suturing tasks in AccelNet Challenge.
•off-the-shelf open-source software of our framework for
easy reproduction and future research acceleration in the
robotic surgery field.
II. METHOD AND RESULT
A. Visual Perception
1) Needle Tracking: The goal of this task is to identify
the suture pose T∈R4x4based on stereo visual perception
Il,r, and it provides the basis for the autonomous suturing
operation. To this end, we first build a multi-task network
based on MaskRCNN [3] including the segmentation, and key
point heads, which are used to extract coarse needle mask Ml,r
c
* is Equal contribution. The first two authors contributed equally.
This work was supported in part by the CUHK Chow Yuk Ho Technology
Centre of Innovative Medicine, in part by the Multi-Scale Medical Robotics
Centre, InnoHk, 8312051, RGC T42-409/18-R, in part by SHIAE (BME-p1-
17), and in part by the Natural Science Foundation of China under Grant
U1613202. (Corresponding author: K. W. Samuel Au)
1Hongbin Lin, Bin Li, Yunhui Liu and Kwok Wai Samuel Au are
with Department of Mechanical, The Chinese University of Hong Kong,
Hong Kong. {hongbinlin, binli}@link.cuhk.edu.hk;
{yhliu, samuelau}@cuhk.edu.hk
and the start/end key-points Pl,r
xy, respectively. We further
designed a coarse-to-fine strategy to optimize the coarse mask
in RGB color space to obtain fine segmentation Ml,r
f. The
above process is formulated as follows:
[Ml,r
c,Pl,r
xy] = Fmultitask (Il,r ),Ml,r
c
optimize
−−−−−−→ Ml,r
f(1)
To recover the needle pose Tfrom the extracted mask and
key points, we hereby present an algebraic geometry-based
algorithm. We define the six Degree of Freedoms (DOFs) of
needle x: [θ1,2, kpst,ed
x,y ]∈R6to represent the needle pose
uniquely which contains four projected coordinates kpst,ed
x,y in
image plane and two angles θ1,2locate in the rays-formed
plane, shown in Fig. 1b. Given a hypothetical needle pose x,
the corresponding reprojected axis points set S(l,r)(x)can be
calculated. Therefore, the goal of the needle pose estimation is
to find a xthat minimize the reprojection offset error between
points set S(l,r)(x)and needle fine mask M(l,r)
f:
arg min
x
JA=X
k∈{r,l}X
pi∈M(k)
f
min
pj∈S(k)(x)||pi−pj||2
2(2)
Note that the objective JAcan innately handle the partially
occluded case, since the occluded part is not reflected in
the mask and thus does not affect the optimization process.
This feature can further expand the use scenarios for needle
pose estimation, since the needle is usually partially occluded
by the PSM capture during surgery. The needle pose is
optimized in x-space with gradient optimization method, and
then transformed into Cartesian space T. We randomly placed
needle and camera pose with their distance range from 80
to 200 mm, and collected 1k samples for training the multi-
task network. The optimization is performed in a maximum
of 1.5k steps. Our needle pose estimation algorithm achieved
an average position error of 0.3 mm and angular error of 1.1
degrees on the AccelNet suture platform.
2) Joint Calibration with Monocular Camera: There is an
unknown bias ∆q∈R6between actual and measured joint
position, qand qmsr, for a Patent Side Manipulator (PSM),
where ∆q=q−qmsr . The goal of joint calibration is to
identify the unknown joint bias ∆qduring evaluation. First,
we tracked Nnon-colinear feature points on the jaw of PSMs
using DeepLabCut (DLC) [4], a state-of-the-art, markerless,
data-driven framework that achieves high tracking accuracy
using limited human-labeled data. Nwas set to 4since N≥3
suffices to identify a unique pose. The position of pixels for
tracked features on a RGB monocular image IRBG along x
and y axes, xf t and yf t respectively, were predicted by a
arXiv:2210.01406v2 [cs.RO] 28 May 2024