Semi-supervised object detection based on single-stage detector for thighbone fracture localization

2025-05-03 0 0 4.92MB 22 页 10玖币

侵权投诉

Semi-supervised object detection based on single-stage detector for

thighbone fracture localization

Jinman Weia, Jinkun Yaob, Guoshan Zhanga,∗

, Bin Guana, Yueming Zhanga, Shaoquan Wanga

aSchool of Electrical and Information Engineering, Tianjin University, Tianjin, 300072, China.

bDepartment of Radiology,Linyi People’s Hosptial,276000 Linyi,China.

Abstract

The thighbone is the largest bone supporting the lower body. If the thighbone fracture is not

treated in time, it will lead to lifelong inability to walk. Correct diagnosis of thighbone disease is

very important in orthopedic medicine. Deep learning is promoting the development of fracture

detection technology. However, the existing computer aided diagnosis (CAD) methods baesd

on deep learning rely on a large number of manually labeled data, and labeling these data costs

a lot of time and energy. Therefore, we develop a object detection method with limited labeled

image quantity and apply it to the thighbone fracture localization. In this work, we build

a semi-supervised object detection(SSOD) framework based on single-stage detector, which

including three modules: adaptive diﬃcult sample oriented (ADSO) module, Fusion Box and

deformable expand encoder (Dex encoder). ADSO module takes the classiﬁcation score as

the label reliability evaluation criterion by weighting, Fusion Box is designed to merge similar

pseudo boxes into a reliable box for box regression and Dex encoder is proposed to enhance the

adaptability of image augmentation. The experiment is conducted on the thighbone fracture

dataset, which includes 3484 training thigh fracture images and 358 testing thigh fracture

images. The experimental results show that the proposed method achieves the state-of the-

art AP in thighbone fracture detection at diﬀerent labeled data rates, i.e. 1%, 5% and 10%.

Besides, we use full data to achieve knowledge distillation, our method achieves 86.2% AP50

and 52.6% AP75.

Keywords: Semi-supervised Learning; Object Detection; Single-stage; Tighbone Fracture

Detection

1. Introduction

The thighbone is located below the pelvis. Thighbone and acetabulum constitute the hip

joint and play a role in supporting the whole body. Various activities of the human body depend

on the thighbone, so it is one of the most vulnerable part. The diagnosis of ordinary fracture

∗Corresponding author

Email addresses: 2021234147@tju.edu.cn (Jinman Wei), yjk1213@163.com (Jinkun Yao),

zhanggs@tju.edu.cn (Guoshan Zhang), guanbin@tju.edu.cn (Bin Guan), seife@tju.edu.cn (Yueming

Zhang), sqwang@tju.edu.cn (Shaoquan Wang)

Preprint submitted to Applied Soft Computing October 21, 2022

arXiv:2210.10998v1 [eess.IV] 20 Oct 2022

and comminuted fracture is a signiﬁcant part of surgical diagnosis[1]. However, compared with

the huge number of patients, there is a lack of excellent surgeons. Therefore, surgeons urgently

need an assistant to relieve their work pressure. In order to solve this problem, many computer-

aided detection and diagnosis methods[2] have been proposed. In recent years, substantial

progress has been made in developing deep learning-based CAD systems to fracture diagnosis.

Guan et al. proposed a convolutional neural network for thighbone fracture detection that can

balance the information of each feature map in ResNeXt’s feature pyramid.[3]. Firat et al.

designed an integrated object detection model for wrist X-ray image fracture detection[4]. At

present, the state-of-the-art fracture detection methods are usually developed based on large-

scale expert annotations such as 5134 labeled CT images for spinal fracture detection[5], 7356

wrist radiographic images[6], 9040 labeled hand, wrist, knee, ankle, foot and ankle radiographs

for multiple fracture detection[7].

Compared with the above-mentioned methods, semi-supervised learning (SSL) uses both

labeled data and unlabeled data when training the model, and uses unlabeled data to assist in

optimizing the model, so as to save training cost. The state-of-the-art semi-supervised methods

are the pseudo-label based approaches[8]. Speciﬁcally, the model is trained on labeled data,

and then the trained model is used to predict the pseudo labels on unlabeled images. Teacher-

student model[9] is a common method to generate pseudo labels in semi-supervised learning in

which key idea is to train two independent models, namely teacher model and student model.

The teacher model is trained on the labeled images to label the unlabeled images and then mix

these pseudo labeled images with the labeled images to train the student model.

Most research on SSOD has focused on the two-stage detectors[10–13]. But basing on the

single-stage detectors (such as FCOS[14], YOLOF[15], RestinaNet[16]) has more attractive

and practical, because they can be easily deployed on devices with limited resources, eliminate

cumbersome preprocessing and post-processing except for NMS[17]. The main diﬀerence be-

tween the single-stage detectors and the two-stage detectors is that Region proposal network

(RPN)[18] of the two-stage detector can ﬁlter most of the background samples, and in the next

stage, the remaining candidate boxes are further predicted the detailed categories. The single-

stage detectors make dense prediction for all areas of the image at one time, as long as few

bounding boxes can be predicted as positive samples. Because in the single-stage detectors, the

generation and judgment of the proposal are integrated, this lead to that the detection speed

is faster but the classiﬁcation score of one-stage detectors is lower than two-stage detectors.

And directly sending pseudo labels with low classiﬁcation score into student model will bring a

lot of noise and aﬀect the training accuracy of the model. Therefore, how to deal with a large

number of low-quality pseudo labels in dense prediction is still an important problem.

Regression branch is another component of object detection task. The regression quality

of pseudo box is another important factor that determines the performance of semi supervised

target detection model. Xu et al.[20] ﬁnd that the accuracy of the regression is related to the

uncertainty calculated by the BoxJitter module, but the BoxJitter module relies on Regions

with CNN features(RCNN) to process the proposal, so it is not applicable to the single-stage

detector. To address this issue, we propose the Fusion Box module in the regression branch for

SSOD based on single-stage detector.

In summary, we develop a semi-supervised framework based on the single-stage detector

for the thighbone fracture detection. In this framework, the adaptive diﬃcult sample ori-

ented(ADSO) module and the Fusion Box module are developed to reduce the impact of in-

accurate pseudo label prediction. In addition, The Single-in-Single-out (SISO) encoder called

Dex encoder is proposed to improve the adaptability of the augmented input images. The main

contributions of this paper can be summarized as follows:

1. We developed the semi-supervised object detection framework based on single-stage

detector for thighbone fracture detection with limited annotations. Compared with previous

work, it has fewer parameters and faster detection speed.

2. The adaptive diﬃcult sample oriented (ADSO) module is proposed to take the classiﬁ-

cation score of teacher model as the criterion of pseudo labels reliability.

3. The Fusion Box module is proposed to reduce the impact of multiple pseudo boxes

regression in the same position on model performance.

4. We design a Single-in-Single-out encoder named Deformable expand encoder (Dex en-

coder) for enhancing the learning ability of of deformed features.

5. The experimental results show that compared with supervised and semi-superviesed

methods, our method is better than other methods in thighbone fracture detection.

2. Related work

2.1. Deep learning for medical detection

CAD has been extensively studied in the past decade[21, 22], and CAD system based on deep

learning has been developed to diagnose a wide range of Pathology such as detection of covid-

19[23, 24], mass and calciﬁcation features in mammography[25] and brain tumor diagnosis[26].

In the fracture detection method based on deep learning[27], FAMO[7] constructed the Feature

Ambiguity Mitigate Operator model to mitigate feature ambiguity in bone fracture detection

on radiographs of various body parts. Due to the requirements of medical professional knowl-

edge, the labor cost of large-scale annotations is expensive which hinders the development of

CAD solutions based on deep learning. Computer aided detection using SSL method is an

emerging task in recent years, such as Yirui Wang et al. proposed the adaptive asymmetric

label sharpening (AALS) algorithm using the teacher-student model paradigm, which solves

the label imbalance problem unique to the medical ﬁeld[28].

2.2. object detection

Object detection is one of the core tasks in computer vision. At present, the object detector

based on CNN can be divided into single-stage and two-stage detectors. FasterRcnn[18] is

Figure 1: The structure of YOLOF.

the representative two-stage detector, which uses RPN network for proposal extraction and

RCNN head for regional prediction and extraction of objects. The single-stage detector only

uses the features extracted by the feature extraction network for regression and classiﬁcation.

For example, SSD[29] uses the feature pyramid method to complete target regression and

classiﬁcation on diﬀerent scale features at the same time. Chen et al. developed the YOLOF

that only uses C5 feature for detection as shown in Figure 1 in which the complex Multiple-in-

Multiple-out encoder is replaced by the simple Single-in-Single-out encoder, YOLOF containing

two key components: dilated encoder and uniform decoder.

2.3. Semi-supervised learning in object detection

SSL method plays a leading role in image classiﬁcation[31–35]. Because the object detector

has complex architecture design and multi task learning (classiﬁcation and regression), it is not a

simple work to transfer the SSL method to the object detection task. The current SSOD method

mainly has two directions: Consistency Regularization[36] and Pesudo Label[8]. The former

uses two deep convolution neural networks to learn the consistency between diﬀerent data

augmentation[37] (horizontal ﬂip, diﬀerent contrast, brightness, etc.) of the same unlabeled

image, and make the image prediction to small disturbance the same. The latter uses the

pre-training model learned on labeled data to infer the unlabeled data. In recent years, semi-

supervised object detection method has attracted people’s attention[38–40]. STAC[19] ﬁrst

applies pseudo label method to SSOD, it apply weak data augmentation to unlabeled data,

and uses the trained teacher model to generate pseudo labels of unlabeled images. Unbiased

teacher[41] uses focal loss[16] to solve the imbalance between positive and negative samples.

Instant teaching[42] trains two models at the same time to check and correct pseudo labels for

each other, so as to eﬀectively suppress the accumulation of false predictions. Almost all the

above work is based on the two-stage detector, such as FasterRcnn, which is not convenient to

develop in the medical ﬁeld with limited resources. Inspired by the above works, we designed

a fast semi-supervised detection model based on the single-stage detector.

Figure 2: The pipeline of established semi-supervised object detection framework: the labeled images and

unlabeled images are sent into the training pipeline in batches. The teacher labels the unlabeled images with

pseudo labels as student’s ground truth and the teacher does not back propagate. The student model adopts

EMA to transfer parameters and update the teacher model. The ADSO of classiﬁcation (Cls) branch adjusts

the conﬁdence of the pseudo labels to evaluate the reliability of the pseudo label. The regression(Reg) branch

judges whether to merge the pseudo boxes according to similarity ξ. The loss function of classiﬁcation branch

and regression branch adopts focal loss and CIOU loss respectively.

3. Methology

Our method adopts the teacher-student mutual learning mode in which the student model

learns from the detection loss of labeled and unlabeled images. The unlabeled images have two

groups of pseudo boxes, which are used for classiﬁcation branch and regression branch training,

respectively. The teacher model is updated by using the student model with exponential moving

average (EMA). The pseudo boxes predicted by the teacher model will be ﬁltered by conﬁdence

at ﬁrst, and then the pseudo labels with classiﬁcation scores higher than the conﬁdence threshold

σwill be retained. The remaining pseudo boxes will be sent to the classiﬁcation branch and

regression branch. In this SSOD framework, There are two critical designs: ADSO and Fusion

Box. Figure 2 shows the description of our SSOD framework.

3.1. Semi-supervised learning framework

In each training iteration, unlabeled images and labeled images are extracted according to a

certain data sampling ratio. The data are preprocessed by two diﬀerent preprocessing methods

to obtain strong augmented labeled images, weak augmented and strong augmented unlabeled

images. The student network is trained with the pseudo boxes generated by teacher model and

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

Semi-supervisedobjectdetectionbasedonsingle-stagedetectorforthighbonefracturelocalizationJinmanWeia,JinkunYaob,GuoshanZhanga,,BinGuana,YuemingZhanga,ShaoquanWangaaSchoolofElectricalandInformationEngineering,TianjinUniversity,Tianjin,300072,China.bDepartmentofRadiology,LinyiPeople'sHosptial,276000Li...

展开>> 收起<<

Semi-supervised object detection based on single-stage detector for thighbone fracture localization.pdf

共22页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Semi-supervised object detection based on single-stage detector for thighbone fracture localization

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: