ARUBA An Architecture-Agnostic Balanced Loss for Aerial Object Detection Rebbapragada V C Sairam Monish Keswani Uttaran Sinha Nishit Shah Vineeth N Balasubramanian_2

2025-04-30 0 0 5.98MB 10 页 10玖币
侵权投诉
ARUBA: An Architecture-Agnostic Balanced Loss for Aerial Object Detection
Rebbapragada V C Sairam Monish Keswani Uttaran Sinha
Nishit Shah Vineeth N Balasubramanian
Indian Institute of Technology Hyderabad
{ai20resch13001, monish.keswani, cs17mtech11003, cs18mtech11020, vineethnb}@iith.ac.in
Abstract
Deep neural networks tend to reciprocate the bias of
their training dataset. In object detection, the bias exists in
the form of various imbalances such as class, background-
foreground, and object size. In this paper, we denote size of
an object as the number of pixels it covers in an image and
size imbalance as the over-representation of certain sizes
of objects in a dataset. We aim to address the problem of
size imbalance in drone-based aerial image datasets. Ex-
isting methods for solving size imbalance are based on ar-
chitectural changes that utilize multiple scales of images or
feature maps for detecting objects of different sizes. We,
on the other hand, propose a novel ARchitectUre-agnostic
BAlanced Loss (ARUBA) that can be applied as a plu-
gin on top of any object detection model. It follows a
neighborhood-driven approach inspired by the ordinality of
object size. We evaluate the effectiveness of our approach
through comprehensive experiments on aerial datasets such
as HRSC2016, DOTAv1.0, DOTAv1.5 and VisDrone and ob-
tain consistent improvement in performance.
1. Introduction
In recent years, drones have shown immense potential in
numerous disciplines. In military warfare, they can be used
as target decoys for combat missions. In agriculture, drones
provide farmers with real-time data to make informed har-
vesting decisions. For search-and-rescue, they can reach
places where humans cannot. Alternatively, they are also
used in fire-fighting, delivery of essentials and aerial pho-
tography. This increasing demand for drones in various do-
mains has recently encouraged the computer vision commu-
nity to work extensively on vision from drones [3].
Deep neural networks have led computer vision research
and development for a decade now on multiple challeng-
ing problems such as semantic segmentation, object detec-
tion/tracking, as well as image classification. In object de-
tection, methods like FasterRCNN [27], YOLO [25], Reti-
naNet [15] and its variants have achieved decent perfor-
mance on many challenging datasets. With increased inter-
Figure 1: Predictions on an image from VisDrone dataset
[6] with Focal loss [14] vs Ours. Top: Focal loss fails
to detect many objects. Bottom: Ours is able to recog-
nize additional objects, including small ones, because of our
ARchitectUre-agnostic BAlanced (ARUBA) loss. Yellow
boxes indicate objects additionally detected.
est and creation of datasets in drone-based imagery, aerial
object detection [34,6] has gained a lot of interest from the
research community. Although the aforementioned meth-
ods exhibit exceptional performance on popular general ob-
ject detection datasets such as MSCOCO [16], aerial-object
datasets [34,6] pose more challenges, even to state-of-the-
art object detection models.
High variation in scale and orientation of objects in
aerial datasets, especially from drone images, make detect-
ing these objects quite challenging. Specialized methods
[8,37] have been proposed to capture the oriented bound-
ing boxes efficiently. An added difficulty in aerial datasets
[34,6,19] is that they are highly skewed in their object size
distribution in addition to the class distribution (as shown
in Figure 2. Note that in Figure 2b, x-axis shows the object
area bins where the size of the objects increases from left to
right and y-axis shows the number of object instances per
an area bin). We also observe that size imbalance is severe
in aerial object datasets when compared to more general-
purpose object detection datasets (refer Figure 3), which
motivates us to address this imbalance problem of drone-
based aerial datasets in this work.
arXiv:2210.04574v3 [cs.CV] 18 Nov 2023
144867
79337
29647 27059 24956
12875 10480 5926 4812 3246
Classes
Instances per class
0
50
100
150
200
250
car
pedestrian
motor
people
van
truck
bicycle
bus
tricycle
a-tricycle
(a) Class Imbalance
Area bins
Instances per bin
8
20
40
60
100
200
1
9
17
25
33
41
49
57
65
73
(b) Size Imbalance
Figure 2: Highly skewed class and size distributions in Vis-
Drone dataset
Size imbalance is a common problem in object detection
datasets, and many methods have been proposed to miti-
gate this issue, as summarized in [20]. Existing methods
[13,18,28] have largely proposed architectural modifica-
tions to enhance the model’s ability to view objects at dif-
ferent scales. However, such multi-scale approaches arise
from careful engineering of architectures to suit a specific
domain or setting. In this work, we propose to address the
size imbalance problem from an architecture-agnostic bal-
anced loss perspective. One could also view our approach
as a long-tailed perspective to a size balance problem, un-
like the class imbalance setting that is typically studied in
long-tailed detection/recognition problems. size imbalance
in aerial datasets. In contrary to the existing methods, we
propose an architecture-independent approach which can be
applied as a plugin on top of any object detection method.
Long-tailed object detection methods typically focus on
datasets with skewed class distribution, to improve perfor-
mance on detecting and classifying minority classes. Many
methods [10,2,32,31,5,30,24] have been proposed
to tackle this problem from a class imbalance perspective
(summarized in Sec 2). We focus on the idea of using loss-
reweighting [5,30,24] wherein higher weights are assigned
to tail classes. Unlike class labels, size (when distinguished
as large-to-small) is an ordinal variable making it non-trivial
to apply existing solutions for class imbalance to size. Be-
sides, as shown in Figure 2b, small-sized objects are dom-
inant in drone-based aerial datasets and large-sized objects
are sparse. Although large-sized objects are the tail, they
have larger spatial support which can provide richer and
more useful features compared to small objects which can
make it helpful to detect them.
On the other hand, learning small-sized objects, although
the majority in such datasets, is challenging, even for state-
of-the-art detection models [27,25,15]. The increasing use
of drone images and the lack of a consistent method for
detection of objects of different sizes in such datasets mo-
tivates us to solve the severe size imbalance in such aerial
datasets. In summary, we address the long-tailed size im-
balance issue in drone-based aerial datasets rather than the
long-tailed class imbalance issue that is typically addressed
in earlier related efforts.
To this end, we propose a novel architecture-agnostic
loss-reweighting strategy which considers the ordinality of
Area bins
Instances per bin
20
40
60
80
200
1
9
17
25
33
41
49
57
65
73
(a) Size Imbalance-COCO
Area bins
Instances per bin
8
20
40
60
100
200
1
9
17
25
33
41
49
57
65
73
(b) Size Imbalance-VisDrone
Figure 3: Comparison of size imbalance severity between
general and drone-based aerial object datasets. Note that y-
axis is log of frequency, hence the effect is exponential in
terms of occurrence.
the size variable in its design. The performance of an object
detection model on instances of a given size would have a
contribution from object instances of neighboring sizes. For
example, given a particular class, a model learned on object
instances of area Xis more likely to recognize an instance
of area X±δrather than X±kδ, where kis a large integer.
We hence apply a Gaussian amplification on the size distri-
bution to consider the effect of such neighborhood instances
(as detailed in Section 3).
We subsequently use a clustering approach to assign
weights to object instances based on their sizes. Finally, in-
spired by previous balanced loss work which focus on class
imbalance [5], we reweight the loss based on size clusters
to suit our problem. Unlike existing methods for long-tailed
class imbalance which assign lower weights to head cate-
gories, our method assigns higher weights to the head cat-
egories (small-sized objects) ensuring that the model learns
better on them. We show that the size-imbalance problem
can be addressed using such a loss-based approach without
the need for time-consuming architecture engineering. To
summarize, our key contributions are as follows:
We propose a novel architecture-agnostic loss-
reweighting strategy to solve the severe size im-
balance issue in drone-based aerial image datasets.
We call this ARchitectUre-agnostic BAlanced Loss
(ARUBA), which can be applied while training any
object detection model.
To the best of our knowledge, this is the first such loss-
based approach to handle size imbalance in this do-
main. Our key observations around the ordinality of
the considered categories and the connection of such
ordering to a model’s performance may be useful in
other settings with ordinal categories (e.g. class labels
of a disease with increasing severity levels).
We propose a simple yet effective pipeline based on
well-known modules to achieve the objectives using
our loss-reweigting strategy. Our extensive experimen-
tal results corroborate the usefulness of this pipeline.
We perform a comprehensive suite of experiments on
multiple drone-based aerial image datasets including
HRSC2016, DOTA-v1.0, DOTA-v1.5 and VisDrone to
摘要:

ARUBA:AnArchitecture-AgnosticBalancedLossforAerialObjectDetectionRebbapragadaVCSairamMonishKeswaniUttaranSinhaNishitShahVineethNBalasubramanianIndianInstituteofTechnologyHyderabad{ai20resch13001,monish.keswani,cs17mtech11003,cs18mtech11020,vineethnb}@iith.ac.inAbstractDeepneuralnetworkstendtorecipro...

展开>> 收起<<
ARUBA An Architecture-Agnostic Balanced Loss for Aerial Object Detection Rebbapragada V C Sairam Monish Keswani Uttaran Sinha Nishit Shah Vineeth N Balasubramanian_2.pdf

共10页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:10 页 大小:5.98MB 格式:PDF 时间:2025-04-30

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 10
客服
关注