
144867
79337
29647 27059 24956
12875 10480 5926 4812 3246
Classes
Instances per class
0
50
100
150
200
250
car
pedestrian
motor
people
van
truck
bicycle
bus
tricycle
a-tricycle
(a) Class Imbalance
Area bins
Instances per bin
8
20
40
60
100
200
1
9
17
25
33
41
49
57
65
73
(b) Size Imbalance
Figure 2: Highly skewed class and size distributions in Vis-
Drone dataset
Size imbalance is a common problem in object detection
datasets, and many methods have been proposed to miti-
gate this issue, as summarized in [20]. Existing methods
[13,18,28] have largely proposed architectural modifica-
tions to enhance the model’s ability to view objects at dif-
ferent scales. However, such multi-scale approaches arise
from careful engineering of architectures to suit a specific
domain or setting. In this work, we propose to address the
size imbalance problem from an architecture-agnostic bal-
anced loss perspective. One could also view our approach
as a long-tailed perspective to a size balance problem, un-
like the class imbalance setting that is typically studied in
long-tailed detection/recognition problems. size imbalance
in aerial datasets. In contrary to the existing methods, we
propose an architecture-independent approach which can be
applied as a plugin on top of any object detection method.
Long-tailed object detection methods typically focus on
datasets with skewed class distribution, to improve perfor-
mance on detecting and classifying minority classes. Many
methods [10,2,32,31,5,30,24] have been proposed
to tackle this problem from a class imbalance perspective
(summarized in Sec 2). We focus on the idea of using loss-
reweighting [5,30,24] wherein higher weights are assigned
to tail classes. Unlike class labels, size (when distinguished
as large-to-small) is an ordinal variable making it non-trivial
to apply existing solutions for class imbalance to size. Be-
sides, as shown in Figure 2b, small-sized objects are dom-
inant in drone-based aerial datasets and large-sized objects
are sparse. Although large-sized objects are the tail, they
have larger spatial support which can provide richer and
more useful features compared to small objects which can
make it helpful to detect them.
On the other hand, learning small-sized objects, although
the majority in such datasets, is challenging, even for state-
of-the-art detection models [27,25,15]. The increasing use
of drone images and the lack of a consistent method for
detection of objects of different sizes in such datasets mo-
tivates us to solve the severe size imbalance in such aerial
datasets. In summary, we address the long-tailed size im-
balance issue in drone-based aerial datasets rather than the
long-tailed class imbalance issue that is typically addressed
in earlier related efforts.
To this end, we propose a novel architecture-agnostic
loss-reweighting strategy which considers the ordinality of
Area bins
Instances per bin
20
40
60
80
200
1
9
17
25
33
41
49
57
65
73
(a) Size Imbalance-COCO
Area bins
Instances per bin
8
20
40
60
100
200
1
9
17
25
33
41
49
57
65
73
(b) Size Imbalance-VisDrone
Figure 3: Comparison of size imbalance severity between
general and drone-based aerial object datasets. Note that y-
axis is log of frequency, hence the effect is exponential in
terms of occurrence.
the size variable in its design. The performance of an object
detection model on instances of a given size would have a
contribution from object instances of neighboring sizes. For
example, given a particular class, a model learned on object
instances of area Xis more likely to recognize an instance
of area X±δrather than X±kδ, where kis a large integer.
We hence apply a Gaussian amplification on the size distri-
bution to consider the effect of such neighborhood instances
(as detailed in Section 3).
We subsequently use a clustering approach to assign
weights to object instances based on their sizes. Finally, in-
spired by previous balanced loss work which focus on class
imbalance [5], we reweight the loss based on size clusters
to suit our problem. Unlike existing methods for long-tailed
class imbalance which assign lower weights to head cate-
gories, our method assigns higher weights to the head cat-
egories (small-sized objects) ensuring that the model learns
better on them. We show that the size-imbalance problem
can be addressed using such a loss-based approach without
the need for time-consuming architecture engineering. To
summarize, our key contributions are as follows:
• We propose a novel architecture-agnostic loss-
reweighting strategy to solve the severe size im-
balance issue in drone-based aerial image datasets.
We call this ARchitectUre-agnostic BAlanced Loss
(ARUBA), which can be applied while training any
object detection model.
• To the best of our knowledge, this is the first such loss-
based approach to handle size imbalance in this do-
main. Our key observations around the ordinality of
the considered categories and the connection of such
ordering to a model’s performance may be useful in
other settings with ordinal categories (e.g. class labels
of a disease with increasing severity levels).
• We propose a simple yet effective pipeline based on
well-known modules to achieve the objectives using
our loss-reweigting strategy. Our extensive experimen-
tal results corroborate the usefulness of this pipeline.
• We perform a comprehensive suite of experiments on
multiple drone-based aerial image datasets including
HRSC2016, DOTA-v1.0, DOTA-v1.5 and VisDrone to