2 S. Janarthan et al.
due to their high computational and memory requirements [3]. Large labelled
data requirement for training is another flip side of conventional deep learning
techniques [4]. Recently proposed semi-supervised learning of deep networks is
also not ideal for this problem as they frequently demonstrate low accuracies
and produce unstable iteration results.
In order to prevent the deep model from overfitting, it is also paramount
to provide sufficient data during the training phase [5]. However, constructing
a large labelled data in the agriculture domain, especially for plant pests, re-
quires not only high standard of expertise but also time-consuming. Moreover,
inaccurate labelling of the training data produces deep models with reduced re-
liability. Few-shot learning concept is proposed simply by replicating humans’
ability to recognize any objects with the help of only a few examples [6]. Few-
shot learning has gained popularity across various domains as it can address
the classification task with a few training samples. In few-shot learning, the
classification accuracy increases as the number of shots grow. However, a ma-
jor limitation of few-shot learning is that the prediction accuracy drops when
the number of ways increases [7]. Directly applying the classification knowledge
learned from meta-train classes to meta-test classes is mostly not feasible, which
is another fundamental problem of this approach [8].
Constructing decent-performing models with the reduced number of train-
able parameters by downsizing the kernel size of convolutions (e.g., from 3 ×3
to 1 ×1 as demonstrated in [9]) is a significant step towards the development
of lightweight networks. In recent years, lightweight deep network architectures
have gained growing popularity as an alternative to traditional deep networks
[10,11,12,13]. The MobileNets [14] and EfficientNets [15] families are thus far two
most widely used lightweight networks. Several lightweight deep network-based
techniques have also been proposed for real-time pest recognition [35,36]. The
lightweight architectures however suffer to reach the expected level of classifica-
tion accuracy as they are essentially developed for faster and lighter deployment
by sacrificing the performance.
Considering this issue, a novel high-performing and lightweight pest recog-
nition approach is proposed in this study, as illustrated in Figure 1. While pre-
serving the lightweight characteristic of the deep network, a double attention
mechanism is infused to enhance the classification performance. As the attention
closely imitates the natural cognition of the human brain, the most influential
regions of the pest images are enhanced to learn better feature representations.
Notably, attention-aware deep networks have shown improved performances in
various classification tasks [18,19]. The key contributions of this paper are three-
fold:
–A novel lightweight network-based framework integrated with a double atten-
tion scheme is proposed for enhancing the in-field pest recognition, especially
using small training data.
–A set of extensive experiments were conducted under diverse environments
to reveal the feasibility and validate the in-field applicability of the pro-
posed framework. To organize diverse environments, three publicly available
datasets consisting of small to large number of pest samples are utilized.