cient and robust robotic weeding systems. To that end,
various deep neural networks (DNNs) have been devel-
oped and attained great successes in various weed detec-
tion and classification tasks. Specifically, the authors in
(Espejo-Garcia et al., 2020a) report that pretraining deep
learning (DL) models on agricultural datasets are advan-
tageous to reduce training epochs and improve model ac-
curacy. Four DL models are fine-tuned on two datasets,
Plant Seedlings Dataset and Early Crop Weeds Dataset
(Espejo-Garcia et al., 2020a), and classification perfor-
mance improvements from 0.51% to 1.89% are reported.
In addition, two DL models, ResNet50 (He et al., 2016)
and Inception-v3 (Szegedy et al., 2016), are tested on
the DeepWeeds dataset (Olsen et al., 2019) using trans-
fer learning (Zhang et al., 2022) for weed identification,
achieving classification accuracies over 95%. In Chen
et al. (2022), 35 state-of-the-art DL models are evaluated
on the CottonWeedID15 dataset for classifying 15 com-
mon weed species in the U.S. cotton production systems,
and 10 out of the 35 models obtain F1 scores over 95%.
Large-scale and diverse labeled image data is essential
for developing the aforementioned DL algorithms. Ef-
fective, robust, and advanced DL algorithms for weed
recognition in complex field environment require a com-
prehensive dataset that covers different lighting/field con-
ditions, various weed shapes/sizes/growth stages/colors,
and mixed camera capture angles (Zhang et al., 2022;
Chen et al., 2022). In contrast, insufficient or low-
quality datasets will lead to poor model generalizability
and overfitting issues (Shorten and Khoshgoftaar, 2019).
Recently, several progresses have been made on devel-
oping dedicated image datasets for weed control (Lu
and Young, 2020), such as DeepWeeds (Olsen et al.,
2019), Early Crop Weeds Dataset (Espejo-Garcia et al.,
2020b), early crop dataset(Espejo-Garcia et al., 2020b),
CottonWeedID15 (Chen et al., 2022), and YOLOWeeds
(Dang et al., 2022), just to name a few. However, col-
lecting a large-scale and labeled image dataset is often
resource intensive, time consuming, and economically
costly. One sound approach to addressing this bottleneck
is to develop advanced data augmentation techniques that
can generate high-quality and diverse images (Xu et al.,
2022). However, basic data augmentation approaches,
such as geometric (e.g., flips and rotations), color trans-
formations (e.g., Fancy PCA (Krizhevsky et al., 2012)
and color channel swapping), tend to produce high-
correlated samples and are unable to learn the variations
or invariant features across the samples in the training
data (Shorten and Khoshgoftaar, 2019; Lu and Young,
2020).
Lately, advanced data augmentation approaches, such
as generative adversarial networks (GANs), have gained
increased attention in the agricultural community due
to their capability of generating naturalistic images (Lu
et al., 2022). In Espejo-Garcia et al. (2021), Deep Con-
volutional GAN (DCGAN) (Radford et al., 2015) com-
bined with transfer learning is adopted to generate new
weed images for weed identification tasks. The authors
then train the Xception network (Chollet, 2017) with the
synthetic images on the Early Crop Weed dataset (Espejo-
Garcia et al., 2020b) (contains 202 tomato and 130 black
nightshade images at early growth stages) and obtain the
testing accuracy of 99.07%. Conditional Generative Ad-
versarial Network (C-GAN) (Mirza and Osindero, 2014)
is adopted in (Abbas et al., 2021) to generate synthetic
tomato plant leaves to enhance the performance of plant
disease classification. The DenseNet121 model (Huang
et al., 2017) is trained on synthetic and real images using
transfer learning on PlantVillage dataset (Hughes et al.,
2015) to classify the tomato leaves images into ten cate-
gories of diseases, yielding 1-4% improvement in classifi-
cation accuracy compared to training on the original data
without image augmentation. The readers are referred
to (Lu et al., 2022) for more recent advances of GANs
in agricultural applications. While impressive progresses
have been made, GANs often suffer from training insta-
bility and model collapse issues (Arjovsky et al., 2017;
Creswell et al., 2018; Mescheder et al., 2017), and they
could fail to capture a wide data distribution (Zhao et al.,
2018), which make GANs difficult to be scaled and ap-
plied to new domains.
On the other hand, diffusion probabilistic models (also
known as diffusion models) have quickly gained popu-
larity in producing high-quality images (Ho et al., 2020;
Song et al., 2020; Dhariwal and Nichol, 2021). Diffu-
sion models, inspired by non-equilibrium thermodynam-
ics (Jarzynski, 1997; Sohl-Dickstein et al., 2015), pro-
gressively add noise to data and then construct the de-
sired data samples by a Markov chain from white noise
(Song et al., 2020). Recent researches show that diffusion
models are able to produce high-quality synthetic images,
surpassing GANs on several different tasks (Dhariwal and
Nichol, 2021), and there is significant interest in validat-
ing diffusion models in different image generation tasks,
such as video generation (Ho et al., 2022b), medical im-
age generation ( ¨
Ozbey et al., 2022), and image-to-image
translation (Saharia et al., 2021). However, the potential
of diffusion methods in agricultural image generation re-
mains largely unexplored, partly owing to the substantial
computational burden of image sampling in regular diffu-
sion models. In this paper, we presented the first results
on image generation for weed recognition tasks using
a classifier-guided diffusion model (ADM-G, (Dhariwal
and Nichol, 2021)) based on a 2D U-Net (Ronneberger
2