A Survey of Computer Vision Technologies In Urban and Controlled-environment Agriculture

2025-04-30 2 0 2.24MB 35 页 10玖币
侵权投诉
A Survey of Computer Vision Technologies In Urban and
Controlled-environment Agriculture
JIAYUN LUO,Nanyang Technological University, Singapore
BOYANG LI,Nanyang Technological University, Singapore
CYRIL LEUNG,Nanyang Technological University, Singapore and China-Singapore International Joint Research
Institute, China
In the evolution of agriculture to its next stage, Agriculture 5.0, articial intelligence will play a central role. Controlled-environment
agriculture, or CEA, is a special form of urban and suburban agricultural practice that oers numerous economic, environmental,
and social benets, including shorter transportation routes to population centers, reduced environmental impact, and increased
productivity. Due to its ability to control environmental factors, CEA couples well with computer vision (CV) in the adoption of
real-time monitoring of the plant conditions and autonomous cultivation and harvesting. The objective of this paper is to familiarize
CV researchers with agricultural applications and agricultural practitioners with the solutions oered by CV. We identify ve major
CV applications in CEA, analyze their requirements and motivation, and survey the state of the art as reected in 68 technical papers
using deep learning methods. In addition, we discuss ve key subareas of computer vision and how they related to these CEA problems,
as well as fourteen vision-based CEA datasets. We hope the survey will help researchers quickly gain a bird-eye view of the striving
research area and will spark inspiration for new research and development.
CCS Concepts: Computing methodologies
Computer vision;Neural networks;Applied computing
Agriculture;
General and reference Surveys and overviews.
Additional Key Words and Phrases: agriculture 5.0, controlled-environment agriculture, multimodality, pest and disease detection,
growth monitoring, ower and fruit detection
ACM Reference Format:
Jiayun Luo, Boyang Li, and Cyril Leung. 2018. A Survey of Computer Vision Technologies In Urban and Controlled-environment
Agriculture. In Woodstock ’18: ACM Symposium on Neural Gaze Detection, June 03–05, 2018, Woodstock, NY. ACM, New York, NY, USA,
35 pages. https://doi.org/10.1145/1122445.1122456
1 INTRODUCTION
Articial intelligence (AI), especially computer vision (CV), is nding an ever broadening range of applications in modern
agriculture. The next stage of agricultural technological development, Agriculture 5.0 [
15
,
100
,
236
,
361
], will constitute
AI-driven autonomous decision making as a central component. The term Agriculture 5.0 stems from a chronology
[
361
] that begins with Agriculture 1.0, which heavily depends on human labor and animal power, and Agriculture
2.0, enabled by synthetic fertilizers, pesticide, and combustion-powered machinery, and develops to Agriculture 3.0
The authors can be reached at the following address: 50 Nanyang Avenue, School of Computer Science and Engineering, Nanyang Technological
University, Singapore 639798. Boyang Li is the corresponding author. The research is funded by WeBank-NTU Joint Research Center and China-Singapore
International Joint Research Institute.
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not
made or distributed for prot or commercial advantage and that copies bear this notice and the full citation on the rst page. Copyrights for components
of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to
redistribute to lists, requires prior specic permission and/or a fee. Request permissions from permissions@acm.org.
©2018 Association for Computing Machinery.
Manuscript submitted to ACM
1
arXiv:2210.11318v2 [cs.CV] 12 Oct 2023
Woodstock ’18, June 03–05, 2018, Woodstock, NY Luo, et al.
and 4.0, characterized by GPS-enabled precision control, and Internet-of-Thing (IoT) driven data collection [
257
]. Built
upon the rich agricultural data collected, Agriculture 5.0 holds the promise to further increase productivity, satiate the
food demand of a growing global population, and mitigate the negative environmental impact of existing agricultural
practices.
As an integral component of Agriculture 5.0, controlled-environment agriculture (CEA), a farming practice carried
out within urban, indoor, resource-controlled, and sensor-driven factories, is particularly suitable for the application of
AI and CV. This is because CEA provides ample infrastructure support for data collection and autonomous execution of
algorithmic decisions. In terms of productivity, CEA could produce higher yield per unit area of land [
8
,
9
] and boost
the nutritional content of agricultural products [
162
,
313
]. In terms of environmental impact, CEA farms can insulate
environmental inuences, relieve the need for fertilizer and pesticides, and eciently utilize recycled resources like
water, thereby may be much more environmentally friendly and self-sustainable than traditional farming.
In the light of current global challenges, such as disruptions to global supply chains and the threat of climate change,
CEA appears especially appealing as a food source for urban population centers. Under pressures of deglobalization
brought by geopolitical tensions [
371
] and global pandemics [
237
,
276
], CEA provides the possibility to build farms close
to large cities, which shortens the transportation distance and maintains secure food supplies even when long-distance
routes are disrupted. The city-state Singapore, for example, has promised to source 30% of its food domestically by
2030 [
1
,
315
], which is only possible through suburban farms such as CEAs. Furthermore, CEA, as a form of precision
agriculture, is by itself a viable solution to the reduction of the emission of greenhouse gasses [
9
,
37
,
249
]. CEA can also
shield plants from adverse climate conditions exacerbated by climate change as its environments are fully controlled
[112] and is able to eectively reuse the arable land eroded due to climate change [373].
We argue that AI and CV are critical to the economic viability and long-term sustainability of CEAs as these
technologies could save expenses associated with production and improve productivity. Suburban CEAs have high
land costs. An analysis in Victoria, Australia [
38
] shows that, due to the higher land cost resulting from proximity
to cities, with an estimated 50-fold productivity improvement per land area, it still takes 6 to 7 years for a CEA to
reach the break-even point. Thus, further productivity improvement from AI would act as strong drivers for CEA
adoption. Moreover, vertical or stacked setup of vertical farms impose additional diculty for farmers to perform daily
surveillance and operations. Automated solutions empowered by computer vision could eectively solve this problem.
Finally, AI and CV technologies have the potential to fully characterize the complex, individually dierent, time-varying,
and dynamic conditions of living organisms [
39
], which will enable precise and individualized management and further
elevate yield. Thus, AI and CV technologies appear to be a natural t to CEAs.
Most of the recent development of AI can be attributed to the newly discovered capability to train deep neural
networks [
175
] that can (1) automatically learn multi-level representations of input data that are transferable to diverse
downstream tasks [
65
,
137
], (2) easily scale up to match the growing size of data [
291
], and (3) conveniently utilize
massively parallel hardware architectures like GPUs [
114
,
337
]. As function approximators, deep learning proves to be
surprisingly eective in generalizing to previously unseen data [
363
]. Deep learning has achieved tremendous success
in computer vision [
302
], natural language processing [
47
,
83
,
118
], multimedia [
23
,
88
], robotics [
300
], game playing
[278], and many other areas.
The AI revolution in agriculture is already underway. State-of-the-art neural network technologies, such as ResNet
[
134
] and MobileNet [
139
] for image recognition, and Faster R-CNN [
244
], Mask R-CNN [
133
], and YOLO [
239
] for
object detection, have been applied to the management of crops [
197
], livestock [
142
,
308
], and plants in indoor and
2
A Survey of Computer Vision Technologies In Urban and Controlled-environment Agriculture Woodstock ’18, June 03–05, 2018, Woodstock, NY
vertical farms [
245
,
366
]. AI has been used to provide decision support in a myriad of tasks from DNA analysis [
197
]
and growth monitoring [245,366] to disease detection [262] and prot prediction [28].
While several surveys have explored the use of computer vision (CV) techniques in agriculture, none of them
specically focus on CEA applications. Some surveys summarize studies based on aspects of practical applications in
agriculture. [
74
,
89
,
123
,
149
,
286
] survey pest and disease detection studies. [
40
,
111
,
312
] discuss fruit and vegetable
quality grading and disease detection. [
307
] summarizes studies in six sub-elds, including crop growth monitoring,
pest and disease detection, automatic harvesting/fruit detection, fruit quality testing, automated management of modern
farms and the monitoring of farmland information with Unmanned Aerial Vehicle (UAV). Other survey organize existing
works from a technical perspective, namely algorithms used [
241
] or formats of data [
56
]. [
154
], as an exception,
introduces the development history of CV and AI in smart agriculture, without investigating any individual studies.
Our work aims to address this gap and provide insights tailored to CEA-specic contexts.
As the volume of research in smart agriculture grows rapidly, we hope the current review article can bridge researchers
from AI and agriculture and create a mild learning curve when they wish to familiarize themselves in the other area. We
believe computer vision has the closest connections with, and is the most immediately applicable in, urban agriculture
and CEAs. Hence, in this paper, we focus on reviewing deep-learning based computer vision technologies in urban
farming and CEAs. We focus on deep learning because it is the predominant approach in AI and CV research. The
contributions of this paper are two-fold, with the former targeted at AI researchers and the latter targeted at agriculture
researchers:
We identify ve major CV applications in CEA and analyze their requirements and motivation. Further, we
survey the state of the art as reected in 68 technical papers and 14 vision-based CEA datasets.
We discuss ve key subareas of computer vision and how they relate to CEA. In addition, we identify four
potential future directions for research in CV for CEA.
In gure 1we provide an graphical preview of our content. It illustrates the end-to-end agriculture process of CEAs,
from seed planting to harvest and sales, with ve major deep learning based CV applications–Growth Monitoring, Fruit
and Flower Detection, Fruit Counting, Maturity Level Classication and Pest and Disease Detection–mapped to the
corresponding applicable plant growth stages. We do not survey the autonomous seed planting and harvesting step as
they are more relevant to robot functioning and robotic control, i.e grasping, carrying and placing of objects rather
than computer vision (we do include the localization of fruit in the fruit and ower detection section that facilitate
harvesting robot to locate the targeted object and perform action). However, we provide here some literature related to
agriculture robot and end-eector design for reference [36,57,92,235,362]
We structure the survey following the process in the gure: First, to provide a bird-eye view of CV capabilities
available to researchers in smart agriculture, we summarize several major CV problems and inuential technical
solutions in §2. Next, we review 68 papers with respect to the application of computer vision in the CEA system in
§3. The discussion is organized into ve subsections: Growth Monitoring, Fruit and Flower Detection, Fruit Counting,
Maturity Level Classication, and Pest and Disease Detection. In the discussion, we focus on fruits and vegetables that
are suitable for CEA, including tomato [
10
,
13
,
127
,
360
], mango [
7
], guava [
277
,
342
], strawberry [
107
,
355
], capsicum
[
177
], banana [
5
], lettuce [
368
], cucumber [
10
,
128
,
203
], citrus [
4
] and blueberry [
2
]. Next, we provide a summary of
fourteen publicly available datasets of plants and fruits in §4to facilitate future studies in Controlled-environment
agriculture. Finally, we highlight a few research directions that could generate high-impact research in the near future
in §5.
3
Woodstock ’18, June 03–05, 2018, Woodstock, NY Luo, et al.
Fig. 1. An illustration of the end-to-end agriculture process of CEAs, from seed planting to harvest and sales, with five major deep
learning based CV in agriculture applications–Growth Monitoring, Fruit and Flower Detection, Fruit Counting, Maturity Level
Classification and Pest and Disease Detection – mapped to the corresponding applicable plant growth stages. Autonomous Seed
Sowing and Autonomous Harvest and Sales in gray boxes are relevant steps in the agriculture process of CEAs but are out of the
scope of our survey which focus on CV in CEAs. Orange lines represent arrows originated from pest and disease detection. Green
lines represent arrows with stage 4 as destination.
One thing to note here is that, except for the Leaf Instance Segmentation task under the Growth Monitoring section,
all the tasks are performed with model trained from dierent datasets and evaluated on dierent metrics. Table 3 4,5,6
showcase the variety in datasets and evaluation metrics. This variation results in incomparable performance between
studies. Such a phenomenon further indicates the necessity of our survey, which summarizes the current progress in
4
A Survey of Computer Vision Technologies In Urban and Controlled-environment Agriculture Woodstock ’18, June 03–05, 2018, Woodstock, NY
literature and encourages the development of general benchmarks to promote consistency and comparability in future
research.
2 COMPUTER VISION CAPABILITIES RELEVANT TO SMART AGRICULTURE
2.1 Image Recognition
The classic problem of image recognition is to classify an image containing a single object to the corresponding object
class. The success of deep convolutional networks in this area dates (at least) back to LeNet [
176
] of 1998, which
recognizes hand-written digits. The fundamental building block of such networks is the convolution operation. Using
the principles of local connections and weight sharing, convolutional networks benet from an inductive bias of
translational invariance. That is, a convolutional network applies (approximately) the same operation to all pixel
locations of the image.
The victory of AlexNet [
166
] in the 2012 ImageNet Large Scale Visual Recognition Challenge [
253
] is often considered
as a landmark event that introduced deep neural networks into the AI mainstream. Subsequently, many variants of
convolutional networks [
151
,
173
,
279
,
297
] have been proposed. Due to space limits, here we provide a brief review of
a few inuential works, which is by no means exhaustive. ResNet [
135
] introduces residual connections that allow the
training of networks of more than 100 layers. ResNeXT [
345
] and MobileNet [
140
] employ grouped convolution that
reduces interaction between channels and improves the eciency of the network parameters. ShueNet [
374
] utilizes
the shuing of channels, which complements group convolution. EcientNet [
301
] shows simultaneous scaling of the
network width, height, and image resolution is key to ecient use of parameters.
Recently, the transformer model has proven to be a highly competitive architecture for image recognition and other
computer vision tasks [
90
]. These models cut the input image into a sequence of small image patches and often apply
strong regularization such as RandAugment [
75
]. Variants such as CaiT [
311
], CeiT [
359
], Swin Transformer [
198
], and
others [72,78,346,380] achieve outstanding performance on ImageNet.
Despite the maturity of the technology for image classication, the assumption that an image contains only one
object may not be easily satised in real-world scenarios. Thus, it is often necessary to adopt a problem formulation as
object detection or semantic / instance segmentation.
2.2 Object Detection
The object detection task is to identify and locate all objects in the image. It can be understood as the task resulted
from relaxing the assumption that the input image contains a single object. This is one natural problem formulation for
real-world images and has seen wide adoption in agricultural applications.
In broad strokes, contemporary object detection methods can be categorized into anchor-box-based and point-based
/ proposal-free approaches. In anchor-box methods [
110
,
243
], the process starts with a number of predened anchor
boxes that are periodically tiled to cover the entire input image. For each anchor box, the network makes two types
of predictions. First, it determines if the anchor box contains one of the predened object classes. Second, if the box
contains an object, the network attempts to move and reshape the box to become closer to the ground-truth location of
the object. One-stage anchor-box detectors [
77
,
101
,
190
,
196
,
240
,
376
] make these predictions all at once. In comparison,
two-stage detectors [
110
,
132
,
189
,
243
], in the rst stage discard anchor boxes that do not contain any object and classify
the remaining boxes into ner object categories in the second stage. The location adjustment, known as bounding box
regression, can happen in both stages. It is also possible to employ more than two stages [48]. When the objects have
5
摘要:

ASurveyofComputerVisionTechnologiesInUrbanandControlled-environmentAgricultureJIAYUNLUO,NanyangTechnologicalUniversity,SingaporeBOYANGLI∗,NanyangTechnologicalUniversity,SingaporeCYRILLEUNG,NanyangTechnologicalUniversity,SingaporeandChina-SingaporeInternationalJointResearchInstitute,ChinaIntheevoluti...

展开>> 收起<<
A Survey of Computer Vision Technologies In Urban and Controlled-environment Agriculture.pdf

共35页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:35 页 大小:2.24MB 格式:PDF 时间:2025-04-30

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 35
客服
关注