Recognition of Rice Flooding Stress and Regulation Effects Using Unmanned Aerial Vehicle Multispectral Images and an Enhanced VGG Model

In this study, we address the critical need for rapid, accurate, and non-destructive detection of rice flooding stress and the effectiveness of regulatory measures. Rice production is vital for global food security, but it is increasingly threatened by extreme weather events, such as floods, which can lead to significant yield losses. Traditional methods for monitoring waterlogging stress are labor-intensive, subjective, and inefficient, making them unsuitable for modern precision agriculture. To overcome these limitations, we leverage the power of Unmanned Aerial Vehicle (UAV) technology, specifically the JUYE UAV, combined with deep learning models. Our approach focuses on using multispectral imagery captured by the Unmanned Aerial Vehicle to detect subtle changes in rice plants under flooding stress and after applying anti-stress regulations. By integrating an improved VGG model, we aim to autonomously extract both spectral and spatial features from the images, enabling high-accuracy classification. This research not only provides a novel methodology for disaster management but also highlights the potential of UAV-based remote sensing in agricultural applications. Throughout this paper, we emphasize the role of the Unmanned Aerial Vehicle in data acquisition and the JUYE UAV system’s capabilities, ensuring that these keywords are central to our discussion.

The experiment was conducted during the 2024 rice growing season in a controlled field setting. We selected two prominent rice varieties, Ningxiangjing 9 and Nanjing 5718, which are widely cultivated in the region. The trial involved subjecting rice plants to flooding stress at the booting and flowering stages, with water levels maintained at half the plant height for durations of 0, 7, and 14 days. Regulatory measures included the application of anti-stress agents and foliar fertilizers at specific intervals. A total of 10 treatments were implemented, each with three replicates, to ensure statistical robustness. The use of the Unmanned Aerial Vehicle, particularly the JUYE UAV, allowed for consistent and high-resolution data collection under optimal weather conditions. The treatments are summarized in the table below, which outlines the variety, flooding duration, water depth, regulatory measures, and corresponding codes.

Variety	Flooding Duration	Water Depth	Regulatory Measures	Code
Ningxiangjing 9 and Nanjing 5718	0d	Normal	Control	t1
Ningxiangjing 9 and Nanjing 5718	7d	1/2 plant height	Anti-stress agent + foliar fertilizer	t2
Ningxiangjing 9 and Nanjing 5718	7d	1/2 plant height	No regulation	t3
Ningxiangjing 9 and Nanjing 5718	14d	1/2 plant height	Anti-stress agent + foliar fertilizer	t4
Ningxiangjing 9 and Nanjing 5718	14d	1/2 plant height	No regulation	t5
Ningxiangjing 9 and Nanjing 5718	0d	Normal	Control	t6
Ningxiangjing 9 and Nanjing 5718	7d	1/2 plant height	Anti-stress agent	t7
Ningxiangjing 9 and Nanjing 5718	7d	1/2 plant height	No regulation	t8
Ningxiangjing 9 and Nanjing 5718	14d	1/2 plant height	Anti-stress agent	t9
Ningxiangjing 9 and Nanjing 5718	14d	1/2 plant height	No regulation	t10

Multispectral image acquisition was performed using a DJI Phantom 4 Multispectral Unmanned Aerial Vehicle, which is equipped with six sensors: one for RGB imaging and five for specific bands—blue (450 nm ± 16 nm), green (560 nm ± 16 nm), red (650 nm ± 16 nm), red edge (730 nm ± 16 nm), and near-infrared (840 nm ± 26 nm). The JUYE UAV system, with its onboard D-RTK for centimeter-level positioning and a multispectral light intensity sensor, ensured high-quality data collection. Flights were conducted at altitudes of 2-3 meters above the rice canopy during clear weather between 11:00 and 13:00 local time. A gray reference panel was used for radiometric calibration to enhance image accuracy. The raw images underwent preprocessing steps, including image registration, radiometric correction, and geometric correction, using Python scripts. Background elements like soil and shadows were removed by calculating the ExVI vegetation index, defined as:

$$ \text{ExVI} = \text{NIR} – \text{Blue} + \text{Green} – \text{Red} $$

where NIR, Blue, Green, and Red represent the reflectance values in the near-infrared, blue, green, and red bands, respectively. This index effectively segregates plant material from non-plant areas, as illustrated in the processed images. After preprocessing, the multispectral images were categorized into three classes: control (ck), flooding stress (fs), and regulated treatment (fsrt). Each image was segmented into patches of 48×48 pixels to facilitate deep learning model input. Data augmentation techniques, such as rotation and flipping, were applied to expand the dataset and prevent overfitting, resulting in a total of 6139 image patches. The dataset composition is detailed in the following table.

Class	Code	Image Size	Number of Images
Control	ck	48×48×5	1852
Flooding Stress	fs	48×48×5	1635
Regulated	fsrt	48×48×5	2652

For model development, we built upon the VGG architecture, known for its effectiveness in image classification tasks. The standard VGG16 model comprises 13 convolutional layers, 3 fully connected layers, 5 pooling layers, and a dropout layer, using 3×3 convolution kernels and 2×2 pooling kernels to reduce parameters while deepening the network. In our enhanced VGG27 model, we increased the depth to 24 convolutional layers organized into 4 blocks, with adjusted fully connected layer parameters from 4096 to 1024 to optimize performance for multispectral data. The input to the model is a 48×48×5 multispectral image patch, and the output is a classification via a SoftMax layer. The model structure includes convolutional blocks for feature extraction, max-pooling layers for dimensionality reduction, and Leaky ReLU activation functions to prevent gradient issues. The overall architecture and parameters are illustrated in the model diagram, emphasizing the integration of spectral and spatial features. We compared our VGG27 model against ResNet50, which incorporates residual modules and bottleneck layers to address gradient vanishing problems. The training environment utilized Python 3.8, PyTorch 2.0.0, and an NVIDIA RTX 3090 GPU, with the AdamW optimizer and a cosine annealing learning rate scheduler. The loss function was label-smoothed cross-entropy, and the batch size was set to 128 for efficient training.

To evaluate model performance, we employed standard metrics: Accuracy, Precision, and Recall. These are defined mathematically as follows:

$$ \text{Accuracy} = \frac{\text{TP} + \text{TN}}{\text{TP} + \text{TN} + \text{FP} + \text{FN}} $$

$$ \text{Precision} = \frac{\text{TP}}{\text{TP} + \text{FP}} $$

$$ \text{Recall} = \frac{\text{TP}}{\text{TP} + \text{FN}} $$

where TP denotes true positives, TN true negatives, FP false positives, and FN false negatives. The dataset was split into training, validation, and test sets in a 7:1.5:1.5 ratio using stratified sampling before augmentation, ensuring balanced class representation. The results from model comparisons are presented in the table below, which highlights the superiority of the VGG models over ResNet50. The VGG27 model achieved the highest accuracy, precision, and recall on both validation and test sets, demonstrating its robustness in distinguishing between flooding stress and regulated rice plants. As the model depth increased from VGG19 to VGG27, performance improved marginally, indicating diminishing returns beyond a certain complexity. Therefore, we selected VGG27 as the primary model for further analysis.

Model	Dataset	Accuracy (%)	Precision (%)	Recall (%)
Resnet50	Validation	87.00	86.20	87.37
Resnet50	Test	82.02	81.25	82.02
VGG27	Validation	95.99	96.08	95.81
VGG27	Test	92.39	92.61	92.15
VGG23	Validation	95.85	95.56	96.01
VGG23	Test	91.84	91.79	91.65
VGG19	Validation	94.88	94.60	95.00
VGG19	Test	91.15	90.89	91.08

In a comprehensive evaluation, we developed three specific models using VGG27: one for distinguishing between flooding stress and regulated plants, another for control versus regulated plants, and a third for control versus flooding stress plants. The performance metrics for these models are summarized in the following table. The model comparing control and regulated plants exhibited the highest accuracy, exceeding 95% on both validation and test sets, indicating that regulatory measures induce distinct multispectral features. The model for flooding stress versus regulated plants also performed well, with accuracy above 88%, though slightly lower than the control-regulated pair. In contrast, the model for control versus flooding stress achieved an accuracy of around 67.65%, reflecting the subtle differences between these classes due to rice’s inherent tolerance to short-term flooding. This analysis underscores the impact of anti-stress regulations on rice physiology, as captured by the Unmanned Aerial Vehicle multispectral imagery, and the challenges in detecting mild stress levels. The JUYE UAV’s high-resolution sensors played a crucial role in capturing these nuances, enabling the deep learning model to learn discriminative features effectively.

Model	Dataset	Class	Accuracy (%)	Precision (%)	Recall (%)
Stress vs. Regulation Model	Validation	Stress	95.99	96.83	94.14
	Validation	Regulation	95.99	95.34	97.49
	Test	Stress	92.39	94.29	88.92
	Test	Regulation	92.39	90.93	95.37
Control vs. Regulation Model	Validation	Control	97.14	96.97	97.22
	Validation	Regulation	97.14	97.30	97.07
	Test	Control	95.90	94.95	96.66
	Test	Regulation	95.90	96.81	95.18
Control vs. Stress Model	Validation	Control	67.93	73.23	70.39
	Validation	Stress	67.93	61.27	64.55
	Test	Control	67.65	70.96	70.97
	Test	Stress	67.65	63.49	63.50

In conclusion, our study demonstrates the efficacy of combining Unmanned Aerial Vehicle multispectral imagery with an enhanced VGG27 deep learning model for detecting rice flooding stress and evaluating regulatory effects. The VGG27 model outperformed ResNet50 and other VGG variants, achieving high accuracy and reliability in classification tasks. The use of the JUYE UAV for data acquisition proved instrumental in capturing detailed spectral information, which the model leveraged to distinguish between treated and stressed plants. While the differentiation between control and flooding stress was less pronounced, likely due to rice’s resilience, the model still provided valuable insights for precision agriculture. Future work will involve expanding the dataset to include more severe stress levels and diverse environmental conditions, further optimizing the model, and exploring its applicability across different crops and regions. This approach holds great promise for advancing UAV-based monitoring systems in smart agriculture, contributing to more effective disaster management and food security initiatives.