In my study, I developed a novel, non-destructive method for rapidly identifying waterlogging stress in wheat and evaluating the efficacy of different regulation measures. I combined high-resolution multispectral imagery captured by drone technology with a deep learning model I specifically designed, termed VGG21. This approach addresses a critical need in precision agriculture for objective and timely assessment of crop stress and the effectiveness of mitigation strategies.
The foundation of my research was a controlled field experiment conducted during the 2025 wheat growing season. I established plots for two popular wheat varieties and subjected them to different treatments: a control group (CK), waterlogging stress, stress alleviated by silicon fertilizer regulation, and stress alleviated by amino acid regulation. These treatments were applied during two critical growth stages: the jointing-booting stage and the flowering-grain filling stage. Waterlogging durations of 0, 10, and 15 days were tested. The primary goal was to see if drone technology could distinguish between these physiologically different states.

For data acquisition, I utilized the DJI P4 Multispectral drone. This platform is a key piece of modern drone technology, equipped with a six-sensor array that captures data in five specific spectral bands: Blue (B), Green (G), Red (R), Red Edge (RE), and Near-Infrared (NIR). I flew the drone at a low altitude of 2-3 meters above the wheat canopy under clear skies around solar noon to collect high-resolution images. The raw multispectral images underwent rigorous preprocessing, including radiometric calibration (using a grey reference panel) and geometric correction. To remove non-plant material like soil and shadows, I applied a mask based on the Enhanced Vegetation Index (EVI), which is calculated using the formula:
$$
\text{EVI} = \text{NIR} – \text{Blue} + \text{Green} – \text{Red}
$$
After preprocessing, the images were segmented into 48×48 pixel patches. I compiled a comprehensive dataset comprising four classes: CK, waterlogging stress (ZSXP), silicon fertilizer regulation (TKGG), and amino acid regulation (TKAJS). The class distribution of my dataset is detailed in Table 1. I used a stratified random sampling strategy to divide the dataset into training, validation, and testing sets in a 70:15:15 ratio for each class.
| Class | Code | Image Size | Number of Samples |
|---|---|---|---|
| Control | CK | 48×48×5 | 3,169 |
| Silicon Fertilizer Regulation | TKGG | 48×48×5 | 4,605 |
| Amino Acid Regulation | TKAJS | 48×48×5 | 4,870 |
| Waterlogging Stress | ZSXP | 48×48×5 | 2,841 |
For the modeling phase, I chose to modify the well-known VGG (Visual Geometry Group) architecture, specifically VGG19. While VGG19 is effective, its standard architecture with 5 pooling layers requires larger input images. Since my input patches were 48×48, I reconfigured the network. I designed the VGG21 model with 4 convolutional blocks instead of 5. The convolutional layers per block were set to 3, 3, 6, and 6, respectively, totaling 18 convolutional layers. I replaced the standard ReLU activation function with Leaky ReLU to improve gradient flow. The model architecture concludes with three fully connected layers, interspersed with two dropout layers (with a probability of 0.5) to prevent overfitting, and finally a Softmax output layer. The specific structure of VGG21 is shown in Table 2.
| Layer Block | Configuration | Output Size |
|---|---|---|
| Input | 48×48×5 | 48×48×5 |
| Conv Block 1 | 3×3 Conv (64), Leaky ReLU (×3), MaxPool 2×2 | 24×24×64 |
| Conv Block 2 | 3×3 Conv (128), Leaky ReLU (×3), MaxPool 2×2 | 12×12×128 |
| Conv Block 3 | 3×3 Conv (256), Leaky ReLU (×6), MaxPool 2×2 | 6×6×256 |
| Conv Block 4 | 3×3 Conv (512), Leaky ReLU (×6), MaxPool 2×2 | 3×3×512 |
| FC Layer | Fully Connected 1024, Leaky ReLU | 1024 |
| Dropout Layer | Dropout, P=0.5 | 1024 |
| FC Layer | Fully Connected 1024, Leaky ReLU | 1024 |
| Dropout Layer | Dropout, P=0.5 | 1024 |
| FC Layer (Output) | Fully Connected, Softmax (Number of Classes) | Number of Classes |
To evaluate the performance of my VGG21 model, I compared it against other state-of-the-art deep learning models: ResNet50, SWIN-Transformer, VGG19, and VGG23 (a deeper variant). For this comparison, I trained and tested on a two-class dataset containing only ‘Silicon Fertilizer Regulation’ and ‘Waterlogging Stress’ samples. The results of this comparison are presented in Table 3. I used standard metrics including Accuracy, Precision, Recall, and F1-score, which are defined as:
$$
\text{Accuracy} = \frac{TP + TN}{TP + TN + FP + FN}
$$
$$
\text{Precision} = \frac{TP}{TP + FP}
$$
$$
\text{Recall} = \frac{TP}{TP + FN}
$$
$$
\text{F1} = 2 \times \frac{\text{Precision} \times \text{Recall}}{\text{Precision} + \text{Recall}}
$$
Where TP, TN, FP, and FN represent True Positives, True Negatives, False Positives, and False Negatives, respectively.
| Model | Accuracy (%) | Class | Precision (%) | Recall (%) | F1-score (%) | Parameters (M) |
|---|---|---|---|---|---|---|
| Resnet50 | 74.42 | Silicon Regulation Stress | 83.77 60.29 | 76.13 71.08 | 79.76 65.24 | 22.43 |
| SWIN-Transformer | 83.53 | Silicon Regulation Stress | 88.86 75.47 | 84.55 81.76 | 85.65 78.49 | 6.58 |
| VGG19 | 90.14 | Silicon Regulation Stress | 96.15 81.08 | 88.48 93.30 | 92.16 86.76 | 18.59 |
| VGG21 | 91.04 | Silicon Regulation Stress | 96.42 82.95 | 89.53 93.88 | 92.85 88.08 | 21.40 |
| VGG23 | 90.07 | Silicon Regulation Stress | 97.94 78.17 | 87.15 96.16 | 92.23 86.24 | 24.22 |
The results in Table 3 clearly show that my VGG21 model achieved the best overall performance, with an accuracy of 91.04%. It outperformed the more complex models, suggesting that my modifications to the VGG architecture were well-suited for this specific classification task with multispectral data captured by drone technology. The fine-tuned VGG21 provided a good balance between model complexity and discriminative power. I then used VGG21 to create two separate two-class recognition models: one for Silicon Fertilizer Regulation vs. Waterlogging Stress, and another for Amino Acid Regulation vs. Waterlogging Stress. The performance metrics are shown in Table 4.
| Model | Accuracy (%) | Class | Precision (%) | Recall (%) | F1-score (%) |
|---|---|---|---|---|---|
| Stress & Silicon Regulation | 91.05 | Silicon Regulation Stress | 96.42 82.95 | 89.53 93.88 | 92.85 88.08 |
| Stress & Amino Acid Regulation | 69.62 | Amino Acid Regulation Stress | 85.35 45.53 | 70.59 66.97 | 77.27 54.21 |
There was a significant performance gap between the two models. The model trained on silicon fertilizer regulation was vastly superior. This suggests that the spectral and textural features in the multispectral images for silicon-regulated wheat were much more distinct from the stressed wheat than those for amino acid-regulated wheat. The silicon regulation appeared to cause a more dramatic and detectable physiological change in the plants. In a further test of robustness, I used the “Silicon Fertilizer Regulation & Stress” model to classify the “Amino Acid Regulation” samples. The outcome is presented in Table 5.
| Model Applied To | Accuracy (%) | Class | Precision (%) | Recall (%) | F1-score (%) |
|---|---|---|---|---|---|
| Amino Acid Regulation Dataset | 73.44 | Amino Acid Regulation Stress | 64.77 87.35 | 89.15 60.70 | 75.03 71.63 |
Interestingly, the model trained on silicon regulation data still performed reasonably well on the amino acid regulation dataset, reaching an accuracy of 73.44%. The recall for amino acid regulation was high (89.15%), indicating the model correctly identified most regulated plants, suggesting the silicon and amino acid regulations might share some common spectral response patterns detectable by drone technology. Finally, I constructed a three-class model with VGG21 to classify CK, Waterlogging Stress, and Silicon Fertilizer Regulation simultaneously. The performance of this multi-class model is summarized in Table 6.
| Model | Accuracy (%) | Class | Precision (%) | Recall (%) | F1-score (%) |
|---|---|---|---|---|---|
| CK, Stress & Silicon Regulation | 77.45 | Silicon Regulation | 95.13 | 88.71 | 91.80 |
| CK | 65.88 | 67.91 | 66.88 | ||
| Stress | 62.15 | 67.68 | 64.80 |
The overall model accuracy was 77.45%, which was lower than the binary models. The main source of confusion was between the ‘CK’ and ‘Stress’ classes. The model’s precision, recall, and F1-score for the ‘Silicon Regulation’ class were very high (around 92%), confirming that plants subjected to this effective regulation were spectrally unique and easy to identify. In contrast, the features for ‘CK’ and ‘Stress’ were more overlapping, making them harder for my model to distinguish. To validate my modeling results, I analyzed the actual grain yield data from the experiment. The yield results (shown in a previously described study) confirmed a clear trend: waterlogging led to yield reduction, and the application of regulation measures, especially silicon fertilizer, significantly mitigated this yield loss. This biological evidence strongly supports the findings from my drone technology based recognition model. The superior performance of the silicon regulation model in the spectral data aligns perfectly with its demonstrated superior agronomic effect in increasing yield. The physiological changes induced by silicon (e.g., changes in chlorophyll content, cellular structure) were apparently more profound and more easily captured by the multispectral sensor on the drone. The specific spectral signature of the silicon-regulated plants made them stand out clearly from both the stressed and the control plants, leading to high classification accuracy. The overlap between control and stress classes suggests that mild stress may not always produce a spectral signal distinct enough for perfect separation, a common challenge in precision agriculture. This work showcases the immense potential of integrating cost-effective drone technology with advanced deep learning models for field-scale, non-destructive crop physiological assessment. My VGG21 model, specifically adapted for multispectral data, proved to be a robust tool. The study provides a clear pathway for developing automated systems that can evaluate the success of abiotic stress mitigation strategies, helping farmers make timely and informed decisions to protect yield. Future work will focus on expanding the dataset, refining the model to better differentiate between control and stressed states, and establishing a direct quantitative link between the model’s identification output and the final grain yield, thus advancing the field of drone-based precision phenotyping for sustainable agriculture.
