Cotton Chlorophyll Content Estimation by Integrating UAV Multispectral and Texture Features

The precise and non-destructive monitoring of crop physiological status is a cornerstone of modern precision agriculture. Among various biophysical parameters, Leaf Chlorophyll Content (LCC) serves as a critical indicator of photosynthetic capacity, nitrogen status, and overall plant health. In China, cotton is a crop of immense economic importance, with its cultivation facing challenges related to water and nutrient management, particularly in arid regions. Traditional methods for LCC determination are destructive, labor-intensive, and spatially limited, making them unsuitable for large-scale field monitoring. The advent of Unmanned Aerial Vehicle (UAV) technology offers a revolutionary platform for high-resolution, rapid, and repeated crop surveillance. The application of China UAV drone technology in agriculture is rapidly expanding, providing a powerful tool for data acquisition. Specifically, UAV drones equipped with multispectral sensors capture canopy spectral reflectance, from which Vegetation Indices (VIs) are derived. These VIs are mathematical combinations of reflectance at specific wavelengths designed to enhance the signal related to vegetation properties like chlorophyll. However, spectral information alone may be insufficient under complex canopy structures or varying growth stages. Texture Features (TFs), extracted from the spatial patterns within the imagery, provide complementary information about canopy architecture, density, and heterogeneity. Therefore, the fusion of spectral VIs and spatial TFs from UAV imagery holds great promise for improving the robustness and accuracy of LCC estimation models.

This study focuses on developing a reliable method for estimating cotton LCC by integrating VIs and TFs derived from a UAV drone-based multispectral imaging system. Different irrigation and nitrogen fertilization levels were applied in a field experiment to induce a wide range of canopy conditions and chlorophyll levels. A China UAV drone was deployed to collect high-resolution multispectral imagery during key cotton growth stages. A suite of VIs known to be sensitive to chlorophyll and a set of TFs were calculated. Advanced modeling approaches, including Random Forest Regression (RF), Residual Network (ResNet), and a 1D-Convolutional Neural Network (1D-CNN), were employed and compared to establish the relationship between the remote sensing features and the ground-measured LCC. The primary objectives were: (1) to analyze the correlation between LCC and individual VIs/TFs across growth stages; (2) to evaluate the performance of different machine learning and deep learning algorithms in modeling LCC using VIs, TFs, and their combination; and (3) to identify the optimal feature set and model for accurate cotton LCC estimation, demonstrating the efficacy of integrated UAV remote sensing for precision agriculture management.

Materials and Methods

Experimental Design and Field Data Collection

A field experiment was conducted to create a gradient in cotton canopy status. The experimental design consisted of a factorial combination of irrigation and nitrogen levels. Four nitrogen application rates (N0, N1, N2, N3) and three irrigation levels (W0.8, W1.0, W1.2) were implemented, resulting in 12 treatment plots. Cotton was sown with a specific planting pattern. The use of a China UAV drone platform was central to the remote data acquisition strategy.

Ground-truth LCC measurements were collected synchronously with each UAV drone flight. From each plot, leaf samples were taken from both the upper and lower canopy layers. The chlorophyll was extracted and its concentration was determined spectrophotometrically using absorbance measurements at 664.2 nm and 648.6 nm. The total chlorophyll content (Ca+b, in mg/L) was calculated using the following equations:

$$C_a = 13.36A_{664.2} – 5.19A_{648.6}$$

$$C_b = 27.43A_{648.6} – 8.12A_{664.2}$$

$$C_{a+b} = C_a + C_b = 5.24A_{664.2} + 22.24A_{648.6}$$

where $A_{664.2}$ and $A_{648.6}$ are the absorbance values at the specified wavelengths.

UAV-based Multispectral Data Acquisition and Processing

A DJI Matrice 350 RTK UAV drone, a prominent platform in China UAV applications, was equipped with a MS600Pro multispectral camera. The camera captures reflectance in six spectral bands: Blue (450 nm), Green (555 nm), Red (660 nm), Red Edge (720 nm), Red Edge (750 nm), and Near-Infrared (NIR, 840 nm). Flights were conducted under clear sky conditions at a constant altitude of 20 meters, with high forward and side overlap to ensure image quality. Radiometric calibration was performed using a calibrated reflectance panel prior to each flight.

The acquired multispectral images were processed using photogrammetry software (Pix4Dmapper) to generate orthomosaics and digital surface models for each spectral band. Further pre-processing, including atmospheric correction and region-of-interest (ROI) extraction for each experimental plot, was performed to obtain accurate spectral reflectance values.

Feature Extraction: Vegetation Indices and Texture Features

Vegetation Indices (VIs): A set of vegetation indices, established in remote sensing literature for vegetation monitoring, was calculated from the surface reflectance values of the relevant bands. The selected VIs and their formulas are summarized in Table 1.

Table 1. Selected Vegetation Indices (VIs) and their formulas.
Vegetation Index	Abbreviation	Formula	Reference
Normalized Difference Vegetation Index	NDVI	$(NIR – Red) / (NIR + Red)$	Rouse et al. (1973)
Green Normalized Difference Index	GNDVI	$(NIR – Green) / (NIR + Green)$	Gitelson & Merzlyak (1998)
Chlorophyll Vegetation Index	CVI	$NIR \times Red / Green^2$	Vincini et al. (2008)
Enhanced Vegetation Index	EVI	$G \times (NIR – Red) / (NIR + C_1 \times Red – C_2 \times Blue + L)$	Huete et al. (2002)
MERIS Terrestrial Chlorophyll Index	MTCI	$(NIR – RedEdge) / (RedEdge – Red)$	Dash & Curran (2004)
Red Edge Chlorophyll Index	CI_{red edge}	$NIR / RedEdge – 1$	Gitelson et al. (2003)
Structure Insensitive Pigment Index	SIPI	$(NIR – Blue) / (NIR – Red)$	Peñuelas et al. (1995)
Difference Vegetation Index	DVI	$NIR – Red$	Jordan (1969)
Optimized Soil-Adjusted Vegetation Index	OSAVI	$(1+0.16) \times (NIR – Red) / (NIR + Red + 0.16)$	Rondeaux et al. (1996)
Green-Blue NDVI	GBNDVI	$[NIR – (Green+Blue)] / [NIR + (Green+Blue)]$	Wang et al. (2007)
Ratio Vegetation Index	RVI	$NIR / Red$	Jordan (1969)

Texture Features (TFs): Texture features were extracted from the grayscale image of each spectral band using the Gray-Level Co-occurrence Matrix (GLCM) method. For each band (Blue, Green, Red, Red Edge, NIR), eight common GLCM-derived texture measures were calculated within a 3×3 moving window. These features describe the spatial distribution and relationship of pixel intensities:

1. Mean (Mea)
2. Variance (Var)
3. Homogeneity (Hom)
4. Contrast (Con)
5. Dissimilarity (Dis)
6. Entropy (Ent)
7. Second Moment (Sem)
8. Correlation (Cor)

The extraction of both VIs and TFs from the UAV drone imagery formed the basis for the subsequent correlation analysis and model development.

Statistical Analysis and Modeling Framework

Pearson correlation analysis was first performed between the ground-measured LCC and each extracted VI and TF for different growth stages (Seedling, Bud, Flowering-Boll, and Boll Opening stages). Highly correlated features ($|r| > 0.6$ and $p < 0.01$) were selected as candidate inputs for the estimation models.

Three modeling approaches were employed and compared:

1. Random Forest Regression (RF): An ensemble learning method that constructs multiple decision trees during training and outputs the mean prediction of the individual trees. It is robust to overfitting and can model non-linear relationships.

2. Residual Network (ResNet): A deep convolutional neural network architecture that uses residual connections to ease the training of very deep networks, helping to avoid the degradation problem.

3. 1D-Convolutional Neural Network (1D-CNN): A neural network designed to process sequential data. Here, the input features (VIs, TFs, or their combination) were treated as a 1D sequence. The model typically includes convolutional layers for local feature extraction, pooling layers for dimensionality reduction, and fully connected layers for regression.

The dataset was randomly split into a training set (70%) for model building and a validation set (30%) for independent evaluation. Three types of input feature sets were tested for each model: (i) VIs only, (ii) TFs only, and (iii) Combined Features (VIs + TFs).

Model performance was evaluated using the Coefficient of Determination ($R^2$) and the Root Mean Square Error ($RMSE$), calculated as:

$$R^2 = 1 – \frac{\sum_{i=1}^{n}(S_i – M_i)^2}{\sum_{i=1}^{n}(M_i – \bar{M})^2}$$

$$RMSE = \sqrt{\frac{1}{n}\sum_{i=1}^{n}(S_i – M_i)^2}$$

where $M_i$ is the measured LCC value, $\bar{M}$ is the mean of measured values, $S_i$ is the estimated LCC value, and $n$ is the number of samples. Higher $R^2$ and lower $RMSE$ indicate better model performance.

Results and Analysis

Correlation between Features and Leaf Chlorophyll Content

The correlation analysis revealed that the strength of association between remote sensing features and LCC varied with growth stage and feature type. For Vegetation Indices, indices like GBNDVI, RVI, CI_{red edge}, MTCI, GNDVI, and NDVI consistently showed high positive correlations ($r > 0.66$) with LCC across most stages, particularly during the Bud and Flowering-Boll stages where correlations exceeded 0.77. This confirms the sensitivity of these spectral indices to chlorophyll concentration when acquired by the UAV drone.

The correlation between Texture Features and LCC was more stage-specific and generally weaker than that of VIs. Different TFs from different bands became significant at different stages. For instance, in the Bud stage, texture features from the Green and Red Edge bands (e.g., Homogeneity, Contrast) showed relatively higher correlations. This stage-dependence highlights the dynamic nature of canopy texture as the plant grows and develops.

Performance of LCC Estimation Models

The performance metrics for the models built using different algorithms and feature sets are summarized below. A key finding was the superior and more stable performance of the 1D-CNN model compared to RF and ResNet across most scenarios when deployed on data from the China UAV drone.

Models Based Solely on Vegetation Indices (VIs)

Table 2 presents the model performance using only the selected VIs as inputs. The 1D-CNN model consistently achieved the highest $R^2$ and lowest $RMSE$ on both training and validation sets across all four growth stages. For example, at the Bud stage, the 1D-CNN model yielded a validation $R^2$ of 0.876 and $RMSE$ of 0.277, outperforming both RF and ResNet. This demonstrates the capability of the 1D-CNN to effectively learn complex, non-linear relationships from spectral indices derived from UAV data.

Table 2. Performance comparison of LCC estimation models using Vegetation Indices (VIs) only.
Growth Stage	Model	Training Set		Validation Set
		$R^2$	$RMSE$	$R^2$	$RMSE$
Bud Stage	RF	0.869	0.309	0.829	0.514
	ResNet	0.848	0.370	0.811	0.344
	1D-CNN	0.876	0.277	0.876	0.277
Flower-Boll Stage	RF	0.874	0.784	0.819	0.553
	ResNet	0.882	0.133	0.835	0.257
	1D-CNN	0.893	0.184	0.853	0.451
Boll Opening Stage	RF	0.837	0.859	0.776	0.498
	ResNet	0.802	0.452	0.751	0.414
	1D-CNN	0.845	0.537	0.789	0.496

Models Based Solely on Texture Features (TFs)

As shown in Table 3, models using only TFs generally exhibited lower accuracy compared to VI-based models. This aligns with the correlation analysis, indicating that spectral information is more directly related to chlorophyll pigmentation. However, the 1D-CNN model again proved to be the most effective among the three algorithms for extracting relevant information from texture data captured by the UAV drone. At the Flower-Boll stage, the 1D-CNN model with TFs achieved a validation $R^2$ of 0.587.

Table 3. Performance comparison of LCC estimation models using Texture Features (TFs) only.
Growth Stage	Model	Training Set		Validation Set
		$R^2$	$RMSE$	$R^2$	$RMSE$
Bud Stage	RF	0.587	0.288	0.485	0.376
	ResNet	0.551	0.297	0.432	0.325
	1D-CNN	0.594	0.234	0.493	0.322
Flower-Boll Stage	RF	0.570	1.038	0.460	0.896
	ResNet	0.516	0.891	0.405	0.812
	1D-CNN	0.587	0.905	0.500	0.838

Models Based on Combined Features (VIs + TFs)

The integration of spectral VIs and spatial TFs resulted in the most accurate and robust LCC estimation models. Using the 1D-CNN algorithm—identified as the most effective—with the combined feature set led to a significant improvement in performance. The results for the three main growth stages are consolidated in Table 4. The fusion model achieved very high accuracy on the training set ($R^2$: 0.955-0.957) and maintained strong performance on the independent validation set ($R^2$: 0.827-0.877). This represents a substantial improvement over models using either VIs or TFs alone. For instance, at the Boll Opening stage, the validation $R^2$ for the combined model was 0.874, compared to 0.789 for the VI-only model and 0.491 for the TF-only model. This conclusively demonstrates the synergistic effect of combining spectral and textural information from UAV multispectral imagery.

Table 4. Performance of the optimal 1D-CNN model using Combined Features (VIs + TFs).
Growth Stage	Training Set		Validation Set
	$R^2$	$RMSE$	$R^2$	$RMSE$
Bud Stage	0.957	1.019	0.827	1.927
Flower-Boll Stage	0.957	1.057	0.877	1.732
Boll Opening Stage	0.955	0.915	0.874	1.408

Discussion

The results of this study underscore the significant potential of China UAV drone technology as a powerful tool for precision agriculture, specifically for monitoring crop physiological traits like chlorophyll content. The superiority of the 1D-CNN model aligns with its inherent strengths in processing sequential or structured data. Unlike RF, which builds independent trees, or standard CNNs/ResNets designed for 2D image grids, the 1D-CNN can efficiently capture intricate local patterns and dependencies within the feature vector (whether it’s a sequence of VIs, TFs, or both). Its architecture, often incorporating mechanisms like batch normalization and dropout, provides strong regularization, which is crucial for achieving generalizable models from the high-dimensional but potentially limited datasets typical in field-based UAV studies.

The enhanced performance achieved by fusing VIs and TFs is a critical finding. Vegetation indices provide a biochemical perspective by quantifying light absorption and reflectance properties directly linked to chlorophyll and other pigments. In contrast, texture features offer a biophysical perspective, describing the structural arrangement of leaves and shadows within the canopy, which influences light interception and scattering. Chlorophyll content is inherently linked to both biochemistry (pigment concentration) and canopy structure (leaf area, orientation, clustering). Therefore, a model that incorporates both information sources inherently has a more complete representation of the factors determining the observed remote sensing signal. The 1D-CNN model effectively learns the complex, non-linear interactions between these spectral and spatial features, leading to more accurate estimations than models based on a single data modality. This fusion strategy is particularly valuable for a UAV drone platform, which simultaneously captures high-resolution spectral and spatial information.

The observation that VI-only models outperformed TF-only models is consistent with the fundamental principles of plant spectroscopy. Chlorophyll has specific absorption features in the red and red-edge regions, making spectral indices directly sensitive to its concentration. Texture, while influenced by canopy density and health, is a more indirect measure and can be confounded by factors like sun angle, wind, and growth stage architecture. However, the substantial boost in accuracy when TFs are added to VIs indicates that texture provides valuable complementary information not fully encapsulated by spectral indices alone, especially for discriminating subtle variations within dense canopies or under different management practices.

The practical implications for agriculture in China are considerable. The proposed workflow—using a cost-effective multispectral UAV drone for data acquisition, extracting both spectral and textural features, and applying a 1D-CNN model—provides a scalable, accurate, and timely method for mapping cotton LCC. This enables farmers and agronomists to diagnose spatial variability in crop nitrogen status and photosynthetic performance, supporting variable-rate application of fertilizers and irrigation, ultimately aiming for optimized resource use, increased yield, and reduced environmental impact.

Conclusion

This study demonstrates a robust framework for estimating cotton leaf chlorophyll content by integrating multispectral and textural information acquired from a UAV drone. Among the evaluated modeling algorithms—Random Forest (RF), Residual Network (ResNet), and 1D-Convolutional Neural Network (1D-CNN)—the 1D-CNN model consistently exhibited superior stability and predictive accuracy. While vegetation indices (VIs) derived from spectral reflectance showed a stronger direct correlation with LCC than texture features (TFs), the fusion of both data types into a combined feature set yielded the highest estimation accuracy across key growth stages (Bud, Flowering-Boll, and Boll Opening).

The optimal 1D-CNN model trained on the combined features achieved validation $R^2$ values of 0.827, 0.877, and 0.874 for the three stages, respectively, significantly outperforming models based solely on VIs or TFs. This highlights the synergistic value of combining the biochemical information from spectra with the structural information from image texture for comprehensive crop monitoring.

In conclusion, the integration of China UAV drone-based remote sensing with advanced deep learning analytics presents a highly effective and practical approach for non-destructive, high-throughput monitoring of cotton chlorophyll status. The proposed methodology provides a valuable tool for supporting precision management decisions in cotton production, contributing to the sustainable intensification of agriculture. Future work could explore the transferability of this model to other crops and regions, and the integration of temporal sequences of UAV data to model chlorophyll dynamics throughout the entire growing season.