In my research on drip-irrigated cotton during the flowering and boll-setting stage, I systematically investigated the influence of different China UAV flight altitudes on the accuracy of canopy temperature extraction. The experiment was conducted over two consecutive years, 2023 and 2024, at a test site in Shihezi, Xinjiang, China. The region is characterized by a temperate continental climate with hot, dry summers and cold winters, receiving mean annual precipitation between 125 and 207 mm, with an annual evaporation rate of 1000 to 1500 mm. The soil texture at the experimental site was clay loam, with an average field capacity of 16.07% (mass water content) and a dry bulk density of 1.43 g·cm−3. I selected cotton variety Zhongmiansuo 99 for this study.
I set up four irrigation frequency treatments: 3, 5, 7, and 12 days per irrigation cycle, with a total irrigation volume of 5250 m3·hm−2 throughout the entire growth period. Each treatment was replicated four times, resulting in 16 plots, each measuring 6.0 m × 8.0 m. Drip irrigation fertigation technology was employed for full water and fertilizer management. The drip tape had an emitter flow rate of 3.2 L·h−1, with emitter spacing of 20 cm. The planting pattern adopted was one film with three pipes and three rows, with 76 cm equal row spacing and a sowing width of 2.28 m. Nitrogen was applied at 300 kg·hm−2, P2O5 at 97 kg·hm−2, and K2O at 97 kg·hm−2, all applied with the irrigation water through drip fertigation.
For thermal infrared image acquisition, I used the DJI Mavic 2 Enterprise Advanced UAV platform, equipped with a thermal infrared camera and RTK positioning module. The thermal camera operated in the 8 to 14 μm spectral range, with a sensor resolution of 640 × 512 pixels and a lens focal length of approximately 38 mm. I set the flight altitudes at 12 m, 20 m, 30 m, 50 m, and 70 m. Image acquisition was conducted between 12:00 and 13:00 Beijing time on clear, sunny days to minimize atmospheric interference. The forward and side overlap rates were both set to 80%, with the lens oriented vertically downward to capture the cotton canopy.

Ground data collection was performed simultaneously with the China UAV flights. Within each experimental plot, I systematically selected five representative cotton plants, ensuring uniform growth, absence of pest damage, and adequate representation of overall plot growth conditions. Immediately after the UAV thermal infrared images were captured, I used a calibrated handheld infrared thermometer (model TP550, accuracy ±0.2°C) to measure the canopy temperature from 5 to 8 cm above the canopy. Each sample point was measured three times, and the average value was recorded. Additionally, I measured the temperatures of black and white calibration boards and pure water for temperature conversion and radiometric calibration of the UAV thermal infrared images, ensuring data accuracy and reliability.
The thermal infrared images were processed using Metashape software for stitching to obtain orthomosaic images of the experimental plots. I then used ENVI 5.3 to clip the plots and DJI Thermal Analysis Tool to extract temperatures from the black and white calibration boards and water. Image registration was performed in ENVI Classic by manually selecting at least 30 feature points, ensuring a root mean square error of less than 1. Finally, I converted the digital number (DN) values of the thermal infrared images to temperature values using ENVI 5.3 and calibrated the images using the measured temperatures of the black and white boards and water.
For canopy temperature extraction, I employed the K-means clustering algorithm as implemented in ENVI 5.6. The algorithm minimizes the sum of squared errors by iteratively updating cluster centroids and reassigning data points. The objective function is given by:
$$E = \sum_{i=1}^{K} \sum_{X \in C_i} \|X – \mu_i\|^2$$
where:
- \(E\) is the sum of squared errors
- \(K\) is the pre-set number of clusters
- \(C_i\) represents the i-th cluster
- \(\mu_i\) is the centroid of cluster \(C_i\)
- \(X\) is the feature vector of individual temperature observations
The centroid is calculated as:
$$\mu_i = \frac{1}{|C_i|} \sum_{X \in C_i} X$$
I set the number of classes to 5, the convergence threshold to 5.00, and the maximum number of iterations to 15. This parameter configuration effectively captured the temperature distribution patterns while controlling computational complexity.
To further improve canopy temperature extraction accuracy, I applied three different outlier removal methods to the temperature frequency distribution histograms:
- Top and bottom 1% elimination method: Remove the top 1% and bottom 1% of temperature pixels from the frequency distribution.
- Low-frequency 0.5% elimination method: Remove the lowest 0.5% of temperature pixels (for 12–30 m altitudes).
- Low-frequency 1% elimination method: Remove the lowest 1% of temperature pixels (for 50–70 m altitudes).
I then performed correlation analysis between the canopy temperatures extracted with and without outlier removal and the ground-measured temperatures. I used a 2:1 ratio to split the data into training and validation sets.
For model evaluation, I employed two core metrics: the coefficient of determination (\(R^2\)) and the root mean square error (\(RMSE\)). The formulas are:
The correlation coefficient r is given by:
$$r = \frac{\sum_{i=1}^{n} (x_i – \bar{x})(y_i – \bar{y})}{\sqrt{\sum_{i=1}^{n} (x_i – \bar{x})^2 \sum_{i=1}^{n} (y_i – \bar{y})^2}}$$
The coefficient of determination is:
$$R^2 = 1 – \frac{\sum_{i=1}^{n} (y_i – \hat{y}_i)^2}{\sum_{i=1}^{n} (y_i – \bar{y})^2}$$
The root mean square error is:
$$RMSE = \sqrt{\frac{1}{n} \sum_{i=1}^{n} (y_i – \hat{y}_i)^2}$$
where \(x_i\) and \(y_i\) are observed values, \(\bar{x}\) and \(\bar{y}\) are their respective means, \(\hat{y}_i\) is the predicted value, and \(n\) is the number of samples.
Additionally, I calculated the coefficient of variation (CV) to assess the relative dispersion of the data:
$$CV = \frac{\sigma}{\mu} \times 100\%$$
where \(\sigma\) is the standard deviation and \(\mu\) is the mean of the dataset:
$$\sigma = \sqrt{\frac{1}{n} \sum_{i=1}^{n} (x_i – \mu)^2}$$
$$\mu = \frac{1}{n} \sum_{i=1}^{n} x_i$$
My study generated a total of 560 valid data points from both years, with 112 data points for each of the five flight altitudes. Based on the treatment means, I obtained 28 characteristic data points, divided into a training set (n = 19) and a validation set (n = 9) using a 2:1 ratio. I ensured uniform sample distribution through outlier removal to maintain model training stability and validation reliability.
Effects of Different Outlier Removal Methods on Canopy Temperature Extraction Accuracy
I first compared the performance of different outlier removal methods for the low-altitude range (12–30 m) and the high-altitude range (50–70 m). For the 12–30 m range, I compared the original temperature data, the top and bottom 1% elimination method, and the low-frequency 0.5% elimination method.
For the 2023 data at 12–30 m, the original temperature data fitted with the measured temperature yielded a regression equation of \(y = 1.093x – 5.275\) with \(R^2 = 0.863\). The top and bottom 1% elimination method improved \(R^2\) to 0.870, an increase of 0.81%. The low-frequency 0.5% elimination method further improved \(R^2\) to 0.874, an increase of 1.27%. For the 2024 data, the original data gave \(R^2 = 0.926\). The top and bottom 1% elimination method maintained \(R^2\) at 0.926, while the low-frequency 0.5% elimination method improved it to 0.934, an increase of 0.86%.
Table 1 summarizes the characteristic values of canopy temperature at 12–30 m for both years.
| Item | 2023 Original | 2023 Top and Bottom 1% | 2023 Low-freq 0.5% | 2024 Original | 2024 Top and Bottom 1% | 2024 Low-freq 0.5% |
|---|---|---|---|---|---|---|
| Maximum value | 39.413 | 39.399 | 39.352 | 35.252 | 35.258 | 35.311 |
| Minimum value | 22.939 | 22.936 | 22.928 | 24.716 | 24.680 | 24.610 |
| Mean value | 29.721 | 29.706 | 29.665 | 28.048 | 28.023 | 27.987 |
| Standard deviation | 2.363 | 2.363 | 2.365 | 2.281 | 2.283 | 2.331 |
| Coefficient of variation | 0.079 | 0.080 | 0.080 | 0.081 | 0.081 | 0.083 |
The validation results for the 12–30 m range showed that the low-frequency 0.5% elimination method outperformed the other methods. For 2023, the validation \(R^2\) values were 0.744 for the top and bottom 1% method and 0.750 for the low-frequency 0.5% method, representing an improvement of 0.89%. For 2024, the validation \(R^2\) values were 0.900 for the top and bottom 1% method and 0.939 for the low-frequency 0.5% method, an improvement of 4.33%. The regression equation for the low-frequency 0.5% method in 2024 was \(y = 0.830x + 5.242\) with \(R^2 = 0.939\). These results indicate that for the 12–30 m altitude range, the low-frequency 0.5% elimination method is the most effective outlier removal strategy.
For the high-altitude range (50–70 m), I compared the original temperature data, the top and bottom 1% elimination method, and the low-frequency 1% elimination method. For 2023, the original temperature data fitted with measured temperature gave \(R^2 = 0.832\). The top and bottom 1% elimination method improved \(R^2\) to 0.833, while the low-frequency 1% elimination method resulted in \(R^2 = 0.831\). For 2024, the original data yielded \(R^2 = 0.912\). The top and bottom 1% elimination method improved \(R^2\) to 0.914, while the low-frequency 1% elimination method reduced \(R^2\) to 0.896.
Table 2 summarizes the characteristic values of canopy temperature at 50–70 m for both years.
| Item | 2023 Original | 2023 Top and Bottom 1% | 2023 Low-freq 1% | 2024 Original | 2024 Top and Bottom 1% | 2024 Low-freq 1% |
|---|---|---|---|---|---|---|
| Maximum value | 34.404 | 34.350 | 34.139 | 35.877 | 35.868 | 36.327 |
| Minimum value | 22.861 | 22.856 | 22.839 | 23.619 | 23.617 | 23.628 |
| Mean value | 28.274 | 28.252 | 28.210 | 26.564 | 26.529 | 26.365 |
| Standard deviation | 1.884 | 1.883 | 1.872 | 2.166 | 2.163 | 2.158 |
| Coefficient of variation | 0.067 | 0.067 | 0.066 | 0.082 | 0.082 | 0.082 |
Validation results for the 50–70 m range showed that the top and bottom 1% elimination method performed best. For 2023, the validation \(R^2\) was 0.814 for the top and bottom 1% method, compared to 0.799 for the original data and 0.722 for the low-frequency 1% method. For 2024, the validation \(R^2\) was 0.618 for the top and bottom 1% method, while the low-frequency 1% method achieved 0.794. However, considering the overall fitting performance, the top and bottom 1% elimination method provided the most consistent results across both years for the 50–70 m altitude range.
Based on these findings, I established a guideline for outlier removal: for low-altitude flights (12–30 m), the low-frequency 0.5% elimination method should be used, while for high-altitude flights (50–70 m), the top and bottom 1% elimination method is recommended. This differentiated approach optimizes data quality and maximizes prediction accuracy.
Effect of Flight Altitude on Canopy Temperature Retrieval Accuracy
After determining the optimal outlier removal methods for each altitude range, I investigated the specific impact of flight altitude on canopy temperature retrieval accuracy. I applied the low-frequency 0.5% elimination method to the 12 m, 20 m, and 30 m data, and the top and bottom 1% elimination method to the 50 m and 70 m data. I then established regression models between the optimized image temperatures and the measured ground temperatures for each altitude.
The results for 2023 showed that the \(R^2\) values ranged from 0.677 to 0.863 across the five altitudes, while for 2024, the \(R^2\) values ranged from 0.687 to 0.783. The 30 m altitude consistently produced the highest \(R^2\) and lowest RMSE in the training set. In 2023, the 30 m altitude achieved an \(R^2\) of 0.863 with an RMSE of 2.424. In 2024, the 12 m, 20 m, and 30 m altitudes showed relatively low RMSE values of 1.104, 1.314, and 1.657, respectively, while the 50 m and 70 m altitudes had higher RMSE values of 9.744 and 5.321.
Table 3 presents the model characteristics for different flight altitudes in 2023 and 2024.
| Year | Altitude (m) | Fitting equation | \(R^2\) | RMSE | Validation equation | \(R^2\) | RMSE |
|---|---|---|---|---|---|---|---|
| 2023 | 12 | \(y = 0.919x – 0.408\) | 0.709 | 3.136 | \(y = 1.208x – 2.704\) | 0.855 | 3.459 |
| 20 | \(y = 1.025x – 3.331\) | 0.819 | 2.614 | \(y = 1.440x – 9.288\) | 0.830 | 3.737 | |
| 30 | \(y = 1.607x – 21.630\) | 0.863 | 2.424 | \(y = 1.301x – 6.188\) | 0.923 | 2.682 | |
| 50 | \(y = 0.893x – 0.118\) | 0.677 | 3.653 | \(y = 0.294x + 23.514\) | 0.810 | 2.647 | |
| 70 | \(y = 0.930x – 1.980\) | 0.741 | 4.285 | \(y = 0.287x + 23.853\) | 0.541 | 2.734 | |
| 2024 | 12 | \(y = 0.795x + 4.093\) | 0.740 | 2.313 | \(y = 1.162x – 4.041\) | 0.829 | 1.104 |
| 20 | \(y = 0.752x + 5.346\) | 0.783 | 2.345 | \(y = 1.428x – 11.655\) | 0.917 | 1.314 | |
| 30 | \(y = 1.014x – 3.949\) | 0.720 | 3.664 | \(y = 0.479x + 17.025\) | 0.766 | 1.657 | |
| 50 | \(y = 0.774x + 3.243\) | 0.701 | 3.804 | \(y = 0.958x + 10.987\) | 0.817 | 9.744 | |
| 70 | \(y = 0.788x – 1.690\) | 0.687 | 4.904 | \(y = 1.550x – 11.397\) | 0.737 | 5.321 |
The validation results confirmed the superior performance of the 30 m altitude. In 2023, the validation \(R^2\) for the 30 m model reached 0.923, the highest among all altitudes, with an RMSE of 2.682. In 2024, the validation \(R^2\) for the 30 m model was 0.766, with an RMSE of 1.657. While the 12 m and 20 m altitudes showed higher validation \(R^2\) in 2024 (0.829 and 0.917, respectively), the 30 m altitude provided more consistent performance across both years, with the best overall balance between fitting accuracy and prediction stability.
Comparing the performance across all altitudes, the models demonstrated that low to medium altitudes (12–30 m) consistently outperformed high altitudes (50–70 m). In 2023, the RMSE for the 30 m model was 2.424, while the 70 m model had an RMSE of 4.285, representing a 43.4% reduction in error at 30 m. In 2024, although the 12 m and 20 m models had lower RMSE values, the 30 m model showed greater consistency in validation performance.
The results also revealed that the 50 m and 70 m altitudes showed significant degradation in model performance, particularly in 2023 where the validation \(R^2\) for 70 m dropped to 0.541, a 26.99% decrease compared to the 30 m altitude. This decline can be attributed to several factors. First, higher flight altitudes increase the atmospheric path length, leading to greater atmospheric attenuation and scattering of thermal radiation. Second, increased altitude reduces spatial resolution, resulting in mixed pixels that combine canopy, soil, and shadow contributions, making it difficult to isolate pure canopy temperature signals. Third, the wider field of view at higher altitudes captures more background soil and non-canopy elements, introducing additional noise into the temperature extraction process.
Furthermore, I observed that the coefficient of variation (CV) showed an increasing trend with altitude. In 2023, the CV for 12–30 m ranged from 0.079 to 0.080, while for 50–70 m, it was 0.067 to 0.067. However, in 2024, the CV for 12–30 m was 0.081 to 0.083, and for 50–70 m, it was 0.082. This suggests that while higher altitudes may reduce absolute variability, they do not necessarily improve the signal-to-noise ratio due to the increased influence of external factors.
Discussion
My findings demonstrate that the accuracy of canopy temperature extraction using China UAV thermal infrared remote sensing is significantly influenced by both flight altitude and the method of outlier removal. The optimal combination identified in this study is a flight altitude of 30 m combined with the low-frequency 0.5% elimination method for outlier removal. This combination achieved the highest prediction accuracy, with the 2023 model showing an RMSE improvement of 70.71% and the 2024 model showing an improvement of 4.03% compared to other altitude-method combinations.
The importance of selecting an appropriate outlier removal method cannot be overstated. The frequency distribution histograms of canopy temperatures extracted from China UAV thermal infrared images typically exhibit a unimodal normal distribution. However, due to sensor noise, atmospheric interference, and mixed pixel effects, the histograms often contain anomalous peaks or fluctuations at the extremes. By removing these outliers, the data quality is improved, leading to more accurate temperature estimates. My results show that the optimal removal threshold varies with flight altitude. At low altitudes (12–30 m), where the thermal images have higher spatial resolution and less atmospheric contamination, the low-frequency 0.5% elimination method is sufficient to remove the most significant outliers without excessive data loss. At high altitudes (50–70 m), where the images are more susceptible to atmospheric effects and mixed pixels, a more aggressive removal method (top and bottom 1% elimination) is necessary to achieve optimal accuracy.
The effect of flight altitude on canopy temperature retrieval accuracy is profound. As flight altitude increases, the spatial resolution of the thermal infrared images decreases, leading to more mixed pixels. A mixed pixel contains contributions from multiple surface types, such as canopy, soil, and shadows, which complicates the extraction of pure canopy temperature. Additionally, the atmospheric path length increases with altitude, causing greater attenuation of thermal radiation and introducing errors from atmospheric water vapor and aerosol scattering. These factors collectively degrade the accuracy of temperature retrieval at higher altitudes.
My finding that 30 m is the optimal flight altitude for cotton canopy temperature extraction agrees with previous studies. For example, Dang et al. (2024) found that 30 m was the optimal altitude for extracting cotton canopy temperature at the seedling and bud stages using China UAV thermal infrared technology, achieving correlation coefficients as high as 0.94 and 0.95. Similarly, Shao et al. (2024) used 30 m as the flight altitude for estimating cotton biomass based on China UAV multispectral imaging, while Zhao et al. (2024) employed 30 m for cotton growth parameter and yield estimation using China UAV multispectral remote sensing. These consistent findings suggest that 30 m represents a robust and generalizable optimal altitude for China UAV-based cotton monitoring in the Xinjiang region.
The superiority of the 30 m altitude can be attributed to a balance between spatial resolution and coverage efficiency. At 30 m, the thermal infrared camera captures sufficient detail to distinguish individual plants and canopy structures while maintaining a reasonable swath width for efficient field coverage. At lower altitudes (12–20 m), although the spatial resolution is higher, the number of images required to cover the same area increases substantially, leading to longer flight times and greater computational requirements for image stitching. Moreover, the increased number of images does not necessarily translate to higher accuracy, as the benefits of finer resolution may be offset by increased sensitivity to local variations in plant structure and leaf orientation. At higher altitudes (50–70 m), the reduction in spatial resolution leads to a loss of detail and an increase in mixed pixels, which degrades the accuracy of canopy temperature extraction.
The atmospheric effect is another critical factor. At 30 m, the atmospheric path length is relatively short, minimizing the attenuation of thermal radiation. As altitude increases to 50 m or 70 m, the path length increases, leading to greater absorption and scattering by atmospheric gases and aerosols. This effect is particularly pronounced in the thermal infrared spectral region, where water vapor absorption bands are strong. The Xinjiang region, where my study was conducted, experiences dry conditions with low atmospheric water vapor content, which somewhat mitigates the atmospheric effect. Nevertheless, the influence of altitude on atmospheric transmission remains significant.
My study also highlights the importance of considering the crop growth stage. The flowering and boll-setting stage is the critical water-sensitive period for cotton. At this stage, the canopy structure is fully developed, with dense foliage and a relatively uniform spatial distribution of leaves. The canopy temperature at this stage is highly sensitive to water stress, making accurate temperature extraction essential for irrigation decision-making. The optimal flight altitude and outlier removal method identified in this study are specifically tailored to this growth stage. For other growth stages, such as the seedling stage or the boll-opening stage, when the canopy structure is less dense or more heterogeneous, the optimal parameters may differ.
Furthermore, my results demonstrate the importance of using appropriate evaluation metrics. The coefficient of determination (\(R^2\)) and the root mean square error (\(RMSE\)) provide complementary information about model performance. \(R^2\) indicates the proportion of variance in the measured temperature that is explained by the model, while \(RMSE\) provides an absolute measure of prediction error in the original units (°C). Both metrics should be considered when evaluating model performance. For practical applications, \(RMSE\) is particularly important, as it directly quantifies the uncertainty in temperature estimates. An \(RMSE\) of 2.424°C, as achieved by the 30 m model in 2023, indicates that the predicted canopy temperature is, on average, within 2.4°C of the measured value. This level of accuracy is sufficient for many irrigation management applications, such as detecting water stress and scheduling irrigation events.
Conclusions
Based on two years of field experiments (2023 and 2024) on drip-irrigated cotton during the flowering and boll-setting stage, I systematically evaluated the effects of China UAV flight altitude and outlier removal methods on canopy temperature extraction accuracy. The key conclusions of my study are as follows:
First, the optimal outlier removal method depends on the flight altitude range. For low-altitude flights between 12 m and 30 m, the low-frequency 0.5% elimination method provides the highest prediction accuracy, retaining 99.5% of the valid data while limiting the influence of outliers within 2.43°C. For high-altitude flights between 50 m and 70 m, the top and bottom 1% elimination method is more effective, ensuring data utilization rates exceeding 98% and maintaining RMSE below 3.90°C. This differentiated approach optimizes data quality and maximizes prediction accuracy, achieving \(R^2\) values exceeding 0.90 for both altitude ranges.
Second, flight altitude is a significant factor affecting canopy temperature extraction accuracy. The 30 m flight altitude consistently produced the best overall performance, with the 2023 model achieving an \(R^2\) of 0.863 and RMSE of 2.424°C, and the 2024 model achieving an \(R^2\) of 0.720 and RMSE of 3.664°C. The 30 m altitude significantly outperformed higher altitudes (50 m and 70 m), which showed substantial degradation in accuracy due to increased atmospheric effects and reduced spatial resolution.
Third, the combination of 30 m flight altitude with the low-frequency 0.5% elimination method represents the optimal strategy for cotton canopy temperature extraction during the flowering and boll-setting stage. This combination improved prediction accuracy by 70.71% in 2023 and 4.03% in 2024 compared to other altitude-method combinations. The improved accuracy is attributed to the balance between spatial resolution, atmospheric effects, and data quality achieved at this altitude.
My findings provide reliable technical support and methodological guidance for precise monitoring of water status, dynamic assessment, and quantitative, intelligent irrigation decision-making in drip-irrigated cotton fields using China UAV thermal infrared remote sensing technology. Future research could extend this work by investigating a finer gradient of flight altitudes, exploring more sophisticated outlier removal algorithms such as machine learning-based approaches, and incorporating additional variables such as atmospheric temperature and humidity to further improve canopy temperature prediction accuracy. Additionally, the generalizability of the optimal parameters to other cotton varieties and growing regions should be validated through further field studies.
