UAV Drones and Color-Space Fusion: An Advanced Framework for High-Accuracy Water Depth Inversion in Turbid Aquaculture Ponds

Accurate and frequent monitoring of water depth in coastal aquaculture ponds is paramount for sustainable management. These small, often geometrically regular water bodies are hotspots for production but also for environmental impact, with effluents rich in nutrients and organic matter influencing adjacent ecosystems. Traditional measurement techniques, while precise, are prohibitively costly and inefficient for the frequent, large-area monitoring required for dynamic management. Here, UAV drones emerge as a transformative tool, offering a unique combination of high spatial resolution, operational flexibility, and cost-effectiveness. They bridge the gap between coarse satellite data and labor-intensive field surveys.

However, deriving accurate water depth from UAV drone imagery in turbid aquaculture environments remains a significant challenge. The complex optical properties of such waters, characterized by high concentrations of suspended sediments and dissolved organic matter, coupled with interference from variable bottom substrates, severely limit the performance of traditional spectral band ratio or linear regression models. These models often fail to capture the non-linear relationships between radiance and depth under such conditions.

This study addresses this critical gap by introducing a novel methodological framework that synergizes UAV drone-acquired multispectral data with features derived from multiple color-space transformations. The core hypothesis is that converting standard Red-Green-Blue (RGB) representations into perceptually uniform or decorrelated color spaces can extract more robust features related to water optical properties, thereby significantly enhancing the ability of machine learning models to invert water depth in challenging, high-turbidity pond environments.

1. Methodology: A Fusion of Spectral and Color-Space Intelligence

The proposed workflow integrates data acquisition via UAV drones, advanced image processing through color-space conversion, feature engineering, and predictive modeling using a suite of machine learning algorithms.

1.1 Data Acquisition with UAV Drones

A commercial multirotor UAV drone equipped with a calibrated multispectral sensor was deployed. The sensor captures data in five distinct spectral bands, detailed in Table 1. Flight missions were conducted at a low altitude under stable, clear-sky conditions synchronized with in-situ depth sampling using a graduated pole, ensuring precise georeferencing between image pixels and measurement points.

Table 1: Spectral Band Specifications of the UAV Drone Sensor
Band Designation	Band Name	Center Wavelength (nm)	Bandwidth (nm)
B1	Blue	450	32
B2	Green	560	32
B3	Red	650	32
B4	Red Edge	730	32
B5	Near-Infrared (NIR)	840	52

1.2 Color-Space Transformation and Feature Engineering

The standard RGB composite from the UAV drone imagery was programmatically transformed into three alternative color spaces: HSV (Hue, Saturation, Value), CIE Lab (Lightness, a*, b*), and YUV (Luma, Chroma). These transformations aim to decouple intensity information from color information, which is often confounded in the RGB space, thereby creating features that may be more invariant to illumination changes and more sensitive to water constituent variations.

1.2.1 RGB to HSV Conversion:
The HSV model represents color in terms more aligned with human perception.
$$θ = cos^{-1}\left( \frac{0.5[(R – G) + (R – B)]}{\sqrt{(R – G)^2 + (R – B)(G – B)}} \right)$$
$$H = \begin{cases} \frac{θ}{360°}, & B ≤ G \\ \frac{2π – θ}{360°}, & B > G \end{cases}$$
$$S = 1 – \frac{min(R, G, B)}{max(R, G, B)}$$
$$V = max(R, G, B)$$
Here, $H$ (Hue) represents the dominant wavelength, $S$ (Saturation) represents the purity of the color, and $V$ (Value) represents the brightness.

1.2.2 RGB to CIE Lab Conversion:
The Lab color space is designed to be perceptually uniform. The conversion first involves a transformation to an intermediate XYZ space.
$$
\begin{bmatrix} X \\ Y \\ Z \end{bmatrix} = \begin{bmatrix} 0.490 & 0.310 & 0.200 \\ 0.177 & 0.812 & 0.011 \\ 0.000 & 0.010 & 0.990 \end{bmatrix} \begin{bmatrix} R \\ G \\ B \end{bmatrix}
$$
$$L^* = 116 \cdot f(Y/Y_n) – 16$$
$$a^* = 500 \cdot [f(X/X_n) – f(Y/Y_n)]$$
$$b^* = 200 \cdot [f(Y/Y_n) – f(Z/Z_n)]$$
where $f(t) = t^{1/3}$ for $t > 0.008856$, else $f(t) = 7.787t + 16/116$. $L^*$ represents lightness, $a^*$ the green-red opposition, and $b^*$ the blue-yellow opposition.

1.2.3 RGB to YUV Conversion:
The YUV model separates luminance (Y) from chrominance (U, V).
$$
\begin{aligned}
Y &= 0.299R + 0.587G + 0.114B \\
U &= -0.169R – 0.331G + 0.500B \\
V &= 0.500R – 0.419G – 0.081B
\end{aligned}
$$
In this study, the V component from YUV is denoted as $V1$ to avoid confusion with the Value component from HSV.

From these transformations, nine new feature channels (H, S, V, L, a, b, Y, U, V1) were extracted per pixel and fused at the pixel level with the original five spectral bands (B1-B5). This created an enriched feature dataset for model training.

1.3 Feature Importance Analysis with SHAP

To understand the contribution of each feature and to perform feature selection, SHapley Additive exPlanations (SHAP) was employed. Based on cooperative game theory, SHAP assigns each feature an importance value for a particular prediction by considering all possible feature combinations. The mean absolute SHAP value across all samples was used to rank features. For each model type, the top five most important features were selected to construct a more parsimonious and interpretable final model, mitigating potential overfitting from the initial high-dimensional feature set.

1.4 Machine Learning Models for Depth Inversion

Five distinct machine learning algorithms were rigorously evaluated to identify the most robust predictor for water depth from UAV drone data. Their key characteristics are summarized below.

Table 2: Summary of Machine Learning Algorithms Evaluated
Algorithm	Acronym	Type	Core Principle
Support Vector Regression	SVR	Kernel-based	Maps data to a high-dimensional space to find a hyperplane that minimizes error within a margin (ε).
Multi-Layer Perceptron Regressor	MLPR	Neural Network	A feedforward artificial neural network model that uses backpropagation for training multiple layers of nodes.
Random Forest Regressor	RF	Ensemble (Bagging)	Constructs a multitude of decision trees during training and outputs the mean prediction of the individual trees.
Extreme Gradient Boosting	XGBoost	Ensemble (Boosting)	An optimized gradient boosting library that builds trees sequentially, each correcting errors of the previous one, with regularization.
Gradient Boosting Decision Tree	GBDT	Ensemble (Boosting)	Builds an additive model in a forward stage-wise fashion, allowing optimization of arbitrary differentiable loss functions.

The dataset (89 samples) was split into 80% for training and 20% for independent testing. A 10-fold cross-validation combined with grid search was used on the training set for hyperparameter tuning to prevent overfitting and find the optimal model configuration. The optimal hyperparameters for the final models are listed in Table 3.

Table 3: Optimal Hyperparameters for the Machine Learning Models
Model	Key Hyperparameter	Optimal Value
Random Forest (RF)	n_estimators	77
Random Forest (RF)	max_depth	11
XGBoost	max_depth	4
	learning_rate	0.28
	reg_alpha	0.15
	reg_lambda	0.9
GBDT	n_estimators	99
	learning_rate	0.09
	subsample	0.5
SVR	C	100
SVR	epsilon	0.05
MLPR	hidden_layer_sizes	(550, 350)
	alpha	0.055
	learning_rate_init	0.0025

1.5 Model Evaluation Metrics

Model performance was assessed using three standard metrics calculated on the held-out test set:
$$R^2 = 1 – \frac{\sum_{i=1}^{n}(y_i – \hat{y}_i)^2}{\sum_{i=1}^{n}(y_i – \bar{y})^2}$$
$$RMSE = \sqrt{\frac{1}{n}\sum_{i=1}^{n}(y_i – \hat{y}_i)^2}$$
$$MAE = \frac{1}{n}\sum_{i=1}^{n}|y_i – \hat{y}_i|$$
where $y_i$ is the measured depth, $\hat{y}_i$ is the predicted depth, $\bar{y}$ is the mean of measured depths, and $n$ is the number of samples. $R^2$ (Coefficient of Determination) indicates the proportion of variance explained, with values closer to 1.0 being better. RMSE (Root Mean Square Error) and MAE (Mean Absolute Error) quantify the prediction error in the original units (meters), with lower values indicating higher accuracy.

2. Results and Analysis

2.1 Baseline Performance: Spectral Bands Only

The initial models trained solely on the five UAV drone spectral bands (B1-B5) established a performance baseline. As shown in Table 4, the ensemble tree-based methods (RF, XGBoost, GBDT) outperformed SVR and MLPR, with RF achieving the highest $R^2$ of 0.69. This underscores the inherent non-linearity of the depth-radiance relationship in turbid ponds, which tree-based models capture more effectively than the linear kernel SVR or the simpler MLPR configuration used.

Table 4: Baseline Water Depth Inversion Accuracy Using Only Spectral Bands
Model	R²	RMSE (m)	MAE (m)
Random Forest (RF)	0.69	0.07	0.06
XGBoost	0.62	0.06	0.05
GBDT	0.63	0.10	0.08
MLPR	0.55	0.07	0.05
SVR	0.36	0.09	0.08

2.2 Impact of Color-Space Feature Fusion

Integrating features from color-space transformations consistently and significantly improved model accuracy across all algorithms. Table 5 presents the results when models were trained on the full set of five spectral bands plus the three channels from each color space.

Table 5: Inversion Accuracy with Full Spectral Bands and Different Color-Space Features
Input Features	Model	R²	RMSE (m)	MAE (m)
B1-B5 + HSV	RF	0.79	0.04	0.03
	XGBoost	0.82	0.05	0.03
	GBDT	0.77	0.05	0.04
	MLPR	0.72	0.06	0.04
	SVR	0.78	0.05	0.04
B1-B5 + Lab	RF	0.78	0.05	0.04
	XGBoost	0.77	0.05	0.04
	GBDT	0.74	0.05	0.04
	MLPR	0.59	0.07	0.06
	SVR	0.50	0.08	0.06
B1-B5 + YUV	RF	0.73	0.05	0.04
	XGBoost	0.76	0.06	0.04
	GBDT	0.66	0.09	0.08
	MLPR	0.61	0.07	0.06
	SVR	0.48	0.08	0.07

The improvement was most pronounced for the SVR model when using HSV features, where its $R^2$ increased by 117% (from 0.36 to 0.78). The HSV color space consistently yielded the highest accuracy across models, particularly with XGBoost achieving an $R^2$ of 0.82. This suggests that the Hue (H) and Saturation (S) components, which separate color type and purity from brightness (V), provide a powerful, decorrelated feature set that is highly informative for water depth estimation under turbid conditions.

2.3 Optimized Models with SHAP-based Feature Selection

Applying SHAP analysis to select the top five most important features for each model type further refined performance. The selected features varied between color spaces and models, highlighting the complementary nature of the information they encode. Table 6 shows the performance of the optimized models.

Table 6: Performance of Optimized Models After SHAP Feature Selection
Model	Optimal Feature Set (Example)	R²	RMSE (m)	MAE (m)
Random Forest (RF)	V, B4, b, B5, H	0.83	0.05	0.04
XGBoost	V1, V, U, B4, B5	0.84	0.05	0.04
GBDT	U, S, B5, a, V	0.85	0.05	0.04
MLPR	Features from B1-B5+HSV	0.72	0.06	0.04
SVR	Features from B1-B5+HSV	0.78	0.05	0.04

The GBDT model, trained on a selected set of features including U (from YUV), S (from HSV), a (from Lab), and spectral bands, achieved the highest overall accuracy with an $R^2$ of 0.85 and an RMSE of 0.05 m. This represents a 35% relative improvement over its baseline performance using only spectral bands. The ensemble methods (RF, XGBoost, GBDT) consistently outperformed SVR and MLPR after optimization, demonstrating their superior capacity to learn from the fused spectral and color-space feature set acquired by UAV drones.

2.4 Feature Importance and Contribution Analysis

SHAP analysis revealed distinct patterns in how different color-space features contribute to predictions. In models utilizing HSV features, the Hue (H) channel was consistently the most important, indicating its strong correlation with changes in water optical properties related to depth. For Lab-based models, the chromaticity channels (a, b) were more influential than lightness (L). In YUV-based models, the chrominance components (U, V1) dominated over the luma (Y). This universal trend—where color or chrominance information outweighs pure brightness or lightness—validates the core premise: separating spectral information from intensity through color-space conversion is key to robust depth retrieval from UAV drone imagery in complex water.

3. Discussion and Implications

This study establishes a significant advancement in remote sensing of shallow water depths by effectively integrating UAV drone technology with computational color science and machine learning. The persistent challenge in turbid aquatic environments like aquaculture ponds is the non-linear and compounded attenuation of light by water, dissolved matter, and suspended particles. Traditional multispectral indices or band ratios often fail under these conditions. The proposed framework addresses this by creating a richer, more discriminant feature space.

The superior performance of HSV-derived features, particularly Hue (H), can be attributed to its invariance to certain illumination changes and its direct relation to the dominant spectral signature reflected from the water column. In turbid water, the spectral shift towards longer wavelengths (green-red) as depth changes or sediment concentration varies is effectively captured by the H channel. Saturation (S) may help normalize for the “whitening” effect caused by high scattering from suspended solids. The decorrelation inherent in the Lab and YUV spaces similarly provides stable chromaticity features that are less sensitive to absolute brightness variations caused by sun glint or cloud shadows—common issues in UAV drone imagery.

The dominance of tree-based ensemble methods (GBDT, XGBoost, RF) highlights their suitability for this task. They naturally handle non-linear relationships, interactions between features (e.g., between a spectral band and a color-space channel), and are relatively robust to irrelevant features, especially after SHAP-based selection. The dramatic improvement in SVR performance with the right feature set (HSV) underscores that the choice of input features is sometimes more critical than the model itself; by providing linearly more separable features in a higher-dimensional space, the kernel trick in SVR becomes far more effective.

Operationally, the use of UAV drones is central to this methodology’s success. The high spatial resolution (<10 cm/pixel) ensures that small ponds and within-pond depth variations are accurately resolved. The low-altitude flight minimizes atmospheric interference, simplifying pre-processing. The on-demand deployment capability of UAV drones allows for monitoring synchronized with tidal cycles or management activities, providing data at the most relevant temporal scales for aquaculture.

While highly effective, this study has limitations. The models were trained on data from a specific geographic region and pond type. Generalization to markedly different water types (e.g., coral reefs, seagrass meadows) or under vastly different illumination conditions requires further validation. Future work will focus on testing the transferability of the approach using data from UAV drones equipped with hyperspectral sensors for even finer spectral feature extraction, and on developing a physical understanding of why specific color-space channels (like Hue) are so strongly linked to water depth in turbid environments.

4. Conclusion

This research demonstrates that fusing multispectral data from UAV drones with engineered features from multiple color-space transformations (HSV, Lab, YUV) significantly enhances the accuracy of machine learning models for water depth inversion in high-turbidity aquaculture ponds. The method successfully overcomes key limitations posed by complex in-water optical processes. Among the evaluated color spaces, HSV provided the most informative features, leading to the highest model performance gains. Among machine learning algorithms, the Gradient Boosting Decision Tree (GBDT) model achieved the optimal result, with a coefficient of determination ($R^2$) of 0.85 and a root mean square error (RMSE) of 0.05 meters. The systematic use of SHAP for feature importance analysis provided critical insights into model behavior and enabled the construction of efficient, high-performance predictive models.

The framework presented is not merely an incremental improvement but a paradigm shift towards leveraging the full potential of UAV drone remote sensing through advanced image processing and machine intelligence. It provides a robust, scalable, and practical solution for the precise and frequent monitoring of water depth, a fundamental parameter for the sustainable management of aquaculture ecosystems, pollution flux estimation, and the conservation of coastal water resources.