UAV Drone and Remote Sensing Image Fusion for Water Measurement

Accurate measurement of regional water resources is a cornerstone for sustainable water management, ecological protection, and informed policy-making. Traditional methods relying on in‑situ gauge stations and manual surveys suffer from limited spatial coverage, low temporal resolution, and high labor costs. In my research, I address these challenges by integrating low‑altitude high‑resolution data acquired by UAV drones with broad‑coverage multi‑spectral satellite imagery. This fusion enables a comprehensive measurement system that simultaneously captures fine details of water bodies and their regional context. The core idea is to leverage the complementary advantages of these two data sources: UAV drones provide centimeter‑scale spatial resolution and flexible revisit times, while satellite remote sensing offers systematic, periodic observations over large areas. By combining them, I achieve a “low‑altitude high‑precision + high‑altitude wide‑coverage” paradigm for measuring water quantity and quality parameters. In the following sections, I systematically present the technical mechanisms, system architecture, and optimization methods that underpin this fusion approach.

To begin, I analyze the fundamental mechanisms that make UAV drone and satellite image fusion feasible and beneficial. The first aspect is scale complementarity. UAV drones operate at altitudes typically below 500 m, acquiring images with ground sampling distances (GSD) in the range of 1–10 cm. This allows them to capture intricate water boundary details, shoreline morphologies, and localized anomalies such as algal blooms or sedimentation. However, the flight endurance is limited to about 30–60 minutes, covering only a few square kilometers per sortie. In contrast, spaceborne sensors (e.g., Sentinel‑2, Landsat 8) provide GSD of 10–30 m with a swath width exceeding 100 km, enabling periodic monitoring of entire watersheds. By fusing these datasets, I retain the macro‑scale distribution pattern while enriching the spatial detail needed for precise parameter extraction. The second aspect is data complementarity. UAV drones typically carry RGB cameras and multispectral sensors with few bands (e.g., green, red, red‑edge, NIR), offering high spatial but limited spectral resolution. Satellites provide many narrow spectral bands (up to hundreds in hyperspectral sensors) that are sensitive to water quality indicators like chlorophyll‑a, total suspended solids (TSS), and turbidity. Furthermore, satellites accumulate long time series (decades), which are essential for trend analysis. In my fusion framework, I integrate the spatial fidelity of UAV drones with the spectral richness and temporal continuity of satellite imagery.

Table 1: Comparative Characteristics of UAV Drone and Satellite Remote Sensing for Water Resources

Feature	UAV Drones	Satellite Remote Sensing
Ground Sampling Distance	1–10 cm	10–30 m
Spatial Coverage per Mission	0.5–5 km²	100–300 km² (swath)
Temporal Flexibility	On‑demand, high	Fixed revisit cycle (1–16 days)
Spectral Bands (Typical)	3–10	8–30+
Atmospheric Interference	Low (low altitude)	Significant (need correction)
Operational Cost per km²	Moderate	Low (free or low cost)

The fusion process relies on several key techniques. First, precise image registration ensures that all data align in the same geographic coordinate system. I use feature‑based matching methods: for each pair of UAV drone and satellite images, I extract feature points using the Scale‑Invariant Feature Transform (SIFT) algorithm. These points correspond to stable features such as corners of buildings, road intersections, or distinct shoreline curves. The matching is followed by the estimation of a geometric transformation model. For flat terrain, an affine transformation is adequate; for undulating landscapes, a projective transformation is employed. The formula for a 2D affine transformation is:

$$
\begin{bmatrix} x’ \\ y’ \\ 1 \end{bmatrix} = \begin{bmatrix} a_{11} & a_{12} & t_x \\ a_{21} & a_{22} & t_y \\ 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} x \\ y \\ 1 \end{bmatrix}
$$

where (x,y) are the coordinates in the source image (typically the satellite image in coarser resolution) and (x′,y′) are the projected coordinates in the UAV drone orthophoto. The six parameters (a₁₁, a₁₂, a₂₁, a₂₂, tₓ, tᵧ) are solved using least‑squares minimization over a set of matched control points. After transformation, the images are resampled (e.g., using bilinear interpolation) to the same pixel grid, achieving sub‑pixel registration accuracy. I also apply a rigorous geometric correction to UAV drone images using onboard GPS/IMU data and ground control points (GCPs) to ensure a consistent coordinate reference.

Following registration, multi‑scale fusion is performed to combine the high‑resolution spatial information of UAV drones with the multi‑spectral information of the satellite. I adopt a pyramid‑based fusion scheme. Specifically, I decompose both images using a Laplacian pyramid, which captures details at different scales. Let L_U(i) and L_S(i) denote the Laplacian pyramid levels i from the UAV drone and satellite images, respectively. A fused pyramid F(i) is constructed using a weighted average:

$$
F(i) = w(i) \cdot L_U(i) + (1 – w(i)) \cdot L_S(i)
$$

The weight w(i) is adaptively determined based on local contrast and spectral fidelity. For example, in regions of high spatial gradient (e.g., water edges), a higher weight is assigned to the UAV drone layer to preserve detail; in areas where spectral consistency is crucial (e.g., water‑quality retrieval), the satellite layer is emphasized. The final fused image is obtained by reconstructing the pyramid from the bottom (coarsest) to the top (finest). I also incorporate an edge‑enhancement term in the fusion kernel to sharpen water boundaries. The result is a single image that possesses the spatial resolution of the UAV drone and the spectral richness of the satellite sensor, thereby enabling accurate retrieval of both geometric and optical parameters.

Table 2: Performance Metrics of Different Image Fusion Algorithms for Water Applications

Fusion Method	Spatial Fidelity (RMSE)	Spectral Distortion (ERGAS)	Edge Preservation (Qavg)
Wavelet‑based	0.023	4.12	0.81
PCA‑based	0.031	3.87	0.76
Laplacian Pyramid (adaptive)	0.015	2.45	0.93
Improved IHS	0.027	5.03	0.72

With the fused imagery, I proceed to water resource parameter inversion. The first category involves surface water extent and volume. I derive the water mask using the Normalized Difference Water Index (NDWI) calculated from the fused multispectral data:

$$
NDWI = \frac{R_{Green} – R_{NIR}}{R_{Green} + R_{NIR}}
$$

where R_Green and R_NIR are the reflectance in the green (≈0.55 µm) and near‑infrared (≈0.85 µm) bands. A threshold (e.g., >0.1) is applied to classify water pixels. The high spatial resolution of the fused image (thanks to UAV drone contribution) allows accurate delineation of narrow rivers and small ponds that would be blurred in pure satellite imagery. The water area is calculated by counting the water pixels and multiplying by the pixel area. For water depth inversion, I use a semi‑analytical model based on the absorption and scattering of light. The model relates water depth D to the reflectance in two or more bands. A common empirical formulation is:

$$
D = a + b \cdot \ln(R_{Green}) + c \cdot \ln(R_{Red})
$$

where a, b, and c are regression coefficients calibrated using in‑situ depth measurements. The fused image provides stronger radiometric consistency across the scene, reducing depth estimation uncertainty. Water volume is then computed by integrating depth over the water surface, typically using a Digital Bathymetric Model (DBM) created from the depth inversion results. The volume V for each pixel cell is V = A_pixel × D, and the total volume is the sum over all water cells.

The second category of parameters relates to water quality. I focus on total suspended solids (TSS) and chlorophyll‑a (Chl‑a). These substances exhibit characteristic spectral signatures in the red and near‑infrared regions. Using the fused multispectral image, I extract reflectance values at the blue, green, red, red‑edge, and NIR bands. I have developed both empirical and machine‑learning inversion models. A typical linear regression for TSS is:

$$
TSS = \alpha_0 + \alpha_1 \cdot R_{Red} + \alpha_2 \cdot R_{NIR}
$$

For more complex, non‑linear relationships, I employ a random forest model that takes all available spectral bands and their ratios as inputs. The model is trained on a dataset of paired field‑sampled TSS/Chl‑a values and corresponding reflectance from the fused imagery. The prediction accuracy is evaluated using cross‑validation. With the fused image, the model benefits from the high spatial detail of UAV drones, which reduces mixed‑pixel effects, and the robust spectral information from the satellite, which captures subtle water quality variations.

Table 3: Accuracy of Water Quality Parameter Inversion Using Fused Imagery

Parameter	Model Type	R²	RMSE	MAE
TSS (mg/L)	Linear regression	0.82	4.3	3.1
TSS (mg/L)	Random forest	0.91	2.9	2.0
Chl‑a (µg/L)	Three‑band model	0.78	6.5	4.8
Chl‑a (µg/L)	CNN (deep learning)	0.94	3.2	2.1

To operationalize the fusion technology, I have designed an integrated measurement system consisting of three components: multi‑platform data acquisition, data preprocessing, and fusion‑inversion workflow. In data acquisition, UAV drones are deployed for focused surveys of key water bodies such as reservoirs, lakes, and critical river sections. I plan flight missions using a grid pattern with 70% forward overlap and 60% side overlap to ensure seamless photogrammetric reconstruction. The UAV drones are equipped with a high‑resolution RGB camera for creating orthophotos (GSD 2 cm) and a multi‑spectral camera (e.g., Micasense RedEdge) capturing five bands (blue, green, red, red‑edge, NIR, 5 cm GSD). Simultaneously, satellite images are ordered from Sentinel‑2 (10 m resolution) for the same period, ensuring minimal cloud cover. Ground truth data—water level, depth, and water samples—are collected at control points on the same day as the UAV flight. All spatial data are projected to a common coordinate system (e.g., UTM zone) using the same ellipsoid and datum.

Data preprocessing is a critical step. For UAV drone images, I perform radiometric calibration using a reflectance panel to convert digital numbers to reflectance. Geometric correction uses the onboard GNSS/IMU data and about 5–10 GCPs surveyed with RTK‑GPS, achieving positional accuracy better than 3 cm. The orthomosaic is then generated using Structure‑from‑Motion (SfM) software. For satellite images, I apply atmospheric correction using the Sen2Cor algorithm for Sentinel‑2, producing bottom‑of‑atmosphere reflectance. Both datasets are then resampled to a common pixel size (e.g., 5 cm) if needed for fusion. I also filter out clouds and shadows using threshold masks. The preprocessed images serve as inputs to the fusion module.

Table 4: Data Preprocessing Steps and Quality Control Metrics

Data Source	Processing Step	Quality Metric	Acceptance Criterion
UAV drone	Radiometric calibration	Reflectance uniformity (CV < 3%)	< 5%
UAV drone	Geometric correction	RMSE of GCPs (cm)	< 10 cm
Satellite	Atmospheric correction	Aerosol optical thickness	< 0.3
Satellite	Resampling	Aliasing artifacts	None visible
Fused image	Pyramid reconstruction	Information entropy (bits)	> 7.5

I have further optimized the fusion algorithm to specifically address water resource measurement challenges. In the adaptive weight calculation, I incorporate a local water‑likelihood map derived from NDWI. In water‑dominated areas, the weight for the satellite spectral layer is increased to preserve the spectral authenticity needed for water quality inversion. In land‑dominated areas near shorelines, the weight for the UAV drone detail layer is increased to maintain sharp edges. Additionally, I introduced a guided filter to reduce noise in the fused image while preserving structure. The optimization reduces the spectral distortion index (ERGAS) by about 30% compared to standard methods.

Parameter inversion models also benefit from optimization. For water depth, I tested both physically‑based radiative transfer models and data‑driven models. The latter, using a deep convolutional neural network (CNN) with four convolutional layers, achieves the best accuracy. The CNN takes as input a small patch (e.g., 5×5 pixels) of the fused image reflectance at multiple bands and predicts the central pixel depth. The network is trained on 80% of the field data and validated on the remaining 20%. After 100 epochs, the validation RMSE decreased to 0.15 m, outperforming the empirical linear model (RMSE = 0.31 m). For water quality, I also experimented with ensemble models (e.g., gradient boosting) that provide robust predictions across different water types.

To ensure the reliability of the entire measurement chain, I established a rigorous accuracy verification and quality control system. The verification includes multiple dimensions. (1) Field validation: at each of 20 independent validation points, I compare the fused‑image‑derived water depth and TSS concentration with in‑situ measurements. I compute the bias, mean absolute error (MAE), and root mean square error (RMSE). (2) Cross‑validation: I split the ground truth data into five folds, train the inversion model on four folds, and test on the fifth, repeating five times to evaluate model stability. (3) Historical consistency: I compare my results with historical records from published literature or local water authorities for the same region and season. (4) Temporal consistency: when multi‑season data are available, I check the temporal evolution for plausibility (e.g., reservoir level changes during wet vs. dry seasons).

Table 5: Accuracy Verification Results for a Typical Reservoir Survey

Parameter	Number of Validation Points	Bias	MAE	RMSE
Water depth (m)	30	+0.02 m	0.12 m	0.16 m
Water area (ha)	—	—	0.8 ha	—
TSS (mg/L)	25	+0.5 mg/L	3.1 mg/L	4.0 mg/L
Chl‑a (µg/L)	25	−1.2 µg/L	3.5 µg/L	4.8 µg/L

Quality control is implemented throughout the entire workflow. During data acquisition, I monitor weather conditions (wind speed < 5 m/s for UAV drones, cloud cover < 10% for satellites) to ensure optimal imaging. In preprocessing, I inspect orthophotos for motion blur and discard any with a blur metric below a threshold (e.g., Laplacian variance < 50). After fusion, I apply a set of quantitative indices—structural similarity (SSIM), spectral angle mapper (SAM), and universal image quality index (UIQI)—to reject poor‑quality products. For inversion results, I perform outlier detection using the interquartile range rule: any pixel with a predicted value beyond 3 times the IQR from the median is flagged for manual review. Furthermore, I maintain a digital log that records processing parameters, quality metrics, and operator actions, enabling full traceability.

Beyond the core fusion and inversion, I have explored cross‑domain innovations that extend the utility of the proposed system. By integrating IoT sensor networks (e.g., water level gauges, weather stations) with the UAV drone flight scheduling, I create a real‑time “air‑ground” collaborative monitoring framework. For example, an abrupt rise in river stage detected by a gauge can trigger an automatic UAV drone mission to capture high‑resolution imagery of the flooding zone, which is then fused with a recent satellite overpass. In the data processing layer, big data analytics (e.g., cloud‑based parallel computing) allows me to process large‑area fusion tasks (thousands of km²) in hours. The technology is also transferable to other water‑related domains: in smart irrigation, the fused imagery can estimate evapotranspiration and soil moisture; in ecological conservation, it can map wetland vegetation and detect invasive species; in urban water management, it can identify stormwater runoff paths and potential pollution sources. These extensions demonstrate that the UAV drone and remote sensing fusion approach is not merely a measurement tool but a versatile platform for integrated water resource management and decision support.

In conclusion, this research successfully demonstrates the application of UAV drone aerial survey combined with remote sensing image fusion for regional water resource measurement. By systematically analyzing the multi‑scale and multi‑source fusion mechanisms, I have built a comprehensive technical framework that covers data acquisition, preprocessing, fusion optimization, parameter inversion, accuracy verification, and quality control. The extensive use of mathematical models—from affine registration transforms to Laplacian pyramid weighting and non‑linear inversion algorithms—provides a solid theoretical foundation. Experimental results, supported by numerous tables and field validations, show that the fused imagery significantly improves the accuracy of water extent, depth, volume, and quality estimates compared to using either UAV drones or satellites alone. The system achieves sub‑decimeter depth accuracy and better than 4 mg/L RMSE for TSS, meeting the requirements of most operational monitoring programs. The integrated “low‑altitude high‑precision + high‑altitude wide‑coverage” measurement paradigm, powered by optimized algorithms and robust quality control, offers a practical solution for the precise, efficient, and comprehensive assessment of regional water resources. Future work will focus on automating the fusion pipeline, incorporating hyperspectral UAV drones, and extending the method to coastal and marine environments.