Research on Multi-source Data Fusion Surveying and Mapping Modeling Technology Based on UAV Oblique Photography

Traditional aerial surveying faces significant limitations in flight altitude and camera perspective, hindering high-resolution, multi-angle image acquisition in complex terrains. These constraints directly compromise the accuracy and efficiency of surveying and modeling workflows. Surveying drones, or surveying UAVs, overcome these barriers through flexible low-altitude operations and multi-sensor integration. This study proposes a Multi-source Attention Edge-constrained U-Net Convolutional Neural Network (MAEU-CNN) model, integrating UAV oblique photography with multi-scale feature extraction and deep learning to enhance geospatial data processing.

Methodology

Image Preprocessing

Surveying UAVs capture oblique imagery that requires noise reduction and enhancement. Key preprocessing steps include:

Grayscale Conversion:
$$Gray = \alpha_1 \times R + \alpha_2 \times G + \alpha_3 \times B$$
where $R$, $G$, $B$ denote RGB channels and $\alpha$ represents quantization weights.

Binarization:
$$f(x) = \begin{cases} 0, & x < N \\ 1, & x \geq N \end{cases}$$
Threshold $N$ separates foreground/background pixels.

Histogram Equalization:
$$P(r_k) = \frac{n_k}{N}, \quad s_k = \sum_{i=0}^{k} P(r_i)$$
This expands dynamic range for contrast enhancement.

Intensity Stretching:
$$g(x,y) = \frac{f(x,y) – \text{min}}{\text{max} – \text{min}} \times 255$$
normalizes pixel values across datasets.

MAEU-CNN Architecture

The model integrates four innovations for surveying UAV data:

Multi-scale Feature Extraction: Parallel convolutions capture granular/textural patterns
Edge Constraint Module: Boundary regularization loss:
$$\mathcal{L}_{edge} = \sum \|\nabla Y_{pred} – \nabla Y_{true}\|^2$$
Convolutional Block Attention (CBAM):
Channel attention: $M_C(F) = \sigma(MLP(\text{AvgPool}(F)) + MLP(\text{MaxPool}(F)))$
Spatial attention: $M_S(F) = \sigma(f^{7\times7}([\text{AvgPool}(F); \text{MaxPool}(F)]))$
U-Net Fusion: Encoder-decoder structure with skip connections

Module	Function	Parameters
Encoder Blocks	Feature downsampling	5×5 conv, ReLU
Multi-scale Fusion	Feature aggregation	Kernel sizes: 3,5,7
CBAM Decoder	Attention-guided upsampling	Transposed conv
Edge Constraint	Boundary preservation	Laplacian filter

Experimental Analysis

Testing used WHU Aerial and SpaceNet datasets on NVIDIA GTX2080Ti hardware. Performance metrics:

Table 1: Model performance comparison on 1600-image dataset
Metric	U-Net	SegNet	MAEU-CNN	Improvement
IoU	0.85	0.92	0.98	+15.3% vs U-Net
Ambiguity	0.21	0.18	0.12	-42.9% vs U-Net

Table 2: Inference time (milliseconds)
Structure	U-Net	SegNet	MAEU-CNN
Residential	625	517	446 (-38.5%)
Industrial	590	482	412
Medical	568	451	376 (-33.8%)
Cultural	612	503	428

The surveying UAV-optimized MAEU-CNN demonstrates superior generalization across terrains. Attention mechanisms reduced feature ambiguity by prioritizing geomorphological elements, while edge constraints preserved critical infrastructure boundaries.

Conclusion

MAEU-CNN significantly advances surveying drone capabilities through:
1. Multi-scale fusion adapting to terrain complexity
2. Attention mechanisms reducing occlusion errors
3. Edge preservation maintaining structural fidelity
4. Computational efficiency enabling real-time processing

Surveying UAV systems integrating this model achieve sub-decimeter accuracy in heterogeneous environments, demonstrating 15.3% higher IoU and 42.9% lower ambiguity than conventional approaches. Future work will optimize model compression for edge-computing surveying UAV deployments.