Novel Auto-Focusing Algorithm for UAV Aerial Photography

1. Introduction
The adoption of UAV aerial photography has revolutionized fields such as geographic mapping, battlefield reconnaissance, and cinematography due to its broad perspective, high mobility, efficiency, and low cost. However, achieving precise auto-focusing (AF) in low-altitude environments remains challenging. Complex lighting conditions and airflow disturbances degrade AF stability and sensitivity, hindering accurate ground target detection. Traditional passive focusing methods, like Depth-from-Focus (DFF), leverage image sharpness evaluation functions but struggle with multidirectional edge gradients in UAV aerial photography. This study addresses these limitations by proposing a novel Threshold-based Tenengrad-Roberts (TR) focusing evaluation function.

2. Limitations of Traditional Spatial Domain Focus Functions
Spatial domain focus functions evaluate sharpness using gradient variations. Clear images exhibit steeper edge gradients than blurred ones. An ideal focus evaluation curve (Figure 1) must satisfy:

Unimodality: A single global peak at the optimal focus position.
High Sensitivity: Steep gradient near the peak for precise focus localization.
Robustness: Noise immunity and computational efficiency.

Common functions and their limitations in UAV aerial photography are summarized below:

Table 1: Traditional Focus Evaluation Functions

Function	Gradient Calculation	Limitations in UAV Context
SMD	Absolute differences in horizontal/vertical neighbors.	Low clarity ratio (RR), multi-peak curves.
Roberts	Diagonal gradients (±45°).	Ignores horizontal/vertical edges.
Sobel	Weighted horizontal/vertical gradients.	Low sensitivity (MSEMSE).
Brenner	Horizontal gradients (0°).	Limited directionality.
SML	Laplacian-based second-order differences.	Prone to local peaks.
Tenengrad	Thresholded Sobel gradients.	Insufficient multidirectional sensitivity.

Mathematically:

Roberts: Gx=∣f(x+1,y+1)−f(x,y)∣Gx=∣f(x+1,y+1)−f(x,y)∣, Gy=∣f(x+1,y)−f(x,y+1)∣Gy=∣f(x+1,y)−f(x,y+1)∣
Tenengrad: FTen=∑(Gx2+Gy2)FTen=∑(Gx2+Gy2), where Gx,GyGx,Gy are Sobel gradients.

3. TR Focus Evaluation Function
3.1 Design Principles
The TR function synergizes:

Tenengrad’s noise robustness via thresholding.
Roberts’ multidirectional sensitivity by extending gradient detection to ±45° and ±135°:

G1(x,y)=f(x+1,y+1)−f(x,y)G2(x,y)=f(x−1,y−1)−f(x,y)G3(x,y)=f(x+1,y)−f(x,y+1)G4(x,y)=f(x−1,y)−f(x,y−1)G1(x,y)G2(x,y)G3(x,y)G4(x,y)=f(x+1,y+1)−f(x,y)=f(x−1,y−1)−f(x,y)=f(x+1,y)−f(x,y+1)=f(x−1,y)−f(x,y−1)

3.2 Adaptive Thresholding
Edge pixels (2–5% of total pixels) dominate gradient information. TR incorporates an adaptive threshold TT derived from Otsu’s method:T=1MN∑[f(x,y)−TOtsu]2T=MN1∑[f(x,y)−TOtsu]2

where TOtsuTOtsu is Otsu’s segmentation threshold, and MNMN is image size.

3.3 Final TR FormulationS(x,y)=G12+G22+G32+G42S(x,y)=G12+G22+G32+G42FTR=∑[S(x,y)]2forS(x,y)>TFTR=∑[S(x,y)]2forS(x,y)>T

This ensures:

Multidirectional gradient detection (±45°, ±135°).
Noise suppression via adaptive thresholding.

4. Experimental Validation
4.1 Setup

Hardware: Intel® Core™ i7-13700KF, UAV-mounted imaging system.
Datasets: 3 groups (A, B, C) of 21 images each (defocus → focus → defocus).
Metrics:
- Clarity Ratio (RR): R=fmax/fminR=fmax/fmin.
- Sensitivity (MSEMSE): MSE=fmax−f(xmax+Δx)f(xmax+Δx)MSE=f(xmax+Δx)fmax−f(xmax+Δx) (Δx=3Δx=3).
- Interval Sum (SS): S=∑f(x)S=∑f(x) (lower = better).
- Local Peaks (αα): Fewer peaks indicate higher stability.
- Runtime (ττ).

4.2 Results
Table 2: Performance Comparison (Group A)

Function	RR	MSEMSE	αα	SS	ττ (s)
Sobel	3.77	0.47	1.00	0.92	0.421
Roberts	17.00	1.58	0.50	0.563	0.503
Tenengrad	11.78	1.21	0.70	0.998	0.598
TR (Proposed)	373.81	2.67	0.53	0.830	0.830

Table 3: Performance Comparison (Group B)

Function	RR	MSEMSE	αα	SS	ττ (s)
Roberts	26.86	3.74	0.57	0.574	0.574
Tenengrad	27.69	2.69	0.54	1.00	0.571
TR (Proposed)	300.03	7.10	0.30	0.830	0.830

Table 4: Performance Comparison (Group C)

Function	RR	MSEMSE	αα	SS	ττ (s)
Roberts	12.64	1.02	0.59	0.563	0.563
Tenengrad	14.83	0.74	0.34	0.999	0.999
TR (Proposed)	539.99	2.12	0.41	0.831	0.831

Key Observations:

Clarity Ratio (RR): TR outperforms others by 1–2 orders of magnitude (e.g., 373.81 vs. 17.00 in Group A).
Sensitivity (MSEMSE): TR achieves the highest values (e.g., 7.10 vs. 3.74 in Group B).
Stability: TR eliminates multi-peak issues (α<0.53α<0.53) and minimizes SS.
Real-time Viability: All τ<1τ<1s, suitable for UAV aerial photography.

5. Conclusion
The TR focus evaluation function significantly enhances AF performance in UAV aerial photography:

Superior Clarity & Sensitivity: RR and MSEMSE exceed traditional functions by orders of magnitude.
Robustness: Adaptive thresholding mitigates noise from lighting/aerodynamic disturbances.
Efficiency: Sub-second computation enables real-time focusing.

This work bridges the gap between theoretical AF algorithms and practical demands of low-altitude UAV operations. Future efforts will optimize TR for embedded UAV hardware.