Occluded Target Detection and Cooperative Tracking for Surveying Drones in Low-Altitude Complex Environments

Surveying drones and surveying UAVs face significant challenges in low-altitude environments due to obstacles like buildings, trees, and infrastructure that cause target occlusion. We propose an integrated framework combining lightweight vision algorithms with multi-UAV coordination to overcome these limitations. Our approach enables real-time detection of heavily occluded targets and dynamic multi-angle observation for persistent tracking.

For occluded target detection, we designed a pure-encoder Transformer model eliminating redundant decoder layers. The architecture comprises three components: a RepVGG backbone for feature extraction, Transformer encoders for attention-based feature enhancement, and a fully connected prediction head. The backbone processes input images (template: 128×128, search area: 5× bounding box size) using re-parameterized VGG blocks:

$$ \text{RepVGG}(I) = \text{Conv}_{3×3}(\text{BN}(\text{ReLU}(\text{Conv}_{1×1}(I)))) $$

Features are flattened into $[HW, B, C]$ and fed into Transformer encoders where multi-head attention computes target-context correlations:

$$ \text{Attention}(Q,K,V) = \sigma\left(\frac{QK^T}{\sqrt{d_k}}\right)V $$
$$ \text{MultiHead} = \text{Concat}(\text{head}_1,…,\text{head}_h)W^O $$

The prediction head outputs bounding box probabilities through four Conv-BN-ReLU layers. Corner expectations are calculated as:

$$ (\hat{x}_{tl}, \hat{y}_{tl}) = \left( \sum_{y=0}^H \sum_{x=0}^W x \cdot P_{tl}(x,y), \sum_{y=0}^H \sum_{x=0}^W y \cdot P_{tl}(x,y) \right) $$

For multi-UAV cooperative tracking, we introduce a visibility-aware observation planning system. First, target positions are estimated via ray intersection from multiple surveying drones:

$$ \min_{p’_t} \sum_{i=1}^N D(p’_t, l_i) $$

where $D$ denotes distance from estimated position $p’_t$ to observation ray $l_i$. Non-occluded regions $\Omega_i$ around the target are identified through radial scanning at angular intervals $\beta$:

$$ \Omega_i = \{\theta | \text{line}(p_t, p_{\theta}) \cap \text{obstacles} = \emptyset \} $$

Optimal observation points $P_{\text{obv}}$ are generated based on occlusion-free sectors. When UAV count exceeds visible regions, particle swarm optimization minimizes maximum observation angle:

$$ \text{minimize} \quad \alpha \sum_{i=1}^n \sum_{j=1}^n c_{ij}x_{ij} + \beta z $$
$$ \text{subject to} \quad \sum_{j=1}^n x_{ij} = 1, \sum_{i=1}^n x_{ij} = 1, z \geq c_{ij}x_{ij} $$

Paths are planned using hybrid A* with occlusion cost $C_{\text{occ}}$ and safety cost $S_d$:

$$ h^k(x) = D(p_k, p_{k+1}) + C_{\text{occ}} + S_d $$

Yaw control maintains target-centered perspectives for all surveying UAVs:

$$ \psi = \text{atan2}\left( e_y^T (p_i – p’_{ti}), e_x^T (p_i – p’_{ti}) \right) $$

Our occlusion detection method achieves superior performance compared to state-of-the-art approaches. Testing on LaSOT dataset and Gazebo simulations shows consistent tracking under 90% occlusion:

Method FPS (RTX 2060) Accuracy Max Occlusion Tolerance
Proposed 80 54.3% 90%
E.T.Tracker 40 59.0% 90%
DiMP18 55 53.1% 90%
KCF 75 17.8% 50%

Onboard performance demonstrates real-time capability for surveying UAV applications:

Hardware FPS Resource Utilization
Jetson Xavier NX 36 GPU: 62%, CPU: 79.6%
de next-TGU8 26 CPU: 407.3% (multi-core)

Multi-UAV cooperative tracking reduces localization error by 62% compared to single-drone operations. In dense jungle flight tests, three surveying drones maintained continuous target coverage through coordinated viewpoint adjustment. The framework’s key innovations include:

  1. Computationally efficient occlusion handling (36 FPS on edge devices)
  2. Visibility-optimized formation control for surveying drones
  3. Integrated perception-planning for cluttered environments

Future work will focus on occlusion prediction and communication-efficient coordination for large surveying UAV swarms. The proposed system significantly advances persistent target monitoring capabilities for surveying drones operating in urban canyons, forested terrain, and industrial complexes where traditional approaches fail.

Scroll to Top