YOLO-DAP: An Advanced Anti-UAV Object Detection Algorithm Based on Improved YOLOv8

In recent years, the rapid proliferation of unmanned aerial vehicles (UAVs) has posed significant challenges in security and surveillance, necessitating robust anti-UAV detection systems. Traditional anti-UAV methods, such as radar detection, radio frequency monitoring, and acoustic sensing, often suffer from limitations like high false detection rates, missed detections, and low accuracy, especially in complex environments. To address these issues, I propose YOLO-DAP, an improved YOLOv8-based algorithm specifically designed for anti-UAV object detection. This algorithm enhances detection precision for small UAV targets while maintaining real-time performance, making it suitable for practical anti-UAV applications. In this article, I will detail the methodology, experimental validation, and results, emphasizing the integration of novel modules and loss functions to optimize anti-UAV capabilities.

The core of YOLO-DAP lies in four key improvements over the baseline YOLOv8n model. First, I remove the large object detection layer (P5) and replace the original detection heads with new ones sized 160×160, 80×80, and 40×40, which significantly boosts the network’s ability to detect微小 UAV targets. Second, I introduce the Dilated-Wise Residual (DWR) attention module to enhance feature extraction by capturing multi-scale contextual information. Third, I incorporate the lightweight downsampling module ADown to improve feature fusion across different scales. Finally, I adopt the PIoU loss function for bounding box regression, accelerating convergence and improving accuracy. These modifications collectively enhance the anti-UAV detection performance, as validated on the public TIB-Net UAV dataset.

To mathematically represent the improvements, consider the baseline YOLOv8 structure. The original detection heads operate at scales of 80×80, 40×40, and 20×20, which may lose fine-grained features for small UAVs. By replacing these with higher-resolution heads, the feature map dimensions are adjusted as follows: let the input image size be $I_{h} \times I_{w}$ (e.g., 640×640). After downsampling, the feature maps at different levels can be denoted as $F_{1}, F_{2}, F_{3}$ corresponding to scales. In YOLO-DAP, the new detection heads correspond to $F_{1}’ = 160\times160$, $F_{2}’ = 80\times80$, and $F_{3}’ = 40\times40$. This change increases the receptive field for small targets, improving detection accuracy. The removal of the P5 layer reduces computational complexity, as shown in the parameter reduction table later.

The DWR module is integrated into the C2f block to replace the Bottleneck, enhancing feature extraction. The module uses dilated convolutions with varying rates to capture multi-scale information. Formally, for an input feature map $X \in \mathbb{R}^{C \times H \times W}$, the DWR module applies a 3×3 convolution to expand channels to 3C, followed by depthwise convolutions with dilation rates $d=3$ and $d=5$. The outputs are aggregated and processed through batch normalization and a 1×1 convolution to produce a residual $R$. The final output is $Y = X + R$, where $Y$ retains enriched features for anti-UAV detection. This can be expressed as:
$$Y = X + \text{Conv}_{1\times1}(\text{BN}(\text{Aggregate}(\text{DWConv}_{d=3}(X), \text{DWConv}_{d=5}(X))))$$
where $\text{DWConv}$ denotes depthwise convolution, and $\text{Aggregate}$ combines features. This structure allows the network to better handle UAVs in cluttered backgrounds.

The ADown module is used in the neck network for downsampling. It splits the input channels, applies average pooling and max pooling, and merges the results. For an input $Z \in \mathbb{R}^{C \times H \times W}$, it produces $Z’ \in \mathbb{R}^{C \times H/2 \times W/2}$ with reduced spatial dimensions but enhanced feature representation. This lightweight design minimizes parameters while improving fusion, crucial for real-time anti-UAV systems. The operation can be summarized as:
$$Z’ = \text{Conv}_{1\times1}(\text{MaxPool}(Z_{2})) + \text{Conv}_{3\times3}(\text{AvgPool}(Z_{1}))$$
where $Z_{1}$ and $Z_{2}$ are split channels from $Z$.

For bounding box regression, I replace the CIoU loss with PIoU loss to address limitations in small target detection. The PIoU loss is defined as:
$$L_{PIoU} = L_{IoU} + 1 – e^{-P^{2}}$$
where $P$ is a penalty factor adaptively scaled to target size:
$$P = \left( \frac{|dw_{1}|}{w_{gt}} + \frac{|dw_{2}|}{w_{gt}} + \frac{|dh_{1}|}{h_{gt}} + \frac{|dh_{2}|}{h_{gt}} \right) / 4$$
Here, $dw_{1}, dw_{2}, dh_{1}, dh_{2}$ are absolute distances between prediction and ground truth box edges, and $w_{gt}, h_{gt}$ are the width and height of the ground truth box. This loss function ensures faster convergence and higher accuracy for anti-UAV tasks, as it penalizes misalignments more effectively for small UAVs.

To evaluate YOLO-DAP, I conducted experiments on the TIB-Net dataset, which includes 2850 images of UAVs in varied scenarios. The dataset was split into training, validation, and test sets in a 7:2:1 ratio. Experimental setup involved an AMD Ryzen 9 CPU, NVIDIA GeForce RTX 4060 GPU, and PyTorch framework. Hyperparameters included an input size of 640×640, SGD optimizer with learning rate 0.01, momentum 0.937, weight decay 0.0005, batch size 8, and 300 epochs. Evaluation metrics were precision (P), recall (R), mean average precision (mAP), parameters, and frames per second (FPS), all critical for assessing anti-UAV performance.

Ablation studies were performed to validate each improvement. The baseline YOLOv8n model achieved an mAP of 84.9% with 3.16×10^6 parameters. By incrementally adding modifications, the performance improved significantly. The table below summarizes the ablation results, highlighting the impact of each component on anti-UAV detection.

Model	Precision (%)	Recall (%)	mAP (%)	Parameters	FPS (frames/s)
Baseline YOLOv8n	89.9	76.4	84.9	3.16×10^6	130.0
+ Remove P5 and replace heads	88.8	90.9	92.0	1.25×10^6	154.6
+ C2f_DWR module	87.0	77.3	85.1	3.12×10^6	123.0
+ ADown module	88.1	81.0	86.0	3.02×10^6	137.5
+ PIoU loss	88.2	80.0	86.9	3.16×10^6	145.9
All improvements (YOLO-DAP)	89.8	92.5	92.7	1.23×10^6	135.2

The results show that removing P5 and replacing detection heads dramatically improved recall and mAP while reducing parameters, essential for efficient anti-UAV systems. The C2f_DWR module provided a slight mAP boost, and ADown enhanced parameter efficiency. PIoU loss increased convergence speed. Combined, YOLO-DAP achieved an mAP of 92.7%, a 7.8 percentage point improvement over baseline, with 1.93×10^6 fewer parameters and higher FPS, demonstrating its superiority for anti-UAV applications.

Comparative experiments with other YOLO variants further validated YOLO-DAP’s advancements. The table below contrasts performance metrics, emphasizing its lead in mAP and parameter reduction for anti-UAV tasks.

Model	Precision (%)	mAP (%)	Parameters	FPS (frames/s)
YOLOv3	89.2	85.1	4.20×10^6	157.6
YOLOv5	85.1	82.1	2.65×10^6	148.0
YOLOv6	84.1	77.7	4.50×10^6	151.8
YOLOv8n	89.9	84.9	3.16×10^6	130.0
YOLO-DAP	89.8	92.7	1.23×10^6	135.2

YOLO-DAP outperforms all models in mAP and parameter efficiency, though YOLOv3 has a higher FPS. However, the balance of accuracy and speed makes YOLO-DAP ideal for real-time anti-UAV detection. The visual results on test images, including complex backgrounds and varying lighting, show reduced missed detections and false alarms compared to YOLOv8n, as illustrated below. The enhanced detection heads and feature extraction modules allow YOLO-DAP to capture finer details of small UAVs, critical for reliable anti-UAV operations.

The image above exemplifies the challenging scenarios in anti-UAV detection, where UAVs appear as small objects against cluttered skies. YOLO-DAP’s improvements enable accurate localization and classification in such conditions, reducing errors common in traditional methods. This visual evidence underscores the algorithm’s practicality for anti-UAV systems deployed in fields like border security or event monitoring.

In conclusion, YOLO-DAP represents a significant leap forward in anti-UAV object detection. By integrating novel detection heads, the DWR module for multi-scale feature extraction, the ADown module for efficient downsampling, and the PIoU loss for optimized regression, the algorithm achieves high precision and recall while minimizing computational costs. The experimental results on the TIB-Net dataset confirm its effectiveness, with a 92.7% mAP and reduced parameters. Future work could explore adapting YOLO-DAP for other small target detection tasks or integrating it with sensor fusion for comprehensive anti-UAV solutions. This research contributes to the growing field of anti-UAV technology, offering a robust tool for enhancing security and surveillance capabilities.