Graph Attention Network Based Defense Decision Making Against UAV Drone Swarms

The rapid advancement of communication and intelligent technologies has enabled modern UAV drones to execute diverse military missions through swift configuration of task modules. The defense against UAV drone swarms has become an urgent challenge in contemporary combat operations. Traditional air defense weapons, mainly point-kill systems, exhibit inherent limitations when countering UAV drone swarms, such as low clearance efficiency and high consumption of high-value munitions. In contrast, ultrawideband high-power microwave (HPM) weapons can radiate ultra-high-energy electromagnetic waves within a certain emission angle, providing near-end large-area damage capability and representing a cost-effective air defense means. These weapons do not rely on high-precision tracking systems; they only require control of the antenna pointing. When the spatial distribution of an UAV drone swarm exceeds the radiation angle, to maximize the protection of friendly assets, the antenna pointing and continuous rotation strategy must be optimized—this constitutes a critical combat decision problem.

In the field of air defense decision making, threat assessment aims to quantify the damage risk posed by enemy targets and guide optimal firepower allocation. Typical features used for threat assessment of aerial targets include target velocity, range, altitude, time-to-fly, path shortcut, target type, and jamming capability. Based on these features, researchers have utilized methods such as analytic hierarchy process and weighted summation to compute threat evaluation values (also called threat indicators), ensuring that defensive fire is always preferentially assigned to targets with the highest threat value in the current decision cycle.

However, this paradigm faces dual challenges when confronting UAV drone swarms. First, to achieve saturation attack and cooperative effects, individual UAV drones in a swarm tend to be highly similar in features such as speed and range, making it difficult for traditional methods to effectively distinguish threat levels within the swarm, often leading to inaccurate fire allocation. Second, to accomplish formation keeping, cooperative obstacle avoidance, and consensus decision making, individual UAV drones within a swarm inevitably exhibit communication and behavioral correlations, resulting in complex interactions and evolutionary patterns in the spatiotemporal dimension—a factor that is severely neglected in traditional static evaluation models.

To address these issues, we innovatively propose a threat assessment framework that integrates spatiotemporal joint modeling, combining Graph Attention Network (GAT) and Long Short-Term Memory (LSTM) networks. GAT, through its adaptive attention mechanism, can accurately parse the spatial collaborative structure within the swarm and dynamically learn the correlation weights between UAV drone nodes. Meanwhile, the gated memory mechanism of LSTM effectively captures the temporal evolution of target states. By fusing these two architectures, our model aims to deeply mine the spatiotemporal interaction features embedded in the incoming process of UAV drone swarms, combine them with the traditional evaluation system, and ultimately achieve fine-grained dynamic differentiation of threat levels among individual targets within the swarm. This provides efficient decision-making support for area-kill weapons such as ultrawideband HPM weapons, thereby enhancing the operational effectiveness of counter-UAV drone swarm defense.

1. Related Work

Existing counter-UAV drone measures include jamming and capture, kinetic intercept, directed energy weapon intercept, and tethered net capture. To effectively counter saturation attacks by UAV drone swarms, relevant institutions worldwide are accelerating the upgrade and innovation of various anti-UAV drone equipment technologies. For example, Russia has developed the “Sickle-VS” series of counter-UAV drone systems; Israel’s D-Fend EnforceAir2 version 24.04.2 supports recommending appropriate countermeasure strategies for each UAV drone and shortening decision cycles. Lockheed Martin in the United States mounts HPM weapons on UAV drone platforms, enabling high-power microwave emission near incoming UAV drone swarms. According to the timing of use of various measures, the characteristics of defense operation phases against UAV drone swarms are summarized in Table 1.

Table 1: Characteristics of Defense Operation Phases Against UAV Drone Swarms

Phase	Applied Means	Expected Effect
Long-range reconnaissance and early warning	Radar detection, radio detection, electromagnetic signal detection, acoustic detection, infrared detection, etc.	Identify UAV drone swarm position and movement, distinguish friend from foe
Medium-range jamming and capture	Link blocking, signal spoofing, sensor blinding, network capture, etc.	Reduce or paralyze UAV drone combat capability
Short-range jamming and destruction	High-power directional electromagnetic jamming, microwave weapon area kill, laser weapon precision destruction, anti-aircraft shell fragmentation, etc.	Implement all-around dense protection, disable UAV drone swarm
End-point protection	Install protective nets or shields, deploy decoys, release smokescreen, etc.	Reduce UAV drone reconnaissance and positioning

In terms of threat assessment for target groups, existing research can be roughly divided into three categories. The first category introduces intelligent methods based on single-target threat assessment to support swarm target threat judgment, such as using deviation-maximization weighting and K-nearest neighbor discrimination for aerial target threat levels, or training support vector regression models to determine swarm target threat levels. However, these methods struggle to effectively differentiate the incoming characteristics of similar types of multiple targets. The second category first performs group clustering based on target track tables and then conducts threat assessment, for example, adding “non-membership” parameters and time weights to traditional fuzzy sets to achieve multi-group target threat evaluation, or using weighted spatial clustering models for swarm target clustering threat estimation, or using Canopy-K-means and analytic hierarchy processes to quantify threat evaluation values. These methods require pre-clustering of targets and neglect the threat changes caused by dynamic interactions between swarm nodes. The third category utilizes graph neural networks, graph convolutional networks, spectral adaptation, swarm dynamic models, and radial basis function neural networks to capture the rich features of UAV drone swarm attacks, particularly achieving good discrimination in key node identification. However, existing methods mostly focus on static or short-term feature extraction, lacking the ability to perform continuous, fine-grained threat ranking of individuals within homogeneous UAV drone swarms. How to deeply integrate the real-time motion state of targets with the cooperative interaction relationships within the swarm to generate more discriminative and stable individual threat quantification metrics remains a critical problem to be addressed in the field of threat assessment.

Regarding counter-UAV drone task allocation, researchers have constructed mathematical models of swarm splitting and trained multi-level residual networks for multi-weapon autonomous decision making. Others have combined the Kuhn-Munkres algorithm and Hungarian algorithm to achieve effective formation and interception of UAV drone swarms. Adaptive genetic algorithms with enhanced local search capability have been designed to solve multi-constraint weapon-target allocation models with good solution quality. Multi-agent deep deterministic policy gradient algorithms have been proposed for cooperative interception by multiple UAV drones. Fuzzy mathematics has been used to evaluate the threat of behavioral factors of UAV drone swarms, and convolutional neural networks have been trained to select the optimal interception weapon. A spatiotemporal benefit-based multi-UAV drone defense strategy has been proposed, dividing tasks into temporal and spatial benefits, showing strong scalability and fast convergence. Machine learning has been demonstrated to provide automatic decision support for laser weapons, predicting the best engagement plan for UAV drone swarms based on threat type, quantity, and attack strategy. While these studies have contributed significantly to improving solution quality and intelligence, the problem of constructing an efficient, rule-based, and interpretable “basic weapon usage strategy model” for area-kill weapons like ultrawideband HPM still remains to be clarified.

2. Graph Network Modeling of Incoming UAV Drone Swarms

2.1 Establishment of Graph Elements

At each observation time step $t$, we construct an incoming UAV drone swarm graph $G_t = \{S_t, E_t, P_t\}$, where $S_t = \{s_1^t, s_2^t, \ldots, s_i^t\}$ is the set of nodes (each node represents an incoming UAV drone), $E_t$ is the set of edges denoting information interaction relationships between UAV drones, and $P_t$ is the weight matrix of the edges. Since maritime UAV drones have limited communication capability, the signal becomes increasingly submerged in noise as distance increases. Therefore, we set a link channel capacity threshold $\text{cap}_{\text{thres}}$. When the link channel capacity is above this threshold, the two UAV drones can communicate and cooperate, i.e., an edge exists between the two nodes in the incoming graph; otherwise, the edge is disconnected.

Let $\text{linkCap}_{ij}^t$ denote the link channel capacity between UAV drone nodes $i$ and $j$ at time $t$, calculated as:

$$
\text{linkCap}_{ij}^t = B \log_2\left(1 + \frac{1}{N_0} \cdot \frac{P_0}{(u_{ij}^t)^\alpha}\right)
$$

where $B$ is the channel bandwidth (Hz), $N_0$ is the noise intensity at the receiver, $P_0$ is the transmission signal power, $u_{ij}^t$ is the distance between two UAV drone nodes $i$ and $j$ at time $t$, and $\alpha$ is the path loss exponent (typically in the range 2 to 4).

The edge weight is also measured by the link channel capacity. The weight $p_{ij}^t$ of the edge in the incoming graph is defined as:

$$
p_{ij}^t = \begin{cases}
0, & \text{linkCap}_{ij}^t < \text{cap}_{\text{thres}} \\
\text{linkCap}_{ij}^t, & \text{linkCap}_{ij}^t \geq \text{cap}_{\text{thres}}
\end{cases}
$$

2.2 Basic Features of Graph Nodes

Features commonly used for threat assessment of aerial targets include target velocity, range, time-to-fly, path shortcut, target type, and jamming capability. Based on three-coordinate radar information, for incoming UAV drone target $s_i$, we have velocity vector $\vec{v_i} = (v_{X_i}, v_{Y_i}, v_{Z_i})$ and relative position vector $\vec{l_i} = (l_{X_i}, l_{Y_i}, l_{Z_i})$. The target range $d_i$, time-to-fly $ts_i$, and path shortcut $slp_i$ are calculated as:

$$
d_i = \sqrt{(l_{X_i})^2 + (l_{Y_i})^2 + (l_{Z_i})^2}
$$
$$
ts_i = -\frac{\vec{l_i} \cdot \vec{v_i}}{|\vec{v_i}|^2}
$$
$$
slp_i = \frac{|\vec{l_i} \times \vec{v_i}|}{|\vec{v_i}|}
$$

Generally, a shorter time-to-fly, smaller range, and smaller path shortcut imply a higher threat level.

Following related references, the UAV drone target type $c_i$ can be classified into reconnaissance, jamming, and attack UAV drones, with increasing threat levels, quantized as 1, 3, and 5 respectively. The target jamming capability $a_i$ reflects the probability that the opponent’s radar can detect the target. We divide jamming capability into four levels: strong, relatively strong, weak, and none, quantized as 8, 6, 4, and 2 respectively.

2.3 Node Importance Estimation

UAV drone swarms typically possess adaptive flight and task allocation capabilities. With the rapid development of distributed cooperative control technologies, any node in the swarm can take over integrated control functions at any time, and nodes closer to the center often assume more critical tasks. We adopt the eigenvector centrality (EC) to enhance the graph features, meaning that a node connected to more important nodes is itself more critical. The eigenvector centrality $EC_i^t$ is computed as:

$$
EC_i^t = \sum_{j=1}^{N} p_{ij}^t
$$

where $N$ is the total number of nodes (i.e., total number of UAV drones) in the graph.

2.4 Static Threat Indicator

After normalizing each feature to the interval $[1, 2]$, we sum them as the target threat value $rw_i$, and then compute the threat indicator $\text{Weight}_i$ as:

$$
rw_i = -d_i – slp_i – ts_i + c_i + a_i + EC_i^t
$$
$$
\text{Weight}_i = \frac{rw_i}{\sum_{i=1}^{N} rw_i}
$$

$\text{Weight}_i$ represents the static threat assessment based solely on current incoming UAV drone swarm features, i.e., an evaluation completed only using information from the current observation period.

3. Threat Assessment Based on GAT-LSTM

We model the entire incoming trajectory of the UAV drone swarm as a sequence of discrete time slices forming an incoming temporal graph. At each time slice, we use GAT to capture the interaction relationships among UAV drones, generating node embeddings that contain implicit influences from the neighborhood. Then we employ LSTM to perform temporal aggregation learning on the node embedding features across time slices. Finally, the threat evaluation layer computes the threat indicator for each target. The overall approach is illustrated conceptually in the next description.

3.1 Spatial Latent Feature Learning

For each time slice, we train a GAT autoencoder based on the incoming graph features $X_0 \in \mathbb{R}^{N \times F}$ and the edge index $\epsilon$. The training objective is to minimize the reconstruction error. The process is as follows:

1. Linear transformation. Let the feature dimension in layer $l$ be $F_l$. After linear transformation, we obtain the input $X_{\text{in}}^l$:

$$
X_{\text{in}}^l = H^{l-1} W^l \in \mathbb{R}^{N \times F_l}, \quad l \geq 1, \quad H^0 = X_0
$$

where $W^l \in \mathbb{R}^{F_{l-1} \times F_l}$ is a learnable weight matrix.

2. Node embedding feature computation. For each edge $(i, j) \in \epsilon$, we calculate the attention score $e_{ij}^l$ between node $i$ and neighbor node $j$:

$$
e_{ij}^l = \text{LeakyReLU}\left((a^l)^T (X_{\text{in}}^l(i) \parallel X_{\text{in}}^l(j))\right)
$$

where $a^l \in \mathbb{R}^{2F_l}$ is a learnable attention vector, and $\parallel$ denotes concatenation of the transformed features of nodes $i$ and $j$.

We then normalize over the neighbor set of node $i$:

$$
a_{ij}^l = \frac{\exp(e_{ij}^l)}{\sum_{k \in \mathcal{N}_i} \exp(e_{ik}^l)}
$$

where $\mathcal{N}_i$ is the neighbor set of node $i$ (including itself).

The hidden state (embedding feature) is updated by aggregating neighbor features:

$$
H^{l+1}(i) = \sigma\left( \sum_{j \in \mathcal{N}_i} a_{ij}^l X_{\text{in}}^l(j) \right)
$$

where $\sigma(\cdot)$ is a nonlinear activation function (e.g., ELU, ReLU, Sigmoid).

3. Multi-head attention for improved embedding features. To capture complex dynamic relationships in the graph and improve model expressiveness and robustness, we employ a multi-head attention mechanism. Define $M$ attention heads. The output layer uses mean aggregation for stable training:

$$
H^{l+1} = \sigma\left( \frac{1}{M} \sum_{m=1}^{M} H_m^l \right)
$$

where $H_m^l$ is the embedding feature of nodes at layer $l$ under the $m$-th attention head.

4. Reconstruction of original features. A linear transformation is applied to the embedding features to reconstruct the original features, yielding the reconstructed feature matrix $X_{\text{recon}}$:

$$
X_{\text{recon}} = H^L W’
$$

where $W’ \in \mathbb{R}^{F’ \times F}$ is a learnable weight matrix, and $L$ is the number of GAT layers.

5. Calculation of reconstruction error. We use mean squared error (MSE) to measure the difference between $X_{\text{recon}}$ and $X_0$:

$$
L_{\text{MSE}} = \frac{1}{N \times F} \sum_{i=1}^{N} \sum_{f=1}^{F} (X_{\text{recon}} – X_0)^2
$$

The training goal of the GAT autoencoder is to minimize this reconstruction error $L_{\text{MSE}}$.

3.2 Spatiotemporal Feature Learning

LSTM receives and processes the sequence of embedding features for each UAV drone node across different time slices, denoted as $V_i^{(T)} = \{ H^L(t) \mid t \in T \}$. Let $h_{t-1}$ represent the hidden state at time step $t-1$ in the LSTM. The input at time step $t$ is $V_L(t) = \{ V_i^{(T)} \parallel X_0(t) \}$, i.e., the concatenation of the temporal embedding and the current original features. The input gate processing is:

$$
\text{in}(t) = \sigma(W_{\text{in}} V_L(t) + U_{\text{in}} h_{t-1} + b_{\text{in}})
$$

where $W_{\text{in}}$, $U_{\text{in}}$, and $b_{\text{in}}$ are the input gate weight matrix, hidden state weight matrix, and bias term, respectively.

The forget gate $fg(t)$ is computed as:

$$
fg(t) = \sigma(W_{\text{fg}} V_L(t) + U_{\text{fg}} h_{t-1} + b_{\text{fg}})
$$

The candidate memory cell $\tilde{z}(t)$ is:

$$
\tilde{z}(t) = \tanh(W_c V_L(t) + U_c h_{t-1} + b_c)
$$

The updated memory cell $z(t)$ is:

$$
z(t) = fg(t) \odot z(t-1) + \text{in}(t) \odot \tilde{z}(t)
$$

where $\odot$ denotes element-wise multiplication.

The output gate $ou(t)$ is:

$$
ou(t) = \sigma(W_{\text{ou}} V_L(t) + U_{\text{ou}} h_{t-1} + b_{\text{ou}})
$$

The hidden state $h_t$ is updated as:

$$
h_t = ou(t) \odot \tanh(z(t))
$$

3.3 Computation of Threat Indicator

We process the LSTM output $ou(t+1)$ to compute the threat indicator $\text{thr}_i(t)$ for UAV drone $i$:

$$
\text{thr}_i(t) = [\mu, (1-\mu)] \begin{bmatrix} \text{Weight}_i(t) \\ ou(t+1) \end{bmatrix}
$$

where $\mu \in [0,1]$ represents the importance of the current incoming UAV drone swarm features for threat assessment. A larger $\mu$ indicates that the current situational features are more important for threat ranking, while $(1-\mu)$ reflects the importance of historical information in judging the threat of each target in the swarm.

4. Weapon Strategy Optimization Model Based on Angular Coverage

Previous weapon task allocation models rarely consider the allocation mechanism of area-kill weapons, often assuming that a weapon can only attack one target at a time, which is quite different from the operational characteristics of ultrawideband HPM weapons. Such weapons can almost instantaneously damage multiple targets within their attack angle. The optimization goal of the weapon strategy is to ensure that, in the shortest possible time, more high-threat UAV drones are positioned within the weapon’s attack angle to achieve destruction. When facing multi-directional, coordinated arrivals, the allocation must not only consider the spatial states of each UAV drone at the current moment but also account for the impact of the weapon’s pursuit sequence on the final defense effect.

We propose a novel angular coverage–based weapon strategy optimization model. The objective function maximizes the total sum of threat indicators of eliminated UAV drones over all time steps. The decision variable is the weapon rotation amount $\Delta\gamma(t)$ at each time step $t$, where a left rotation is denoted as “-”, a right rotation as “+”, and no movement as $\Delta\gamma(t)=0$. The model assumes a finite number of time steps involved in the calculation (generally defaulting to the maximum flight time among the UAV drone swarm). Let $\beta$ be the attack angle of the ultrawideband HPM weapon, $\gamma(t)$ be the weapon’s pointing at time $t$, and $v$ be the weapon’s strapdown scanning speed. The proposed angular coverage–based weapon strategy optimization model is formulated as:

$$
\max \sum_{t} \sum_{j} \text{thr}_j(t_{\text{Lh}})
$$
$$
\text{s.t.} \quad \{ j, \theta_j(t_{\text{Lh}}) \in \beta \pm (\gamma(t) + \Delta\gamma(t)) \}
$$

where $t_{\text{Lh}} = t + (|\Delta\gamma(t)| / v)$ is the future time step, $\text{thr}_j(t_{\text{Lh}})$ is the threat indicator of UAV drone $j$ at that future time, and $\theta_j(t_{\text{Lh}})$ is the azimuth angle of UAV drone $j$ at $t_{\text{Lh}}$. The constraint indicates that at the future time, the azimuth of UAV drone $j$ must lie within the attack angle range of the HPM weapon after rotation.

5. Simulation Experiments

We conducted simulations to validate the feasibility and effectiveness of the proposed method. Without considering UAV drone carriers or release phases, we assumed that the UAV drone swarm approached in a triangular formation. During flight, UAV drones could communicate with each other to confirm attack routes or use their own perception to plan routes. Regardless of the coordination style, all UAV drones followed basic motion and control rules to ensure no collisions along their trajectories. The time difference between any two UAV drones entering the defense zone was required to be within 10 seconds; otherwise, it would be difficult to achieve the mission of saturation attack on the target defense system. Based on a bird swarm model, we simulated scenarios with path shortcuts ranging from 0 to 100 meters and a UAV drone count of no less than 10. For example, with 40 UAV drones, the incoming trajectory map can be generated (shown conceptually in the earlier figure).

Through extensive simulations, the GAT-LSTM model parameters were set as shown in Table 2.

Table 2: Model Training Parameter Settings

Parameter	Description	Value
GAT_layer	Number of GAT layers	2
GAT_hidden	GAT hidden dimension	16
GAT_heads	Number of attention heads	8
LSTM_hidden	Number of LSTM units	32
Optimizer	Optimization algorithm	Adam
Shuffle	Shuffle training data	every-epoch
dropout	Prevent overfitting	0.6
GAT_activation	GAT activation function	ReLU
epochs	Number of training iterations	100
batch_size	Samples per training batch	16
lr	Learning rate	0.001
lookahead_steps	Decision steps into the future	5
popsize	Population size (for optimization)	30

The training convergence of the GAT-LSTM model is shown (in the analysis). The $R^2$ value is close to 1, indicating that the model can explain 88.24% of the variation in the data. In the region where the true $\text{Weight} < 0.6$, the predicted values align well with the true values. Although there is some dispersion in the high-value region (e.g., true $\text{Weight} > 0.8$), the trend conforms, demonstrating strong predictive capability for the threat indicator.

To evaluate the performance of the weapon strategy optimization, we define three indicators: detection rate $dRate$, weighted coverage $dCover$, and average detection time $dTime$. Let $N_{\text{cover}}$ be the number of detected UAV drones, $t_{\text{Cover}_i}$ be the detection time of the $i$-th UAV drone, and $\text{thr}_i(t_{\text{Cover}_i})$ be its threat indicator at detection. The indicators are calculated as:

$$
dRate = \frac{N_{\text{cover}}}{N}
$$
$$
dCover = \frac{\sum_{i=1}^{N_{\text{cover}}} \text{thr}_i(t_{\text{Cover}_i})}{\sum_{i=1}^{N} \sum_{t} \text{thr}_i(t)}
$$
$$
dTime = \frac{\sum_{i=1}^{N_{\text{cover}}} t_{\text{Cover}_i}}{N_{\text{cover}}}
$$

The core of our optimization model is “select the weapon action sequence and determine how much to rotate.” To find a suitable algorithm for this model, we applied seven mainstream optimization algorithms—greedy, ant colony optimization, particle swarm optimization, differential evolution, genetic algorithm, simulated annealing, and tabu search—to solve the model equation (the angular coverage model). The results were compared, and the following observations were made: both our method and the static threat assessment method could detect all 12 UAV drones, but our method achieved a weighted coverage $dCover$ of no less than 0.68 across all algorithms, consistently higher than the static method, meaning it is more effective at eliminating high-threat targets within the swarm. In terms of average detection time, our method achieved shorter average detection times compared to the static method when using ant colony, differential evolution, genetic, and simulated annealing algorithms; however, for particle swarm, greedy, and tabu search, the detection times were comparable or slightly longer than the static method.

This divergence stems from the adaptation between algorithm mechanism and problem characteristics. Our method outputs threat assessment results with higher discrimination—i.e., the threat values are more significantly differentiated. This is beneficial for algorithms that rely on population diversity information for searching (e.g., differential evolution, genetic algorithm) and for algorithms that rely on positive feedback learning (e.g., ant colony). High discrimination accelerates learning, selection, and convergence. However, for algorithms that depend on single-trajectory local search (e.g., greedy, tabu search), higher discrimination may increase the decision cost of falling into suboptimal paths. For algorithms that rely on historical best guidance (e.g., particle swarm), dynamically changing threat assessments may interfere with stable convergence direction, weakening speed advantages.

Based on experimental results, if the operational goal prioritizes maximizing the elimination of high-threat targets (i.e., ensuring weighted coverage), the greedy algorithm can still be a preferred solution method due to its simplicity and acceptable performance in this context.

6. Conclusion

Through temporal discretization of the incoming UAV drone swarm scenario, we fused GAT and LSTM to mine the spatiotemporal features during the incoming process, achieving dynamic differentiated assessment of the threat of each node in an adaptive swarm of UAV drones. Based on the dynamic threat assessment values, we established an area-kill weapon strategy optimization model and conducted experiments with mainstream optimization algorithms. The experimental results demonstrate:

Our method can effectively distinguish the threat levels of individual targets within the UAV drone swarm, providing critical information for interception decisions.
The established weapon strategy optimization model supports the generation of sequential rotation decisions, enabling efficient and orderly coverage of the UAV drone swarm by the ultrawideband HPM weapon.
The decision effectiveness of our proposed approach is higher than that of the static threat assessment method when facing multi-UAV drone incoming scenarios, making it more conducive to supporting HPM weapon counter-UAV drone swarm operational decisions.

Future work will further expand data samples by incorporating UAV drone swarm motion models such as the Vicsek model and Couzin moving model for mixed neural network training, thereby improving the generalization of the proposed method to a wider range of operational scenarios.