Game Task Assignment for Heterogeneous Drone Swarms

In the rapidly evolving field of drone technology, the coordination of multiple heterogeneous unmanned aerial vehicles (UAVs) has become a cornerstone for complex mission execution in dynamic environments. The task allocation problem in heterogeneous drone swarms presents significant challenges due to the diversity of drone capabilities, varying task requirements, and the need for real-time adaptability. We propose a novel task allocation algorithm that integrates task screening, preference computation, and stable matching theory to address these challenges effectively.

Our approach begins with a task region screening mechanism that filters potential UAV candidates based on mission requirements and environmental conditions within the airspace. This mechanism drastically reduces computational complexity by narrowing down the candidate set, thereby improving allocation efficiency. Subsequently, we compute UAV preference values for each task using a multi-factorial formula that accounts for drone capability, environmental harshness, and task reward. These preferences are then used to form an initial assignment, which is refined through the Gale-Shapley stable matching framework to ensure rationality and stability. In the coalition formation phase, we employ Shapley value analysis to dynamically optimize coalition gains by removing drones with low marginal contributions and reallocating them through a secondary assignment process to maximize task completion rates.

Experimental results demonstrate that our method outperforms random allocation, auction-based algorithms, and marginal utility approaches in terms of computational efficiency, total coalition benefit, and structural stability. The proposed algorithm achieves over 90% of the optimal coalition benefit while requiring fewer iterations and shorter runtime, making it highly suitable for large-scale, dynamic task scenarios in modern drone technology applications.

Introduction

The proliferation of drone technology has enabled unprecedented capabilities in environmental monitoring, disaster response, military reconnaissance, and logistics. However, single-UAV systems often fall short in covering extensive areas, handling complex tasks, or adapting to rapidly changing conditions. Multi-UAV systems, leveraging distributed coordination and task specialization, offer a solution by forming dynamic coalitions to execute missions collaboratively. The core challenge lies in efficiently assigning heterogeneous UAVs to tasks while optimizing overall system benefit and maintaining stability under constraints.

Existing research has explored various approaches, including auction-based methods, marginal utility algorithms, and game-theoretic frameworks. Auction algorithms suffer from local optima and lack of cooperation, while marginal utility methods require numerous iterations and frequent coalition restructuring. Game theory, particularly coalition formation games, provides a robust foundation for modeling UAV interactions, but often overlooks the impact of task environment heterogeneity. Our work bridges this gap by introducing a task-driven screening mechanism and a stable matching process that accounts for both drone heterogeneity and environmental complexity.

The main contributions of our paper are as follows:

We propose a task region screening method that filters potential UAV candidates based on task demands and environmental factors, reducing computational load and improving allocation efficiency.
We develop a UAV preference computation formula that integrates drone capability, task reward, and environmental cost, enabling rational initial assignment under the Gale-Shapley stable matching framework.
We introduce a secondary allocation strategy for removed UAVs, optimizing task completion rates by reallocating low-marginal-contribution drones to other coalitions.
We evaluate the proposed algorithm against random, auction, and marginal utility baselines, demonstrating superior performance in benefit, stability, and runtime across various scales.

The remainder of this paper is organized as follows: Section 2 presents the system model and problem formulation. Section 3 details the proposed task allocation algorithm. Section 4 discusses the Gale-Shapley stability analysis. Section 5 reports experimental results. Section 6 concludes the paper.

System Model and Problem Formulation

Airspace and Task Representation

We consider a three-dimensional airspace $\xi$ defined as:

$$
\xi = \{(x,y,z) \mid x_{\min}\leq x\leq x_{\max},\; y_{\min}\leq y\leq y_{\max},\; z_{\min}\leq z\leq z_{\max}\},\quad \xi\subseteq\mathbb{R}^3
$$

Within this airspace, we have a set of $N$ UAVs $\mathcal{U} = \{u_1, u_2, \dots, u_n\}$ and a set of $M$ tasks $\mathcal{R} = \{R_1, R_2, \dots, R_m\}$, with $N > M$. Each task $R_i$ is defined as a spherical region in space:

$$
R_i = \{(x,y,z) \mid \|(x,y,z)-(x_i,y_i,z_i)\| \leq r_i,\; z\geq0\}
$$

where $(x_i,y_i,z_i)$ is the task center and $r_i$ the radius. The environment within each task region may contain obstacles, whose number and distribution affect the difficulty of task execution.

Environmental Cost

For a UAV $u_i$ executing task $R_j$, the environmental cost $H(p_j)$ is computed based on the Euclidean distances to obstacles and their intensity factors:

$$
H(x,y,z) = \sum_{i=1}^{N_{\text{obs}}} \frac{\kappa_i}{(x-x_i)^2 + (y-y_i)^2 + (z-z_i)^2}
$$

where $N_{\text{obs}}$ is the number of obstacles in the task region, $(x_i,y_i,z_i)$ are obstacle positions, and $\kappa_i$ are intensity coefficients. Higher $H$ indicates more severe environmental conditions that increase task difficulty and reduce UAV efficiency.

Task Region Screening

To reduce the search space, we define a screening radius $d_j$ for each task $R_j$:

$$
d_j = D(d_j) = G(f_j) \cdot t
$$
$$
G(f_j) = \frac{1}{1 + H(P_j)}
$$

Here, $t$ is an initial radius parameter, and $G(f_j)$ is a task complexity coefficient inversely related to the environmental cost $H(P_j)$ at the task center $P_j$. For low-difficulty tasks, the screening radius is enlarged to consider more distant UAVs; for high-difficulty tasks, the radius shrinks to prioritize nearby drones.

Within the screening radius $d_j$, we select the set of candidate UAVs:

$$
\mathcal{U}_j = \{u_i \in \mathcal{U} \mid \|u_i – P_j\| \leq d_j\}
$$

If $|\mathcal{U}_j|$ exceeds the maximum capacity $C_{\max}(R_j)$, we retain only the top $k$ UAVs based on their capability function $p(u_i)$:

$$
\mathcal{U}’_j = \text{topk}_j\{\,p(u_i) \mid u_i \in \mathcal{U}_j\}
$$

If $|\mathcal{U}_j|$ is insufficient, we expand the radius:

$$
d_j = (1+\varepsilon)\,d_j
$$

where $\varepsilon$ is a scaling factor. The overall initial UAV pool is the union of all screened sets:

$$
\mathcal{U}_{\text{int}} = \bigcup_{j=1}^m \mathcal{U}’_j
$$

UAV Preference Calculation

Each UAV $u_i \in \mathcal{U}_{\text{int}}$ computes a preference value for each task $R_j$ using the following function:

$$
Y(u_i, R_j) = \mathcal{T}\left( P(u_i,R_j) + H(p_j) + F(u_i,R_j) \right)
$$

where $\mathcal{T}$ denotes normalization. The capability evaluation $P(u_i,R_j)$ is defined as:

$$
P(u_i,R_j) = w_1 \cdot \frac{E(u_i)}{E_{\max}} + w_2 \cdot \frac{v(u_i)}{v_{\max}} + w_3 \cdot A(u_i,R_j)
$$

Here, $E(u_i)$ is the endurance capability, $v(u_i)$ the speed, $A(u_i,R_j)$ the task adaptability (based on historical assignment frequency and screening hits), and $w_1,w_2,w_3$ are weights with $w_1 \geq w_2 \geq w_3$. The net benefit $F(u_i,R_j)$ is:

$$
F(u_i,R_j) = F_{\text{reward}}(R_j) – F_{\text{cost}}(u_i,R_j)
$$
$$
F_{\text{cost}}(u_i,R_j) = c_1 \cdot d(u_i,R_j) + c_2 \cdot \frac{d(u_i,R_j)}{E(u_i)}
$$

where $d(u_i,R_j)$ is the Euclidean distance from UAV to task center, and $c_1,c_2$ are cost weights. The preference list $\text{T}(u_i)$ is constructed by sorting all tasks in descending order of $Y(u_i,R_j)$. Each UAV initially selects its most preferred task, forming an initial coalition allocation.

Coalition Benefit Function

The total benefit $S(R_j)$ of a coalition $\mathcal{C}_j$ assigned to task $R_j$ is given by:

$$
S(R_j) = Q(R_j) + V(R_j) – K(\mathcal{C}_j,R_j) – \gamma\,H(p_j)
$$

subject to constraints:

$$
\begin{cases}
\text{Cap}(\mathcal{C}_j) \geq Q(R_j) \\
|\mathcal{C}_j| \leq C_{\max}(R_j) \\
\mathcal{C}_i \cap \mathcal{C}_j = \emptyset,\quad i\neq j
\end{cases}
$$

where $Q(R_j)$ is the task workload, $V(R_j)$ is the cooperation reward, $K(\mathcal{C}_j,R_j)$ is a penalty for overcapacity, and $\gamma$ is the environmental weight. The coalition capability is:

$$
\text{Cap}(\mathcal{C}_i) = \sum_{u_j \in \mathcal{C}_i} P(u_j,R_i) – H(p_i)
$$

The cooperation reward is defined as:

$$
V(R_j) =
\begin{cases}
\alpha \cdot \beta(R_j) \cdot (\text{Cap}(\mathcal{C}_j) – Q(R_j)), & \text{if } \text{Cap}(\mathcal{C}_j) > Q(R_j) \\
0, & \text{otherwise}
\end{cases}
$$
$$
\beta(R_j) = \lambda_1 \cdot Q(R_j) + \lambda_2 \cdot H(p_j)
$$

The penalty for excessive capacity is:

$$
K(\mathcal{C}_j,R_j) = \mu \cdot \max(0,\; \text{Cap}(\mathcal{C}_j) – \lambda_3 \cdot Q(R_j))^2
$$

Combining these, the final benefit function becomes:

$$
S(R_j) = Q(R_j) + \alpha \cdot \max(0,\; \beta(R_j)\cdot\text{Cap}(\mathcal{C}_j) – Q(R_j)) – \mu \cdot \max(0,\; \text{Cap}(\mathcal{C}_j) – \lambda_3 Q(R_j))^2 + \gamma H(p_j)
$$

The objective is to maximize the sum of benefits over all tasks:

$$
\arg\max \sum_{R_j \in \mathcal{R}} S(R_j)
$$

Proposed Algorithm

Task Coalition Screening Algorithm

The screening algorithm reduces the problem size by filtering UAVs based on task proximity and capacity. The pseudocode is presented in the table below.

Algorithm 1: Task Coalition Screening
Step	Description
1	Initialize task set $\mathcal{T}$, UAV set $\mathcal{U}$
2	For each task $s$ from 1 to $T_n$:
3	Compute screening radius $d_s$ using Equations (4) and (5). Obtain $\mathcal{U}_s = \{u_i \in D_s\}$
4	If $\|\mathcal{U}_s\| > U_{\max}$:
5	Select top $k$ UAVs via $\mathcal{U}’_s = \text{topk}_j\{p(u_i) \mid u_i \in \mathcal{U}_s\}$
6	Else if $\|\mathcal{U}_s\| \leq U_{\min}$:
7	Expand radius: $d’_s$, update set $\mathcal{U}’_s = \{u_i \in D’_s\}$
8	Obtain initial pool $\mathcal{U}_{\text{int}} = \bigcup_{s=1}^m \mathcal{U}’_s$

Stable Matching Algorithm Based on Preference Computation

After screening, we compute preference lists for all UAVs and apply the Gale-Shapley stable matching with coalition optimization. The algorithm proceeds as follows:

Algorithm 2: Stable Matching Allocation
Step	Description
1	Obtain $\mathcal{U}_{\text{int}}$ and $\mathcal{T}_{\text{int}}$ from Algorithm 1
2–6	For each UAV $u_{\text{item}} \in \mathcal{U}_{\text{int}}$, compute preference $Y(u_i,R_j)$ and select top task $T_{\text{item}}$.
7–13	For each task $R_{\text{item}}$, compute coalition benefit $S(R_j)$ and Shapley values $\phi(u_i)$ for all members. Remove the UAV with minimum $\phi(u_i)$ and add it to leisure queue $\mathcal{U}_{\text{leisure}}$.
14–25	While $\mathcal{U}_{\text{leisure}} \neq \emptyset$: randomly select $u_i$, pick next-preferred task from its list, compute new coalition benefit, and update if benefit increases. Continue until stable.

The Shapley value $\phi(u_i)$ for UAV $u_i$ within coalition $\mathcal{C}_j$ is:

$$
\phi(u_i) = \sum_{S \subseteq \mathcal{C}_j \setminus \{u_i\}} \frac{|S|!\,(|\mathcal{C}_j|-|S|-1)!}{|\mathcal{C}_j|!} \cdot \Delta S(u_i,S)
$$
$$
\Delta S(u_i,S) = S(R_j, S \cup \{u_i\}) – S(R_j, S)
$$

Removing the UAV with the smallest Shapley value improves coalition efficiency. The removed UAVs are reallocated in a secondary process, where each UAV attempts to join the next best task that yields a positive marginal benefit. This iterative process continues until no further improvements can be made, at which point the system reaches a Nash equilibrium.

Gale-Shapley Stability Analysis

Theorem: In any two-sided matching problem with strict preference lists, a stable matching exists even under many-to-one matching with additional capacity constraints.

Proof sketch: The Gale-Shapley algorithm guarantees a stable matching when participants propose in order of descending preference. In our setting, UAVs propose to tasks based on their computed $Y(u_i,R_j)$ values, and tasks accept the most preferred UAVs according to the coalition benefit $S(R_j)$. Tasks always retain the best set of proposers up to their capacity. Since UAVs only propose to tasks they prefer more than their current match, and tasks never swap a higher-ranked UAV for a lower-ranked one, no blocking pair can exist. The secondary allocation phase maintains this property by ensuring that reallocation only occurs when it strictly improves the coalition benefit. Thus, a stable matching is guaranteed.

Experimental Results

We conducted extensive simulations to evaluate the proposed algorithm (denoted Optimize) against three baselines: Greedy Auction (GA), Random Marginal Utility (RMU), and a pure Auction algorithm. Experiments were performed on a PC with AMD Ryzen 5 4600H and 8GB RAM using Python 3. All results are averaged over 20 independent runs.

Parameter Settings

For the experimental scenarios, we considered a three-dimensional airspace containing 3 to 8 tasks and 10 to 120 UAVs. Key parameters are summarized below.

UAV Parameters
UAV ID	Position	Endurance	Load (km)	Adaptability
1	(1.9,3.0,3.2)	1.1	16	0.70
2	(1.7,3.4,3.5)	1.0	15	0.92
3	(2.8,3.2,3.0)	1.4	25	0.55
4	(3.3,3.1,3.4)	1.5	26	0.80
5	(3.0,3.4,3.3)	1.2	24	0.97
6	(4.0,3.3,3.2)	1.6	21	0.78
7	(4.2,3.7,3.4)	1.0	20	0.80
8	(3.9,3.3,3.7)	1.6	22	0.85
9	(3.8,3.2,3.8)	1.4	20	0.90
10	(3.6,3.0,3.9)	1.3	23	0.80

Task Parameters
Task ID	Position	Workload	Reward	Capacity
1	(2.0,3.0,3.0)	35	5	3
2	(3.0,3.2,3.2)	45	8	4
3	(4.2,3.4,3.3)	42	6	4

Total Benefit Comparison

Figure 4 (not shown due to no picture references) illustrates the total benefit over algorithm iterations for the small-scale scenario with 10 UAVs and 3 tasks. The proposed Optimize algorithm achieves a final benefit of 118.75, closely approaching the optimal value of 123.69 obtained by the exhaustive marginal utility method after many iterations. The Auction algorithm yields only 70.93 due to its greedy nature. The Optimize algorithm converges in 5 coalition structure changes, whereas RMU requires 14 changes, indicating higher stability.

Table 6 and 7 show the initial and final coalition structures for the three algorithms. Our method consistently produces near-optimal coalitions, e.g., Task 1: {4,5}, Task 2: {3,8}, Task 3: {6,7} with benefits 29.32, 47.30, 42.13 respectively, totaling 118.75. The optimal solution (from RMU after 500 iterations) is {1,5}, {3,4}, {6,8} with total 123.69. Our algorithm achieves over 96% of the optimal benefit with far fewer iterations.

Initial Coalition Distribution
Algorithm	Task	Initial Structure	Benefit
Optimize	1	{1,2,4,5,6,7,8,9,10}	-18.38
	2	{2}	18.53
	3	–	0
RMU	1	{4,5}	27.80
	2	{3,6,7,8,10}	12.96
	3	{1,2,9}	30.20
Auction	1	–	0
	2	–	0
	3	–	0

Final Coalition Distribution
Algorithm	Task	Final Structure	Benefit
Optimize	1	{4,5}	29.32
	2	{3,8}	47.30
	3	{6,7}	42.13
RMU	1	{1,5}	35.83
	2	{3,4}	47.25
	3	{6,8}	40.61
Auction	1	{1}	11.75
	2	{3}	15.81
	3	{6,7}	42.07

Scalability Analysis

We tested the algorithms with varying numbers of UAVs and tasks: 25–120 UAVs and 5–60 tasks. The results are shown in the following tables (averaged over 20 runs).

Multi-Scale Data Test Results
Algorithm	Scenario (UAVs-Tasks)	Coalition Changes	Runtime (s)
Optimize	25-5	7	0.31
	40-10	22	1.37
	90-30	31	3.67
	120-60	43	4.61
RMU	25-5	261.89	0.31
	40-10	54	6.31
	90-30	61	12.27
	120-60	105	29.34
Auction	25-5	8	0.27
	40-10	10	1.38
	90-30	22	1.91
	120-60	38	2.66

Our algorithm exhibits a gradual increase in coalition changes and runtime with scale, while RMU shows explosive growth due to its exhaustive local search. Auction methods remain fast due to simplicity but yield poor overall benefits. In the largest scenario (120 UAVs, 60 tasks), the Optimize algorithm achieves a total benefit exceeding 90% of the theoretical optimum (obtained by enumerating all possible coalitions for small subsets), whereas Auction only reaches about 60% and RMU fails to converge within the allocated iteration limit (500 iterations).

Figure 5 (not displayed) compares the total benefit across different UAV-to-task ratios. Our algorithm consistently outperforms the baselines, especially when the ratio is 3:1 or 2:1, which are typical in realistic drone technology deployments. The benefit margin widens as the problem size increases, confirming the scalability advantage of our approach.

Conclusion

We have presented a novel task allocation algorithm for heterogeneous drone swarms that integrates task screening, preference computation, and stable matching theory. By leveraging Shapley value-based coalition optimization and secondary reallocation, our method achieves high total benefit, structural stability, and computational efficiency. Experimental results demonstrate that the algorithm attains over 90% of the optimal coalition benefit while requiring significantly fewer iterations than marginal utility methods and offering far greater benefit than auction-based algorithms. The approach is particularly effective for large-scale dynamic task environments, making it a promising solution for next-generation drone technology systems requiring real-time, adaptive task assignment.

Future work will explore the integration of deep reinforcement learning to further enhance adaptability under uncertainty, as well as incorporate communication constraints and energy consumption models to extend the applicability to real-world operations. Large-scale field trials are also planned to validate the engineering feasibility of the proposed algorithm.