We propose a novel task assignment algorithm for heterogeneous drone swarms, specifically designed to address the challenges of multi-UAV cooperation in complex and resource‑limited airspace environments. Our method integrates task region screening, preference computation, and stable matching theory, and is crafted with the growing needs of China drone operations in mind. Through systematic analysis and extensive simulations, we demonstrate that our approach significantly improves allocation efficiency, coalition stability, and overall mission benefits compared to traditional methods.
1. Introduction
The rapid advancement of unmanned aerial vehicle (UAV) technology has enabled a wide range of applications, from environmental monitoring to military reconnaissance. In many practical scenarios, especially within China drone ecosystems, single‑UAV systems often fall short due to limited coverage and low fault tolerance. Multi‑UAV systems, on the other hand, can collaborate through dynamic coalition formation to achieve complex tasks. However, the allocation of heterogeneous UAVs to tasks remains a critical and challenging problem. Existing methods, such as auction‑based or marginal‑benefit approaches, often suffer from high computational complexity, frequent coalition restructuring, and suboptimal global reward.
In this work, we propose a game‑theoretic task assignment method that combines task‑region screening, preference calculation, and stable matching based on the Gale‑Shapley algorithm. Our contributions are threefold:
- We design a task‑region screening mechanism that filters candidate UAVs based on environmental difficulty and task requirements, reducing the problem size and improving computational efficiency.
- We introduce a preference‑based initial assignment followed by a Gale‑Shapley stable matching to ensure stability, and then optimize coalition composition using Shapley value‑based pruning and secondary reallocation.
- We evaluate our method against two benchmark algorithms (random marginal benefit and greedy auction) under various scales. The results show that our method achieves over 90% of the optimal coalition reward, with fewer iterations and shorter runtime, making it well suited for large‑scale China drone applications.
2. Problem Model
Consider a three‑dimensional airspace containing a set of heterogeneous UAVs \(U = \{u_1, u_2, \dots, u_n\}\) and a set of tasks \(R = \{R_1, R_2, \dots, R_m\}\), with \(n > m\). Each task \(R_i\) is defined as a spherical region centered at \((x_i, y_i, z_i)\) with radius \(r_i\):
$$
R_i = \{(x, y, z) \ | \ \|(x,y,z)-(x_i,y_i,z_i)\| \le r_i,\ z \ge 0\}.
$$
The environment may contain obstacles; their quantity and positions influence the task difficulty. Let the environmental cost function be:
$$
H(x,y,z) = \sum_{i=1}^{N} \frac{\kappa_i}{(x-x_i)^2 + (y-y_i)^2 + (z-z_i)^2},
$$
where \(\kappa_i\) is the intensity factor of the \(i\)-th obstacle and \(N\) is the number of obstacles.
2.1 Task Region Screening
For each task \(R_j\), we compute a dynamic screening radius:
$$
d_j = D(d_j) = G(f_j)\cdot t, \qquad G(f_j) = \frac{1}{1+H(P_j)},
$$
where \(t\) is an initial radius parameter and \(P_j\) is the task center. UAVs within distance \(d_j\) form an initial candidate set \(U_j = \{u_i \in U \ |\ \|u_i – P_j\| \le d_j\}\). If \(|U_j|\) exceeds the maximum capacity \(C_{\max}(R_j)\), we select the top \(k\) UAVs according to their capability function \(p(u_i)\):
$$
U’_j = \text{top}_k\{p(u_i)\ |\ u_i \in U_j\}.
$$
If \(|U_j|\) is too small, we expand the radius by a factor \(\varepsilon\):
$$
d_j = (1+\varepsilon)\, d_j.
$$
The global initial UAV set is the union over all tasks:
$$
U_{\text{int}} = \bigcup_{j=1}^{m} U’_j.
$$
2.2 Preference Calculation
Each UAV \(u_i\) in \(U_{\text{int}}\) computes a preference value for each task \(R_j\):
$$
Y(u_i, R_j) = T\big(P(u_i,R_j) + H(p_j) + F(u_i,R_j)\big),
$$
where \(T\) denotes normalization, and
$$
P(u_i,R_j) = w_1\frac{E(u_i)}{E_{\max}} + w_2\frac{v(u_i)}{v_{\max}} + w_3 A(u_i,R_j),
$$
$$
F(u_i,R_j) = F_{\text{reward}}(R_j) – F_{\text{cost}}(u_i,R_j),
$$
$$
F_{\text{cost}}(u_i,R_j) = c_1\,d(u_i,R_j) + c_2\frac{d(u_i,R_j)}{E(u_i)}.
$$
Here \(E(u_i)\) is the endurance, \(v(u_i)\) velocity, \(A(u_i,R_j)\) an adaptability factor (function of historical assignment), and \(F_{\text{reward}}(R_j)\) the task reward. The coefficients satisfy \(w_1 \ge w_2 \ge w_3\).
2.3 Coalition Utility Function
The total utility of a coalition \(C_j\) assigned to task \(R_j\) is:
$$
S(R_j) = Q(R_j) + V(R_j) – K(C_j,R_j) – \gamma H(p_j).
$$
The cooperation reward term is:
$$
V(R_j) =
\begin{cases}
\alpha\cdot\beta(R_j)\cdot(\text{Cap}(C_j) – Q(R_j)), & \text{Cap}(C_j) > Q(R_j),\\
0, & \text{otherwise},
\end{cases}
$$
$$
\beta(R_j) = \lambda_1 Q(R_j) + \lambda_2 H(p_j),
$$
and the penalty for excessively large capacity is:
$$
K(C_j,R_j) = \mu\cdot\max(0,\ \text{Cap}(C_j) – \lambda_3 Q(R_j))^2.
$$
The total coalition capability is:
$$
\text{Cap}(C_i) = \sum_{u_j \in C_i} P(u_j,R_i) – H(p_i).
$$
Combining all, we obtain the final utility function:
$$
S(R_j) = Q(R_j) + \alpha\max(0,\beta(R_j)\cdot\text{Cap}(C_j)-Q(R_j)) – \mu\max(0,\text{Cap}(C_j)-\lambda_3 Q(R_j))^2 + \gamma H(p_j).
$$
3. Algorithm
3.1 Task Coalition Screening
Algorithm 1 performs the screening phase, as described in Section 2.1.
| Step | Description |
|---|---|
| 1 | Initialize task set \(T\) and UAV set \(U\). |
| 2 | For each task \(s=1\) to \(T_n\): |
| 3 | Compute screening radius \(d_s\) using (4) and (5). Obtain \(U_s = \{u_i \in D_s\}\). |
| 4 | If \(|U_s| > U_{\max}\): select top \(k\) UAVs via (6) → \(U’_s\). |
| 5 | Else if \(|U_s| \le U_{\min}\): expand radius using (7), update set \(U’_s\). |
| 6 | \(U_{\text{int}} = \bigcup_{s=1}^m U’_s\). |
3.2 Stable Matching Allocation
Algorithm 2 performs the preference‑based assignment and coalition optimization.
| Step | Description |
|---|---|
| 1 | Input \(U_{\text{int}}\) and tasks \(T_{\text{int}}\). |
| 2 | For each UAV \(u_i\) in \(U_{\text{int}}\): compute preference list \(Y_i\) using (9); assign to task with highest preference. |
| 3 | For each task \(R_j\): evaluate coalition utility \(S(R_j)\); if constraints violated, compute Shapley values \(\phi(u_i)\) for all members. |
| 4 | Remove UAV with smallest \(\phi(u_i)\) and add to leisure list \(U_{\text{leisure}}\). |
| 5 | While \(U_{\text{leisure}} \ne \emptyset\): pick a random UAV, try to assign it to the next preferred task; accept if coalition utility increases; update coalitions. |
The Shapley value of a UAV \(u_i\) in coalition \(C_j\) is defined as:
$$
\phi(u_i) = \sum_{S \subseteq C_j \setminus \{u_i\}} \frac{|S|!\,(|C_j|-|S|-1)!}{|C_j|!} \cdot \Delta S(u_i,S),
$$
with \(\Delta S(u_i,S) = S(R_j,\,S\cup\{u_i\}) – S(R_j,S)\).
3.3 Stability Analysis
We prove that under strict preference lists, the Gale‑Shapley algorithm always yields a stable matching even in the many‑to‑one setting with capacity constraints. UAVs propose to tasks in decreasing order of preference; tasks temporarily hold the best proposals. The iterative removal of low‑contribution UAVs (via Shapley value) and reallocation preserves stability. The process terminates at a Nash equilibrium where no UAV can unilaterally improve its payoff by switching tasks.
4. Experimental Results
We conducted simulations using a Python environment. The airspace contained 10–120 UAVs and 5–60 tasks. Parameters were set as: coalition reward coefficient \(\alpha = 0.5\), penalty coefficient \(\mu = 0.2\), screening radius scaling \(\varepsilon = 0.1\). Three algorithms were compared: our proposed Optimized Stable Matching (OSM), Random Marginal Benefit (RMB), and Greedy Auction (GA).
4.1 Small‑Scale Scenario
We first tested with 10 UAVs and 3 tasks. The optimal coalition structure was \(C_1 = \{u_1,u_5\},\ C_2 = \{u_3,u_4\},\ C_3 = \{u_6,u_8\}\) yielding total reward 123.69. Table 3 and Table 4 show initial and final allocations.
| Algorithm | Task | Initial Coalition | Reward |
|---|---|---|---|
| OSM | 1 | {1,2,4,5,6,7,8,9,10} | -18.38 |
| OSM | 2 | {2} | 18.53 |
| OSM | 3 | – | 0 |
| RMB | 1 | {4,5} | 27.8 |
| RMB | 2 | {3,6,7,8,10} | 12.96 |
| RMB | 3 | {1,2,9} | 30.20 |
| GA | 1 | – | 0 |
| GA | 2 | – | 0 |
| GA | 3 | – | 0 |
| Algorithm | Task | Final Coalition | Reward |
|---|---|---|---|
| OSM | 1 | {4,5} | 29.32 |
| OSM | 2 | {3,8} | 47.30 |
| OSM | 3 | {6,7} | 42.13 |
| RMB | 1 | {1,5} | 35.83 |
| RMB | 2 | {3,4} | 47.25 |
| RMB | 3 | {6,8} | 40.61 |
| GA | 1 | {1} | 11.75 |
| GA | 2 | {3} | 15.81 |
| GA | 3 | {6,7} | 42.07 |
Our OSM algorithm achieved total reward 118.75 (96% of optimal), RMB reached 123.69 (optimal) but required many iterations, while GA only obtained 69.63. OSM also had the lowest structural change count (5 changes) versus RMB (14) and GA (4). This demonstrates OSM’s stability and near‑optimal performance.
4.2 Large‑Scale Experiments
We tested with 25–120 UAVs and 5–60 tasks. Table 5 summarizes average coalition changes and runtime over 20 runs per configuration.
| Scenario | Algorithm | Changes | Runtime (s) |
|---|---|---|---|
| U25‑T5 | OSM | 7 | 0.31 |
| RMB | 261.89 | 0.31 | |
| GA | 8 | 0.27 | |
| U40‑T10 | OSM | 22 | 1.37 |
| RMB | 54 | 6.31 | |
| GA | 10 | 1.38 | |
| U90‑T30 | OSM | 31 | 3.67 |
| RMB | 61 | 12.27 | |
| GA | 22 | 1.91 | |
| U120‑T60 | OSM | 43 | 4.61 |
| RMB | 105 | 29.34 | |
| GA | 38 | 2.66 |
OSM consistently required fewer coalition changes and maintained low runtime, especially in large‑scale scenarios. RMB’s change count grew dramatically, and GA, although fast, produced much lower total rewards. Figure 1 (below) illustrates the total reward comparison across multiple scales.

The figure confirms that OSM’s total reward remains above 90% of the optimal (computed by exhaustive search for small cases). For large problems with 120 UAVs and 60 tasks, OSM outperformed both GA and RMB in reward, while maintaining the best stability and runtime.
5. Conclusion
We have presented a novel game‑theoretic task assignment algorithm for heterogeneous UAV swarms. By combining task‑region screening, preference‑based Gale‑Shapley matching, and Shapley‑value coalition optimization, our method achieves efficient, stable, and near‑optimal allocations. Extensive experiments demonstrate that our algorithm significantly reduces computational overhead and coalition restructuring, making it highly suitable for large‑scale dynamic environments. As China drone technologies continue to evolve, such intelligent allocation mechanisms will be crucial for real‑world applications including disaster response, surveillance, and logistics. Future work will incorporate deep learning for online preference learning and consider communication constraints to further enhance practicality.
