Multi-UAV Task Allocation for Emergency Rescue Using Improved Grey Wolf Optimization

In recent years, the application of drone technology in emergency response scenarios has gained significant attention due to its ability to rapidly address road incidents such as accidents, vehicle failures, road damage, and natural disasters. These events often occur simultaneously across multiple road segments, posing substantial challenges to rescue operations. Unmanned Aerial Vehicles (UAVs), as a modern technological tool, have played a crucial role in enhancing the efficiency of emergency rescue missions. This article focuses on the task allocation problem for multiple UAVs engaged in reconnaissance and information collection at incident points, referred to as “event points.” The primary objective is to optimize the assignment of tasks to UAVs, ensuring minimal flight distance and the least number of drones required, thereby improving the overall effectiveness of emergency response efforts.

The task allocation problem for multiple UAVs is inherently complex, often modeled as a Multiple Traveling Salesperson Problem (MTSP). In this context, each UAV departs from a central emergency rescue center, visits a subset of event points to perform intelligence, surveillance, and reconnaissance (ISR) tasks, and returns to the base. The challenges include ensuring that each event point is visited exactly once, adhering to UAV endurance constraints, and minimizing both the total flight path and the number of UAVs deployed. This problem is critical in real-world scenarios where time and resources are limited, and efficient task allocation can significantly impact rescue outcomes. The integration of advanced optimization algorithms, such as those inspired by swarm intelligence, has shown promise in addressing these challenges, leading to more robust and scalable solutions for multi-UAV systems.

To formalize the problem, a mathematical model is developed based on the MTSP framework. Let $ m $ denote the total number of UAVs, indexed by $ k = 1, 2, \ldots, m $, and $ n $ represent the total number of event points, indexed by $ i, j = 1, 2, \ldots, n $. The central rescue center is designated as point 0. Key parameters include $ T_k^{\text{max}} $, the maximum endurance time for UAV $ k $; $ s_i $, the ISR time required at event point $ i $; $ d_{ij} $, the distance between event points $ i $ and $ j $; $ t_{ij} $, the travel time between $ i $ and $ j $; and $ v_k $, the velocity of UAV $ k $. The decision variables are $ x_{ijk} $, a binary variable that equals 1 if UAV $ k $ travels from event point $ i $ to $ j $, and 0 otherwise; and $ y_{ik} $, a binary variable that equals 1 if event point $ i $ is assigned to UAV $ k $, and 0 otherwise. The objective functions are defined as follows:

$$ \min F_1 = \sum_{k=1}^{m} \sum_{j=1}^{n} x_{0jk} $$

$$ \min F_2 = \sum_{k=1}^{m} \sum_{i=0}^{n} \sum_{j=0}^{n} d_{ij} x_{ijk} $$

where $ F_1 $ aims to minimize the number of UAVs used, and $ F_2 $ aims to minimize the total flight distance. The constraints are formulated as:

$$ \sum_{k=1}^{m} \sum_{i=0}^{n} x_{ijk} = 1, \quad \forall j $$

$$ \sum_{j=0}^{n} x_{0jk} = \sum_{i=0}^{n} x_{i0k} = 1, \quad \forall k $$

$$ \sum_{i=0}^{n} x_{ijk} = y_{jk}, \quad \forall j,k $$

$$ \sum_{j=0}^{n} x_{ijk} = y_{ik}, \quad \forall i,k $$

$$ \sum_{i=1}^{n} \sum_{j=0}^{n} x_{ijk} t_{ij} + \sum_{i=1}^{n} s_i y_{ik} \leq T_k^{\text{max}}, \quad \forall k $$

$$ t_{ij} = \frac{d_{ij}}{v_k}, \quad \forall i,j,k $$

$$ x_{ijk} \in \{0,1\}, \quad \forall i,j,k $$

$$ y_{ik} \in \{0,1\}, \quad \forall i,k $$

These constraints ensure that each event point is visited exactly once, each UAV starts and ends at the rescue center, the sequence of tasks is maintained, and the endurance limits of UAVs are respected. The model effectively captures the multi-objective nature of the problem, balancing resource utilization and operational efficiency.

In the realm of optimization algorithms, swarm intelligence-based methods have been widely adopted for solving complex task allocation problems. Two prominent algorithms are the Particle Swarm Optimization (PSO) algorithm and the Grey Wolf Optimization (GWO) algorithm. The PSO algorithm is inspired by the social behavior of bird flocking and fish schooling. It operates by iteratively updating the velocity and position of particles in the search space to converge towards optimal solutions. The update equations for PSO are:

$$ v_i^d = \omega v_i^{d-1} + c_1 r_1 (pbest_i^d – x_i^d) + c_2 r_2 (gbest^d – x_i^d) $$

$$ x_i^{d+1} = x_i^d + v_i^d $$

where $ v_i^d $ is the velocity of particle $ i $ at iteration $ d $, $ x_i^d $ is the position of particle $ i $, $ pbest_i^d $ is the best position found by particle $ i $ so far, $ gbest^d $ is the global best position found by the swarm, $ \omega $ is the inertia weight, $ c_1 $ and $ c_2 $ are acceleration coefficients, and $ r_1 $ and $ r_2 $ are random numbers in [0,1]. The PSO algorithm is known for its simplicity and fast convergence, but it may suffer from premature convergence in complex problems.

On the other hand, the GWO algorithm mimics the hunting behavior and social hierarchy of grey wolves. In GWO, the population is divided into alpha, beta, delta, and omega wolves, representing the best, second-best, third-best, and remaining solutions, respectively. The algorithm involves three main phases: encircling prey, hunting, and attacking prey. The mathematical model for encircling prey is given by:

$$ D = | C \cdot X_p(t) – X_i(t) | $$

$$ X_i(t+1) = X_p(t) – A \cdot D $$

where $ D $ represents the distance between the wolf and the prey, $ X_p(t) $ is the position of the prey at iteration $ t $, $ X_i(t) $ is the position of wolf $ i $, and $ A $ and $ C $ are coefficient vectors calculated as:

$$ A = 2a r_1 – a $$

$$ C = 2 r_2 $$

$$ a = 2 – 2 \frac{t}{t_{\text{max}}} $$

Here, $ a $ decreases linearly from 2 to 0 over iterations, $ r_1 $ and $ r_2 $ are random vectors in [0,1], and $ t_{\text{max}} $ is the maximum number of iterations. The hunting phase is modeled as:

$$ D_{\alpha} = | C_1 \cdot X_{\alpha}(t) – X_i(t) | $$

$$ D_{\beta} = | C_2 \cdot X_{\beta}(t) – X_i(t) | $$

$$ D_{\delta} = | C_3 \cdot X_{\delta}(t) – X_i(t) | $$

$$ X_1 = X_{\alpha}(t) – A_1 \cdot D_{\alpha} $$

$$ X_2 = X_{\beta}(t) – A_2 \cdot D_{\beta} $$

$$ X_3 = X_{\delta}(t) – A_3 \cdot D_{\delta} $$

$$ X_i(t+1) = \frac{X_1 + X_2 + X_3}{3} $$

where $ X_{\alpha}(t) $, $ X_{\beta}(t) $, and $ X_{\delta}(t) $ are the positions of the alpha, beta, and delta wolves, respectively. The GWO algorithm is effective for continuous optimization but requires modifications to handle discrete problems like MTSP.

To address the limitations of the standard GWO algorithm in solving discrete task allocation problems, an improved hybrid algorithm, termed CPS-GWO, is proposed. This algorithm integrates concepts from PSO and incorporates a Kent chaotic mapping for population initialization, along with enhanced position update rules. The CPS-GWO algorithm begins with a novel encoding scheme for the wolf population. Each wolf is represented as a one-dimensional vector of natural numbers, where each element corresponds to an event point, and the vector length equals the total number of event points. For example, for 7 event points, a random permutation like [7, 2, 5, 4, 3, 6, 1] is generated, and after considering UAV endurance constraints, it is decoded into multiple paths, such as 0→7→2→0, 0→5→4→0, and 0→3→6→1→0, where 0 denotes the rescue center.

The initialization phase uses Kent chaotic mapping to enhance population diversity. The Kent map is defined as:

$$ x_{i+1} = \begin{cases} \frac{5}{2} x_i, & 0 < x_i \leq 0.4 \\ \frac{5}{3} (1 – x_i), & 0.4 < x_i < 1 \end{cases} $$

This mapping ensures a uniform distribution of initial solutions in the search space, improving the algorithm’s exploration capability. Additionally, the control parameter $ a $ in GWO is adjusted nonlinearly to better reflect the complex search process:

$$ a = 2 – 2 \left( \frac{t}{t_{\text{max}}} \right)^2 $$

This nonlinear decrease allows for a more adaptive balance between exploration and exploitation during iterations. The position update rule is further improved by incorporating PSO-inspired elements and weighted influences from the alpha, beta, and delta wolves. The new update equation is:

$$ X_i(t+1) = \omega X_i(t) + c_1 r_1 [X_{i}^{\text{best}} – X_i(t)] + c_2 r_2 \left\{ \lambda_1 [X_{\alpha}^{\text{best}} – X_{\alpha}(t)] + \lambda_2 [X_{\beta}^{\text{best}} – X_{\beta}(t)] + \lambda_3 [X_{\delta}^{\text{best}} – X_{\delta}(t)] \right\} $$

where $ \omega $ is an adaptive inertia weight, $ c_1 $ and $ c_2 $ are learning factors, and $ \lambda_1, \lambda_2, \lambda_3 $ are weight coefficients calculated as:

$$ \lambda_1 = \frac{ | D_{\delta} | }{ | D_{\alpha} | + | D_{\beta} | + | D_{\delta} | } $$

$$ \lambda_2 = \frac{ | D_{\beta} | }{ | D_{\alpha} | + | D_{\beta} | + | D_{\delta} | } $$

$$ \lambda_3 = \frac{ | D_{\alpha} | }{ | D_{\alpha} | + | D_{\beta} | + | D_{\delta} | } $$

The adaptive inertia weight $ \omega $ is defined based on the fitness of solutions:

$$ \omega_i^d = \begin{cases} \omega_{\text{min}} + (\omega_{\text{max}} – \omega_{\text{min}}) \frac{f(x_i^d) – f_{\text{min}}^d}{f_{\text{ave}}^d – f_{\text{min}}^d}, & f(x_i^d) \leq f_{\text{ave}}^d \\ \omega_{\text{max}}, & f(x_i^d) > f_{\text{ave}}^d \end{cases} $$

where $ f_{\text{ave}}^d = \frac{1}{n} \sum_{i=1}^{n} f(x_i^d) $ is the average fitness at iteration $ d $, and $ f_{\text{min}}^d = \min \{ f(x_1^d), f(x_2^d), \ldots, f(x_n^d) \} $ is the minimum fitness. This approach emphasizes the influence of higher-ranked wolves and adapts the search strategy dynamically, leading to improved convergence and solution quality for MTSP problems.

The CPS-GWO algorithm’s workflow involves initializing the population using Kent chaotic mapping, evaluating fitness, updating the positions of alpha, beta, and delta wolves, and iteratively refining solutions until termination criteria are met. The integration of PSO components helps in maintaining diversity and avoiding local optima, making it suitable for complex multi-UAV task allocation scenarios. The pseudocode for CPS-GWO is summarized as follows: initialize parameters and population; while stopping condition not met, calculate fitness, update alpha, beta, delta, adjust parameters, update positions using the hybrid rule, and decode solutions to obtain task assignments.

To validate the effectiveness of the CPS-GWO algorithm, simulation experiments were conducted using instances from the TSPLIB database. The experimental setup assumed homogeneous UAVs with a maximum endurance time $ T_k^{\text{max}} = 120 $ minutes for all $ k $, a constant flight speed $ v_k = 20 $ m/s, and an ISR time $ s_i = 0.1 $ hours at each event point. The algorithm parameters were set as: population size NIND = 50, maximum iterations MAXGEN = 100. The experiments were performed in MATLAB on a system with Intel® Core™ i7-9750H CPU and 8.00 GB RAM. Each instance was run 30 times to obtain statistical results, and the best solutions were recorded.

The following table presents the simulation results for six TSPLIB instances, including the number of event points, computation time, number of UAVs used (F1), total flight distance (F2), and the optimal task sequences for each UAV.

Instance	Event Points	Computation Time (s)	Number of UAVs (F1)	Flight Distance (F2) km	Task Sequences
burma14	14	1.000	2	20.614	UAV1: 0-1-8-14-12-6-5-7-13-11-9-0; UAV2: 0-10-2-3-4-0
ulysses16	16	1.090	3	51.441	UAV1: 0-3-10-9-5-15-16-0; UAV2: 0-13-6-7-12-2-4-14-8-0; UAV3: 0-11-0
ulysses22	22	1.429	3	51.632	UAV1: 0-8-14-13-20-19-9-10-21-16-3-0; UAV2: 0-22-2-17-4-18-12-7-6-5-15-0; UAV3: 0-11-0
bayg29	29	2.300	5	302.483	UAV1: 0-18-17-14-0; UAV2: 0-21-2-20-10-4-15-19-13-24-28-0; UAV3: 0-11-22-0; UAV4: 0-5-29-3-26-9-12-6-0; UAV5: 0-16-25-7-23-27-8-0
eil51	51	10.320	6	112.710	UAV1: 0-28-31-26-7-43-24-23-48-8-0; UAV2: 0-41-40-19-0; UAV3: 0-22-3-36-35-20-29-21-34-50-16-2-0; UAV4: 0-51-17-42-44-45-15-37-46-0; UAV5: 0-32-11-12-47-4-18-13-25-14-6-27-0; UAV6: 0-38-5-49-10-33-39-30-9-0
st70	70	21.220	9	210.789	UAV1: 0-68-44-30-20-14-0; UAV2: 0-23-13-31-69-38-22-66-6-41-17-43-42-18-0; UAV3: 0-16-47-4-3-28-32-7-2-29-70-36-0; UAV4: 0-12-21-34-39-45-61-33-0; UAV5: 0-25-0; UAV6: 0-40-46-27-9-0; UAV7: 0-19-55-49-26-8-0; UAV8: 0-37-58-50-60-52-10-5-53-63-57-24-15-59-35-0; UAV9: 0-51-65-56-67-54-62-48-11-64-0

The results demonstrate that the CPS-GWO algorithm efficiently solves the MTSP problem, achieving balanced task assignments with minimal UAV usage and flight distances. For instance, in the burma14 case, only 2 UAVs are needed, covering a total distance of 20.614 km, while larger instances like st70 require 9 UAVs but maintain a reasonable flight distance of 210.789 km. The computation times are practical for real-time applications, increasing with problem size but remaining within acceptable limits.

To further evaluate the performance, CPS-GWO was compared with other optimization algorithms, including standard GWO, an improved GWO (IGWO), a genetic algorithm-based IGWO (GA-IGWO), PSO, genetic algorithm (GA), ant colony optimization (ACO), and brain storm optimization (BSO). The comparison focused on instances with up to 50 event points, using the same population size and iteration limits. The following table summarizes the results for four instances, highlighting the number of UAVs (F1), flight distance (F2), and average computation time.

Instance	Algorithm	F1 (UAVs)	F2 (km)	Average Time (s)
burma14	GWO	2	22.405	2.169
	CPS-GWO	2	20.614	1.000
	GA-IGWO	2	20.749	1.468
	IGWO	2	22.249	1.700
ulysses16	GWO	3	64.764	3.028
	CPS-GWO	3	51.441	1.090
	GA-IGWO	3	54.444	1.996
	IGWO	3	62.016	2.200
ulysses22	GWO	3	66.315	3.357
	CPS-GWO	3	51.632	1.429
	GA-IGWO	3	53.401	2.634
	IGWO	3	55.901	2.115
bayg29	GWO	5	554.418	5.677
	CPS-GWO	5	302.483	2.309
	GA-IGWO	5	377.236	3.225
	IGWO	5	400.101	3.410

The results indicate that CPS-GWO consistently outperforms other variants of GWO in terms of solution quality and computation time. For example, in ulysses16, CPS-GWO reduces the flight distance by approximately 20% compared to standard GWO and achieves faster convergence. Similarly, in bayg29, CPS-GWO yields a flight distance of 302.483 km, significantly lower than the 554.418 km from GWO. The integration of Kent chaotic mapping and PSO-inspired updates contributes to this improvement by enhancing exploration and exploitation balance. When compared to other swarm intelligence algorithms, CPS-GWO demonstrates competitive performance, as shown in the following table for the same instances:

Instance	PSO	GA	ACO	BSO	CPS-GWO
burma14 (F2 km)	25.282	26.121	25.282	20.076	20.614
ulysses16 (F2 km)	71.565	72.669	71.296	50.362	51.441
ulysses22 (F2 km)	77.741	79.024	77.741	50.443	51.632
bayg29 (F2 km)	720.524	722.385	719.624	301.718	302.483

In all cases, CPS-GWO achieves flight distances close to or better than BSO, which is known for its effectiveness in combinatorial optimization. This confirms the algorithm’s robustness and suitability for multi-UAV task allocation in emergency rescue operations. The use of drone technology and Unmanned Aerial Vehicle systems in these algorithms highlights their potential to revolutionize disaster response strategies, providing faster and more efficient solutions.

In conclusion, this study addresses the multi-UAV task allocation problem in emergency rescue scenarios by formulating it as an MTSP and proposing the CPS-GWO algorithm. The algorithm combines the strengths of GWO and PSO, incorporating Kent chaotic mapping for initialization and adaptive position updates to handle discrete optimization. Simulation results on TSPLIB instances validate its effectiveness in minimizing the number of UAVs and total flight distance while adhering to operational constraints. Compared to existing algorithms, CPS-GWO shows superior performance in solution quality and computational efficiency. Future work could explore the integration of heterogeneous UAVs with varying capabilities and tasks, as well as dynamic environments where event points change over time. Additionally, extending the model to include multi-objective optimization considering factors like energy consumption and task priorities would further enhance the practicality of drone technology in real-world emergency response systems. The continuous advancement of Unmanned Aerial Vehicle technology and optimization algorithms will undoubtedly lead to more intelligent and adaptive solutions for complex rescue missions.