Unmanned Aerial Vehicle Path Planning Based on Multi-Strategy Improved White Shark Optimizer

In recent years, the rapid advancement of Unmanned Aerial Vehicle technology has propelled its widespread application across numerous fields, including agriculture, disaster rescue, and environmental monitoring. In these scenarios, planning an optimal path for the Unmanned Aerial Vehicle from a starting point to a target point is a fundamental and critical task. However, in practical applications, Unmanned Aerial Vehicle path planning must consider various factors, such as terrain obstacles, flight restrictions, and energy consumption. Addressing these complex environments and multi-objective tasks makes it particularly important to plan a safe and efficient path for the Unmanned Aerial Vehicle.

Commonly used path planning algorithms for Unmanned Aerial Vehicles primarily include traditional algorithms, reinforcement learning-based path planning algorithms, and heuristic algorithms. Traditional algorithms, such as the A* algorithm, RRT algorithm, and Dijkstra algorithm, are typically based on mathematical models and graph theory methods. They offer advantages such as solid theoretical foundations, determinism, and ease of implementation, demonstrating good performance in small-scale environments. However, in complex, high-dimensional environments, their performance tends to be poor, and computational efficiency is low. For instance, an improved lazy theta* path planning algorithm based on the variant of the A* algorithm, theta*, was proposed, which reduces search nodes by adding octree mapping and introduces line-of-sight algorithms and heuristic weight adjustments to effectively improve search speed and path smoothness. Nevertheless, the construction time for octree maps is lengthy, which may impact real-time performance, and its ability to handle complex obstacles is limited. Another study proposed an improved kinematically constrained bidirectional RRT algorithm, which achieves efficient path planning and smooth trajectory generation in complex orchard environments through strategies such as improved artificial potential field-guided tree generation, motion control parameter sampling, and dual-tree connection area optimization. However, its real-time response capability in complex environments is limited, and it cannot effectively address obstacle avoidance for complex obstacles. A method combining improved information-enhanced RRT* with dynamic windows was proposed to solve the path planning problem for Unmanned Aerial Vehicles in dynamic or unknown static obstacle environments, but this algorithm has high complexity and requires substantial computational resources.

Reinforcement learning-based methods, utilizing Deep Q-Networks, Q-learning, and others for path planning, offer strong adaptability, real-time planning capability, and autonomous learning ability. They can perform path planning in complex, dynamic, and uncertain environments. However, they require large amounts of training data and computational resources, and the training process is often time-consuming. For example, an improved adversarial dual deep Q-network was proposed, which successfully addresses path planning issues in complex scenarios by combining heuristic action strategies and prioritized experience replay. However, it demands high computational resources, as deep neural networks require extensive training data and computational power, accompanied by long model convergence times. Another study introduced a Q-learning algorithm that slightly reduces the distance required for the Unmanned Aerial Vehicle to reach the target point by incorporating a shortest-distance priority strategy during the learning process, effectively solving the path planning problem for Unmanned Aerial Vehicles in static and dynamic obstacle environments. However, Q-learning requires maintaining and updating a Q-table; when the state space is large, computational complexity increases significantly, and as the environment becomes more complex and changes frequently, training time also substantially increases.

Heuristic algorithms in path planning offer advantages such as high flexibility, strong adaptability, and outstanding multi-objective optimization capabilities, enabling them to effectively handle various types of path planning problems. However, they face certain issues regarding convergence speed and global optimal solutions. For instance, a novel hybrid particle swarm optimization algorithm was proposed, which enhances optimization capability, avoids local convergence, and improves convergence speed by integrating simulated annealing algorithms and dimension learning strategies. However, in large-scale search spaces or high-dimensional optimization problems, it still suffers from insufficient population diversity. A hybrid grey wolf and differential evolution algorithm was proposed to solve the Unmanned Aerial Vehicle path planning problem, extending the wolf pack’s position update equation and using a rank-based mutation operator based on differential evolution algorithms to promote algorithm exploration and help the algorithm escape local optima. However, this algorithm exhibits convergence hysteresis issues. Another study proposed a path planning and collision avoidance algorithm for Unmanned Aerial Vehicles based on partially observable Markov decision processes and an improved grey wolf optimization algorithm, which can generate an optimal flight path in static environments while achieving dynamic obstacle avoidance for Unmanned Aerial Vehicles. However, this algorithm requires high computational resources, as Monte Carlo tree search and information particle filters involve substantial computational load in continuous state spaces.

The White Shark Optimizer (WSO) is a novel heuristic algorithm proposed in 2022 for global optimization problems, capable of handling different types of problems simply and quickly. However, it is prone to insufficient exploration of the search space and falling into local optima. To address these issues, this paper proposes a Multi-Strategy Improved White Shark Optimizer (MSIWSO) and applies it to the three-dimensional path planning problem for Unmanned Aerial Vehicles. The improved algorithm incorporates a spiral search strategy to balance global exploration and local exploitation, enhancing search efficiency; utilizes a crisscross strategy to improve global search capability and solution accuracy; and introduces a Cauchy mutation strategy to enhance population diversity and avoid local optima. Experiments demonstrate that this algorithm exhibits excellent optimization capability and stability, enabling the planning of safe and efficient flight paths for Unmanned Aerial Vehicles.

This study focuses on scenarios such as single Unmanned Aerial Vehicle monitoring and rescue in complex mountainous environments, addressing how to plan a safe and efficient flight path from a starting point to a target point. As shown in the model, path planning must automatically generate an optimal path from a fixed starting point to safely reach the target point, fully considering terrain environment and flight constraints (such as yaw angle, pitch angle, etc.), to ensure efficient task completion. The evaluation of each path must integrate three-dimensional environmental factors and flight physical constraints, which determine the feasibility, safety, and efficiency of the path. The following sections detail the Unmanned Aerial Vehicle path planning model.

The Unmanned Aerial Vehicle path planning involves finding an optimal and collision-free trajectory in the solution space from a specified starting point to a target point, while considering environmental threats (e.g., terrain obstacles) and flight physical constraints. A feasible path $P$ can be represented as a series of ordered path points: $P = \{p_0, p_1, \ldots, p_D, p_{D+1}\}$, where $p_0$ is the starting point, $p_{D+1}$ is the target point, and each path point $p_i = (x_i, y_i, z_i)$ must satisfy boundary constraints. The horizontal boundaries are defined by $[X_{\min}, X_{\max}]$ and $[Y_{\min}, Y_{\max}]$, and the vertical boundaries by $[Z_{\min}, Z_{\max}]$. The specific symbol definitions are provided in Table 1.

Table 1: Symbol Definition Table
Symbol	Definition
$D$	Number of path points
$p_0$, $p_{D+1}$	Starting point and target point
$\alpha_i$, $\beta_i$	Yaw angle and pitch angle
$\alpha_{\max}$, $\beta_{\max}$	Maximum yaw angle and maximum pitch angle
$H_{\min}$, $H_{\max}$	Minimum and maximum height from terrain
$\lambda_l$	Length penalty coefficient
$\lambda_h$	Height penalty coefficient
$\lambda_\alpha$, $\lambda_\beta$	Yaw angle and pitch angle penalty coefficients
$K$	Maximum number of iterations
$N$	Population size
$M$	Number of crisscross individuals
$v_k^i$	Current white shark’s velocity
$\omega_k^i$	Current white shark’s position
$\omega_k^{\text{best}}$	Global best white shark position
$\hat{v}_k^i$	Individual best position of white shark
$\omega_k^r$	Position of a random white shark in population
$D_w$	Distance between best and current white shark
$D_r$	Distance between current and random white shark

The environmental modeling for Unmanned Aerial Vehicle flight is fundamental, describing the conditions that the Unmanned Aerial Vehicle must meet for obstacle avoidance and constraint satisfaction during flight in a specific space. In this paper, mountains are defined as obstacles that severely affect the flight safety of the Unmanned Aerial Vehicle. The mathematical model of obstacles can be expressed by the following formula:

$$ z(x, y) = \sum_{i=1}^{m} h(i) \times \exp\left(-\left(\frac{x – x_0(i)}{s_x^i}\right)^2 – \left(\frac{y – y_0(i)}{s_y^i}\right)^2\right) $$

This formula is used to construct a three-dimensional mountainous environment model by superimposing multiple rotationally symmetric Gaussian surfaces to generate continuous terrain. Here, $(x, y)$ is the projection of a point on the horizontal plane, $z$ is the corresponding height at that point; $(x_0(i), y_0(i))$ represents the center coordinates of the $i$-th peak; $s_x^i$ and $s_y^i$ control the decay of the $i$-th peak on the x-axis and y-axis, respectively; $m$ is the number of peaks; and $h(i)$ is the height of the $i$-th peak. The formula simulates the slope change of peaks through an exponential function: the farther from the peak center, the more significant the height attenuation. In practical modeling, several independent peak forms are first generated, and then these peaks are superimposed and combined to simulate the complex natural landform characteristics of multiple peaks coexisting.

In Unmanned Aerial Vehicle path planning, to ensure that the Unmanned Aerial Vehicle can avoid terrain obstacles during flight while preventing increased energy consumption or exposure to adverse conditions such as high-altitude wind speeds due to excessive flight height, a reasonable range is planned for the Unmanned Aerial Vehicle’s flight height, as shown below:

$$ z(x_i, y_i) + H_{\min} \leq H_i \leq z(x_i, y_i) + H_{\max} $$

where $z(x_i, y_i)$ is the terrain height at the $i$-th path point; $H_i$ is the height of the Unmanned Aerial Vehicle at the $i$-th path point; $H_{\min}$ and $H_{\max}$ are the minimum and maximum allowed flight heights from the terrain, respectively.

In actual flight, the trajectory of the Unmanned Aerial Vehicle is required to be as smooth and stable as possible. Excessively large yaw and pitch angles pose unnecessary risks to the Unmanned Aerial Vehicle. Therefore, the yaw and pitch angles of the Unmanned Aerial Vehicle should be as small as possible.

$$ \alpha_i = \arccos\left(\frac{\mathbf{q}_i \cdot \mathbf{q}_{i+1}}{\|\mathbf{q}_i\| \|\mathbf{q}_{i+1}\|}\right) < \alpha_{\max} $$

$$ \beta_i = \arctan\left(\frac{z_{i+1} – z_i}{\sqrt{(x_{i+1} – x_i)^2 + (y_{i+1} – y_i)^2}}\right) < \beta_{\max} $$

where $\mathbf{q}_i = (x_i – x_{i-1}, y_i – y_{i-1})$ and $\mathbf{q}_{i+1} = (x_{i+1} – x_i, y_{i+1} – y_i)$ are direction vectors at the $i$-th path point.

The quality of the path generated by Unmanned Aerial Vehicle path planning is typically evaluated by a fitness function, which comprehensively considers path length, path height, and Unmanned Aerial Vehicle flight physical constraints. The expression of the fitness function $F$ is as follows:

$$ F = \lambda_l F_L + \lambda_h F_H + \lambda_\alpha F_\alpha + \lambda_\beta F_\beta $$

where $F_L$, $F_H$, $F_\alpha$, and $F_\beta$ represent path length penalty, path height penalty, yaw angle penalty, and pitch angle penalty, respectively. $\lambda_l$ is the path length penalty coefficient; $\lambda_h$ is the height penalty coefficient; $\lambda_\alpha$ and $\lambda_\beta$ are the yaw angle and pitch angle penalty coefficients, respectively.

In Unmanned Aerial Vehicle path planning, to ensure that the path meets actual task requirements, the path length should be minimized while satisfying basic constraints. Here, the flight path of the Unmanned Aerial Vehicle consists of $D+1$ path segments, and the length of each path segment $l_i$ is the Euclidean distance between adjacent nodes. The path length penalty $F_L$ is as shown in Formula (8), where $L_{\min}$ is the Euclidean distance from the starting point to the target point; $L_{\max}$ is the theoretical maximum path length, defined as the total path length of the Unmanned Aerial Vehicle vertically ascending from the starting point to the maximum safe flight height, flying straight to above the target point, and then vertically descending.

$$ F_L = \frac{\sum_{i=1}^{D+1} l_i – L_{\min}}{L_{\max} – L_{\min}} $$

Considering that paths may violate constraints, this paper adopts a penalty mechanism. The specific penalty formulas are as follows:

$$ F_H = \sum_{i=1}^{D} \begin{cases} 0 & \text{if } H_{\min} \leq H_i – z(x_i, y_i) \leq H_{\max} \\ \frac{(H_i – z(x_i, y_i) – H_{\max}) (F_H^{\max} – F_H^{\min})}{H_{\max} – H_{\min}} + F_H^{\min} & \text{if } H_i – z(x_i, y_i) > H_{\max} \\ \frac{(H_{\min} – (H_i – z(x_i, y_i))) (F_H^{\max} – F_H^{\min})}{H_{\max} – H_{\min}} + F_H^{\min} & \text{if } H_i – z(x_i, y_i) < H_{\min} \end{cases} $$

$$ F_\alpha = \sum_{i=1}^{D} \begin{cases} 0 & \text{if } \alpha_i \leq \alpha_{\max} \\ \frac{(\alpha_i – \alpha_{\max}) (F_\alpha^{\max} – F_\alpha^{\min})}{\alpha_{\max}} + F_\alpha^{\min} & \text{if } \alpha_i > \alpha_{\max} \end{cases} $$

$$ F_\beta = \sum_{i=1}^{D} \begin{cases} 0 & \text{if } \beta_i \leq \beta_{\max} \\ \frac{(\beta_i – \beta_{\max}) (F_\beta^{\max} – F_\beta^{\min})}{\beta_{\max}} + F_\beta^{\min} & \text{if } \beta_i > \beta_{\max} \end{cases} $$

where $F_H^{\max}$ and $F_H^{\min}$ are the maximum and minimum penalty values when the flight height violates the maximum safe height constraint, respectively; $F_\alpha^{\max}$ and $F_\alpha^{\min}$ are the maximum and minimum penalty values for violating the yaw angle constraint, respectively; $F_\beta^{\max}$ and $F_\beta^{\min}$ are the maximum and minimum penalty values for violating the pitch angle constraint, respectively.

The main concept of the White Shark Optimizer (WSO) is derived from the behavior of white sharks during hunting. Through their keen hearing and smell, they can accurately locate prey, achieving search, tracking, positioning, and hunting of prey. The White Shark Optimizer maps the prey position to the optimal solution to be solved and achieves solution optimization by simulating the process of white sharks sensing and tracking prey. The algorithm simulates three core predation stages: (1) moving towards prey; (2) randomly searching for prey; and (3) searching near prey.

In this behavior, white sharks track and locate prey based on their senses. White sharks perceive the position of prey through fluctuations generated by the movement of prey and move directly towards the prey. The velocity update method is:

$$ v_{k+1}^i = \mu \left[ v_k^i + p_1 \left( \omega_k^{\text{gbest}} – \omega_k^i \right) \times c_1 + p_2 \left( \omega_{\text{best}}^{\hat{v}_k^i} – \omega_k^i \right) \times c_2 \right] $$

$$ \hat{v} = \lfloor N \times \text{rand} \rfloor + 1 $$

Formula (12) is used to update the velocity of the white shark, simulating the predatory behavior of white sharks being guided simultaneously by the population’s optimal direction and individual historical optimal direction. It transforms the perceived prey position information into the velocity and direction of white shark movement, enabling white sharks to gradually approach the prey. In the formula, $v_{k+1}^i$ represents the velocity of the $i$-th white shark at iteration $k+1$; $v_k^i$ represents the velocity of the $i$-th white shark at iteration $k$; $\omega_k^{\text{gbest}}$ represents the global best position of the white shark population in the first $k$ iterations; $\omega_k^i$ represents the position of the $i$-th white shark at iteration $k$; $\omega_{\text{best}}^{\hat{v}_k^i}$ represents the individual best position of the white shark so far; $\hat{v}_i$ represents the index vector of the white shark that has currently reached the best position, defined as in Formula (13), where $N$ is the number of white shark populations; $c_1$ and $c_2$ are random numbers in the interval $[0,1]$; $p_1$ and $p_2$ are two control parameters; $\mu$ represents the contraction factor.

In this behavior, white sharks search for prey based on the scent left by the prey and fluctuations generated by movement. In some cases, prey leave scent at the location they departed, and white sharks can still smell the scent left by the prey. In this situation, white sharks randomly search within the search space to find prey.

$$ \omega_{k+1}^i = \begin{cases} \omega_k^i \cdot \neg \mathbf{a} + \mathbf{b} \cdot \mathbf{u} + \mathbf{l} \cdot \text{rand} & \text{if } \text{rand} < mv \\ \omega_k^i + \frac{v_k^i}{f} & \text{if } \text{rand} \geq mv \end{cases} $$

Formula (14) primarily elaborates how white sharks dynamically adjust their positions within the search space using smell and hearing. In the formula, $\neg$ is the negation operator; $\mathbf{u}$ and $\mathbf{l}$ are the upper and lower bounds of the search space, respectively; $\mathbf{a}$ and $\mathbf{b}$ are one-dimensional binary vectors; $\mathbf{w}_0$ is a logical vector; $\text{rand}$ is a random number uniformly distributed in the range $[0,1]$; $f$ is the frequency of the white shark’s wave motion; $mv$ is the intensity of the white shark’s hearing and smell, used to control the search intensity of the white shark. When $\text{rand} < mv$, the white shark randomly searches for prey within the search space; when $\text{rand} \geq mv$, the white shark moves towards the direction of the prey.

In this behavior, white sharks update their positions based on the best white shark, and the updated position is very close to the prey.

$$ \omega_{k+1}^i = \omega_k^{\text{gbest}} + \text{sgn}(\text{rand} – 0.5) \cdot D_w \cdot r_1 \cdot r_2 \cdot r_3 \cdot Ss $$

$$ D_w = | \text{rand} \times (\omega_k^{\text{gbest}} – \omega_k^i) | $$

Formula (15) enhances the local exploitation capability of the algorithm by simulating the fine search behavior of white sharks when approaching prey. In the formula, $\omega_{k+1}^i$ is the updated position of the $i$-th white shark based on the best white shark; $\text{sgn}()$ is a function used to represent the sign; variables $r_1$, $r_2$, and $r_3$ are random numbers in the range $[0,1]$; $D_w$ represents the distance between the prey and the white shark; $Ss$ is a parameter representing the shark’s smell and visual intensity.

Simultaneously, white sharks update their positions through schooling behavior. The position of the white shark is updated based on the two best-positioned white sharks during the search process.

$$ \omega_{k+1}^i = \frac{\omega_k^i + \omega_{k+1}^i}{2 \times \text{rand}} $$

WSO has advantages such as few parameters, ease of implementation, and fast convergence speed. However, it still suffers from issues such as insufficient global search capability, tendency to fall into local optima, and low convergence accuracy. To address these limitations, this paper proposes three collaborative improvement strategies.

In WSO, the random search for prey described by Formula (14) drives white sharks to explore near the boundaries of the search space when searching for potential prey locations. However, due to its limited exploration capability, this method does not sufficiently explore the solution space. To overcome this limitation, this paper proposes a spiral search strategy inspired by natural spiral motion patterns. This strategy balances exploration and exploitation by dynamically adjusting the spiral trajectory: expanding the exploration range in early iterations to enhance global search; and contracting the spiral trajectory in later stages to focus on local optimization.

Based on this, to balance global search and local exploitation capabilities and further improve search efficiency, the method of randomly searching for prey, i.e., Formula (14), is improved as follows:

$$ \omega_{k+1}^i = \begin{cases} \omega_k^i + e^{\psi \tau} \cos(2\pi\tau) \cdot D_r & \text{if } \text{rand} < mv \\ \omega_k^i + \frac{v_k^i}{f} & \text{if } \text{rand} \geq mv \end{cases} $$

$$ D_r = \text{rand} \times (\omega_k^r – \omega_k^i) $$

In the formula, when $\text{rand} < mv$, spiral motion is simulated to perform prey search. Here, $\psi$ is a constant, taken as 1; $\tau$ is a random number in $[-1,1]$; $D_r$ is the position difference between a randomly selected white shark individual $\omega_k^r$ and the current white shark individual $\omega_k^i$.

In the spiral search formula, $e^{\psi \tau}$ achieves random nonlinear expansion of the search radius, and $\cos(2\pi\tau)$ can generate circular search trajectories at any angle, ensuring search direction diversity and effectively covering corner areas of the solution space. In the early stages, differences between white shark individuals are significant; $D_r$ introduces population difference information by randomly selecting individuals, perturbing the search direction and maintaining population diversity. In the later stages, white sharks gradually approach the prey, differences between individuals缩小, and white sharks begin local exploitation. Through the synergistic effects of dynamic radius, omnidirectional search, and population perturbation, the exploration of the search space by white sharks is expanded, achieving a balance between exploration and exploitation.

In WSO, premature convergence and insufficient search are the main problems affecting algorithm performance. Premature convergence is usually caused by stagnation in certain dimensions, while insufficient search limits the global search capability of the algorithm. To address this, this paper introduces a crisscross strategy. Horizontal crossover performs search, which can reduce unexplored areas in the solution space and improve the global search capability of the algorithm. Meanwhile, vertical crossover helps break stagnation in certain dimensions and avoid premature convergence.

Horizontal crossover and vertical crossover are performed sequentially, and the two crossover methods influence each other, synergistically improving the solution accuracy and search efficiency of the algorithm. In the vertical crossover process, once a stagnant dimension breaks through the local optimum, this information quickly spreads throughout the population through horizontal crossover, thereby helping other stagnant dimensions to escape the local optimum as soon as possible. Additionally, the crisscross strategy includes a competition mechanism, where offspring are compared with parents, and superior individuals are retained, ensuring that updates proceed in a better direction. To balance computational efficiency and optimization performance, this paper randomly selects $M$ white shark individuals to implement the crisscross strategy. Horizontal crossover is an arithmetic crossover between two different white shark individuals operating on all dimensions. First, the selected white shark individuals are randomly paired, and then the paired two individuals undergo horizontal crossover with crossover probability $P_{hc}$, usually set to 1. For randomly matched parent individuals $\omega_{i1}$ and $\omega_{i2}$, their offspring individuals $\omega_{i1}^{hc}$ and $\omega_{i2}^{hc}$ can be obtained by the following formulas:

$$ \omega_{i1,j}^{hc} = r_4 \cdot \omega_{i1,j} + (1 – r_4) \cdot \omega_{i2,j} + c_3 \cdot (\omega_{i1,j} – \omega_{i2,j}) $$

$$ \omega_{i2,j}^{hc} = r_5 \cdot \omega_{i2,j} + (1 – r_5) \cdot \omega_{i1,j} + c_4 \cdot (\omega_{i2,j} – \omega_{i1,j}) $$

where $\omega_{i1,j}$ and $\omega_{i2,j}$ are the $j$-th dimension of $\omega_{i1}$ and $\omega_{i2}$, respectively; $\omega_{i1,j}^{hc}$, $\omega_{i2,j}^{hc}$ are the $j$-th dimension of the offspring generated by horizontal crossover on the $j$-th dimension of $\omega_{i1,j}$ and $\omega_{i2,j}$, respectively; $r_4$ and $r_5$ are random numbers uniformly distributed in the range $[0,1]$; $c_3$ and $c_4$ are random numbers uniformly distributed in the range $(-1,1)$. The generated offspring are compared with their parents, and the better individuals are retained.

Vertical crossover is an arithmetic crossover where each white shark individual operates on two different dimensions. Each white shark individual undergoes one vertical crossover and only updates one random dimension, while other dimensions remain unchanged, providing an opportunity for stagnant dimensions to escape local optima without destroying another dimension that may be optimal. During vertical crossover, two dimensions $j_1$ and $j_2$ are randomly selected, and the $j_1$-th dimension of the offspring $\omega_i^{vc}$ is obtained by Formula (22), while other dimensions remain consistent with the parent $\omega_i$.

$$ \omega_{i,j_1}^{vc} = r_6 \cdot \omega_{i,j_1} + (1 – r_6) \cdot \omega_{i,j_2} $$

where $\omega_{i,j_1}^{vc}$ is the $j_1$-th dimension of the offspring of $\omega_i$; $r_6$ is a random number uniformly distributed in the range $[0,1]$. The generated offspring are compared with their parents, and the better individuals are retained.

Cauchy mutation is a random perturbation strategy based on the Cauchy distribution. Its core idea is to use the long-tail characteristic of the Cauchy distribution to generate random values far from the current solution, thereby enhancing the global search capability of the algorithm. Specifically, the probability density function of the Cauchy distribution makes it easier to generate extreme values, which helps the algorithm escape local optima. In this algorithm, Cauchy mutation is applied to update the global best solution. By applying random perturbations to the current solution, new candidate solutions are generated. The new solution is compared with the old solution, and the better solution is retained. This mechanism can effectively enhance population diversity and avoid falling into local optima. The global best solution $\omega^{\text{gbest}}$ update formula is:

$$ \omega^{\text{gbest}}’ = \omega^{\text{gbest}} + c \cdot \text{Cauchy}(0,1) \cdot \delta \quad \text{if } \text{rand} > \delta $$

$$ c = \text{sgn}(\text{rand} – 0.5) $$

$$ \delta = \frac{\delta_0}{1 + e^{-b_1 (k / K)}} $$

where $\text{Cauchy}(0,1)$ represents a standard Cauchy distribution; $c$ is the direction control parameter; $\delta$ is the nonlinear factor controlling mutation; $\delta_0$, $b_1$ are two positive numbers controlling the initial value size and decay rate of $\delta$, respectively; $K$ is the maximum number of iterations. When $\text{rand} > \delta$, mutation is performed on the best solution to generate a new solution $\omega^{\text{gbest}}’$, and the better solution is retained.

To further optimize the Cauchy mutation strategy, the value of $\delta$ is adjusted by Equation (25) to dynamically adjust the intensity and frequency of mutation. In the early stages, the algorithm needs to extensively explore the search space through high-frequency mutations, increasing solution diversity; however, excessive mutations can over-disperse the solutions, significantly increasing convergence difficulty. Therefore, an appropriate initial value needs to be set for $\delta$, balancing the initial mutation frequency with $\delta_0$. If $\delta_0$ is too small, initial mutations are insufficient, limiting solution diversity and easily causing premature convergence; if $\delta_0$ is too large, over-exploration occurs, drastically increasing convergence difficulty, and over-exploration should be avoided.

As iterations progress, the algorithm gradually transitions from the exploration phase to the exploitation phase. At this time, $\delta$ needs to decay smoothly to reduce mutation frequency, achieving a smooth transition from exploration to exploitation. If $b_1$ is too small, $\delta$ decays slowly, the exploration phase is prolonged, and it is difficult to focus on high-quality areas; if $b_1$ is too large, $\delta$ decays too quickly, the transition from exploration to exploitation occurs too early, and it is easy to fall into local optima due to overly hasty exploitation. After multiple experimental adjustments, when $\delta_0 = 0.75$ and $b_1 = 0.1$, the dynamic change of $\delta$ better matches the iterative needs of the algorithm. In the early stage, strong mutations fully explore the solution space; in the middle stage, the search range steadily narrows, orderly transitioning to the exploitation phase, effectively avoiding premature convergence.

Algorithm 1: Improved White Shark Optimizer

Input: Population size $N$, number of iterations $K$

Output: Optimal solution

Initialize white shark population $\omega_i$ (i=1,2,…,N);

Calculate the fitness value of individuals in $\omega_i$, and record the best individual as the optimal solution $\omega^{\text{gbest}}$;

WHILE (k < K)

Calculate and update parameters $mv$, $Ss$, and $\delta$;

FOR i=1 to N DO:

Update the white shark’s position using the spiral search strategy Formula (18);

END FOR

FOR i=1 to N DO:

IF rand < Ss THEN

The white shark performs local search near the prey, updating its position using Formula (15);

And update the population position based on the white shark’s schooling behavior using Formula (17);

END IF

END FOR

Randomly select M white sharks and execute the crisscross strategy;

Calculate the fitness value of each individual and update $\omega^{\text{gbest}}$;

IF rand > $\delta$ THEN

Execute the Cauchy mutation strategy and select the better solution through a greedy strategy;

END IF

k = k + 1;

END WHILE

Return optimal solution

In Unmanned Aerial Vehicle path planning, the flight path of the Unmanned Aerial Vehicle is usually generated based on a series of discrete path nodes generated by the optimization algorithm and formed by connecting these nodes. Considering the application of the improved white shark algorithm in Unmanned Aerial Vehicle path planning, as shown in the environmental modeling in Section 2.1, set each path to consist of D path nodes, and set D as the position dimension of each white shark. A white shark with a D-dimensional position vector is a planned path, and the j-th dimension position of the white shark is the j-th path point of the planned path.

In the path initialization phase, first randomly generate N initial paths. Each path consists of D three-dimensional path points, representing a white shark characterized by a D-dimensional position vector. These initial paths are generated by uniform random sampling, and a feasibility verification mechanism is added during the generation process to exclude invalid paths that obviously violate flight constraints.

Then, execute the improved white shark optimization algorithm to dynamically optimize the path by simulating the predatory behavior of white sharks. Specifically, each path is represented by a set of discrete three-dimensional node sequences, corresponding to the individual position vectors in the white shark population. In each iteration, the algorithm evaluates the fitness value of each path through the objective function and selects better paths. Then, based on the spiral search, crisscross, and Cauchy mutation strategies, the positions of the path nodes are dynamically adjusted, so that the population gradually evolves in the direction of better fitness. After multiple iterations, the algorithm finally converges to an optimal discrete path node sequence.

Finally, after generating the optimal discrete path node sequence, the cubic spline interpolation method is used to fit the discrete path nodes into a smooth flight path, ensuring path continuity and curvature smoothness, thereby improving the flight stability and safety of the Unmanned Aerial Vehicle, and ultimately generating a smooth trajectory that meets actual flight requirements.

Specific steps are as follows:

Step 1: Model the three-dimensional mountainous environment map and set the starting and ending points for Unmanned Aerial Vehicle flight;

Step 2: Establish a multi-objective fitness function (such as Formula (7));

Step 3: Randomly generate N initial paths. Each path is in the form of a sequence of D path points and is iteratively optimized through multi-strategy MSIWSO, eventually converging to an optimal trajectory.

Step 4: Use the cubic spline interpolation method to smooth the obtained optimal path and generate the final flight path.

In Unmanned Aerial Vehicle path planning, the improved white shark algorithm generates a series of discrete path nodes and forms a complete flight path by connecting these nodes. Considering that directly connecting these path nodes may result in unsmooth paths or even sharp turns, which poses challenges for Unmanned Aerial Vehicle flight control, after generating the optimal discrete path node sequence, the cubic spline interpolation method is further used to fit the discrete path nodes into a smooth flight path.

The specific interpolation process for generating the path is as follows:

Step 1: Combine the path points generated by MSIWSO with the starting and target points to form an ordered node sequence $P = \{p_0, p_1, \ldots, p_D, p_{D+1}\}$, where each node $p_i = (x_i, y_i, z_i)$.

Step 2: Divide the sequence P into three independent sequences according to coordinates: $A = \{x_0, x_1, \ldots, x_{D+1}\}$, $B = \{y_0, y_1, \ldots, y_{D+1}\}$, $C = \{z_0, z_1, \ldots, z_{D+1}\}$.

Step 3: Construct cubic spline interpolation functions for the coordinate sequences of each dimension separately. For any adjacent node interval $(x_i, x_{i+1})$, $(y_i, y_{i+1})$, and $(z_i, z_{i+1})$, the interpolation function is a cubic polynomial and satisfies continuous function values, first derivatives, and second derivatives over the entire sequence interval.

Step 4: On the parameterized interval $\{1,2,\ldots,n\}$ (where n is the number of interpolation points), calculate the smooth path points for each dimension through the cubic spline interpolation function: $(x_1^*, x_2^*, \ldots, x_n^*)$, $(y_1^*, y_2^*, \ldots, y_n^*)$, and $(z_1^*, z_2^*, \ldots, z_n^*)$.

Step 5: Combine the starting point, interpolation points, and target point to form a new sequence $P^* = \{p_0, p_1^*, \ldots, p_n^*, p_{D+1}\}$, where $p_i^* = (x_i^*, y_i^*, z_i^*)$. Connect this sequence in order to form the final smooth flight path.

This paper uses the CEC2022 test functions to evaluate the performance of MSIWSO and compares it with the Hybrid Grey Wolf Differential Evolution algorithm (HGWODE) and the original White Shark Algorithm (WSO) to verify the effectiveness of MSIWSO in solving the Unmanned Aerial Vehicle path planning problem. The simulation test environment is as follows: operating system win11, 64-bit operating system, 16 GB memory, AMD Ryzen 9 7945HX processor, equipped with Radeon graphics card, main frequency 2.50 GHz, simulation software matlabR2021a.

The benchmark function set is a set of test functions that play an important role in testing algorithm performance in optimization algorithm research and development. The CEC2022 test function set consists of 12 single-objective test functions with boundary constraints: unimodal function (F1), multimodal functions (F2-F5), hybrid functions (F6-F8), and composition functions (F9-F12). Here, to verify the performance of MSIWSO, the deviations between the solutions obtained by MSIWSO, HGWODE, and WSO on CEC2022 and the global optimal solution are compared and analyzed. For fairness in experimental testing, the population size for all three algorithms is 60, and the dimension is 20. Each algorithm was independently tested 20 times on each benchmark function, and the test results are shown in Table 2.

Table 2: Comparison of CEC2022 Test Function Results
Function	Metric	WSO	HGWODE	MSIWSO
F1	Mean	6.6063E+3	5.1557E+4	0.3693E-1
	Std	1.2846E+4	1.7787E+4	1.0519E+0
	Best	7.6375E+1	2.1072E+4	5.6101E-3
	Worst	5.5527E+4	9.1184E+4	4.7583E+0
F2	Mean	7.1111E+1	7.3370E+1	3.9969E+1
	Std	4.7849E+1	4.5124E+1	2.0541E+1
	Best	6.7036E+0	4.4895E+1	3.5191E-1
	Worst	1.9018E+2	2.0258E+2	6.8497E+1
F3	Mean	1.1212E+1	3.4376E+0	5.8599E-2
	Std	6.9474E+0	2.9394E+0	3.6817E-2
	Best	1.1239E+0	1.5057E-1	9.7804E-3
	Worst	2.6268E+1	1.2497E+1	1.4413E-1
F4	Mean	4.6237E+1	9.2060E+1	3.6512E+1
	Std	1.8113E+1	1.8285E+1	1.1712E+1
	Best	1.9908E+1	6.3145E+1	1.7909E+1
	Worst	9.4949E+1	1.2211E+2	5.7707E+1
F5	Mean	2.1281E+2	1.9559E+1	2.6477E+1
	Std	1.7308E+2	6.5643E+1	3.2081E+1
	Best	2.2926E+1	6.0019E-1	1.1773E+0
	Worst	6.5081E+2	2.9661E+2	1.3115E+2
F6	Mean	3.6311E+6	4.5594E+7	1.5354E+4
	Std	1.1370E+7	6.7329E+7	7.0112E+3
	Best	1.2954E+2	1.8659E+4	1.2708E+3
	Worst	4.3117E+7	3.2445E+8	2.3274E+4
F7	Mean	1.2885E+2	1.4006E+2	4.9540E+1
	Std	7.7938E+1	5.1615E+1	3.3091E+1
	Best	2.8913E+1	4.0574E+1	2.4611E+1
	Worst	2.8562E+2	2.5550E+2	1.6948E+2
F8	Mean	1.1242E+2	3.1875E+1	5.9004E+1
	Std	7.1440E+1	7.5534E+0	5.6606E+1
	Best	2.1882E+1	2.3716E+1	2.1019E+1
	Worst	2.8790E+2	4.9529E+1	1.6554E+2
F9	Mean	2.4072E+2	1.9963E+2	1.8078E+2
	Std	6.3157E+1	2.4360E+1	5.3886E-4
	Best	1.8078E+2	1.8078E+2	1.8078E+2
	Worst	3.8076E+2	2.8008E+2	1.8078E+2
F10	Mean	1.8345E+3	4.7949E+2	2.2791E+2
	Std	1.2359E+3	3.9170E+2	1.7296E+2
	Best	1.0096E+2	1.2632E+2	4.0477E+1
	Worst	3.3135E+3	1.4766E+3	5.7842E+2
F11	Mean	4.4462E+2	3.0748E+2	3.0112E+2
	Std	1.1483E+2	1.0016E+1	4.6634E+0
	Best	3.0678E+2	3.0086E+2	3.0002E+2
	Worst	6.5283E+2	3.3946E+2	3.2093E+2
F12	Mean	3.6492E+2	2.5682E+2	2.5313E+1
	Std	7.9126E+1	1.6898E+1	9.5241E+0
	Best	2.4954E+2	2.4449E+2	2.4125E+2
	Worst	5.0071E+2	3.0211E+2	2.7734E+2

From Table 2, it can be found that MSIWSO shows better performance advantages on most test functions. In the unimodal function F1, the average value of MSIWSO is significantly better than WSO and HGWODE, and its optimal solution is close to the theoretical global optimum, indicating that it can quickly locate the global optimal solution.

In multimodal functions, MSIWSO has the lowest optimal values for functions F2-F4, indicating that it can effectively avoid premature convergence in complex multimodal terrain. In function F5, although the optimal solution of MSIWSO is worse than HGWODE, its smaller worst value and standard deviation indicate that its overall performance is more stable.

In hybrid functions F6 and F7, the average values of MSIWSO are much lower than the other two algorithms. This shows that it can better find the global optimal solution, avoid falling into local optima, and has strong global search capability.

In composition functions F9-F12, MSIWSO has the best optimal values and average values, indicating that it has strong global search capability and can widely explore in complex solution spaces to find the best solution. Moreover, in function F9, the standard deviation of MSIWSO approaches zero, and the optimal value and worst value are almost the same, indicating that MSIWSO can stably cover multiple local optimal regions and converge to the global optimum.

Among these 12 functions, the standard deviation of MSIWSO is generally lower than other algorithms, indicating that its solution performance is more consistent in multiple independent runs, providing relatively stable results in each run, and it is less affected by initial conditions, random disturbances, or the complexity of the function itself. At the same time, MSIWSO performs well in unimodal, multimodal, hybrid, and composition functions, and in functions containing asymmetric terrain (F12) and high noise (F7), MSIWSO can still maintain low standard deviation and high-quality solutions, reflecting its adaptability to complex problems.

Meanwhile, to further verify the convergence performance of MSIWSO, convergence experiments were conducted on these test functions, and the convergence curves are shown in the figure. From the convergence curves, it can be found that MSIWSO can achieve faster convergence speed and obtain better solutions in these test functions, showing good convergence performance. The experimental results show that MSIWSO has good exploration and exploitation capabilities, improving the convergence speed and solution accuracy of the algorithm.

In summary, MSIWSO demonstrates good optimization capability and stability on the CEC2022 function set, and can effectively cope with complex and changing optimization problems. MSIWSO has strong adaptability and solving ability, and has broad application potential in solving complex optimization problems in the real world.

To evaluate the effectiveness and feasibility of MSIWSO in solving the three-dimensional path planning problem for Unmanned Aerial Vehicles, this paper conducts simulation experiments under two terrain models, 10-peak and 20-peak, and compares them with WSO and HGWODE algorithms.

As shown in the figures, all three algorithms can generate feasible paths from the starting point to the target point in the three-dimensional environment for Unmanned Aerial Vehicle path planning. However, the path planned by MSIWSO has more advantages. This is mainly because when encountering obstacles, MSIWSO adopts a lateral bypass strategy instead of climbing, which can effectively avoid the impact of obstacles. At the same time, MSIWSO can comprehensively consider the height factor of the flight path, maintain a safe distance from obstacles, and prioritize shorter paths, thereby improving the efficiency and feasibility of the path.

Moreover, from the convergence curves, it can be observed that in both 10-peak and 20-peak environments, MSIWSO shows better convergence performance. The fitness value of MSIWSO in the early iterations is not outstanding, but as the iterations progress, its fitness value continues to optimize, gradually decreases, and achieves stable convergence after 200 iterations. MSIWSO not only has good convergence speed but can also obtain better solutions. This is because MSIWSO combines the spiral search strategy for extensive exploration in the early stage and local fine search in the later stage, achieving a balance between exploration and exploitation and improving search efficiency.

From the box plot of fitness values, it can be seen that MSIWSO shows better advantages in both 10-peak and 20-peak environments. MSIWSO has better average and median values than other algorithms, and the box length and error lines of MSIWSO are shorter, indicating that its solution distribution is more concentrated, with better stability and consistency. In the 10-peak environment, compared with WSO and HGWODE, the average fitness value of MSIWSO is reduced by 40.35% and 68.64%, respectively, and the optimal fitness value is reduced by 4.46% and 5.51%, respectively. At the same time, in the more complex 20-peak environment, the average fitness value of MSIWSO is reduced by 72.2% and 34.7%, respectively, and the optimal fitness value is further reduced by 41.4% and 26.8%. This performance improvement is mainly due to the crisscross strategy and Cauchy mutation strategy. The former optimizes the solution distribution and improves the solution accuracy; the latter expands the search range, enhances population diversity, and avoids local optima. Therefore, the fitness value of the path generated by MSIWSO is lower, indicating that it can find a better flight path for the Unmanned Aerial Vehicle during the path planning process.

In addition, to more comprehensively evaluate the quality of the path planned by MSIWSO, this paper conducts experimental evaluations from four aspects: path height penalty, yaw angle penalty, pitch angle penalty, and path length, to verify the applicability of this algorithm in actual flight environments.

According to the table, MSIWSO performs better overall in terms of height penalty, yaw angle penalty, and pitch angle penalty. Specifically, MSIWSO is better than WSO and HGWODE in height penalty and pitch angle penalty, indicating that the flight path it plans has excellent performance in vertical height control and attitude adjustment. In the 10-peak environment, the average height penalty of MSIWSO is reduced by 99.7% compared to WSO, and its pitch angle penalty standard deviation is reduced by 82.8% compared to HGWODE. In the more challenging 20-peak environment, MSIWSO still maintains a large advantage. The worst height penalty is reduced by 14.5% compared to HGWODE, and the average yaw angle penalty is the lowest among the three algorithms. Although the worst yaw angle value of MSIWSO in the 20-peak scenario is slightly higher than HGWODE, its standard deviation is only 28.2% higher than HGWODE, indicating that the path fluctuation is still within a controllable range. Overall, MSIWSO has great advantages in the smoothness and stability of flight path planning, generating low-penalty paths in complex terrain, providing reliable guarantee for the actual flight of Unmanned Aerial Vehicles.

The box plot of path length shows that the path length distribution of MSIWSO is more concentrated in both environments. In the 10-peak environment, the average path length of MSIWSO is reduced by 1.16% and 2.78% compared to WSO and HGWODE, respectively, and the worst path length is reduced by 6.46% and 11.18%, respectively. In the 20-peak environment, although the terrain complexity increases, the average path length of MSIWSO is shortened by 11.2% and 2.7% compared to WSO and HGWODE, respectively, and the worst path length is significantly reduced by 16.9% compared to WSO, indicating that it can still plan a shorter path in complex terrain. In addition, the box length and error lines of MSIWSO are shorter, which indicates that the path planned by this algorithm has higher consistency and better stability in multiple run results, and can plan shorter paths for Unmanned Aerial Vehicles in most cases.

This paper proposes a multi-strategy improved white shark optimization algorithm to solve the path planning problem of Unmanned Aerial Vehicles in three-dimensional space. First, combined with the spiral search strategy, it extensively explores in the early stage and focuses on local optimization in the later stage, achieving a balance between exploration and exploitation and improving search efficiency. Then, the crisscross strategy is introduced to finely optimize the solution distribution, improve global search capability, and enhance solution accuracy. Finally, the Cauchy mutation strategy is used, and the long-tail distribution characteristics of the Cauchy distribution are used to introduce jump perturbations to generate new solutions, effectively expanding the search range, increasing population diversity, and avoiding the algorithm falling into local optima. Experimental results show that the MSIWSO algorithm has good optimization capability and stability, and can plan safe and efficient flight paths for Unmanned Aerial Vehicles in three-dimensional environments, providing a practical and feasible solution for the Unmanned Aerial Vehicle path planning problem.

Although the algorithm proposed in this study can effectively solve the Unmanned Aerial Vehicle path planning problem, there is still room for optimization in real-time performance in dynamic environments. Future research can focus on more efficient dynamic environment obstacle avoidance strategies to enhance the algorithm’s autonomous obstacle avoidance ability in unknown environments and further expand its application range.