The orchestrated flight of multiple unmanned aerial vehicles, known as drone formation, represents a pinnacle of coordination and autonomy in modern aerospace systems. From a historical perspective, the conceptual underpinnings of coordinated aerial systems can be traced to early 20th-century military reconnaissance, where single drones operated in isolation due to technological constraints. The paradigm shifted markedly in the 1960s, with pioneering efforts by military entities to develop multi-drone协同 systems, exemplified by the use of “Firebee” drone formations in the Vietnam War. Today, the evolution of drone formation technology is profoundly intertwined with advancements in artificial intelligence, autonomous navigation, and adaptive control, enabling unprecedented efficiency and precision in complex missions spanning defense, disaster response, and commercial applications.
Operating a cohesive drone formation offers significant advantages over solitary platforms. It provides enhanced situational awareness through a wider sensor footprint, increases operational efficiency via task parallelism, and improves system robustness through mutual support—ensuring mission continuity even if individual units fail. The core challenge in realizing these benefits lies in effective formation control, which scholars often categorize based on the information used: position-based, displacement-based, or distance-based control. Central to all these approaches is the fundamental requirement for accurate and reliable localization—determining the precise spatial relationship between each drone and its peers or targets. This capability is the bedrock upon which stable drone formation geometry and cooperative behavior are built.

Localization methodologies for drone formations are broadly classified into two distinct philosophies: active and passive. This article, after a comparative analysis of both, will focus primarily on surveying the principles, methods, and characteristics of passive localization techniques for cooperative aerial networks.
Active vs. Passive Localization: A Foundational Dichotomy
The choice between active and passive localization fundamentally shapes the capabilities, vulnerabilities, and application domain of a drone formation system. A clear understanding of this dichotomy is essential.
Active Localization involves drones actively emitting signals—such as radio waves, acoustic pulses, or laser beams—and processing the reflected returns from other drones, beacons, or the environment. Techniques like Time Difference of Arrival (TDoA), Frequency Difference of Arrival (FDoA), and direct ranging (e.g., using Ultra-Wideband) fall under this category. The measured parameters are used to solve geometric equations for position.
The principal advantage of active localization is its potential for high precision and self-contained operation. It provides direct range information, which is often crucial for tight formation-keeping. The underlying calculation for a simple ranging method between two drones can be expressed as:
$$ d = c \cdot \Delta t $$
where \( d \) is the estimated distance, \( c \) is the speed of the signal (e.g., speed of light for RF), and \( \Delta t \) is the measured round-trip time. This direct measurement leads to well-posed estimation problems. Furthermore, active systems can operate independently of external infrastructure, providing robust navigation in GPS-denied environments.
However, this approach carries inherent disadvantages. The emission of energy makes the drone formation detectable and susceptible to electronic warfare tactics like jamming and spoofing. It also increases the platform’s power consumption and electromagnetic signature, which is undesirable for covert operations.
Passive Localization, in contrast, operates on the principle of reception only. Drones in the formation determine their relative positions or the location of a target by solely processing signals that are naturally present or emitted by external sources (e.g., other drones’ communication links, terrestrial radio towers, or target emissions). Common techniques include Angle of Arrival (AoA), Time of Arrival (ToA) from unsynchronized emitters, and received signal strength (RSS) fingerprinting.
The defining characteristic of passive localization is its low probability of intercept (LPI) and high resistance to jamming, as the drone formation does not reveal its presence through emissions. This makes it supremely suitable for surveillance, reconnaissance, and electronic intelligence (ELINT) missions. Its challenges stem from the nature of the measurements; for instance, bearing-only (AoA) measurements lead to an observability problem—without direct range data, the estimation process is non-linear and can suffer from slow convergence or divergence without careful filtering and maneuver. The core measurement model for AoA is:
$$ \theta_i = \arctan\left(\frac{y_t – y_i}{x_t – x_i}\right) + \nu_{\theta} $$
where \( \theta_i \) is the bearing measured by drone \( i \), \( (x_t, y_t) \) is the target position, \( (x_i, y_i) \) is the drone’s position, and \( \nu_{\theta} \) is measurement noise.
The following table summarizes the core attributes of both approaches in the context of drone formation operations:
| Feature | Active Localization | Passive Localization |
|---|---|---|
| Core Principle | Emit signal and process reflection/response. | Process ambient or external signals only. |
| Key Advantage | High precision, direct range data, self-sufficient. | High covertness (LPI), low detectability, resistant to jamming. |
| Primary Disadvantage | Easily detectable, susceptible to jamming, higher power draw. | Observability issues (e.g., bearing-only), often requires sensor fusion, complex estimation. |
| Typical Metrics | Range (d), Time Difference (TDoA). | Bearing (θ), Signal Strength (RSS), Time of Arrival (ToA). |
| Formation Suitability | Tight, high-precision formation-keeping in permissive environments. | Covert, wide-area surveillance formations; electronic warfare packages. |
A Deep Dive into Passive Localization Methodologies for Drone Formations
Given its strategic advantages in covertness and electronic resilience, passive localization has been the focus of extensive research for drone formation control. Without active ranging, the formation must infer its geometry and maintain it through sophisticated control laws based on other sensed or communicated parameters. The mature and stable control paradigms for passively-localized drone formations can be classified into several distinct categories, each with its own mathematical foundation and practical implications.
1. Leader-Follower (Master-Slave) Approach
This is a hierarchical control strategy where one or a few designated drones, the leaders, are responsible for trajectory planning and navigation. The remaining follower drones regulate their states (position, velocity) relative to their assigned leader(s) using only local relative measurements (like bearing or visual tracking), which are inherently passive. The fundamental control objective for a follower \( i \) with respect to its leader \( L \) is to maintain a desired relative displacement \( \mathbf{p}_{i, des} = \mathbf{p}_L + \mathbf{d}_{i}^L \), where \( \mathbf{p}_L \) is the leader’s position and \( \mathbf{d}_{i}^L \) is the desired offset in the leader’s body frame. A common simplified control law for a 2D case using bearing \( \phi \) and estimated range \( \hat{r} \) might be:
$$ \mathbf{v}_i = K_p (\hat{r} \cos\phi – d_x, \hat{r} \sin\phi – d_y)^T $$
where \( \mathbf{v}_i \) is the follower’s velocity command, \( K_p \) is a gain, and \( (d_x, d_y) \) is the desired offset. The strength of this method lies in its conceptual simplicity and reduced inter-drone communication load—often only leader states or local measurements are needed. However, it introduces a critical single point of failure: the failure of a leader can destabilize the entire sub-formation. Research has focused on enhancing robustness, for instance, by introducing dynamic leader re-assignment or hybrid virtual structures to mitigate this dependency.
2. Graph Theory-Based Formations
This approach provides a rigorous mathematical framework for modeling the sensing and communication interactions within a drone formation. Each drone is represented as a node \( v_i \) in a graph \( G = (V, E) \), and an edge \( e_{ij} \in E \) exists if drone \( i \) can measure a relative state (e.g., relative position, bearing, or distance) with respect to drone \( j \). This measurement topology is often described by an adjacency matrix \( A \) or a Laplacian matrix \( L \). The formation control problem then translates to driving the system states to the null space of a matrix derived from \( L \) and the desired geometric constraints. For a formation defined by desired relative positions \( \mathbf{p}_{ij}^* = \mathbf{p}_i^* – \mathbf{p}_j^* \), a standard gradient-based control law using only relative position measurements \( \mathbf{p}_{ij} \) is:
$$ \dot{\mathbf{p}}_i = -\sum_{j \in N_i} (\mathbf{p}_{ij} – \mathbf{p}_{ij}^*) $$
where \( N_i \) is the set of neighbors of drone \( i \). Graph theory elegantly handles arbitrary formation shapes and allows for analysis of connectivity, convergence, and rigidity—a property ensuring the formation shape is uniquely determined by the inter-agent distance measurements. The challenge lies in ensuring the graph remains connected and that the local passive measurements (like bearing) collectively provide sufficient information for global shape stability, a concept formalized as “bearing rigidity.”
3. Behavior-Based (Bio-Inspired) Methods
Inspired by the emergent, coordinated behaviors of flocks of birds or schools of fish, this decentralized paradigm eschews explicit global geometry. Each drone in the formation executes a set of simple local rules based on passive observations of its nearby peers. These rules typically include:
- Separation: Steer to avoid crowding local flockmates. A repulsive potential field can model this: $$ \mathbf{F}_{sep, i} = \sum_{j \in N_i} -k_{sep} \frac{1}{||\mathbf{r}_{ij}||^2} \hat{\mathbf{r}}_{ij} $$ where \( \mathbf{r}_{ij} = \mathbf{p}_i – \mathbf{p}_j \).
- Alignment: Steer towards the average heading of local flockmates. $$ \mathbf{v}_{align, i} = \frac{1}{|N_i|} \sum_{j \in N_i} \mathbf{v}_j $$
- Cohesion: Steer to move toward the average position of local flockmates. $$ \mathbf{F}_{coh, i} = k_{coh} \left( \frac{1}{|N_i|} \sum_{j \in N_i} \mathbf{p}_j – \mathbf{p}_i \right) $$
The final control input for drone \( i \) is a weighted sum: \( \mathbf{u}_i = w_{sep}\mathbf{F}_{sep,i} + w_{align}\mathbf{v}_{align,i} + w_{coh}\mathbf{F}_{coh,i} \). This method offers remarkable robustness and adaptability, as the drone formation self-organizes without a central plan. It is highly scalable and fault-tolerant. However, guaranteeing precise geometric shape or achieving specific, rigid patterns is difficult, making it more suitable for tasks like area coverage, surveillance, or obstacle avoidance where exact positioning is less critical than overall group cohesion and motion.
4. Virtual Structure Approach
Here, the entire drone formation is treated as a single, rigid virtual body. Each drone is assigned a fixed reference point on this virtual structure. The control objective for each drone is to make its actual position \( \mathbf{p}_i \) track its moving reference point \( \mathbf{p}_{i, vs}(t) \) on the virtual structure. This reference is defined by:
$$ \mathbf{p}_{i, vs}(t) = \mathbf{p}_{vs}(t) + R(\theta_{vs}(t)) \mathbf{d}_i $$
where \( \mathbf{p}_{vs}(t) \) and \( \theta_{vs}(t) \) are the virtual structure’s global position and orientation, \( R \) is a rotation matrix, and \( \mathbf{d}_i \) is the drone’s fixed offset in the structure’s body frame. The drones need to know or estimate the virtual structure’s state \( (\mathbf{p}_{vs}, \theta_{vs}) \), which can be broadcast by a leader or agreed upon collaboratively. The control law is essentially a set of independent tracking controllers:
$$ \ddot{\mathbf{p}}_i = \ddot{\mathbf{p}}_{i, vs} + K_d (\dot{\mathbf{p}}_{i, vs} – \dot{\mathbf{p}}_i) + K_p (\mathbf{p}_{i, vs} – \mathbf{p}_i) $$
This method provides excellent precision for maintaining complex, time-varying geometric patterns (like letters in a light show), as the virtual structure’s motion is perfectly known. The drawback is its centralized flavor—the definition and dissemination of the virtual structure’s trajectory create a potential single point of failure and require reliable, potentially high-bandwidth communication, which conflicts with the pure passive sensing ideal. It is often used in scenarios where precise choreography is paramount and the environment is controlled.
5. Consensus-Based Algorithms
Consensus theory provides the mathematical tools for a group of agents to reach an agreement on a shared value (e.g., a common velocity, a rendezvous point, or a geometric configuration) using only local communication or sensing. For drone formation control, consensus is typically applied to achieve and maintain a desired relative state. In a widely used approach, the formation is defined by a set of desired inter-agent distances \( d_{ij}^* \). The control objective is to drive all drones to satisfy \( ||\mathbf{p}_i – \mathbf{p}_j|| \to d_{ij}^* \) for all specified pairs \( (i,j) \). A standard consensus-based control law for this is:
$$ \dot{\mathbf{p}}_i = \sum_{j \in N_i} (\mathbf{p}_j – \mathbf{p}_i) + \sum_{j \in N_i} \frac{(||\mathbf{r}_{ij}|| – d_{ij}^*)}{||\mathbf{r}_{ij}||} \mathbf{r}_{ij} $$
The first term is a consensus term on positions that encourages agents to gather, while the second term is a gradient term from a potential function based on distance error. This method is fully distributed, robust to dynamic changes in network topology, and does not require a leader. Its performance is highly dependent on the connectivity of the underlying interaction graph and can be sensitive to communication delays. It represents a powerful framework for self-organizing drone formations where passive relative measurements (like vision-based estimates of neighbor positions) are used to compute the control input.
Synthesis and Comparative Analysis of Passive Formation Methods
The following table synthesizes the key attributes of the five primary passive localization and control methods for drone formations, highlighting their operational principles, strengths, and inherent limitations.
| Method | Core Principle | Key Advantages | Primary Challenges & Limitations | Typical Sensor Suite |
|---|---|---|---|---|
| Leader-Follower | Hierarchical tracking of designated leader(s). | Simple to design and implement; reduces global communication needs; clear chain of command. | Single point of failure (leader); error propagation down the chain; poor scalability for complex shapes. | Visual (camera) for leader tracking; IMU; possibly inter-drone bearing/range (passive). |
| Graph Theory | Maintain constraints defined by a graph of desired relative states. | Rigorous mathematical foundation; can describe arbitrary formations; analyzes stability and connectivity formally. | Requires careful design for rigidity/observability; control laws can be complex; sensitive to initial conditions and measurement noise. | Relative position/bearing/distance sensors (e.g., vision, UWB for ToA); communication for neighbor state. |
| Behavior-Based | Local reactive rules (separation, alignment, cohesion). | Highly robust and fault-tolerant; emergent adaptability to environment; completely decentralized and scalable. | Difficult to guarantee precise geometric shape; predictable global behavior is hard to formally specify; potential for chaotic motion. | Local relative position/velocity sensing (e.g., radar, lidar, vision); no need for global ID. |
| Virtual Structure | Track a fixed point on a moving virtual rigid body. | Enables precise execution of complex, time-varying patterns; simplifies high-level path planning for the whole formation. | Centralized planning component; requires reliable dissemination of virtual structure state; less flexible to dynamic changes. | Absolute or relative positioning to compute tracking error (GPS, vision-based SLAM); good inter-drone comms for structure data. |
| Consensus-Based | Reach agreement on shared states (e.g., relative distances, velocities) via local interaction. | Fully distributed; strong theoretical guarantees on convergence; highly flexible and robust to topology changes. | Convergence speed depends on network connectivity; sensitive to communication delays; may require many iterations for large formations. | Relative state measurement (distance, bearing); local communication for state exchange. |
Mathematical Underpinnings and Observability
A unifying challenge across many passive drone formation localization schemes is the observability problem. When drones rely on bearing-only or other incomplete relative measurements, the system’s state may not be fully observable without specific maneuvers. Consider a simplified case where two drones estimate the relative position of a third using only bearings. The system’s observability can be analyzed through the rank of the observability matrix \( \mathcal{O} \) derived from the linearized measurement model. For bearing \( \theta \) to a target, the measurement Jacobian \( H \) is:
$$ H = \nabla_{\mathbf{x}} \theta = \left[ \frac{-(y_t-y_i)}{r^2}, \frac{(x_t-x_i)}{r^2}, \frac{(y_t-y_i)}{r^2}, \frac{-(x_t-x_i)}{r^2} \right] $$
for state \( \mathbf{x} = [x_i, y_i, x_t, y_t]^T \) and range \( r = \sqrt{(x_t-x_i)^2+(y_t-y_i)^2} \). The system becomes observable only if the observer drone undertakes maneuvers that make the measurement directions linearly independent over time. This is formalized as the need for non-zero “relative motion” between the observer and the target. For a stable drone formation, this implies that passive localization schemes often require integrated estimation filters (like Extended Kalman Filters or Particle Filters) and carefully designed cooperative control laws that ensure the collective motion maintains observability of the entire formation’s state.
Conclusion and Future Trajectories
In summary, passive localization stands as a critical enabling technology for covert, robust, and resilient drone formation operations. Its principal virtue lies in its low electromagnetic signature, granting a decisive advantage in contested or surveillance-oriented environments. This article has surveyed the dominant control paradigms—Leader-Follower, Graph Theory, Behavior-Based, Virtual Structure, and Consensus-Based methods—that enable formations to function without active ranging. Each method presents a unique trade-off between precision, robustness, decentralization, and complexity.
The fundamental limitation of passive approaches, namely the lack of direct distance information and associated observability challenges, continues to drive research. Future directions are poised at the intersection of several advanced fields. Firstly, the fusion of passive sensing modalities (e.g., multi-camera vision, acoustic arrays, and electronic support measures) with lightweight active sensors in a controlled, sporadic manner could yield hybrid systems that balance covertness with periodic high-precision updates. Secondly, the integration of machine learning, particularly deep reinforcement learning, offers promise for developing adaptive control policies that can optimize formation behavior and state estimation in complex, uncertain environments where traditional models may fail. Finally, the advancement of secure, low-probability-of-intercept communication protocols is essential to support the information exchange required by many distributed passive localization algorithms without negating their covertness benefit.
The evolution of drone formation technology is inexorably linked to solving these localization challenges. As research progresses towards more autonomous, intelligent, and adaptable swarms, passive and hybrid localization strategies will undoubtedly form the cornerstone of next-generation systems capable of operating effectively in the most demanding and unpredictable scenarios.
