Autonomous Fire Drone for High-Rise Building Fire Surveillance and Reconnaissance

The proliferation of high-rise and super-tall structures globally presents a formidable challenge to conventional firefighting methodologies. The core dilemma lies in the severe limitation of situational awareness. Firefighters on the ground struggle to accurately assess the scope, intensity, and progression of a blaze occurring dozens or hundreds of meters above street level. This critical information gap hinders effective command decisions, resource deployment, and ultimately, rescue and suppression operations. External firefighting from aerial platforms like helicopters is risky, expensive, and often constrained by smoke, heat, and structural obstacles. Our project addresses this fundamental need for persistent, external, and intelligent reconnaissance. We have designed and developed an autonomous fire drone system specifically engineered for the high-rise environment. This system transcends the limitations of remote-controlled consumer-grade drones by integrating advanced machine perception, automated navigation, and robust data-link capabilities to provide a continuous, information-rich aerial perspective of the fire zone.

The overarching design philosophy for our fire drone centers on autonomous operation from launch to persistent station-keeping. The system must reliably locate a fire source on a high-rise facade, navigate to a safe and optimal observation point, and maintain its position while streaming multi-sensor data to incident commanders. This automation is crucial as it removes the burden of manual piloting in a complex, dynamic, and potentially communications-degraded environment, allowing human operators to focus on analysis and decision-making. The system architecture is built upon a powerful onboard computing core that synthesizes data from a suite of heterogeneous sensors. Key capabilities include visual and thermal fire detection, obstacle avoidance, precise relative positioning, and long-range, low-latency data transmission. The following table summarizes the core design objectives and the corresponding technological solutions implemented in our fire drone.

Design Objective	Technical Solution	Key Benefit
Autonomous Fire Source Detection	Fusion of RGB & Thermal Imagery with Deep Learning (YOLO)	Reliable detection day/night, through minor smoke/obscurants.
Safe Navigation & Obstacle Avoidance	Stereo Vision (Dual RGB Cameras) for 3D mapping and depth estimation.	Real-time perception of building facade, windows, and other obstacles.
Persistent, Stable Observation	Advanced Flight Controller with GPS-denied positioning (Visual Odometry).	Maintains a fixed relative position to the fire for continuous monitoring.
High-Bandwidth, Reliable Data Link	Adaptive Communication (5G P2P / Robust Wi-Fi) with automatic failover.	Ensures continuous streaming of video and sensor data to ground command.
Extended Operational Loiter Time	Efficient Powertrain & Power Management System.	Maximizes on-station time for prolonged reconnaissance missions.

System Architecture and Workflow

The operational workflow of the fire drone is a sequence of autonomous states, transitioning based on sensor input and algorithmic decisions. The process begins with the drone being launched, typically from a secure ground or rooftop location within line-of-sight of the target building. Upon reaching a pre-defined scan altitude, it enters the Search Phase.

Search and Identification Phase: The drone performs a systematic scan of the building facade. The core sensor fusion for fire detection happens here. The RGB camera and the thermal imaging sensor are spatially aligned through a pre-calibration process. This calibration determines a homography or transformation matrix $$ \mathbf{H} $$ that maps pixels from the thermal image onto the RGB image plane, correcting for parallax and lens distortion. The aligned thermal channel (representing temperature) is then combined with the RGB channels to create a multi-channel input array $$ \mathbf{I}_{fusion} $$.

$$ \mathbf{I}_{fusion}(x,y) = [R(x,y), G(x,y), B(x,y), T_{aligned}(x,y)] $$
This 4-channel image is fed into a convolutional neural network (CNN) based on the YOLO (You Only Look Once) architecture, which has been specifically trained on a diverse dataset of high-rise fire scenarios. The model outputs bounding box predictions $$ \mathbf{b} = [x_{center}, y_{center}, width, height] $$ and a confidence score $$ c \in [0,1] $$ for fire-like regions. A detection is considered valid when $$ c > \tau $$, where $$ \tau $$ is a high confidence threshold (e.g., 0.8).

Navigation and Approach Phase: Once a fire is identified and its image coordinates are known, the drone must plan a path to a suitable observation point. The stereo vision system is activated for this phase. By comparing the two rectified images from the baseline-separated cameras, a disparity map $$ \mathbf{D}(x,y) $$ is computed. The depth $$ Z $$ (distance to an object) at a pixel is inversely proportional to its disparity:
$$ Z = \frac{f \cdot B}{D} $$
where $$ f $$ is the focal length and $$ B $$ is the baseline distance between the cameras. This depth map allows the drone to build a local 3D obstacle map of the facade, enabling it to avoid protrusions like balconies or signage while approaching the target.

Persistent Observation Phase: Upon reaching a safe stand-off distance from the identified fire zone, the fire drone transitions to a loitering observation mode. It typically executes an automated orbit or a station-keeping hover. The flight controller now fuses data from visual odometry (tracking visual features on the building), inertial measurement units (IMUs), and potentially a laser rangefinder to maintain a fixed position relative to the fire, even in the absence of GPS signal which is often unreliable near large structures. All sensor data—RGB video, thermal video, temperature readouts, and drone telemetry (position, attitude, battery)—are packaged and transmitted.

Hardware Platform Design

The hardware design is a critical enabler of the fire drone‘s autonomous capabilities. It is bifurcated into two main subsystems: the High-Level Perception & Computing System and the Low-Level Flight Control & Powertrain System. This separation of concerns ensures that computationally intensive tasks like image processing and neural network inference do not interfere with the time-critical stability control of the aircraft.

High-Level Perception & Computing System: At the heart of this subsystem is a powerful Linux-based single-board computer (SBC), such as a NVIDIA Jetson series module. This choice is dictated by the need for substantial processing power to run modern CNN models in real-time and handle multiple high-bandwidth sensor streams concurrently. The SBC interfaces with the following key perception sensors:

Gimbal-Mounted Dual-Sensor Pod: A pivotal design innovation is the co-location of the primary RGB camera and the thermal imaging sensor on a 2-axis brushless gimbal. This allows the drone’s “eyes” to pan and tilt independently of the aircraft’s body orientation. During the search phase, the gimbal can sweep vertically. During observation, it can keep the sensors locked on the fire source while the drone maneuvers. This is more cost-effective and lightweight than installing multiple fixed sensors.
Stereo Vision Rig: Two global-shutter monochrome or RGB cameras, fixed in a rigid housing with a known baseline, provide the depth perception necessary for obstacle avoidance and proximity estimation.
Communication Module: A hybrid 5G/Wi-Fi module allows the system to select the optimal data link. High-bandwidth, low-latency 5G is preferred for direct Peer-to-Peer (P2P) streaming to the ground station. A robust long-range Wi-Fi link serves as a fallback.

Low-Level Flight Control & Powertrain System: This subsystem is responsible for the fundamental stability, maneuvering, and propulsion of the fire drone. It is built around a reliable and deterministic real-time microcontroller, typically an STM32 series. Its primary functions are:

Sensor Fusion for Attitude Estimation: It reads data from an IMU (gyroscope, accelerometer), a magnetometer, and a barometer. A sensor fusion algorithm (like a complementary filter or Kalman filter) computes the drone’s precise orientation (roll $$ \phi $$, pitch $$ \theta $$, yaw $$ \psi $$) and altitude.
Flight Control Law Execution: It runs Proportional-Integral-Derivative (PID) or more advanced controllers to maintain stable flight. It receives high-level navigation commands (e.g., “move to these relative coordinates”) from the SBC and translates them into low-level motor commands.
Powertrain Management: The flight controller generates Pulse-Width Modulation (PWM) signals to control Electronic Speed Controllers (ESCs), which in turn drive the brushless DC motors. The thrust $$ T_i $$ for each motor i is calculated to achieve the desired total thrust and rotational torques.

The power budget and component selection are vital for mission endurance. The table below outlines a representative hardware configuration and its power profile.

Component	Representative Model/Spec	Approx. Power Draw
Flight Controller (STM32)	STM32H7 Series	1-2 W
Onboard Computer	NVIDIA Jetson Xavier NX	10-15 W
RGB Camera	Global Shutter, 4K	2-3 W
Thermal Camera	Uncooled VOx Microbolometer, 640×512	1.5-2.5 W
Stereo Vision Cameras (x2)	Global Shutter Mono	3 W (total)
Communication Module	5G/Wi-Fi Combo	3-8 W (peak)
Brushless Motors (x6)	KV300, for Hexacopter	200-400 W (total, variable)
Total Estimated Average (Hover)		~250-300 W

Given a high-capacity 6-cell LiPo battery (e.g., 10,000mAh, 22.2V, ~222 Wh), the theoretical maximum hover endurance $$ t_{hover} $$ can be estimated as:
$$ t_{hover} \approx \frac{E_{battery}}{P_{hover}} \cdot \eta \approx \frac{222 Wh}{280 W} \cdot 0.85 \approx 0.67 \text{ hours} \approx 40 \text{ minutes} $$
where $$ \eta $$ is a system efficiency factor accounting for voltage sag and other losses. This endurance is sufficient for initial reconnaissance and monitoring, with potential for extension through optimized aerodynamics and hybrid power systems.

Software and Algorithmic Core

The intelligence of the fire drone is embodied in its software stack. The algorithms transform raw sensor data into actionable intelligence and precise control.

Multi-Modal Fire Detection Algorithm: The YOLO-based detection model is the cornerstone. Training data is augmented with synthetic smoke and glare to improve robustness. The model outputs are processed with Non-Maximum Suppression (NMS) to eliminate duplicate detections. The confidence score $$ c $$ is derived from the product of the objectness probability and the conditional class probability. For the fire class, this can be expressed as:
$$ c = P_{obj} \cdot P_{fire|obj} $$
The bounding box coordinates $$ (x, y, w, h) $$ are normalized to the image dimensions, providing a target vector for the navigation system.

Navigation and Guidance: The guidance system converts the image-plane target into 3D world commands. Using the depth $$ Z_{target} $$ estimated from the stereo system at the target bounding box, and the camera’s intrinsic calibration matrix $$ \mathbf{K} $$, we can back-project the image point $$ \mathbf{p}_{image} = [u, v, 1]^T $$ to a 3D vector $$ \mathbf{P}_{camera} $$ in the camera coordinate frame:
$$ \mathbf{P}_{camera} = Z_{target} \cdot \mathbf{K}^{-1} \mathbf{p}_{image} $$
This vector, transformed into the drone’s body frame, provides the relative 3D offset the drone needs to travel to center the fire in its view from a safe distance.

Communication and Ground Control Software: The ground station software is designed for clarity and reliability. It performs two key functions:

Direct Video/Sensor Display: It decodes the live video streams (RGB and thermal), displays them side-by-side or fused, and presents vital telemetry on a dashboard.
Cloud Relay & Data Logging: Recognizing that the ground station may be in a challenging radio environment, it simultaneously acts as a robust relay. It re-transmits the received data via a stable wired internet connection (e.g., fiber or LTE backup) to a secure cloud server. This ensures that command personnel away from the immediate scene, or in a headquarters, have uninterrupted access to the fire drone‘s feed. The cloud platform also enables data recording, analysis, and sharing across multiple agencies.

The communication protocol between the drone and ground station uses a state-aware adaptive mechanism. It continuously monitors link quality indicators (LQI) like signal strength (RSSI) and packet loss. Based on predefined thresholds, it can dynamically adjust video compression parameters or, in extreme cases, command the drone to a pre-programmed position with a better communication link.

System Integration and Performance Considerations

Integrating the hardware and software modules into a cohesive, reliable fire drone system involves addressing several interdisciplinary challenges. Key among them are real-time performance, system robustness, and operational safety.

Real-Time Scheduling and Latency: The software architecture on the onboard computer must carefully manage process priorities. The sensor ingestion, core detection pipeline, and command generation must run in a high-priority real-time thread or process to minimize latency $$ L_{total} $$ from scene change to control reaction. This total latency can be modeled as:
$$ L_{total} = L_{sensor} + L_{processing} + L_{comm_{down}} + L_{control} $$
Where $$ L_{processing} $$ is dominated by the neural network inference time. Using optimized inference engines like TensorRT is essential to keep this value below 100ms. $$ L_{comm_{down}} $$ is the latency for sending commands from the SBC to the flight controller, which must be on the order of milliseconds.

Robustness in Adverse Conditions: The fire drone must perform in the harsh environment of a structure fire, which may involve:

High Temperatures & Updrafts: The airframe and electronics require passive cooling and thermal shielding. The flight controller must be tuned to handle strong, turbulent thermal plumes.
Smoke and Particulates: While thermal imaging can penetrate smoke better than visible light, heavy soot can degrade all optical sensors. Filters and pressurization systems for the camera housings may be necessary for prolonged exposure.
Electromagnetic Interference (EMI): Fireground communications and equipment generate significant EMI. The drone’s radio systems and flight control electronics must be well-shielded, and communication protocols should include strong error correction.

Safety and Fail-Safe Protocols: Autonomous operation demands rigorous fail-safe logic. The system continuously performs self-checks on battery voltage $$ V_{bat} $$, communication link status, and motor/ESC health. A hierarchical set of contingency actions is programmed:

Low Battery Return-to-Launch (RTL): Triggered when $$ V_{bat} < V_{threshold\_RTL} $$.
Communication Loss Hold & Return: If the command link is lost for $$ t_{timeout} $$ seconds, the drone will hold its position briefly, then execute a slow ascent and RTL to re-acquire signal.
Critical System Failure Landing: In case of a critical sensor failure (e.g., IMU), the drone will attempt an immediate controlled landing at its current location, prioritizing avoidance of populated areas.

Applications, Economic Viability, and Future Directions

The application of this autonomous fire drone extends beyond immediate firefighting into prevention, inspection, and training. Its primary value proposition is the enhancement of situational awareness, which is the single most critical factor in managing complex high-rise incidents.

Key Operational Applications:

Initial Reconnaissance and Size-Up: Provides the incident commander with the first clear external view of the fire’s location and extent within minutes of arrival.
Progress Monitoring: Tracks the effectiveness of interior attack lines by showing changes in thermal signature and visible flame from the outside.
Search for Victim Indicators: Thermal imaging can potentially detect human heat signatures at windows or on balconies, guiding rescue efforts.
Post-Fire Assessment: Safely inspects structural integrity and identifies hotspots during overhaul.
Preventive Inspection: Can be used to periodically scan building facades for thermal anomalies indicating electrical faults or insulation issues.

Economic and Practical Considerations: Compared to manned helicopter operations, the fire drone system offers a radically lower cost per mission, both in terms of initial acquisition and operational expense. It eliminates pilot risk and can be deployed from the fire apparatus itself. The design prioritizes the use of commercial off-the-shelf (COTS) components where possible, keeping manufacturing and maintenance costs manageable for municipal fire departments. The table below contrasts key operational aspects.

Aspect	Autonomous Fire Drone	Manned Helicopter
Deployment Time	Minutes (on-scene)	Tens of minutes to hours
Operational Cost per Hour	Very Low (electricity, maintenance)	Extremely High (fuel, crew, maintenance)
Risk to Personnel	None (remote/autonomous)	High (operating in fire plume)
Low-Altitude Hover Capability	Excellent, can get very close to facade	Limited and hazardous
Mission Flexibility	High (multiple drones possible)	Lower (limited airframes)

Future Enhancements and Research: The platform is a foundation for continuous improvement. Future work will focus on:

Advanced AI for Scene Understanding: Implementing semantic segmentation to not only find fire but also identify building features (windows, doors, vents), potential collapse hazards, and victim locations.
Swarm Coordination: Enabling multiple fire drones to work collaboratively—one for close observation, another for wide-area monitoring, a third acting as a communication relay.
Integrated Payload Delivery: Evolving from a pure reconnaissance platform to one capable of delivering emergency supplies (e.g., respirators, radios) to trapped occupants or deploying lightweight external suppression agents.
Enhanced Autonomy and Docking: Developing the ability for the drone to autonomously dock with a charging station on a fire truck or nearby building for indefinite loiter during prolonged incidents.

In conclusion, the development of this autonomous fire drone represents a significant step forward in fire service technology, specifically tailored to the unique and growing challenge of high-rise building fires. By fusing advanced perception, automated navigation, and robust communication, it transforms the aerial perspective from a rare and risky luxury into a persistent, reliable, and intelligent asset on the fireground. This system directly addresses the critical information gap, promising to enhance firefighter safety, improve operational efficiency, and ultimately, save lives and property in our vertically expanding urban landscapes.