A Comprehensive System Design for Fire Rescue Drones Based on Visible and Infrared Image Fusion

In the field of firefighting and emergency response, the integration of unmanned aerial vehicles, or fire drones, has marked a significant technological advancement. The primary challenge for existing systems lies in their reliance on either visible-light or thermal imaging cameras in isolation, coupled with manual image interpretation by firefighters. This approach lacks the synergistic potential of multi-spectral data fusion, leading to suboptimal situational awareness, especially during nighttime operations or in obscured environments. To address these critical limitations, this article presents a novel, integrated system design for fire drones. The proposed system synergistically combines a visible-light camera and an infrared thermal imaging camera, processes their feeds through a dedicated onboard computing unit, and employs sophisticated computer vision algorithms for automated fire detection, image fusion, and three-dimensional localization of fire points. This design significantly enhances the drone’s reconnaissance capabilities, operational reliability in diverse conditions, and provides actionable intelligence to firefighting teams.

1. System Overview and Hardware Architecture

The proposed fire drone system is built upon a robust and reliable hardware platform, ensuring functionality in the demanding environments typical of fire scenes. The core philosophy is to create a cohesive unit where sensor data acquisition, real-time processing, and command/control are seamlessly integrated. The system’s effectiveness hinges on the careful selection and integration of each component, tailored specifically for the fire reconnaissance mission profile.

The hardware design revolves around several key modules:

Unmanned Aerial Vehicle (UAV) Platform: This forms the mobile base for the entire system. The platform must prioritize reliability, endurance, and payload capacity. Key requirements include:
- Integration of a high-precision GPS module for navigation, trajectory logging, and position holding.
- An advanced flight controller (e.g., PX4) capable of interfacing with an onboard computer for high-level command execution.
- Redundant sensors including a triple-redundant IMU (Inertial Measurement Unit) with a 3-axis gyroscope, 3-axis accelerometer, and 3-axis magnetometer for stable flight and navigation.
- Essential autonomous functions: Low-battery Return-to-Home (RTH), Fail-safe RTH (upon signal loss), and waypoint mission planning to reduce operator workload.
- Performance Specifications: A minimum flight endurance of 30 minutes while fully laden, an operational altitude capability up to 1 km, a control range exceeding 5 km, and a maximum speed over 12 m/s.
- Environmental Resilience: A minimum ingress protection rating of IP63, ensuring complete protection against dust and protection against water spray at angles up to 60° from vertical, making the fire drone operational in light rain and smoky conditions.
Gimbal-Stabilized Payload Platform: A 3-axis brushless gimbal is crucial for stabilizing the camera module. It eliminates vibrations and allows the operator to remotely control the camera’s pan, tilt, and roll angles independently of the drone’s movement. This ensures clear image capture and enables precise aiming at areas of interest. The gimbal also houses sensors (gyroscopes and accelerometers) to provide real-time attitude data of the camera module to the central computer.
Fire Detection Camera Module (The Core Sensor Suite): This dual-camera module is the primary innovation. It features a visible-light camera and an infrared thermal camera mounted in a co-axial, side-by-side configuration, ensuring both sensors share an identical field of view. For this design, a global shutter visible-light camera with a resolution of 1280×720 pixels is selected for high-definition, high-frame-rate capture. The thermal camera is a microbolometer-based module with a resolution of 160×120 pixels. Both cameras are fitted with lenses providing a 22°×16° field of view, translating to a ground coverage of approximately 50×36 meters at a 100-meter altitude.
Onboard Computing Unit: An Intel NUC or similar compact, high-performance computer serves as the system’s brain. It is responsible for:
- Receiving and synchronizing image streams from both cameras.
- Running the real-time fire detection and image fusion algorithms.
- Processing gimbal attitude data and GPS information.
- Calculating the 3D world coordinates of detected fire points.
- Communicating with the flight controller for potential automated navigation responses.
Communication and Control Link: A long-range remote controller (RC) with a range of at least 6 km provides manual piloting and gimbal control. A separate, high-definition, low-latency digital video link transmits the fused visual feed and system telemetry (fire alerts, coordinates) back to the ground control station, which is typically integrated with the RC.

The following table summarizes the key hardware components and their specifications:

Component	Key Specifications & Purpose
UAV Airframe & Propulsion	Multirotor configuration, capable of carrying 2-3 kg payload, weather-resistant (IP63).
Flight Controller (e.g., PX4)	Manages flight stability, executes autonomous missions, provides sensor fusion for navigation.
Visible-Light Camera	Global shutter, 1280×720 @ 30fps+, 22°×16° FOV. Provides high-detail contextual imagery.
Infrared Thermal Camera	160×120 pixels, Radiometric (provides temperature data), 22°×16° FOV. Detects heat signatures.
3-Axis Gimbal	Stabilizes cameras, provides remote pan/tilt/roll control, outputs camera attitude angles.
Onboard Computer (Intel NUC)	Runs Linux/ROS, executes CV algorithms, performs coordinate transformations.
Long-Range RC & Video Link	>6 km control range, HD low-latency video downlink for real-time operator view.

2. Software Pipeline and Fire Detection Algorithm

The intelligence of the fire drone is embedded in its software pipeline. After hardware calibration (camera intrinsics, extrinsics, and distortion correction), the system follows a structured workflow to transform raw sensor data into actionable fire intelligence. The software architecture is designed for real-time operation on the onboard computer.

The overarching software workflow can be visualized as a sequential pipeline:

Data Acquisition & Preprocessing: Synchronized image frames (typically at 30 fps) are captured from both the visible and thermal cameras. The images are rectified using their respective calibration matrices. The thermal image, due to its lower native resolution, is upscaled to match the visible image’s dimensions (1280×720) using an interpolation function like `resize()` in OpenCV, facilitating pixel-level correspondence for later fusion.

Dual-Modality Fire Detection: Fire detection runs in parallel on both image streams, increasing robustness.

Visible-Light Processing: The algorithm searches for visual characteristics of fire and smoke. For flame detection, it uses color space analysis. In the RGB model, fire pixels generally exhibit high red (R) and green (G) values with G greater than blue (B). The HIS (Hue, Saturation, Intensity) model is also effective, as flame pixels have specific hue and saturation signatures. The logical rules applied are:
$$ \text{rule1: } R \geq G \geq B $$
$$ \text{rule2: } R \geq R_T $$
$$ \text{rule3: } S \geq \frac{(255 – R) \cdot S_T}{R_T} $$
where $R_T$ and $S_T$ are empirically determined thresholds for the red component and saturation, respectively. Pixels satisfying these conditions are classified as potential flame, creating a binary mask. Morphological operations (dilation, erosion) are then applied to connect regions and reduce noise.

Thermal Image Processing: This is the more reliable detection channel. The radiometric data allows direct temperature reading. A fixed temperature threshold (e.g., 101°C or higher, configurable based on scene) is applied to isolate high-temperature regions. To distinguish true fire from other hot objects (e.g., vehicles, machinery), secondary temporal and shape features are analyzed within the candidate Regions of Interest (ROI):

Feature	Description	Purpose
Circularity	$C = \frac{4\pi \cdot Area}{Perimeter^2}$. Fire tends to have low, irregular circularity.	Filters out regularly shaped hot objects (e.g., humans, engines).
Area Change Rate	Rate of change of the high-temperature region’s area between consecutive frames.	Fire typically expands; static hot objects have near-zero change.
Edge Flicker/Jitter	Irregular variation in contour shape and number of sharp corners over time.	Characteristic of turbulent flame boundaries.
Spectral Flicker	Frequency of intensity oscillation, typically in the 8-12 Hz range for flames.	A strong discriminant based on the physics of combustion.

The thermal processing flowchart involves: converting the image to HSV space, thresholding for red/orange hues to get an ROI, applying the Canny edge detector within the ROI, and finally validating the region using the features listed above.

Sensor Fusion and Alert Generation: A fire alarm is triggered only when both modalities corroborate the presence of a fire, or when the thermal detection is high-confidence. Upon confirmation, the system proceeds to fuse the information. The coordinates of the fire region’s bounding polygon from the thermal image are mapped onto the corresponding high-resolution visible image. A prominent red rectangle (e.g., 16 pixels wide) is drawn on the visible image to indicate the fire location, providing immediate visual context to the operator.
Advanced Image Fusion (Laplacian Pyramid): For a more detailed integrated view, the interior of the detected fire region undergoes a multi-resolution fusion process. The corresponding sub-images from the visible ($I_v$) and thermal ($I_t$) feeds are extracted. The Laplacian pyramid fusion algorithm is employed:
1. Construct Gaussian pyramids $G_v$ and $G_t$ for $I_v$ and $I_t$, respectively, by repeated smoothing and downsampling.
2. Build Laplacian pyramids $L_v$ and $L_t$ from the Gaussian pyramids. The Laplacian level $L_l$ is the difference between Gaussian level $G_l$ and the expanded version of the next level $G_{l+1}$: $$L_l = G_l – \text{UP}(G_{l+1})$$ where $\text{UP}()$ denotes upsampling.
3. Create a binary mask $M$ defining the fusion region (e.g., a vertical split) and build its Gaussian pyramid $G_M$.
4. Fuse the Laplacian pyramids at each level $l$ using the mask pyramid as weights: $$L_{Fused,l} = G_{M,l} \cdot L_{v,l} + (1 – G_{M,l}) \cdot L_{t,l}$$
5. Reconstruct the final fused image $I_{fused}$ by starting from the top of the fused Laplacian pyramid and recursively adding the expanded version to the next level: $$I_{fused} = \text{Reconstruct}(L_{Fused})$$ This results in a seamless blend where the thermal “hot spot” is integrated into the detailed visible background, enhancing information content.
3D Geo-Localization of Fire Point: This is a critical step for actionable intelligence. When a fire is detected, the system calculates its precise 3D location in world coordinates (latitude, longitude, altitude). The process involves two transformations:
1. Image to Drone Body Coordinates: Using the camera’s intrinsic matrix $K$, the camera-to-body rotation $R^b_c$, the gimbal-reported camera attitude (yaw $\psi$, pitch $\theta$, roll $\phi$ relative to the drone body), and the drone’s estimated altitude above ground ($\hat{h}$), the image pixel coordinates $(x_{ft}, y_{ft})$ of the fire point are projected. A simplified model for a nadir-pointing camera considers the derived horizontal offsets $(\Delta x, \Delta y)$ in the drone’s body frame: $$\begin{bmatrix} \Delta x \\ \Delta y \end{bmatrix} = \hat{h} \cdot K^{-1} \begin{bmatrix} x_{ft} \\ y_{ft} \\ 1 \end{bmatrix}$$ The fire point’s body-frame coordinates $(x_d, y_d, z_d)$ are then: $$x_d = x_v + \Delta x, \quad y_d = y_v + \Delta y, \quad z_d = -\hat{h}$$ where $(x_v, y_v)$ is the drone’s own body-frame origin (typically its center of mass). For a gimbaled camera, the full rotation matrix $C^b_n$ (from navigation to body frame, incorporating drone and gimbal angles) must be applied to the line-of-sight vector.
2. Drone Body to World Coordinates: The body-frame coordinates are transformed into the North-East-Down (NED) local navigation frame using the drone’s attitude. Finally, using the drone’s own high-precision GPS/INS-derived position $(lat_{uav}, lon_{uav}, alt_{uav})$, the local NED offsets are converted into absolute geographic coordinates (WGS84). The final output is the fire’s estimated $(lat_{fire}, lon_{fire}, alt_{fire})$.
This calculated location is instantly overlaid on the video feed and can be transmitted to other units or integrated into a Geographic Information System (GIS) for coordinated response.

3. Operational Advantages and Application Scenarios

The implementation of this integrated fire drone system delivers transformative advantages over conventional reconnaissance methods. The fusion of visible and thermal data directly addresses the limitations of single-sensor systems, providing unparalleled situational awareness. The operator receives a single, enriched video stream where the thermal heat signature is perfectly registered onto the high-resolution visual scene, making fire identification intuitive and rapid, day or night. The automated detection and alarm system reduce operator cognitive load, ensuring critical threats are not missed during prolonged surveillance. Most importantly, the automatic generation of accurate 3D geo-coordinates for each fire point turns qualitative observation into quantitative, actionable data. This allows for precise resource deployment, creation of heat maps for fire spread analysis, and effective coordination among multiple firefighting teams.

The application domains for such a sophisticated fire drone are extensive:

Urban Firefighting: Assessing structural fires from a safe stand-off distance, identifying hotspots through walls or roofs with thermal imaging, locating trapped individuals via their heat signatures, and monitoring the effectiveness of firefighting efforts in real-time.
Wildland and Forest Fire Management: Conducting pre-emptive patrols over large, inaccessible areas for early detection of smoldering fires or lightning strikes. Once a fire is active, the drone can map the fire’s perimeter, identify spot fires ahead of the main front, and track fire progression to inform evacuation orders and tactical ground operations.
Industrial Facility Monitoring: Inspecting refineries, chemical plants, or energy facilities for abnormal heat patterns that may indicate equipment failure, electrical faults, or insulation leaks, enabling preventive maintenance and avoiding catastrophic incidents.
Post-Incident Analysis and Forensics: The logged data, including fused images, fire locations, and timestamps, serves as a valuable record for post-fire investigation to determine the origin and cause of the fire.

4. Conclusion and Future Perspectives

The design and implementation of a fire drone system based on visible and infrared image fusion represent a significant leap forward in fire service technology. By moving beyond simple aerial photography to an intelligent, sensor-fused reconnaissance platform, this system addresses the core needs of modern firefighting: safety, speed, accuracy, and information superiority. The integration of reliable hardware with robust computer vision algorithms for detection, fusion, and localization creates a tool that extends the senses of the firefighter, allowing them to “see” heat through smoke and darkness and to “know” the exact location of threats. The proposed architecture is modular and scalable, allowing for the integration of additional sensors or more powerful AI models in the future.

Future enhancements for the next generation of such fire drones could include the integration of multi-spectral or hyperspectral cameras for detecting specific chemical signatures, the use of onboard AI for predicting fire spread dynamics, and the implementation of secure, mesh-network communications for coordinating swarms of drones over large disaster zones. The path forward lies in deepening the autonomy and intelligence of the fire drone, transforming it from a remotely piloted camera into a fully-fledged, collaborative partner in emergency response, capable of not only finding fire but also understanding its behavior and guiding the optimal response. The continuous evolution of this technology promises to further safeguard both communities and the brave personnel who protect them.