In the context of the rapidly expanding low‑altitude economy, UAV drones have become deeply integrated into everyday life, supporting applications such as agricultural assistance, package delivery, and fire rescue. However, the development process of UAV drones is often hampered by long development cycles, high experimental costs, poor fidelity of physical tests, and difficulties in reproducing high‑risk scenarios. To address these challenges, I present a simulation system based on ROS (Robot Operating System) and Unity3D, designed specifically for UAV drones. This system leverages ROS as the core control platform for perception, decision‑making, and control, while Unity3D serves as the external environment platform that provides high‑fidelity 3D rendering, physics engine, and UI components. Through real‑time message exchange between ROS and Unity3D, the system can simulate the dynamic behavior and decision‑making responses of UAV drones in various virtual environments. In the following sections, I will detail the system architecture, communication mechanisms, performance evaluation, and typical application scenarios, demonstrating that the proposed platform can meet the development needs of UAV drones under ideal conditions, enabling real‑time data transmission, synchronization detection, and timely response.
1. Overview of Existing Simulation Platforms for UAV Drones
Before presenting my own system, I briefly review the mainstream simulation platforms commonly used for UAV drones, such as Gazebo, Webots, V‑REP (CoppeliaSim), and AirSim. Each platform has unique strengths and limitations. For instance, Gazebo provides a rich sensor model library and flexible physical engine, making it suitable for multi‑sensor integration. AirSim offers excellent visual 3D effects and a built‑in Weather API for dynamic weather adjustments, which is especially beneficial for UAV drones flight testing. Nevertheless, these platforms either lack intuitive human‑machine interaction interfaces or exhibit constrained applicability. My work introduces Unity3D as the external simulation environment, which can achieve comparable visual fidelity while offering highly customizable UI components and straightforward integration with ROS. The following table summarizes the key features of these platforms, highlighting the advantages of my proposed approach for UAV drones simulation.
| Feature | Gazebo | Webots | V‑REP | AirSim | My System (ROS+Unity3D) |
|---|---|---|---|---|---|
| Weather simulation | Manual configuration | Manual configuration | Manual configuration | Built‑in Weather API | Customizable via Unity3D scripts |
| 3D visual fidelity | Moderate | Moderate | Moderate | High | High (Unity3D rendering) |
| Human‑machine interaction | Limited | Limited | Moderate | Moderate | Rich (Unity3D UI) |
| ROS integration | Native | Native | Native | Native | TCP‑based, high performance |
| Sensor simulation | Extensive library | Extensive | Extensive | Standard sensors | Customizable via Unity3D |
| Applicability for UAV drones | High | High | High | Specialized for drones | General purpose |
As shown in Table 1, my ROS‑Unity3D system combines the strengths of both platforms, offering high visual fidelity, flexible UI, and seamless ROS compatibility, making it a competitive choice for UAV drones simulation.
2. Communication between ROS and Unity3D
To achieve bi‑directional data exchange between ROS and Unity3D, three common methods exist: ROS‑TCP‑Connector/ROS‑TCP‑Endpoint, ROS#, and ROSBridge. After evaluating their characteristics, I selected ROS‑TCP‑Connector and ROS‑TCP‑Endpoint for the following reasons: they are officially provided by Unity, support both ROS1 and ROS2, exhibit high performance with low latency, and are well suited for real‑time control of UAV drones. The table below compares the three methods.
| Method | Protocol | Advantages | Disadvantages | Applicable Scenarios |
|---|---|---|---|---|
| ROS‑TCP‑Connector (Official) | TCP/IP | High performance, low latency, support for ROS1/ROS2, suitable for high‑frequency data (e.g., images, point clouds) | Manual configuration required | Robot visualization, remote control, sensor data interaction |
| ROS# | WebSocket (JSON) | Cross‑platform, flexible | Higher latency, bottleneck for high‑frequency image transmission | Robot motion control, autonomous driving simulation |
| ROSBridge | WebSocket (JSON) | High universality, multi‑language client support | Low communication efficiency, not suitable for real‑time control | Remote monitoring, data visualization, web‑based robot control |
The underlying architecture is shown in Figure 1 (conceptually). ROS nodes publish and subscribe to topics, while Unity3D, acting as a special ROS node via the TCP‑Connector, can both publish sensor data and subscribe to control commands. The coordinate transformation between ROS and Unity3D is crucial when controlling UAV drones because the axis definitions differ. The mapping is summarized in the following equation:
$$
\begin{pmatrix} X_{\text{Unity}} \\ Y_{\text{Unity}} \\ Z_{\text{Unity}} \end{pmatrix}
=
\begin{pmatrix} Z_{\text{ROS}} \\ -X_{\text{ROS}} \\ Y_{\text{ROS}} \end{pmatrix}
$$
This coordinate mapping ensures that the movement commands generated by ROS are correctly interpreted in Unity3D’s world space, allowing UAV drones to respond as expected.
3. System Architecture and Implementation
The overall system architecture comprises two main parts: the ROS side and the Unity3D side. On the ROS side, I designed nodes that handle control message publishing, image processing, and decision‑making. On the Unity3D side, the system is responsible for environment simulation, camera image capture, model motion execution, and UI interaction. The communication uses the ROS‑TCP‑Connector and ROS‑TCP‑Endpoint packages, which establish a persistent TCP connection between the two platforms.
3.1 ROS Side Design
ROS employs a distributed node architecture. For the control of UAV drones, I primarily use the publish‑subscribe model for continuous motion commands. A dedicated control node publishes Twist messages (linear and angular velocities) to a specific topic (e.g., /cmd_vel). The control node can be driven by various decision‑making algorithms, such as path following, obstacle avoidance, or coverage search. Additionally, ROS can receive image data from Unity3D for vision‑based tasks. The image messages, in the format sensor_msgs/Image, are converted to OpenCV format using cv_bridge, enabling efficient processing. A typical image processing pipeline for obstacle detection involves computing the average depth in a region of interest (ROI) and comparing it to a predefined threshold:
$$
d_{\text{avg}} = \frac{1}{N} \sum_{i \in \text{ROI}} D(i)
$$
$$
\text{ObstacleDetected} = \begin{cases}
\text{true}, & \text{if } d_{\text{avg}} < \tau \\
\text{false}, & \text{otherwise}
\end{cases}
$$
where D(i) is the depth at pixel i, N is the number of pixels in the ROI, and τ is a user‑defined threshold (e.g., 0.03 m). When an obstacle is detected, the control node switches from forward motion to a sequence of avoidance maneuvers using a finite‑state machine (FSM). The state transition logic can be modeled as:
$$
S(t+1) = f_{\text{FSM}}(S(t), \text{detect}(t))
$$
where S ∈ {FORWARD, STOPPED, AVOID, RETURN}. This FSM ensures robust and safe behavior for UAV drones in simulated environments.
3.2 Unity3D Side Design
On the Unity3D side, I built a 3D environment containing terrains, buildings, trees, and other obstacles relevant to UAV drones flight. A camera is attached as a child to the drone model, capturing RGB and depth images at configurable resolutions. The image data is encoded and sent to ROS via the TCP‑Connector. Simultaneously, Unity3D subscribes to the control topic to receive Twist messages, converting the velocities into positional updates each frame. The conversion uses the coordinate transformation mentioned earlier. To ensure smooth motion, the drone’s rigid body is updated with the linear and angular velocities, or the transform is directly manipulated for kinematic simulation.
I also developed an interactive UI using Unity3D’s Canvas system. The UI includes buttons to switch between different environments, sliders to adjust environmental parameters (e.g., obstacle height, wind speed), and a control panel to start/pause the simulation. This makes the system highly user‑friendly for testing various scenarios of UAV drones.
4. Performance Evaluation
I conducted a series of experiments to evaluate the real‑time performance of the ROS‑Unity3D simulation system for UAV drones. Two different hardware configurations were tested, as listed in Table 3.
| Component | Device 1 (Desktop) | Device 2 (Laptop) |
|---|---|---|
| CPU | Intel Core i5‑14600KF (3.50 GHz) | AMD Ryzen 7 4800H with Radeon Graphics (2.90 GHz) |
| GPU | NVIDIA GeForce RTX 5060 Ti 8 GB | AMD Radeon RX 5600M Series 6 GB |
| RAM | 32 GB | 16 GB |
| Storage | 932 GB | 477 GB |
| Operating System (ROS host) | Linux Ubuntu 20.04 (WSL2) | Linux Ubuntu 20.04 (WSL2) |
I first measured the latency when Unity3D publishes variable‑sized data (from 0 to 2.5 MB) to ROS. The results, plotted in Figure 2 (conceptually), show that latency increases moderately with data size but remains under 0.1 s for data up to 1 MB. Beyond that, latency rises more sharply but still stays within acceptable limits for most UAV drones simulation tasks where control messages are typically small (a few hundred bytes). The relationship can be approximated by a linear model:
$$
L = \alpha \cdot S + \beta
$$
where L is latency in seconds, S is data size in MB, and the coefficients depend on network conditions and hardware. For Device 1, α ≈ 0.035 s/MB and β ≈ 0.005 s.
Next, I measured three critical latency endpoints for UAV drones control: (1) the time from ROS publishing a motion command to Unity3D starting to execute it (denoted ROS→Unity), (2) the time from Unity3D receiving an emergency stop command to the drone actually stopping (Stop conversion time), and (3) the round‑trip time from ROS publishing a command to ROS receiving the feedback from Unity3D. Ten trials were conducted per scenario, each consisting of 10 messages. The average results are summarized in Tables 4 and 5.
| Test Case | Latency Range (ms) | Mean Latency (ms) |
|---|---|---|
| ROS publish → Unity feedback received | [6.00, 7.50] | 6.65 |
| Unity receives stop command → drone stops | [1.90, 3.30] | 2.26 |
| ROS publish command → Unity subscribes | [2.90, 3.65] | 3.32 |
| Test Case | Latency Range (ms) | Mean Latency (ms) |
|---|---|---|
| ROS publish → Unity feedback received | [163.00, 242.70] | 201.43 |
| Unity receives stop command → drone stops | [332.00, 512.00] | 409.20 |
| ROS publish command → Unity subscribes | [69.00, 102.00] | 92.55 |
Clearly, Device 2 (laptop) exhibits significantly higher latency due to its slower CPU/GPU and network configuration. However, even the worst‑case latency (≈512 ms) for the stop conversion is still acceptable for non‑critical simulations, though it could lead to collisions in fast‑moving scenarios. Therefore, I recommend using a powerful desktop for real‑time control of UAV drones requiring high responsiveness.
I further validated the system by reproducing an existing multi‑agent coverage path planning algorithm. In this test, a 2D matrix map (0 for free, 1 for obstacles) was sent from ROS to Unity3D. Unity3D generated a 3D representation, with red cubes for obstacles and green icons for UAV drones. The UI allowed switching between multiple maps and adjusting obstacle dimensions. The algorithm’s output waypoints were streamed to Unity3D, and the drone agent followed the path smoothly. This demonstrated that my system can easily integrate and visualize third‑party algorithms, greatly reducing development time for UAV drones.
5. Vision‑Based Obstacle Avoidance for UAV Drones
To test the autonomous development capability, I implemented a simple vision‑based obstacle avoidance system using the depth camera in Unity3D. The workflow is as follows: Unity3D captures depth images (32‑bit float), encodes them as sensor_msgs/Image, and publishes them to ROS. The ROS node subscribes, converts the image to OpenCV format using cv_bridge, and computes the average depth in the central region (ROI). If the average depth falls below a threshold (e.g., 0.03 m), the FSM transitions from FORWARD to STOPPED, then to AVOID (moving sideways with a slight rotation), and finally to RETURN (realigning to original heading). The control commands are published as Twist messages, which Unity3D subscribes to and applies to the drone model. The entire perception‑decision‑action loop demonstrated stable obstacle avoidance in a static environment with randomly placed obstacles. This confirms that the system can serve as a testbed for developing and validating real‑time vision algorithms for UAV drones.
I also conducted a preliminary multi‑drone cooperative obstacle search experiment. Two drones, labeled Drone A and Drone B, moved randomly within a bounded area. Each drone had its own camera and subscribed to a common obstacle‑detection topic. When Drone A detected an object (based on red‑color thresholding from its RGB camera), it stopped and published its position to Drone B via ROS service (using WebSocket). Drone B then interrupted its random search and moved toward the target location. The experiment ran with acceptable inter‑drone communication latency (under 50 ms), showcasing the feasibility of collaborative tasks with multiple UAV drones in the simulation system.
6. Conclusion
In this work, I have designed and implemented a simulation system for UAV drones based on ROS and Unity3D. The system effectively addresses the common challenges in UAV drones development: long development cycles, high experimental costs, poor physical test fidelity, and difficulty in reproducing high‑risk scenarios. By leveraging the ROS‑TCP‑Connector for low‑latency communication, the system achieves real‑time data transmission and synchronization between the control and environment components. Performance evaluation on two different hardware platforms shows that, under ideal conditions, the latency remains well within acceptable boundaries for real‑time control. Additionally, I have demonstrated the system’s versatility by integrating existing algorithms (coverage path planning) and developing new ones (vision‑based obstacle avoidance and multi‑drone cooperative search).
Future research directions include further optimizing the transmission performance, especially for high‑bandwidth sensor data (e.g., lidar point clouds), and incorporating more physics‑based aerodynamics models to enhance the fidelity of UAV drones flight simulation. Moreover, I plan to extend the system to support hardware‑in‑the‑loop (HIL) testing, bridging the gap between pure simulation and real‑world deployment.
Finally, I insert the conceptual image representing the system’s overall look:

This figure (conceptually) illustrates a typical simulation scenario where multiple UAV drones navigate through a 3D environment, demonstrating the system’s potential for scalable, high‑fidelity testing.
