A Collaborative Simulation Training System for Power Line Inspection Using Drones

Drone inspection of transmission and distribution lines is a critical means for ensuring the safe and stable operation of the power grid. Conducting specialized operational skill training is of paramount importance for the smooth execution of line maintenance operations. In practice, drone inspection tasks often require collaborative completion by a team consisting of two or more personnel. Traditional training typically involves expert-led theoretical instruction followed by hands-on practical exercises. However, relying solely on lectures covering precautions, responsibilities, and coordination points often results in trainees having an insufficiently profound understanding of the collaborative process and its essentials. Transitioning directly from theoretical explanation to operational training on actual power lines can lead to poor coordination and poses significant safety risks. The integrated application of virtual reality, human-computer interaction, and network technologies to construct an immersive operational environment and a freely operable simulation training platform allows multiple personnel to participate in the inspection process through different roles. This enables them to experience how to cooperate and synergistically complete the inspection and maintenance of line equipment, effectively addressing the challenges faced in current drone inspection training.

Currently, simulation training is widely used in operational training across various industries, including military and education. Collaborative operation simulation has found applications in numerous fields such as aerospace, assembly, and air combat simulation. Research results indicate that this approach not only yields significant economic benefits but also effectively enhances training levels. The application of simulation training in the field of power line maintenance initially involved using 3D simulation technology to recreate maintenance scenarios and standardize operational processes, thereby improving training effectiveness by visually explaining job methods, personnel composition, tool allocation, procedural steps, and safety measures. In recent years, with the advancement of virtual reality (VR) technology, devices such as VR headsets, controllers, and sensors have been gradually integrated into line maintenance training. This has enabled human-computer interaction and immersive experiences in the maintenance process, significantly enhancing training effectiveness and making skill training based on virtual scenarios a reality. However, current applications are largely confined to single-user, single-process simulation training and cannot yet support multi-user collaborative operations.

Collaborative simulation training is based on VR technology. Within a computer-generated 3D virtual environment that includes the operational scene and the action processes of personnel, different users interact in real-time through their respective terminals, driving human models within the shared virtual environment to complete the entire operational process simulation. Realizing virtual collaborative operations requires solving two key problems: first, how to control the concurrent operational behaviors of multiple users to prevent conflicts; second, how to ensure consistency of the simulated operational scene across multiple different clients. Based on this, our research focuses on constructing a model for multi-user collaborative operations in line maintenance. Through the design of the system composition, architecture, and network topology for collaborative operations, and utilizing fundamental development platforms such as Unity 3D, 3DMax, VC++, and database management system software, we have implemented a multi-user collaborative simulation system for line maintenance. This system also enables monitoring and management of the operational process, thereby solving the problems of multi-user concurrent operation conflict control and virtual simulation scene consistency.

1. Multi-User Collaborative Operation Model

1.1 Collaborative Operation Process Design

In the multi-user collaborative simulation for line maintenance, various operators use VR interaction devices to enter the same virtual operational scene and perform simulated tasks. The process involves numerous concurrent operations that may lead to conflicts. Therefore, controlling the concurrent behaviors of multiple users to prevent conflicts is the primary challenge in realizing virtual collaborative training for line maintenance. Given the complexity of collaborative line maintenance operations and the variety of tools and equipment involved, we employ a token mechanism and a dynamic permission allocation method to orchestrate the operational workflow. By actively allocating operation tokens, the system guides each step of the process, preventing conflicts during multi-user operations.

According to the actual requirements of line maintenance projects, the operational process of each task can be categorized into parallel operation processes and synchronous multi-user operation processes. Each role is defined by four attributes: sequence number, position, permission, and state. The position of each role is assigned based on actual operational requirements, with states set as ‘waiting’ or ‘operating’. The operational content and responsible roles for each step are designed according to actual job requirements. Operational permissions are dynamically allocated at the start of each step. A reminder flag is set at the end of a step, prompting the operator to release their permissions upon completion. Permissions are then reallocated before the next step begins. For instance, assume a maintenance task involves K personnel and consists of M operational steps. After the simulation task initiates, the system first assigns different permissions (1 to K, with 1 being highest) to different roles based on actual requirements. The role with the highest permission (e.g., 1) is responsible for executing specific actions, while others assist. If the highest-permission role experiences an anomaly (e.g., network disconnection), the system automatically reassigns the operation token to the waiting personnel with the next highest priority. This dynamic permission allocation avoids concurrency conflicts and ensures the smooth progression and stable operation of the system.

1.1.1 Parallel Operation Process

In collaborative line maintenance, situations arise where multiple personnel perform independent tasks concurrently within the same timeframe. We define this as a parallel operation process. The workflow is illustrated in the following flowchart. The system first divides personnel into N mutually independent serial operation groups. Within a group, the system assigns permissions to all participating personnel based on actual operational requirements. The member with the highest priority operates in a single-user serial mode. If that member encounters an anomaly, the system selects a substitute from the waiting personnel to continue. All personnel release their permissions upon completing a step and receive newly assigned permissions upon entering the next phase. Given the strict operational standards in line maintenance, if one member of a two-person team has not completed their assigned task, the process must wait for the other to finish before both can proceed to the next operational phase.

The token assignment for a parallel step involving N groups can be modeled. Let $G = \{G_1, G_2, …, G_N\}$ represent the set of groups. For each group $G_i$, there is an ordered set of members $M_i = \{m_{i1}, m_{i2}, …\}$ sorted by permission level (1 is highest). At the start of a parallel phase, the system assigns the token $T_{op}$ to the highest-priority member in each group:
$$ T_{op}(G_i, t_{start}) = m_{i1} \quad \forall i \in [1, N] $$
The operation proceeds until all members in all groups have completed their sub-tasks. A synchronization condition governs progression to the next phase $P_{next}$:
$$ \text{Proceed to } P_{next} \iff \forall G_i \in G, \forall m_{ij} \in M_i, \quad \text{TaskComplete}(m_{ij}) = \text{True} $$

1.1.2 Synchronous Multi-User Operation Process

Certain drone inspection tasks, such as flight control and gimbal operation, require synchronous multi-user collaboration to ensure correctness and effectiveness. The system must manage permissions for such tasks. The workflow is as follows: First, two personnel are selected in descending order of permission level. The system then issues a collaboration request to both. The operation commences only after both confirm acceptance. During the operation, the system continuously monitors for anomalies (e.g., disconnection). If one user drops out, the system issues the command to the next available personnel in the permission hierarchy to ensure task completion. Upon finishing their part in a step, personnel actively release their permissions before the next step begins.

This process can be formalized. For a synchronous task requiring $U$ users (typically $U=2$), let the set of available users be $A = \{a_1, a_2, …, a_K\}$ sorted by permission $p(a_1) > p(a_2) > … > p(a_K)$. The system selects the top $U$ users, $S = \{a_1, a_2\}$. It sends a collaboration request $C_{req}$. The operation starts if all confirm:
$$ \text{StartOperation} \iff \forall a_i \in S, \quad \text{Confirm}(a_i, C_{req}) = \text{True} $$
If a user $a_j \in S$ fails during operation at time $t_f$, the system replaces them with the next eligible user $a_{U+1}$:
$$ S’ = (S \setminus \{a_j\}) \cup \{a_{U+1}\} $$
The state of the operation $OpState(t)$ must remain consistent across all clients despite the change.

1.2 Simulation Task Modeling

1.2.1 Hierarchical Task Structure

Simulating the multi-user collaborative process for power line drone inspection first requires a comprehensive and clear description of all possible operational tasks and a realistic, fluent reproduction of specific actions. Decomposing the simulation tasks hierarchically, thereby modeling and encapsulating line maintenance operations, allows for more flexible control of virtual humans and drones to execute specified actions and enables accurate, real-time control of the inspection simulation process. Therefore, task decomposition is key to operational process simulation. Our research integrates task decomposition with human activity simulation, designing an operational task decomposition model centered on human activity. The specific operational process for a person completing a task can be represented as a sequence of actions within a spatiotemporal context.

Based on the human activities within an operational process, simulating each specific action is the foundation for completing the entire simulation. Therefore, task decomposition should break down the complete process to the level of all movements and action information required to perform the operation, and accurately express this information. Guided by process-oriented and hierarchical design principles, we posit that complex operational tasks consist of several subtasks, and each subtask comprises multiple independent, easily describable basic actions. According to action type and task abstraction level, we categorize line maintenance operational activities into three layers from top to bottom: the business-related Maintenance Task Layer, the Operation Unit Layer oriented towards smaller operational objectives, and the task-agnostic Basic Action Layer. The decomposition model is illustrated below.

Table 1: Hierarchical Task Decomposition Model
Layer	Description	Example Components
Maintenance Task Layer	The final operational goal, composed of a sequence of operation units.	Drone takeoff, hover, photograph defect, land.
Operation Unit Layer	Describes the operations a virtual worker must perform to accomplish a sub-goal. Formed by combining basic actions.	Pick up controller, power on drone, perform pre-flight check.
Basic Action Layer	Parameterized, generic actions independent of specific tasks. Possess universal semantics.	Walk to(point), bend over, turn(angle), grasp(tool), release(object).

The hierarchical structure not only completely describes any operational task but also provides information support for evaluating and analyzing the operational process.

1.2.2 Establishing the Action Library

Based on the hierarchical decomposition, when an operator interacts with a simulated object in the virtual scene, they call basic actions from the library, combine them into an operational behavior to complete a unit task, and finally achieve the maintenance task goal by completing multiple unit tasks. Therefore, the basic action library must be disjoint, complete, and reusable, covering a comprehensive set of reusable basic actions that describe all operations required in actual work. To facilitate the modeling and simulation of various line maintenance operations, we define a set of basic actions as standards for describing operational behaviors and invoke them through predefined commands, as summarized below.

Table 2: Basic Action Library Commands
Command	Parameters	Description
`Worker_Gesture(gesture1, gesture2)`	gesture1: Hand posture; gesture2: Body posture	Adjusts the virtual human from the current posture to the desired hand operation and appropriate body posture.
`Target_Location(location)`	location: Target coordinates (x, y, z)	Moves the virtual human to the specified operational position.
`Use_Tool(name, gesture1, gesture2, equipment)`	name: Tool name; gesture1,2: Postures; equipment: Target object	Retrieves a tool from the scene and performs an operation on a specific equipment item.
`Operation(equipment_area, action)`	area: Equipment region identifier; action: Specific operation	Executes a specific inspection action on a defined area of equipment.
`Release(equipment, gesture1, gesture2)`	equipment: Object name; gesture1,2: Postures to revert to	Releases the held equipment and reverts hand and body to initial or a specified state.

We employ a combination of keyframe animation simulation and inverse kinematics (IK) computation to animate the virtual human’s basic actions. Keyframe methods are used for relatively fixed actions; we first build a basic animation library composed of keyframe animations and realize the virtual worker’s operational behaviors by calling these animations. For actions difficult to represent with fixed animations—such as grasping objects of varying sizes, where hand aperture and posture differ—we use IK methods to achieve realistic interaction. For example, by obtaining the initial hand position/orientation $H_{init}$ and the target hand position/orientation $H_{target}$, we can calculate the number of time steps $\Delta t$ required for the transformation, simulating non-fixed actions. The IK solution can be approximated for a limb with $n$ joints by solving for joint angles $\theta_i$ that satisfy:
$$ H_{target} = FK(\theta_1, \theta_2, …, \theta_n) $$
where $FK$ is the forward kinematics function. In practice, a cyclic coordinate descent (CCD) or Jacobian-based method is used within the game engine to compute $\theta_i$ in real-time.

1.3 3D Simulation Resource Modeling

1.3.1 Model Definition

To enable dynamic interaction and display between virtual personnel and objects in the scene, some 3D objects in the line maintenance simulation environment are non-static. To facilitate animation control and improve interaction efficiency, we adopt a feature-based modeling approach to create predefined object descriptions. These define not only the geometric characteristics of equipment but also their functional features and the tasks they participate in. By pre-storing information relevant to interaction and providing parameters for the basic actions in the library, virtual personnel can quickly access this information and invoke corresponding actions to perform specified interactive tasks. Integrating interactive object parts, interaction locations, equipment state changes, and other information enables the simulation of any possible interactive behavior between virtual personnel and objects. We categorize interactive features based on the operational characteristics of line maintenance projects, as detailed below.

Table 3: Interactive Feature Classification
Category	Feature	Specific Content
Device Attributes	Parts	Describes the geometric shape, hierarchy, position, and physical properties (mass, center of gravity) of each component.
Device Attributes	Motion	Defines device movement and any other changes (e.g., color, material).
Interaction Information	Location	Defines interaction hotspots on the device and valid positions for the virtual human to facilitate interaction.
Interaction Information	Posture	Describes the posture the virtual human should adopt, primarily defining hand operation and body stance.
State Change	Variables	Records the state of the device and its changes after various interaction operations.
Character Action	Operation	Describes preset actions for virtual roles via simple instruction sets (e.g., positioning to avoid collision during device operation).

1.3.2 3D Modeling

To construct a realistic virtual operational environment for line maintenance, it is necessary to create fundamental resource models, including 3D models of line equipment, defect models, human models, and the operational environment. We use the parametric modeling software 3DMax to create the 3D models involved in the virtual line maintenance process. The modeling workflow follows a standard pipeline from conceptual design and reference collection to geometric modeling, texture mapping, and optimization. The final published model files are imported into the VR development platform Unity3D for constructing the line maintenance simulation scene. The geometry of a model $M$ can be represented as a set of vertices $V$, faces $F$, and textures $T$: $M = \{V, F, T\}$. For efficient rendering in a collaborative real-time environment, the polygon count for complex models like transmission towers must be optimized: $|F|_{optimized} \leq F_{budget}$.

2. System Design

2.1 System Composition

We divide the functionality of the drone line inspection simulation training system into a Trainee Client and an Instructor Station. Multiple trainee clients enable collaborative operation of the line maintenance process, where trainees use different hardware (controllers, HMDs) to interact and immerse themselves in the same operational scenario. The instructor monitors, records, and manages the multi-user training process via a control console. For network architecture, both distributed and centralized coordination are common. Centralized coordination, where a server receives input from all nodes, processes it uniformly, and broadcasts results, is simpler for conflict resolution and suitable for systems with fewer nodes. Since drone training is typically organized within a local area network (LAN) with 2-4 concurrent users, we adopted a centralized architecture for our simulation training system.

The system consists of one simulation server, multiple trainee client computers, one instructor station, one stereoscopic projection monitoring system, and a network switch. Trainees wear VR input devices (e.g., motion controllers, trackers) and feedback devices (e.g., HMDs, force feedback devices) to participate in collaborative simulation training. The client captures input data, and based on computation results from the server, renders the first-person 3D operational scene for that trainee and drives force feedback devices. The simulation server receives operational data from all trainees, performs simulation computations (including 3D scene updates, collision detection, grasp simulation, dynamics, and collaborative processing), and sends results back to clients and the instructor station. It also handles process recording, replay, and multi-view observation. The instructor can send control commands via the instructor station and observe the operational process and collaboration of all trainees through the projection system.

2.2 System Architecture

Addressing the functional requirements of the multi-user collaborative drone inspection simulation training system, we constructed the system architecture based on a data-modeling approach, comprising the Presentation Layer, Logic Layer, Data Layer, and Support Layer.

Table 4: System Architecture Layers
Layer	Components	Function
Presentation Layer	Trainee HMD/Controller UI; Instructor Console UI; Projection Display.	Provides human-computer interaction interfaces for different users, displays simulation data.
Logic Layer	Simulation Training Module, Assessment Module, Permission Manager, Training Monitor, Information Manager.	Core processing for business logic, system management, configuration. Implements the collaborative workflow and task model.
Data Layer	3D Model DB, Equipment Info DB, Question DB, User Profile DB, Action Library DB.	Encapsulates all resources needed for simulation (models, information, user data).
Support Layer	Unity3D, 3DMax, VC++, DBMS; TCP/IP, IPX, DirectPlay, video protocols.	Provides foundational software development platforms and network protocols for scene rendering, control, and data transmission.

2.3 Network Structure

The line maintenance collaborative simulation system is a multi-computer, multi-display training environment. During collaborative operation, consistency across all clients’ virtual scenes—including scene elements, object positions/states, and inter-object constraints—is essential for smooth collaboration. Frequent data exchange between clients and a central server regarding scenes, roles, scripts, and communication is key to realizing collaborative work. To ensure real-time and reliable data synchronization, we employ a data synchronization technique in designing the network topology.

A public data server acts as a synchronization hub for virtual scene element data. Each client accesses this public information in a Client/Server (C/S) mode. Before system runtime, each client downloads a copy of the virtual scene element database. By modifying this central database, all clients can update their local scenes. During operation, the network does not transmit the entire scene data, but only the dynamic data such as object positions, orientations, and state relationships. Clients receive this data and update their local scenes and the execution information for virtual humans and feedback devices accordingly. This approach minimizes network bandwidth usage, ensures real-time data interaction, and maintains virtual scene consistency across all trainee clients during training. The synchronization of an object’s state $S_o$ at time $t$ across $N$ clients can be expressed as a function of the authoritative server state $S_o^{server}(t)$ and network latency $L_i$ for client $i$:
$$ S_o^{client_i}(t) \approx S_o^{server}(t – L_i) $$
The goal is to minimize the discrepancy $\epsilon_i(t)$:
$$ \epsilon_i(t) = | S_o^{client_i}(t) – S_o^{server}(t) | $$
for all objects $o$ and clients $i$, using synchronization algorithms like dead reckoning or state interpolation.

3. System Implementation

3.1 Visual Scene Simulation

Constructing a virtual line maintenance scene is the foundation for multi-user collaborative training. We first built a 3D dynamic simulation platform for line maintenance scenes using the Unity 3D development platform. Leveraging the 3D model library and this platform, dynamic 3D operational scenes can be quickly assembled through custom script analysis methods and importing equipment files.

3.2 Functional Realization

The system first uses the 3D dynamic simulation platform to recreate the actual line maintenance environment in three dimensions. Human-computer interaction is then achieved via VR input devices. Multiple trainees, using different clients, enter the same operational environment, assume different roles, and perform collaborative operations according to the system’s predefined workflow to complete each step. For example, in a 500 kV single-circuit tension tower inspection task, one operator controls the remote controller (pilot) and another operates the ground control station (payload operator). The control setup and viewing perspectives are identical to actual inspection operations, and VR goggles can be used for training.

The pilot is responsible for maneuvering the drone to a specified location and hovering. The payload operator adjusts the gimbal, aligning the drone’s camera with the inspection point to capture compliant photographs. After collaboratively completing one task, both personnel proceed to the next inspection point. This phase follows the synchronous multi-user operation process. Apart from the assigned tasks, trainees are not restricted from performing other exploratory operations within the simulation, enhancing the flexibility of the drone training experience.

By connecting all trainee clients, the instructor station, and the server via a LAN, a collaborative simulation training system is established. Multiple trainees can form a team according to actual line maintenance requirements, play different operational roles based on their positions, and conduct collaborative training for the same line inspection project within the same virtual environment. Within this virtual environment, each trainee experiences the scene from a first-person perspective. They perform free-form simulation operations according to actual job requirements, coordinating with each other, and can communicate freely in real space. This allows them to genuinely experience the working conditions and operational responsibilities of their respective roles, strengthening their training in multi-user collaborative operational capabilities and essentials.

During collaborative operations, the instructor can observe the entire process from a freely controllable third-person perspective via the instructor station. This view can be displayed on a projection screen for on-site observation and technical exchange by other personnel.

4. Conclusion

In summary, this research presents a comprehensive solution for multi-user collaborative drone training for power line inspection. The key contributions are:

Collaborative Operation Model: We designed a robust model using a token mechanism and dynamic permission allocation for both parallel and synchronous operations. A hierarchical task structure decomposes complex maintenance tasks, while feature-based modeling describes interactive objects. This model effectively resolves operational conflicts and ensures system stability.
System Architecture & Network: A centralized LAN-based system was constructed, employing data synchronization techniques in its network topology to guarantee real-time consistency of the virtual simulation scene across all client machines, which is fundamental for credible collaborative drone training.
Functional Implementation: Utilizing development platforms like Unity 3D and 3DMax, we implemented a fully functional collaborative simulation training system. It enables multiple trainees to engage in immersive, first-person operational training within a shared virtual environment, while providing the instructor with comprehensive monitoring and management tools.

The results demonstrate that the system operates stably and reliably. It provides an advanced and effective training tool for collaborative line maintenance operations and can be used for specialized skill training in drone inspection and for pre-operational simulation drills. This approach addresses the limitations of traditional and single-user simulation training, offering a safer, more controlled, and deeply engaging platform for developing the critical coordination skills required for modern power line inspection teams. The principles and architecture developed are also broadly applicable to other domains requiring collaborative procedural training in hazardous or complex environments.