This research focuses on the collaborative application of advanced Unmanned Aerial Vehicle (UAV) mapping technologies for the systematic survey and documentation of immovable cultural relics. The work was conducted as part of a national heritage census initiative, aiming to accurately record the quantity, distribution, characteristics, preservation status, and environmental context of historical sites. The methodology represents an in-depth, innovative fusion of several key technologies: UAV-based oblique photogrammetry, airborne LiDAR (Light Detection and Ranging) for point cloud acquisition, and handheld laser scanning systems. By synergistically employing “airborne survey” and “ground measurement” strategies, this integrated approach enables non-contact, high-precision, and efficient recording of cultural heritage assets, from expansive archaeological landscapes to intricate architectural details.
The initial phase involved the preparation of high-resolution satellite imagery basemaps. All known site coordinates were converted to the CGCS2000 national coordinate system and plotted onto these maps. These served as the foundational survey documents. Recognizing the diversity of heritage types—ancient cultural ruins, tombs, cave temples, rock carvings, and modern historical sites—customized technical plans were formulated for each category to determine both the core relic boundaries and their associated buffer zones for protection.
The cornerstone of the “airborne survey” strategy was the use of UAV drones for rapid aerial data capture. For sites like ancient tombs or ruins with minimal surface structures, aircraft such as the DJI Phantom 4 RTK and Mavic 3 Enterprise were deployed. These UAV drones are equipped with high-precision GNSS (Global Navigation Satellite System) receivers and high-resolution cameras. Key flight parameters for generating orthomosaics with approximately 2 cm ground sampling distance (GSD) are summarized below:
| Parameter | Flat Terrain (e.g., Xiaoshizhuang Tomb Complex) | Mountainous/Hilly Terrain (e.g., Wazhang Hill) |
|---|---|---|
| Flight Altitude | < 120 m | Variable (Terrain-following mode) |
| Forward Overlap | 60% | 65% |
| Side Overlap | 50% | 55% |
| Primary UAV drones Used | DJI Phantom 4 RTK | DJI Mavic 3 Enterprise |
The imagery was processed using software like DJI Terra or Pix4Dmapper to produce the required orthophoto maps, significantly aiding in the visual interpretation and delineation of site boundaries.
Overcoming Vegetation Challenges with Airborne LiDAR
Sites with dense tree cover, such as memorial gardens, posed a significant challenge for conventional photogrammetry, as optical cameras cannot penetrate foliage to capture the ground surface or obscured structures. For these environments, the project employed a DJI Matrice 350 RTK platform equipped with a Zenmuse L2 integrated LiDAR module. This combination is particularly powerful for forested area surveying. The Zenmuse L2 module incorporates a frame-scanning LiDAR, a high-precision IMU (Inertial Measurement Unit), and a 4/3″ CMOS mapping camera. This UAV drone system actively emits laser pulses and measures the time delay of the returned signals to calculate distances. The resultant point cloud data provides direct 3D measurements of the ground and objects beneath the canopy.
The LiDAR data acquisition was configured for high detail and accuracy. A five-return mode was used to capture multiple echoes from a single pulse, allowing the recording of returns from canopy layers, branches, and finally the ground. The system operated at a sampling frequency of 240 kHz (240,000 points per second). To ensure completeness and accuracy, the following mission parameters were set:
| Mission Parameter | LiDAR Setting | Visible Camera Setting |
|---|---|---|
| Forward Overlap | 60% | 70% |
| Side Overlap | 30% | 60% |
| Flight Speed | 15 m/s | |
The point cloud precision can be characterized by its expected error, often modeled as a function of flight altitude (H), scan angle (θ), and system-specific errors. The vertical accuracy (σz) is typically more critical for terrain modeling and is often stated as:
$$\sigma_z = a + b \cdot H \cdot \tan(\theta)$$
where ‘a’ represents fixed systematic errors and ‘b’ covers errors related to ranging and GPS/IMU precision. The Zenmuse L2 system specifications quote a vertical accuracy of 4 cm and a horizontal accuracy of 5 cm under optimal conditions.

Data processing was performed in DJI Terra, which handles trajectory calculation and point cloud optimization. The classified point cloud, separating ground points from vegetation and buildings, was used to generate a Digital Elevation Model (DEM). This DEM, when combined with the simultaneously captured and colored orthophoto, provided a complete and accurate representation of the terrain and any cultural features hidden beneath trees.
High-Fidelity 3D Modeling via Oblique Photogrammetry
For heritage sites with complex above-ground structures like temples, halls, or statues, the project utilized oblique photogrammetry to create photo-realistic 3D models. This technique involves mounting a multi-sensor array (typically a five-camera rig) on a UAV drone to capture imagery from one nadir (vertical) and four oblique angles simultaneously. This captures not only roof tops but also the textures and geometry of building facades, which are crucial for architectural documentation.
The primary setup for this task was the DJI Matrice 350 RTK carrying a Sanyou PSDK 102SV3 five-lens camera. To achieve a highly detailed model suitable for heritage recording, aggressive flight parameters were chosen to maximize image overlap and GSD:
| Parameter | Setting for Detailed Modeling |
|---|---|
| Flight Altitude | 60 m |
| Forward Overlap | 75% |
| Side Overlap | 70% |
| Flight Speed | 10 m/s |
Furthermore, manual “orbital” flights around key structures were conducted using a DJI Phantom 4 RTK. Flying at multiple elevations and maintaining a constant distance from the subject, these flights captured additional perspectives of facades, eaves, and recessed areas, ensuring complete coverage and higher texture quality for the 3D model.
The core of processing oblique imagery into a 3D model is Aerotriangulation (AT). Using software like ContextCapture (Smart3D), the process begins with feature detection and matching across all images. The mathematical foundation is the collinearity condition, which states that the exposure station, the image point, and the corresponding object point lie on a straight line. The condition equations for a point \(i\) in image \(j\) are:
$$
x_{ij} – x_p = -f \frac{m_{11}(X_i – X_{oj}) + m_{12}(Y_i – Y_{oj}) + m_{13}(Z_i – Z_{oj})}{m_{31}(X_i – X_{oj}) + m_{32}(Y_i – Y_{oj}) + m_{33}(Z_i – Z_{oj})}
$$
$$
y_{ij} – y_p = -f \frac{m_{21}(X_i – X_{oj}) + m_{22}(Y_i – Y_{oj}) + m_{23}(Z_i – Z_{oj})}{m_{31}(X_i – X_{oj}) + m_{32}(Y_i – Y_{oj}) + m_{33}(Z_i – Z_{oj})}
$$
Where:
- \((x_{ij}, y_{ij})\) are the image coordinates of point \(i\).
- \((x_p, y_p, f)\) are the principal point coordinates and focal length (interior orientation).
- \((X_i, Y_i, Z_i)\) are the object space coordinates of point \(i\).
- \((X_{oj}, Y_{oj}, Z_{oj})\) are the object space coordinates of the exposure station for image \(j\).
- \(m_{11}…m_{33}\) are the elements of the 3D rotation matrix for image \(j\), defined by its orientation angles \((\omega, \phi, \kappa)\).
A bundle block adjustment solves for all unknown exterior orientation parameters \((X_{oj}, Y_{oj}, Z_{oj}, \omega_j, \phi_j, \kappa_j)\) and object point coordinates \((X_i, Y_i, Z_i)\) simultaneously, minimizing the discrepancies between measured and projected image coordinates. After this adjustment, dense image matching algorithms generate the final 3D mesh and texture, which is often tiled for efficient storage and visualization.
Bridging the Gap: Seamless Indoor-Outdoor Data Fusion
While aerial UAV drones excel at capturing exteriors, documenting interior spaces like temple halls requires a ground-based solution. For this, a handheld SLAM (Simultaneous Localization and Mapping) laser scanner, the Huace Rask RS series, was employed. Its key innovation is the deep fusion of SLAM positioning with high-precision RTK GNSS. The system starts outdoors, acquiring an absolute GNSS-RTK position (accurate to ~3 cm in the CGCS2000 frame). As the operator moves indoors, the system seamlessly transitions to LiDAR/Visual SLAM for navigation, but uses the RTK-derived position and trajectory as a robust prior to constrain SLAM drift.
This fusion allows for “loop-free” operation indoors, meaning the operator does not need to return to the starting point to close a loop for error correction. The absolute accuracy for the entire indoor-outdoor point cloud is maintained at approximately 5 cm, all within a unified coordinate system. The data acquisition protocol involved starting outside a structure (e.g., a temple hall), allowing the system to fix an RTK position, and then moving slowly and steadily indoors, ensuring all corners and ceiling areas were covered. The point cloud density \(\rho\) is a function of scanner settings and motion:
$$\rho \approx \frac{f_{scan} \cdot t_{dwell}}{A_{coverage}}$$
where \(f_{scan}\) is the scanner’s pulse repetition rate, \(t_{dwell}\) is the time spent scanning an area \(A_{coverage}\). Slow, deliberate movement increases \(t_{dwell}\), resulting in a denser, more detailed point cloud of the interior space.
Synthesis and Project Impact
The ultimate power of this methodology lies in the integration of these disparate data streams. The exterior oblique photogrammetry model, the LiDAR-derived terrain model for vegetated areas, and the indoor SLAM point cloud are all spatially aligned within the CGCS2000 coordinate system. This creates a holistic, multi-resolution digital twin of the heritage site. The following table summarizes the role and output of each technology in the workflow:
| Technology | Primary Platform | Key Application in Heritage Survey | Main Data Product |
|---|---|---|---|
| Nadir Photogrammetry | Phantom 4 RTK, Mavic 3E UAV drones | Site overview, boundary delineation, 2D planimetric mapping. | High-resolution Orthomosaic (2 cm GSD). |
| Airborne LiDAR | Matrice 350 RTK + Zenmuse L2 UAV drones | Documenting terrain and features under dense vegetation canopy. | Classified 3D Point Cloud, Digital Elevation Model (DEM). |
| Oblique Photogrammetry | Matrice 350 RTK + 5-lens camera UAV drones | High-detail 3D modeling of buildings, facades, and external structures. | Textured 3D Mesh Model (Realistic Visual Appearance). |
| Handheld SLAM LiDAR | Huace Rask RS System | Capturing precise geometry of interior spaces and confined areas. | Dense, colored 3D Point Cloud (Indoor). |
This deeply integrated and innovative application of modern geospatial technologies revolutionized the fieldwork for the heritage census. The project team was able to complete the field data acquisition for over 200 heritage sites in approximately 20 days—a task that would have taken significantly longer using traditional surveying methods. The approach not only provided a permanent, accurate, and rich digital record for preservation and research but also demonstrated a new paradigm for efficient and comprehensive cultural heritage documentation. The synergy between different UAV drone platforms and ground-based sensors, all working within a unified geospatial framework, sets a new standard for archaeological and conservation surveying.
