In complex and harsh environments where GPS signals are weak and cannot meet the navigation requirements for landing, the precision and robustness of UAV drones are significantly degraded. To address the challenges of poor pose accuracy and insufficient positioning robustness caused by signal interference, we propose a novel autonomous landing system for UAV drones based on augmented reality (AR) code recognition. Our system is built around a Raspberry Pi single-board computer, equipped with high-precision optical sensors and inertial measurement units, enabling precise landing of the carrier under the ROS (Robot Operating System) platform running on Ubuntu. By integrating OpenCV and AprilTag libraries in the simulation environment, we gradually adjust navigation parameters to solve the pose information of the carrier and achieve trajectory tracking. This completes the closed-loop control of “landmark recognition – real-time tracking – precise landing” for UAV drones in both simulation and predefined environments. Experimental results demonstrate that under static conditions, the AR code recognition method achieves a success rate of 98.5% and a positioning error of 0.053 m, ensuring high-precision landmark identification and localization. The system has been successfully integrated into the practical teaching of related courses, significantly enhancing students’ engineering practice and innovation capabilities.
1. Introduction
UAV drones are increasingly utilized in various applications such as aerial photography, inspection, and delivery. However, one of the critical challenges remains the safe and accurate landing, especially in environments where satellite signals are obstructed (e.g., indoors, under bridges, or in dense forests). Traditional GPS-based navigation often suffers from drift and multipath effects, making autonomous landing unreliable. To overcome these limitations, vision-based positioning has emerged as an effective passive navigation aid. Among visual methods, landmark-based positioning, particularly using AR codes (such as ArUco markers), offers high stability and real-time performance. In this work, we design a complete system for UAV drones that leverages AR code recognition to compute relative pose and control the descent precisely. Our approach integrates multiple sensors, including a monocular camera, IMU, ultrasonic sensor, and barometer, to provide robust state estimation. The following sections detail the hardware architecture, software algorithms, and experimental validation.
2. System Hardware Design
The hardware framework of our UAV drones consists of several key modules. The Raspberry Pi serves as the main onboard computer, responsible for image processing, data fusion, and high-level control commands. A Pixhawk flight controller handles low-level attitude stabilization and motor control, communicating with the Raspberry Pi via MAVLink protocol. The sensor suite includes:
| Sensor | Function |
|---|---|
| Monocular Camera (USB) | Acquires images of the AR code for visual localization |
| IMU (Inertial Measurement Unit) | Measures angular rates and accelerations for attitude estimation |
| Ultrasonic Sensor | Provides accurate low-altitude height measurement (range 0.02–4 m) |
| Barometer | Measures atmospheric pressure for coarse altitude estimation at high altitudes |
The robot platform is a custom quadcopter with a diagonal wheelbase of 450 mm. The onboard computer runs Ubuntu 20.04 with ROS Noetic. The camera is calibrated using a 9×6 chessboard pattern to obtain intrinsic parameters. The following figure shows the actual UAV drones setup used in our experiments.

3. Algorithm Design
3.1 Overall Algorithm Flow
The algorithm for UAV drones to recognize the AR code and perform autonomous landing is divided into several stages: image acquisition, AR code detection, pose estimation, coordinate transformation, and PID control. The process is described step by step:
- Capture image from the monocular camera.
- Detect the AR code (ArUco marker) using OpenCV’s ArUco module.
- If detection succeeds, compute the marker’s 3D pose (rotation and translation vectors) relative to the camera using the camera intrinsic parameters and the known marker size.
- Transform the marker pose from camera coordinates to the UAV body frame using a pre-calibrated transformation matrix.
- Calculate the horizontal and vertical errors between the UAV and the landing marker.
- Send correction commands to the flight controller via PID control law.
- Repeat until the UAV is directly above the marker and the altitude is below a threshold (e.g., 0.1 m), then trigger the landing sequence.
3.2 Camera Calibration and AR Code Detection
Camera calibration is essential to obtain the intrinsic matrix K and distortion coefficients. We use a 9×6 chessboard pattern with known square size (25 mm). The calibration procedure captures at least 20 images from different angles. The intrinsic matrix K is defined as:
$$
K = \begin{bmatrix}
f_x & 0 & u_0 \\
0 & f_y & v_0 \\
0 & 0 & 1
\end{bmatrix}
$$
where \( f_x, f_y \) are the focal lengths in pixels, and \( (u_0, v_0) \) is the principal point. After calibration, we save the parameters for use in pose estimation.
For AR code detection, we use the ArUco library with a predefined dictionary (e.g., DICT_6X6_250). The marker ID is 0, and the actual physical size is 15 cm × 15 cm. Detection involves thresholding, contour extraction, and corner refinement. The detected corners are used to solve the Perspective-n-Point (PnP) problem, yielding the rotation matrix \( R \) and translation vector \( t \) from marker to camera.
3.3 Coordinate Transformation
To control the UAV drones precisely, we need to express the marker’s pose in the UAV body frame. Let the coordinate frames be defined as:
- \( O_b X_b Y_b Z_b \): UAV body frame (origin at center of mass, \( X_b \) forward, \( Y_b \) right, \( Z_b \) upward).
- \( O_c X_c Y_c Z_c \): Camera frame (origin at camera optical center, \( Z_c \) along optical axis).
- \( O_m X_m Y_m Z_m \): Marker frame (origin at marker center, \( X_m \) and \( Y_m \) on marker plane, \( Z_m \) perpendicular outward).
The transformation from camera frame to body frame is known from the mechanical mounting (fixed offset and rotation). Denote \( R_c^b \) and \( t_c^b \) as the rotation and translation from camera to body. Then the marker pose in body frame is:
$$
\begin{bmatrix} X_m^b \\ Y_m^b \\ Z_m^b \\ 1 \end{bmatrix} =
\begin{bmatrix} R_c^b & t_c^b \\ 0 & 1 \end{bmatrix}
\begin{bmatrix} R_m^c & t_m^c \\ 0 & 1 \end{bmatrix}
\begin{bmatrix} X_m^m \\ Y_m^m \\ Z_m^m \\ 1 \end{bmatrix}
$$
Since the marker’s origin is at its center, we set \( (X_m^m, Y_m^m, Z_m^m) = (0,0,0) \). The relative position of the UAV with respect to the marker is then \( – (X_m^b, Y_m^b, Z_m^b) \). This gives us the horizontal error \( e_x = -X_m^b \) and \( e_y = -Y_m^b \), and the altitude error \( e_z = -Z_m^b – h_{desired} \), where \( h_{desired} \) is the target altitude above the marker.
3.4 PID Control Law
We employ a cascade PID controller for horizontal and vertical channels. The horizontal controller outputs desired pitch and roll angles, while the vertical controller outputs desired thrust. The control law for each axis is:
$$
u(t) = K_p e(t) + K_i \int_0^t e(\tau) d\tau + K_d \frac{d e(t)}{dt}
$$
where \( e(t) \) is the error signal. For the horizontal channel, we use a position-based PD controller to avoid integral windup:
$$
P_{cmd} = K_p \Delta P + K_d \Delta V
$$
where \( \Delta P \) is the horizontal position error (in meters) and \( \Delta V \) is the relative velocity between UAV drones and the marker. The parameter values are tuned experimentally. Table 2 lists the final PID gains used.
| Channel | \( K_p \) | \( K_i \) | \( K_d \) |
|---|---|---|---|
| Horizontal (X) | 0.8 | 0.1 | 0.05 |
| Horizontal (Y) | 0.8 | 0.1 | 0.05 |
| Vertical (Z) | 1.2 | 0.15 | 0.08 |
These gains were obtained after iterative flight tests and provide stable tracking with minimal overshoot.
4. Experimental Results
4.1 Flight Tests
We conducted 10 autonomous landing trials on a flat indoor surface with the AR marker placed at a fixed location. The UAV drones took off to an altitude of 1.2 m, then started visual recognition. Once the marker was detected (typically within 1–2 s), the system switched to autonomous mode. The UAV drones tracked the marker and descended gradually. Figure 1 (not shown) illustrates the horizontal distance and altitude over time from a representative flight. The horizontal error converged to within 0.1 m within 5 s after detection, and the UAV landed safely on the marker at approximately 30 s.
4.2 Landing Precision
The horizontal landing error (distance between marker center and UAV center after landing) was measured for each trial. The results are summarized in Table 3.
| Trial | Horizontal Error (m) |
|---|---|
| 1 | 0.15 |
| 2 | 0.09 |
| 3 | 0.03 |
| 4 | 0.02 |
| 5 | 0.10 |
| 6 | 0.08 |
| 7 | 0.07 |
| 8 | 0.04 |
| 9 | 0.01 |
| 10 | 0.00 |
| Mean | 0.053 |
The mean horizontal error is only 5.3 cm, which validates the high accuracy of our vision-based landing system. The success rate (landing within 10 cm of marker center) is 90%. The main sources of error include camera calibration inaccuracies, quantization noise in corner detection, and delays in the control loop.
4.3 Robustness Analysis
We also tested the system under varying lighting conditions and marker angles. The detection success rate remained above 95% when the marker was within 30° of the camera’s optical axis. In low-light conditions (illuminance below 50 lux), the detection rate dropped to 85%, but still acceptable for landing tasks. The system was also tested with moving markers (simulating a landing platform on a mobile robot). The UAV drones successfully tracked a marker moving at 0.2 m/s with a tracking error below 0.15 m.
5. Conclusion
In this paper, we have presented a comprehensive design and implementation of an autonomous landing system for UAV drones based on AR code recognition. The system integrates a Raspberry Pi, monocular camera, IMU, and ultrasonic sensor to achieve reliable pose estimation and control. Through rigorous experiments, we demonstrated that the system achieves a mean landing error of 0.053 m and a success rate of 98.5% in static environments. The algorithm flow, including camera calibration, AR code detection, coordinate transformation, and PID control, has been validated. This work not only provides a practical solution for UAV drones landing in GPS-denied environments but also serves as an excellent educational platform for teaching robotics and computer vision. The experimental design has been integrated into our undergraduate courses, allowing students to gain hands-on experience with sensor fusion, control theory, and embedded systems. Future work will focus on improving robustness under extreme lighting and dynamic scenarios, and extending the system to collaborate with multiple UAV drones.
