We present a comprehensive study on the recognition of photovoltaic (PV) module defects using drone technology for intelligent zonal inspection in large-scale solar farms. Our approach integrates unmanned aerial vehicle (UAV) high-resolution imaging, convolutional neural network (CNN) feature extraction, and graph neural network (GNN)-based relational reasoning. The goal is to achieve robust and accurate defect identification under challenging outdoor conditions such as varying illumination, vegetation occlusion, and dust accumulation. We demonstrate through extensive experiments that the proposed methodology significantly outperforms existing methods in terms of feature correlation and defect recognition consistency.

Introduction
The global energy transition has accelerated the deployment of photovoltaic systems, making them a cornerstone of renewable energy generation. Large-scale PV farms, often spanning hundreds of megawatts, operate under harsh environmental conditions. PV modules are exposed to continuous thermal stress, mechanical loads, and chemical degradation, leading to defects such as hot spots, micro-cracks, glass breakage, and diode failures. These defects not only reduce power output but also pose fire hazards. Traditional manual inspection, which relies on visual inspection by technicians, is time-consuming, labor-intensive, and prone to human error. To address these limitations, drone technology has emerged as a transformative tool for PV farm inspection. Drones equipped with high-definition visible-light and infrared thermal cameras can rapidly cover vast areas and capture detailed imagery. However, simply deploying drones is insufficient; we require intelligent flight planning, image acquisition under variable conditions, and automated defect detection from massive datasets.
Existing studies have attempted to apply machine learning to defect detection in electrical equipment. For instance, some works use meta-learning with metric learning for small-sample defect detection in substation equipment, but such methods may be sensitive to feature metrics and fail under noise. Others employ lightweight YOLO-based networks for external defects in electrical equipment, but they are vulnerable to outdoor interference, resulting in low accuracy under realistic field conditions. In contrast, our approach leverages the synergy between drone technology, deep feature extraction, and graph-based relational modeling to overcome these challenges. We systematically divide the PV farm into inspection zones, collect multimodal images (visible and thermal) with optimized flight parameters, and then apply a CNN to extract hierarchical features. These features are subsequently embedded into a graph structure where nodes represent individual modules and edges capture spatial or electrical connections. A GNN then performs node classification to identify defective modules. This framework not only identifies individual anomalies but also leverages contextual relationships among modules, improving detection robustness.
Methodology
Drone Technology for Zonal Data Acquisition
To enable efficient and precise inspection using drone technology, we design a systematic image acquisition procedure. First, the PV farm is partitioned into multiple contiguous sub-zones based on actual layout, module type, and terrain. For each sub-zone, an optimal terrain-following flight path is computed to maintain a constant distance between the drone and the module surface. During flight, the UAV carries both a high-resolution visible-light camera (20 megapixels) and an infrared thermal camera (640×512 resolution, thermal sensitivity ≤0.05°C). The cameras operate with pre-set parameters: forward overlap 80%, side overlap 70%, flight height 10–15 m, ground speed 4 m/s, and camera pitch angle approximately -90° to minimize perspective distortion. The imaging principle is modeled as:
$$ I(x,y) = \beta \cdot K \cdot \iint \left[ \gamma \cdot f(x’, y’) \right] \cdot e^{i\theta} \, dx’ dy’ $$
where \( (x,y) \) represents the image coordinates, \( (x’,y’) \) the actual object coordinates, \( \beta \) an imaging coefficient, \( \gamma \) resolution, \( \theta \) the imaging angle between drone and target, \( H \) flight altitude, and \( K \) the scaling factor between image and real area. This equation captures the geometric transformation from the three-dimensional scene to the two-dimensional image plane under drone-based oblique or nadir viewing.
Feature Extraction via Convolutional Neural Network
After obtaining multi-modal images, we apply a convolutional neural network (CNN) to extract feature maps that capture spatial structures and texture details of PV modules. The CNN simulates biological visual mechanisms through hierarchical layers. Initial convolutional layers detect low-level features such as edges, corners, and color blobs. Intermediate layers combine these to form local patterns like grid lines and cell boundaries. Deeper layers aggregate global context to characterize complex defect morphologies (e.g., hot spot shapes, crack orientations). Pooling operations interspersed between convolutions reduce dimensionality and enhance translation invariance. Specifically, the output of the \(i\)-th convolution layer is:
$$ S_i(a,b) = \Gamma \left[ \sum_{m=1}^{M} \sum_{n=1}^{N} Q_i(a+m, b+n) \cdot J_i(m,n) + \xi_i \right] $$
where \( Q_i(a,b) \) is the input (image at scale \(i\)), \( J_i(m,n) \) the convolution kernel, \( \Gamma \) the activation function (ReLU), \( \xi_i \) the bias term, and \( M \times N \) the kernel size. This operation is repeated across multiple layers to produce a feature vector for each module region.
Graph Neural Network for Defect Recognition
Once CNN features are extracted for each PV module in the farm, we construct a graph structure using graph neural network (GNN) principles. The entire PV farm is represented as an undirected weighted graph \( G = (R, T, U) \), where \( R \) is the set of nodes (each node corresponds to a module or a detection segment), \( T \) is the set of edges defined by physical adjacency or electrical series connection, and \( U \) is the adjacency matrix storing connection strengths. Node feature matrix \( C \) is formed by stacking CNN feature vectors for each module. Then a fully connected layer maps node features to a class space:
$$ \psi = \text{ReLU}(W \cdot C + E) $$
where \( W \) is the weight matrix and \( E \) the bias. Finally, a Softmax function computes the probability distribution over defect categories:
$$ P_{IO} = \frac{\exp(\psi_{IO})}{\sum_{k=1}^{K} \exp(\psi_{Ik})} $$
Here, \( P_{IO} \) denotes the probability that node \( I \) belongs to defect class \( O \). The class with maximum probability is assigned as the recognition result for that module. This graph-based approach exploits the relational context among modules — for example, a thermal anomaly in one module may correlate with similar issues in adjacent modules due to shared electrical or environmental factors — thereby improving detection consistency.
Experimental Design
We validate the proposed method on a large mountainous PV farm in southwestern China with a total installed capacity of 100 MWp and approximately 400,000 monocrystalline silicon modules spread over 2.5 km². The terrain is undulating with significant slope variations and shading issues. Modules are arranged east–west in rows and north–south in columns. The farm is divided into 15 inspection sub-zones, each containing 80–120 strings with 22 modules per string. UAV platform is DJI Matrice 30T with visible and thermal cameras as described. Flight parameters are strictly controlled to ensure image quality. To establish ground truth, we artificially introduce and record typical defects including hot spots (temperature rise >10°C), micro-cracks (length >15 cm), glass breakage (area >5 cm²), dust coverage (>30%), and vegetation occlusion. All defect locations and categories are precisely documented.
We compare our method against two baseline approaches: a meta-learning-based defect detection method (referred to as Baseline A) and a lightweight YOLOv5-based external defect detection method (Baseline B). Both are tested under identical conditions. Performance is quantified using Q correlation coefficient and F1 score.
Q Correlation Coefficient Evaluation
The Q correlation coefficient measures linear correlation between true features \( X \) and extracted features \( Y \). It ranges from -1 to 1, where values above 0.8 indicate strong correlation, below 0.3 indicate weak correlation, and intermediate values moderate. The formula is:
$$ Q = \frac{\text{Cov}(X,Y)}{\sqrt{\text{Var}[X] \cdot \text{Var}[Y]}} $$
We compute Q for each method over 20 experimental trials. The extracted features are the CNN feature vectors before GNN processing. Table 1 summarizes the average Q values along with standard deviations.
| Method | Mean Q | Std Dev |
|---|---|---|
| Proposed Method | 0.972 | 0.018 |
| Baseline A | 0.764 | 0.122 |
| Baseline B | 0.688 | 0.195 |
As shown, our method achieves Q consistently near 1.0, significantly outperforming baselines. This confirms that the CNN features extracted under our drone technology capture the intrinsic characteristics of PV modules with minimal noise contamination.
F1 Score Consistency Analysis
F1 score, the harmonic mean of precision and recall, evaluates the consistency between defect recognition results and ground truth. We compute F1 scores for each method over 20 independent runs. Table 2 presents the mean F1 scores and standard deviations.
| Method | Mean F1 | Std Dev |
|---|---|---|
| Proposed Method | 0.953 | 0.025 |
| Baseline A | 0.721 | 0.098 |
| Baseline B | 0.645 | 0.132 |
The proposed method maintains near-perfect F1 scores with low variability, indicating stable and accurate defect identification. Baseline methods exhibit wider fluctuations and lower average performance, particularly in challenging sub-zones with heavy vegetation or steep slopes. This demonstrates that our integration of drone technology with CNN-GNN architecture is resilient to real-world interference.
Results and Discussion
Our experimental results clearly show that the proposed drone technology-driven inspection framework significantly improves defect recognition. The Q correlation coefficient of 0.972 indicates that the extracted features are highly representative of actual module conditions. This is critical because feature quality directly impacts subsequent classification. Furthermore, the F1 score of 0.953 reveals that false positives and false negatives are well balanced, minimizing both missed detections and false alarms. The low standard deviations across trials underline the repeatability and robustness of our method.
We attribute these improvements to three key factors. First, the intelligent zonal inspection strategy using drone technology ensures comprehensive coverage with optimal viewing angles and consistent resolution. Second, the CNN architecture is designed to extract multi-scale features that capture subtle defect signatures even in the presence of background noise (e.g., dust, shading). Third, the GNN leverages spatial and electrical relationships among modules, which helps to disambiguate genuine defects from artifacts. For instance, a single module showing a hot spot-like thermal pattern might be a false positive if considered in isolation, but when neighboring modules also exhibit elevated temperatures, the pattern becomes plausible. The GNN implicitly learns such contextual cues.
We also conducted ablation studies to isolate the contributions of drone technology versus the CNN and GNN components. Without drone-based adaptive flight (e.g., using fixed altitude), the F1 score drops by approximately 12%. Removing the GNN and using only CNN features with a simple classifier reduces F1 by 8%. These results confirm that each element plays a vital role.
Conclusion
In this work, we present a robust and practical solution for PV module defect recognition using advanced drone technology with intelligent zonal inspection. By systematically partitioning the farm, acquiring high-quality multimodal imagery, extracting deep features via CNN, and performing relational reasoning via GNN, we achieve superior defect detection performance. Our method maintains a Q correlation coefficient close to 1.0 and an F1 score above 0.95 under real-world mountainous terrain conditions. The proposed framework significantly enhances the automation and reliability of PV farm maintenance, reducing manual labor and operational costs while improving safety and energy yield. Future work will explore real-time edge processing on the UAV and integration with predictive maintenance systems to further advance the role of drone technology in sustainable energy infrastructure.
We believe that the synergy between drone technology and graph neural networks opens new avenues for large-scale asset inspection. The methodology can be extended to other types of renewable energy installations such as wind turbines and concentrated solar power plants. Our study underscores the importance of combining high-quality data acquisition with intelligent feature extraction and contextual reasoning to address the challenges of complex outdoor environments.
