Urban Land Cover Classification Using High-Resolution UAV Drones Imagery

In modern urban development, the accurate classification of land cover plays a pivotal role in supporting sustainable city management. As a researcher focused on geospatial analysis, I recognize that traditional methods often fall short in capturing the fine-grained details required for precise urban planning. With the advent of high-resolution imagery from UAV drones, we now have an unprecedented opportunity to enhance classification accuracy and enable refined environmental monitoring. This study aims to address these challenges by constructing a dedicated dataset and evaluating advanced deep learning models for urban land cover classification. The integration of UAV drones technology not only improves data acquisition efficiency but also provides rich, real-time insights into urban dynamics, which is crucial for addressing issues like urban sprawl, heat island effects, and biodiversity loss.

The significance of this work lies in its potential to transform how cities are managed. By leveraging UAV drones, we can systematically analyze features such as buildings, vegetation, and impervious surfaces, thereby informing policy decisions on infrastructure, conservation, and climate resilience. UAV drones offer distinct advantages over satellite or manned aerial systems, including lower cost, higher flexibility, and the ability to operate in complex environments. In this paper, I present a comprehensive approach to urban land cover classification, from data collection using UAV drones to model validation, with a focus on creating a high-quality dataset that outperforms existing benchmarks. The following sections detail the study area, data acquisition with UAV drones, preprocessing steps, classification methodology, experimental results, and conclusions.

Study Area Description

The study area is located in a rapidly urbanizing region characterized by a mix of residential, commercial, and industrial zones. This area exhibits typical urban features, with well-defined functional districts and recent infrastructural developments, such as new transportation networks and public facilities. However, it also faces ecological challenges, including drought and land desertification, which necessitate continuous monitoring of land cover changes. Using UAV drones, we captured high-resolution imagery to analyze surface cover types, which can aid in mitigating environmental degradation and promoting sustainable development. The region’s diversity in land use makes it an ideal testbed for evaluating classification methods, as it includes dense built-up areas, green spaces, water bodies, and bare land.

From my perspective, selecting this area was strategic due to its representative urban landscape and environmental pressures. UAV drones enabled us to collect data during summer when vegetation is lush and features are most distinct, minimizing shadows by flying at noon. This careful planning ensured that the imagery would be suitable for detailed annotation and model training, highlighting the practicality of UAV drones in urban studies.

Data Acquisition with UAV Drones

Data acquisition was carried out using a UAV drones platform, specifically the CW-20 hybrid electric vertical take-off and landing (VTOL) system equipped with a Carl Zeiss LX 2/35 lens. This platform was chosen for its reliability, efficiency, and precision in small to medium-scale geographic information collection. The UAV drones operated in both VTOL and fixed-wing modes, reducing dependency on specific landing sites and airspace conditions, which is a key advantage in urban environments.

The flight mission was conducted in summer to capture rich land cover details, with operations scheduled at midday to minimize shadow interference from tall structures. After setting up the ground station and assembling the UAV drones, we used CW Commander software for route planning. The parameters for the UAV drones photogrammetry are summarized in the table below, which illustrates the technical specifications ensuring high-quality data capture.

Parameter	Value
Camera Model	N7-RⅡ
Flight Altitude	900 m
Forward Overlap	75%
Side Overlap	75%
Spatial Resolution	7.8 cm
Spectral Bands	R-G-B

Post-flight, we downloaded POS data, differential signals, and base station data for processing with JoPPS software to achieve centimeter-level accuracy. The UAV drones captured 1,876 raw RGB images, each with a resolution of 7,952 × 5,304 pixels, meeting the overlap standards. This process underscores the capability of UAV drones to generate extensive, high-resolution datasets for urban analysis.

Data Preprocessing Workflow

The raw imagery from UAV drones contained significant overlap, leading to data redundancy that required preprocessing before annotation. We employed image stitching algorithms to create orthomosaics, which were then standardized by cropping into 1,024 × 1,024 pixel tiles to meet neural network input requirements. This step involved excluding edge regions with minimal features, resulting in 480 high-quality samples for the dataset. The preprocessing workflow can be summarized as follows: raw image acquisition → POS data processing → orthomosaic generation → tile segmentation → sample selection. The efficiency of UAV drones in data collection allowed for a large initial pool, but careful curation was necessary to ensure dataset relevance.

To formalize the preprocessing, let $I_{raw}$ represent the raw image set, and $I_{ortho}$ the orthomosaic. The tiling process divides $I_{ortho}$ into $n$ tiles $T_i$ of size $1024 \times 1024$, where $i = 1, 2, \dots, n$. We filter out tiles with low feature diversity, resulting in a final set $S$ of annotated samples. This approach ensures that the dataset is optimized for training deep learning models, leveraging the high resolution provided by UAV drones.

Land Cover Classification Methodology

Classification Principles

Based on the imagery from UAV drones, we defined six land cover categories: buildings, impervious surfaces, vegetation, bare land, water, and cars. Buildings appear as regular geometric shapes with distinct contours; vegetation shows deep green tones; impervious surfaces include roads and hardened areas; water bodies are blueish; bare land is yellowish; and cars are small, movable objects. The high resolution from UAV drones allows for precise discrimination of these classes, which is essential for accurate urban mapping. In my view, this classification aligns with practical urban management needs, such as monitoring impervious surfaces for flood risk or tracking vegetation for green space planning.

Annotation Principles

Annotation was performed at the pixel level using a consistent standard to ensure quality. Unlike existing datasets, our focus on UAV drones imagery required handling complex urban features with high precision. We converted annotations to a compatible format using json_to_dataset tools, but the manual process was time-consuming due to the detail in UAV drones imagery. This highlights a challenge in utilizing UAV drones data: while they offer rich information, annotation efforts can be intensive. To address this, future work could explore semi-automated methods leveraging UAV drones’ real-time capabilities.

Dataset Division

The 480 samples were split into training and validation sets in a 5:1 ratio, with 400 for training and 80 for validation. The dataset structure is shown below:

Category	Path	Number of Images
Training Images	TMSK/train/images	400
Training Labels	TMSK/train/label	400
Validation Images	TMSK/validation/images	80
Validation Labels	TMSK/validation/label	80

A statistical analysis revealed imbalanced class distribution: vegetation covered 39.10%, impervious surfaces 30.96%, buildings 16.63%, bare land 11.29%, water 1.39%, and cars 0.63%. This imbalance, common in UAV drones datasets, poses challenges for model training, particularly for minority classes like cars and water. We address this through loss function adjustments, as discussed later.

Existing Urban Land Cover Datasets

To contextualize our work, we compared with public datasets like ISPRS Potsdam and Vaihingen, which offer aerial imagery at 0.05 m and 0.09 m resolution, respectively. These datasets cover diverse landscapes but may not fully capture the nuances of UAV drones imagery. In my assessment, UAV drones provide superior clarity and flexibility, making our dataset more suited for fine-grained urban analysis. The table below summarizes key differences:

Dataset	Resolution	Source	Classes
Potsdam	0.05 m	Aerial	6
Vaihingen	0.09 m	Aerial	6
Our TMSK	0.078 m	UAV Drones	6

Experimental Application and Model Training

We employed a Deep-UNet encoder-decoder architecture for semantic segmentation, training it on our TMSK dataset and comparing with Potsdam and Vaihingen. The model was configured with 300 epochs, a batch size of 2, and used an OHEM loss function combined with a Warmup learning rate strategy to handle class imbalance. The learning rate was adjusted with a power value of 0.9, 10 warmup epochs, and a warmup ratio of 0.1. This setup leverages the high-resolution input from UAV drones to improve edge detection and small-object recognition.

The training process minimizes the loss function $L$, defined as a combination of cross-entropy and OHEM. Let $y$ be the ground truth and $\hat{y}$ the prediction; the loss for a sample is:

$$ L = -\frac{1}{N} \sum_{i=1}^{N} \left[ y_i \log(\hat{y}_i) + (1 – y_i) \log(1 – \hat{y}_i) \right] $$

where $N$ is the number of pixels. OHEM focuses on hard examples, which is crucial for UAV drones imagery with diverse features. The Warmup strategy gradually increases the learning rate $lr$ from an initial value $lr_0$ over warmup epochs $t_w$:

$$ lr = lr_0 \cdot \left( \frac{t}{t_w} \right)^{0.9} $$

for $t \leq t_w$, then follows a decay schedule. This enhances model stability when processing data from UAV drones.

Accuracy Evaluation Results

We evaluated performance using accuracy (ACC), mean Intersection over Union (MIoU), and F1-score. These metrics are calculated as follows:

$$ ACC = \frac{TP + TN}{TP + TN + FP + FN} $$

$$ MIoU = \frac{1}{C} \sum_{c=1}^{C} \frac{TP_c}{TP_c + FP_c + FN_c} $$

$$ F1 = \frac{2 \cdot Precision \cdot Recall}{Precision + Recall} $$

where $Precision = \frac{TP}{TP + FP}$ and $Recall = \frac{TP}{TP + FN}$. Here, $TP$, $TN$, $FP$, and $FN$ denote true positives, true negatives, false positives, and false negatives, respectively, and $C$ is the number of classes.

The results on our TMSK dataset and Potsdam are compared in the table below, demonstrating the superiority of UAV drones-based data:

Metric	TMSK Dataset (%)	Potsdam Dataset (%)
ACC	96.7251	96.6428
F1-score	95.3622	95.2735
MIoU	90.3612	90.2463

These outcomes confirm that our dataset, derived from UAV drones, achieves higher accuracy due to better image clarity and annotation quality. The improvement in MIoU is particularly notable, indicating better segmentation of class boundaries, which is vital for urban applications like zoning or environmental assessment. In contrast, Vaihingen dataset yielded lower performance due to limited sample size, underscoring the value of extensive UAV drones data collection.

Discussion and Implications

The success of this study hinges on the effective use of UAV drones for high-resolution imagery. UAV drones enable rapid, cost-effective data acquisition that adapts to urban complexities, such as narrow streets or dense vegetation. From my perspective, the TMSK dataset fills a gap in urban land cover classification by providing a tailored resource for UAV drones imagery, which can be expanded with more classes or temporal data. However, challenges remain, including class imbalance and computational demands for processing UAV drones data. Future work could integrate multi-spectral sensors on UAV drones or employ transfer learning to enhance minority class recognition.

Moreover, the application of UAV drones extends beyond classification to real-time monitoring, such as tracking urban heat islands or assessing disaster damage. By combining UAV drones with deep learning, cities can implement dynamic management systems that respond to changing conditions. I believe that as UAV drones technology evolves, with advancements in autonomy and sensor integration, their role in urban studies will become even more pivotal.

Conclusion

In this research, I have presented a comprehensive method for urban land cover classification using high-resolution imagery from UAV drones. The construction of the TMSK dataset, comprising 480 annotated samples across six classes, provides a valuable resource for the remote sensing community. Through experiments with Deep-UNet and comparisons to benchmark datasets, we demonstrated that UAV drones-based data offers superior performance in terms of accuracy, F1-score, and MIoU. This validates the practicality of UAV drones for fine-grained urban analysis and supports their integration into sustainable city management practices.

Looking ahead, I envision further exploration of UAV drones capabilities, such as leveraging 3D reconstruction from UAV drones data or fusing multi-temporal imagery for change detection. The continuous improvement of UAV drones technology will undoubtedly drive innovations in urban land cover classification, making cities more resilient and livable. By embracing UAV drones, researchers and planners can unlock new insights into urban ecosystems, ultimately contributing to smarter, more sustainable urban futures.