High-Precision Mangrove Species Classification with DJI Drones

The sustainable management and conservation of mangrove ecosystems are of paramount global ecological importance. These vital coastal forests, serving as crucial buffers between terrestrial and marine environments, are characterized by a unique but often fragile biodiversity. Traditional field-based surveys for mapping mangrove species are notoriously challenging due to the difficult, muddy terrain, making them time-consuming and inefficient for large-scale monitoring. Consequently, remote sensing has emerged as the dominant tool for observing these ecosystems. However, a persistent challenge in achieving fine-scale species-level classification is the spectral similarity between different mangrove taxa. Many species exhibit nearly identical reflectance characteristics in common optical bands, leading to significant misclassification when using spectral information alone. Structural information, particularly canopy height, has proven to be a critical discriminator. While airborne LiDAR is the traditional source for such 3D data, its high cost and operational complexity limit widespread use. This study explores and validates a novel, cost-effective alternative: utilizing a consumer-grade DJI drone, the Phantom 4 Multispectral version, to simultaneously acquire both high-resolution spectral data and, through photogrammetry, detailed structural information for superior mangrove species classification.

The core innovation of this approach lies in the integrated sensor system of the DJI Phantom 4 Multispectral drone. This platform is equipped with a six-camera array: one RGB sensor and five monochromatic sensors capturing precise spectral bands essential for vegetation analysis—Blue (450±16 nm), Green (560±16 nm), Red (650±16 nm), Red-Edge (730±16 nm), and Near-Infrared (840±26 nm). A built-in sunlight sensor captures irradiance data, enabling radiometric calibration during processing to normalize lighting conditions. Crucially, a single flight mission collects both the multispectral imagery and the high-overlap RGB imagery necessary for 3D reconstruction. This dual-data acquisition capability is the foundation of the proposed methodology. The primary hypothesis is that a Canopy Height Model (CHM) derived from DJI drone RGB imagery via Structure-from-Motion (SfM) photogrammetry can effectively substitute for LiDAR-derived CHM. When combined with the multispectral data, this hybrid feature set should significantly enhance the accuracy of machine learning algorithms in discriminating between spectrally similar mangrove species.

The methodological workflow can be formalized into a series of sequential processing stages, denoted as functions of the raw data. Let the raw data acquired by the DJI drone be represented as $D_{raw} = \{I_{RGB}, I_{MS}\}$, where $I_{RGB}$ is the set of RGB images and $I_{MS}$ is the set of co-registered multispectral band images. The complete processing pipeline $F$ is described as follows:

$$F(D_{raw}) = C_{species} = ML_{\theta}( \Phi( P(I_{RGB}), R(I_{MS}) ) )$$

Where:

  • $P(\cdot)$ is the Photogrammetric Processing function generating the Digital Surface Model (DSM).
  • $R(\cdot)$ is the Radiometric Processing and Reconstruction function generating the orthorectified multispectral image stack.
  • $\Phi(\cdot)$ is the Feature Engineering function that generates the CHM from the DSM and combines it with spectral features.
  • $ML_{\theta}(\cdot)$ is the Machine Learning Classifier with parameters $\theta$ that produces the final species classification map $C_{species}$.

The first stage, $P(I_{RGB})$, involves SfM processing. The high-overlap RGB images are used to perform aerial triangulation, dense image matching, and 3D point cloud generation. The output is a high-resolution raster Digital Surface Model (DSM), representing the elevation of the topmost surface, including tree canopies and the ground. The function can be summarized as:

$$DSM = P(I_{RGB}) = \text{SfM\_Pipeline}(I_{RGB}, \text{Overlap} \geq 75\%)$$

Concurrently, the second stage, $R(I_{MS})$, processes the multispectral data. Using calibration panel data and the recorded solar irradiance, the radiance values for each single-band image are corrected to ground reflectance. These corrected bands are then mosaicked and orthorectified using the geometry from the SfM process, producing a seamless, multi-band orthomosaic $M_{MS}$:

$$M_{MS} = R(I_{MS}) = \text{Orthomosaic}( \text{Radiometric\_Calibrate}(I_{MS}) )$$

The critical feature engineering stage, $\Phi$, involves generating the Canopy Height Model (CHM). The CHM is defined as the height of objects above the ground. Therefore, a Digital Terrain Model (DTM) representing the bare earth elevation is required. We employ a simple yet effective ground filtering algorithm based on elevation thresholding on the DSM. The process is defined by:

$$DTM = \text{Interpolate}(\{p_i \in DSM \, | \, \text{elevation}(p_i) < T\})$$
$$CHM = DSM – DTM$$

Here, $T$ is an empirically determined threshold that segregates ground points from vegetation points. The selected ground points are then interpolated (e.g., using Inverse Distance Weighting or Kriging) to create a continuous DTM surface at the same resolution as the DSM. The final CHM is obtained via simple raster subtraction. The combined feature set for classification is then a stacked raster where each pixel contains its spectral values from $M_{MS}$ and its height value from $CHM$:

$$F_{stack} = \Phi(DSM, M_{MS}) = \text{Stack}([M_{MS}^{(B)}, M_{MS}^{(G)}, M_{MS}^{(R)}, M_{MS}^{(RE)}, M_{MS}^{(NIR)}, CHM])$$

For the classification stage $ML_{\theta}$, we employ a Gradient Boosting Decision Tree (GBDT) algorithm and compare its performance against other established classifiers. GBDT is an ensemble learning method that builds a strong predictive model by sequentially adding weak learners (typically decision trees) that correct the errors of the previous ensemble. The model aims to minimize a loss function $L$ (e.g., log-loss for classification). Given a dataset with $m$ training samples $\{(x_1, y_1), …, (x_m, y_m)\}$, where $x_i$ is the feature vector from $F_{stack}$ and $y_i$ is the species label, the GBDT model $f(x)$ is built iteratively:

1. Initialize the model with a constant value: $f_0(x) = \arg \min_c \sum_{i=1}^m L(y_i, c)$.
2. For iteration $t = 1$ to $T$:
a. Compute pseudo-residuals: $r_{ti} = – [\frac{\partial L(y_i, f(x_i))}{\partial f(x_i)}]_{f(x)=f_{t-1}(x)}$.
b. Fit a regression tree $h_t(x)$ to the targets $r_{ti}$.
c. For each leaf node $j$ in tree $h_t$, compute the output value: $\gamma_{tj} = \arg \min_{\gamma} \sum_{x_i \in R_{tj}} L(y_i, f_{t-1}(x_i) + \gamma)$.
d. Update the model: $f_t(x) = f_{t-1}(x) + \eta \cdot h_t(x, \gamma_{tj})$, where $\eta$ is the learning rate.

The final model is the additive combination of all weak learners: $f(x) = \sum_{t=0}^{T} \eta \cdot h_t(x)$. The GBDT algorithm’s ability to model complex, non-linear relationships and its robustness to various data types make it highly suitable for this task. Key parameters optimized include the number of trees ($T$), learning rate ($\eta$), and tree depth.

The study was conducted in a representative mangrove reserve area. The DJI drone was flown at an altitude of 100 meters, yielding a ground sampling distance of approximately 5 cm for the multispectral imagery. Flight planning ensured high overlap (75% frontal, 50% lateral) for robust 3D reconstruction. The area contained four dominant planted mangrove species: Sonneratia apetala, Rhizophora stylosa, Avicennia marina, and Bruguiera gymnorhiza, alongside bare mudflat areas. Field surveys provided ground truth data for training and validation. Based on the visual characteristics observed in the ultra-high-resolution DJI drone imagery, spectral-spatial profiles for each class were established.

Table 1: Representative Spectral-Spatial Characteristics from DJI Drone Imagery
Land Cover Class Spectral Appearance (Reflectance) Spatial/Textural Appearance Typical Height Range
Avicennia marina Moderate NIR, higher Green reflectance Dense, fine-textured canopy; extensive contiguous patches Low to Medium
Rhizophora stylosa Lower overall reflectance, deeper Green Clumped growth form; rounded canopy clusters Medium
Bruguiera gymnorhiza Bright NIR, uneven Green reflectance Irregular canopy texture; often mixed with R. stylosa Medium
Sonneratia apetala High NIR, high Red-Edge Very rough, heterogeneous canopy texture; tall emergent crowns High
Bare Mudflat Low, flat reflectance across all bands Very smooth, homogeneous texture Ground Level (CHM ≈ 0)

Following the pipeline $F$, the DSM was generated from the RGB imagery. A threshold filter successfully isolated ground points in the relatively flat tidal zone, allowing for the creation of a reliable DTM and subsequently, a detailed CHM. The CHM clearly delineated the vertical structure, with Sonneratia apetala stands appearing as distinct tall towers compared to the lower, more uniform canopy of Avicennia marina. The multispectral orthomosaic provided the five spectral bands. The combined 6-layer feature stack (5 spectral + 1 height) was then used for classification.

We evaluated four machine learning algorithms: k-Nearest Neighbors (KNN), Decision Tree (DT), Random Forest (RF), and Gradient Boosting Decision Tree (GBDT). For each, two scenarios were tested: classification using only multispectral features (MS-only) and classification using the combined multispectral and CHM features (MS+CHM). A large and balanced sample set was used for training and validation. The performance was assessed using Overall Accuracy (OA) and the Kappa coefficient. The results conclusively demonstrate the value of the CHM derived from the DJI drone photogrammetry.

Table 2: Classification Accuracy Comparison for Different Feature Sets and Algorithms
Classifier Feature Set Overall Accuracy (OA %) Kappa Coefficient Notable Improvement
K-Nearest Neighbor (KNN) MS-only 72.5 0.624 OA increased by 11.1%
MS+CHM 83.6 0.768
Decision Tree (DT) MS-only 70.7 0.603 OA increased by 7.2%
MS+CHM 77.9 0.695
Random Forest (RF) MS-only 71.9 0.622 OA increased by 12.2%
MS+CHM 84.1 0.777
Gradient Boosting (GBDT) MS-only 72.5 0.622 OA increased by 12.0%
MS+CHM 84.5 0.779

The GBDT classifier achieved the highest accuracy of 84.5% when using the combined feature set. The quantitative analysis reveals several key findings. First, the integration of the CHM feature caused a substantial improvement in classification accuracy for all algorithms, with OA gains ranging from 7.2% to 12.2%. This unequivocally validates the central hypothesis that structural height is a powerful discriminator for mangrove species. Second, the performance of the CHM generated from the affordable DJI drone imagery was highly effective, demonstrating its potential as a practical substitute for expensive LiDAR data in providing crucial 3D information for vegetation mapping. Third, among the classifiers, the ensemble methods (RF and GBDT) outperformed the simpler models (KNN and DT), with GBDT showing a slight edge, likely due to its more sophisticated boosting mechanism that effectively reduces bias and variance.

A per-class analysis of the GBDT MS+CHM results provides deeper insight. The accuracy for the tall, structurally distinct species Sonneratia apetala saw the most dramatic improvement, exceeding 40 percentage points compared to the MS-only classification, reaching over 97% accuracy. This highlights the scenario where spectral confusion is high but height difference is significant. Species with similar heights and intermixing patterns, such as Rhizophora stylosa and Bruguiera gymnorhiza, remained the most challenging to separate, although their classification accuracy still improved with the addition of the CHM. The ubiquitous Avicennia marina was also mapped with higher confidence (over 92% accuracy). Misclassifications primarily occurred at the boundaries between species patches and in areas of complex canopy intermixing, which is a common limitation in pixel-based classification approaches.

In conclusion, this study successfully demonstrates a streamlined, cost-effective, and highly accurate workflow for fine-scale mangrove species classification. The integrated DJI Phantom 4 Multispectral drone platform is the cornerstone of this approach, enabling the synchronous collection of coregistered spectral and high-resolution RGB imagery in a single deployment. The photogrammetric processing of the RGB data to generate a detailed Canopy Height Model (CHM) has proven to be a highly viable alternative to airborne LiDAR for capturing essential vegetation structure. The mathematical combination of this structural feature with the multispectral reflectance values creates a robust feature space that significantly mitigates the problem of spectral similarity among mangrove species. The application of the advanced Gradient Boosting Decision Tree algorithm further optimizes the classification outcome. The final achieved high accuracy underscores the operational viability of this method. This DJI drone-based framework offers a powerful tool for mangrove researchers and conservation managers, facilitating rapid, repeatable, and precise monitoring of species composition, which is fundamental for effective ecosystem assessment, restoration planning, and biodiversity conservation.

Scroll to Top