Abstract
Mangroves provide essential ecological benefits, and accurate classification is vital for their protection. This study used 2023 Landsat 8 SR data within the Google Earth Engine (GEE) platform to classify mangrove and non-mangrove areas in the Farasan Islands Protected Area in Saudi Arabia. Machine learning models, Random Forest (RF), Support Vector Machine (SVM), eXtreme Gradient Boost (GB), and an ensemble approach were employed using spectral indices such as NDVI, MNDWI, SR, GCVI, and LST. The ensemble model achieved an overall accuracy (OA) of 92.2% and a kappa coefficient (KC) of 0.84. The models, RF had an OA of 91.4% and KC of 0.82, SVM had 88.3% OA and 0.76 KC, and GB recorded 86.7% OA and 0.73 KC. Ground truth cross-validation was conducted using high-resolution satellite imagery from Google Earth, combined with an NDVI overlay derived from Landsat 8 data. This approach confirmed the accuracy of the models in detecting dispersed mangrove patches, which are often missed in global datasets. This workflow can enhance conservation efforts and support sustainable mangrove management.
Similar content being viewed by others
Introduction
Mangrove forests, found in the tropics and subtropics, are essential ecosystems that bridge terrestrial and marine environments1,2,3. These ecosystems are among the most productive and valuable ecosystems in the world4,5, providing crucial ecological services, including carbon storage, shoreline stabilization, and support for diverse biological communities6,7,8. Despite their importance, mangroves are under significant threat from human activities and climate change, leading to habitat loss and degradation9. In recent decades, regional and global pressures have further endangered marine ecosystems10.
Spalding et al. (2010) conducted a comprehensive global assessment and reported that mangroves covered approximately 152,000 square kilometers11 worldwide during the year 2000. This estimate has been widely referenced in subsequent studies and conservation efforts12,13. Their analysis, which is based on satellite imagery and ground truth data, remains one of the most cited sources of information on global mangrove extent. A later study by Giri et al. (2011), utilizing Landsat satellite data, provided an updated global mangrove distribution map, estimating the coverage at 137,760 square kilometers14. This study refined previous estimates by using higher-resolution imagery and a more consistent classification approach, highlighting the precision improvements in remote sensing technologies over time. Similarly, the Global Mangrove Watch (GMW) project estimated that the global mangrove extent was 137,600 km2 in 2016 and 147,359 km2 in 2020, covering 14.93% of the global coastline15. Using synthetic aperture radar (SAR) and optical imagery, GMW provided updated and more accurate data, as SAR can penetrate cloud cover and capture data under various environmental conditions, including during both day and night. In 2020, GMW reported that Saudi Arabia’s mangrove habitat was 77.1 km2, 11.86% of its 7141.5 km coastline. GMW also noted a 23 km2 decline in Saudi mangroves from 1996 to 2020. Despite these losses, Saudi Arabia’s two mangrove species are not classified as threatened on the IUCN Red List.
In Saudi Arabia, where mangrove habitats are primarily concentrated along the Red Sea coast, local environmental factors such as arid climates and coastal development pose unique challenges for mangrove conservation16. Relying solely on global datasets for mangrove extent data can lead to inaccuracies, as they may not fully capture the local ecological variations. ML techniques, such as RF, SVM, and GB, offer a more tailored solution by integrating high-resolution satellite data, such as Landsat or Sentinel imagery, with specific local training datasets to produce more accurate and up-to-date mangrove classifications17,18. These methods enable the detection of subtle changes in mangrove cover, providing essential insights for managing and protecting Saudi Arabia’s mangroves in the face of ongoing environmental pressures. By applying ML, researchers can develop more refined models that outperform global assessments, making them indispensable for conservation efforts.
This research aims to enhance the precision of mangrove classification on Farasan Island by integrating Landsat 8 Surface Reflectance (SR) with advanced ML techniques, including RF, SVM, and GB. An ensemble method based on majority voting, which combines predictions from multiple models, is employed to further improve classification accuracy. By comparing the ML-derived mangrove classifications with the global mangrove datasets, this study seeks to generate more accurate estimates of mangrove extent, offering a reliable foundation for ecosystem monitoring. The methodology encompasses comprehensive data cleaning, model training, validation, and accuracy assessment to ensure robust and scientifically credible results.
This study is motivated by the urgent need to detect undocumented fragmented mangrove patches. The Farasan Islands, with their arid climate and scattered mangrove stands, present a distinct challenge for remote sensing classification. By applying pixel-based machine learning classifiers and integrating multiple spectral and thermal indices, this research aims to enhance classification accuracy and support localized conservation strategies within a protected island environment.
Materials and methods
Study area
The study area covers the mangrove ecosystem of the Farasan Islands, located in the Red Sea off the southwestern coast of Saudi Arabia (see Fig. 1). The Farasan Islands were declared a protected marine area in 1996 and are managed for biodiversity conservation, particularly for their rich marine life and unique mangrove ecosystems19. The study area is centered at approximately 16°48’ N latitude and 41°59 E longitude20. The islands are characterized by an arid, subtropical climate, with annual rainfall ranging from 50 to 100 mm and temperatures ranging from 21 °C (winter) to 40 °C (summer)21. The diverse ecosystems of the Farasan Islands, including coral reefs, mangroves, and seagrass beds, make them critical sites for ecological research22. The study area is divided into four regions: A, B, C, and D. This division helps in understanding the spatial variations in mangrove analysis, as shown in Table 1. Mangroves are primarily distributed along sheltered coastal areas, such as shallow bays, inlets, and intertidal zones, where they find favorable conditions for thrive.
Datasets
This section outlines the datasets used for mangrove classification. It begins by detailing the collection and preparation process for the training and reference samples, followed by a description of the satellite datasets.
Training and test samples
A total of 12,033 pixel samples were used to classify mangrove and non-mangrove areas (Table 1). These samples were divided into 60% for training and 40% for testing to ensure robust model performance. To improve classification accuracy, NDVI data overlaid with high-resolution imagery (Google Earth) was used for visual inspection of coastal areas during sample data collection. Non-mangrove samples included categories such as coastal seagrasses, open water, grassland, and areas with barren ground. A stratified random sampling method was employed to reduce bias. Training and testing results indicate model performance but do not guarantee perfect classification accuracy. For independent accuracy assessment, 385 reference sample points were generated using a simple random sampling design. These points were visually interpreted and classified into mangrove and non-mangrove classes using Google Earth.
Satellite data
In this study, Landsat 8 Surface Reflectance (SR) OLI/TIRS23 images (Path 167, Row 48), covering the mangroves of the Farasan Islands, were utilized. A total of 23 scenes were processed in the Google Earth Engine (GEE) environment to generate an annual median mosaic for the year 2023, based on the satellite’s 16-day revisit interval. Atmospheric corrections were inherently applied in the Level 2 Tier 1 Surface Reflectance dataset. Cloud and shadow contamination were removed during preprocessing using Quality Assessment (QA) bands.
Although Sentinel-2's 10m spatial resolution offers advantages for detecting small and fragmented mangrove patches, Landsat 8 was selected for this study due to its thermal infrared bands, which are essential for deriving Land Surface Temperature (LST). Incorporating LST as an input variable enhances classification accuracy by highlighting thermal contrasts between mangroves and surrounding land, water, and non-mangrove vegetation.
Data analysis
The workflow depicted in Fig. 2 illustrates the process of mangrove classification using Landsat 8 data. The process begins with data cleaning, followed by the calculation of spectral indices, including the Normalized Difference Vegetation Index (NDVI), Modified Normalized Difference Water Index (MNDWI), Simple Ratio (SR), Green Chlorophyll Vegetation Index (GCVI), and Land Surface Temperature (LST). An annual median mosaic for 2023 was created by calculating the median value of each pixel for each spectral band, reducing seasonal variations and providing a consistent input for classification models. An elevation mask was used to exclude mangroves unsuitable regions, focusing on coastal and intertidal zones where mangroves typically thrive, thereby enhancing classification accuracy by reducing misclassification. The spectral indices were normalized to a 0–1 range (using min-max normalization) and used as input for a machine learning (ML) model to classify areas into mangrove and non-mangrove classes. To further improve classification accuracy, an ensemble approach using majority voting which combines predictions from multiple models was implemented.
Spectral indices
Multiple standard spectral indices were generated to serve as input variables in the machine learning (ML) classification, each highlighting different land cover properties.
Normalized difference vegetation index (NDVI): The NDVI is widely used to assess vegetation photosynthetic activity24 and was calculated using Eq. 1):
Modified normalized difference water index (MNDWI): The MNDWI enhances open water detection by suppressing the reflectance of vegetation and soil25. It was calculated using Eq. (2):
Simple ratio (SR): The SR index provides a basic measure of vegetation density by calculating the ratio of NIR to red reflectance26, using Eq. (3):
Green chlorophyll vegetation index (GCVI): The GCVI is particularly sensitive to variations in chlorophyll concentration27, and was calculated using Eq. (4):
Land surface temperature (LST): LST was calculated by first converting the raw digital number (DN) to top-of-atmosphere (TOA) radiance using multiplicative and additive rescaling factors28. The TOA radiance was then converted to brightness temperature in Kelvin using Planck’s law and calibration constants. Finally, LST was calculated from the brightness temperature and surface emissivity29 using Eq. (5):
where Tb is the brightness temperature in Kelvin; λ is the wavelength of the emitted radiance; ρ is a constant; and \(\epsilon\) is the surface emissivity.
Classification
This study employed the RF30, SVM31, and GB32 classifiers, and an ensemble method using majority voting was applied to classify mangrove and non-mangrove pixels. Each classifier was optimized through hyperparameter tuning: RF utilized bootstrap aggregation, SVM employed a radial basis function kernel for nonlinear classification, and GB applied gradient boosting. For the RF classifier, the number of trees was varied from 5 to 200 in increments of 5 to identify the optimal count that maximizes classification accuracy while maintaining computational efficiency. Similarly, for the SVM and GB models, key parameters such as the RBF kernel in the SVM and the number of trees and learning rate in GB were fine-tuned to achieve the best results. Focal mode filtering was used to smooth the classification outputs. Although deep learning models like CNNs and U-Net are effective for image classification and segmentation, they were not used in this study as the classification was conducted on the Google Earth Engine (GEE) platform, which does not support deep learning frameworks like TensorFlow or PyTorch. GEE is optimized for scalable geospatial analysis, focusing on traditional machine learning models for tasks that involve spectral indices and large-scale environmental data processing.
The model’s performance was evaluated using precision Eq. (6), recall Eq. (7), and the F1 score Eq. (8) for mangrove (M) and non-mangrove (NM) classifications33.
Independent accuracy assessment
Both visual and statistical methods were employed to evaluate the accuracy of the mangrove ecosystem map. High-resolution satellite imagery from Google Earth, combined with an NDVI overlay derived from Landsat 8 data, was used for visual interpretation to validate classified mangrove areas. For the statistical accuracy assessment, 385 independent test samples were used to generate a confusion matrix, from which key metrics such as producer accuracy (PA), user accuracy (UA), overall accuracy (OA), and the kappa coefficient (KC) were calculated using Eqs. (9–12)34. The PA indicates how well mangrove patches are classified, whereas the UA reflects how accurately the map represents real conditions in the study area. The kappa coefficient measures classification accuracy while accounting for the possibility of random agreement.
where Po is the observed accuracy and Pε is the expected accuracy.
Gini index assessment
To calculate the Gini index35, we implemented Eq. (13) for the Gini coefficient based on the distribution of values for both the mangrove and nonmangrove classes.
where G is the Gini coefficient, n is the number of data points, and xi represents the values of the variable sorted in ascending order, and i is the rank of each value in the sorted list.
Results
Mangroves class
The results presented in Table 2 show the variation in mangrove area across the four groups (A, B, C, and D). Using the ensemble method, the total mangrove area was estimated at 4.78 km2. Group B had the largest area at 3.08 km2, followed by Group C with 1.13 km2, Group D with 0.31 km2, and Group A with 0.26 km2.
Figure 3 illustrates the mangrove class estimated using machine learning (ML) models, with pixels classified as mangroves shown in red (RF), orange (SVM), and blue (GB). Regions where two classifiers agree are displayed in pink (RF + GB), yellow (RF + SVM), and cyan (SVM + GB). Areas where all three classifiers (RF, SVM, and GB) consistently classify the same mangrove pixels are highlighted in green. A visual comparison of these classifications with the reference GMW data demonstrates the strong reliability of the ML models. Notably, the higher concentration of green pixels, particularly in core mangrove regions, indicates strong consensus among the models.
The iteration results for the machine learning (ML) classifiers reveal consistent variation in the number of classified mangrove pixels across ten iterations. RF consistently detected the highest number of mangrove pixels, ranging from 5458 to 5562, while SVM detected between 5283 and 5349 pixels. Although GB classified fewer pixels, ranging from 5033 to 5144, it still demonstrated substantial accuracy. These results emphasize the robustness of the RF model for mangrove classification, with GB providing a more conservative estimate. The variability observed across classifiers highlights the importance of ensemble methods in improving mangrove mapping accuracy.
For statistical validation of model comparisons, the Shapiro-Wilk test indicated that the performance differences between the ensemble model and individual classifiers were not normally distributed, as all p-values were significantly below 0.05. Due to this violation of the normality assumption, the Wilcoxon signed-rank test was used as a non-parametric alternative. The Wilcoxon test revealed statistically significant performance differences, with all p-values below 0.05, confirming that the ensemble model’s better performance was not due to random variation. These results provide strong evidence supporting the effectiveness of the ensemble approach for mangrove mapping, independent of normality assumptions.
Accuracy assessment
Model performances
Using 40% of the reference points, F1 scores, precision and recall were computed for each classifier. The RF model achieved F1 scores of 92.5% ± 1.3% (NM) and 89.4% ± 1.5% (M). SVM showed F1 scores of 90.7% ± 1.4% (NM) and 85.7% ± 1.8% (M). GB had F1 scores of 89.4% ± 1.4% (NM) and 83.7% ± 1.7% (M). The ensemble model improved F1 scores to 93.0% ± 1.3% (NM) and 89.7% ± 1.5% (M), indicating enhanced overall performance (Fig. 4).
Independent accuracy assessment
The ensemble model achieved an overall accuracy of 92.2% ± 1.3%, RF (91.4% ± 1.5%), SVM (88.3% ± 1.7%), and GB (86.7% ± 1.8%). The ensemble also recorded the highest Mangrove User’s Accuracy (M_UA) of 87.7% ± 1.7% and Mangrove Producer’s Accuracy (M_PA) of 92.9% ± 1.4%, demonstrating its superior capability in accurately classifying mangrove areas across different density gradients (Fig. 5).
The kappa coefficients are 0.82 ± 0.03 (RF), 0.76 ± 0.04 (SVM), 0.73 ± 0.04 (GB), and 0.84 ± 0.03 (ensemble), with the ensemble showing the highest agreement beyond random chance.
Cross-Validation of Mangrove Classification at Sites A and B Using Field Data and High-Resolution Imagery.
For cross-validation, two sites (a) and (b), were selected (Fig. 3) to represent diverse mangrove density gradients, including core and sparse patches. Site (a) Fig. 6, features mangroves predominantly thriving along the coastal boundary, with scattered patches extending inland. Site (b) (Fig. 7) is a small island where mangroves form coverage along the northwest and southeastern regions, while narrower mangrove belts are visible in the northern and southern sections. These variations illustrate the spatial diversity in mangrove ecosystems captured by the classifiers. For cross-validation, 30 reference sample points were used at site (a) and 22 points at site (b). These sites were analyzed via high-resolution GeoEye imagery to verify the accuracy of classification outputs. Ground-truth data collection targeted both mangrove and non-mangrove areas based on prior remote sensing analysis. To ensure data quality and minimize observer bias, the classified pixels were verified through imagery interpretation using multiple sources, including Google Earth, GeoEye imagery, and focus group discussions.
At site (a) (16°52’10 "N, 41°58’53"E), our analysis revealed a mangrove extent of 754,400 square meters, compared to the 512,976 square meters reported by the GMW, indicating an increase of 241,424 square meters. This underscores the importance of localized training samples, which enhance the ability of ML classifiers to detect dispersed mangroves across varying density gradients.
Figure 7 shows site (b) (16°48’4"N, 41°59’46"E), analysis revealed 524,400 square meters mangroves area. Notably, on the northwestern side of the site, several scattered patches of mangroves that were not captured by the GMW data were successfully identified. However, at NE side, a narrow strip of mangroves was not captured by either the GMW or our analysis. This omission is due to the resolution limitations of Landsat datasets, which have impacted the detection of such small or fragmented patches.
Gini coefficient analysis
The Gini coefficient was calculated via the RF model to assess the distributional inequality of various environmental indices across mangrove and nonmangrove areas (Table 3).
For NDVI, the Gini coefficient was slightly higher in mangrove areas (0.038) than in non-mangrove areas (0.033), indicating greater variability in vegetation health within mangrove zones. Surface reflectance (SR) showed greater variability in mangrove areas (0.090) than in non-mangrove areas (0.060). In contrast, chlorophyll content (GCVI) had higher variability in non-mangrove areas (0.083) than in mangrove areas (0.074). LST exhibited the lowest Gini values, reflecting more uniform temperatures in non-mangrove areas (0.014) than in mangrove areas (0.024).
Discussion
ML integrated with spectral indices for mangrove mapping
In this study, the ML classifiers RF, SVM, GB and the ensemble approach were employed for mangrove mapping using Landsat 8 Surface Reflectance (SR) OLI/TIRS data, covering the period from January to December 2023 on Farasan Island, Saudi Arabia. Data collection and processing were conducted within the Google Earth Engine (GEE) environment. Although deep learning models like CNNs are useful for mangrove classification, they were not used in this study due to the requirement of frameworks (e.g., TensorFlow, PyTorch), which Google Earth Engine does not support36. This study emphasizes the utility of spectral indices such as the NDVI, MNDWI, SR, GCVI, and LST. This approach of using spectral indices offers significant advantages over traditional methods such as land use/land cover (LULC), field surveys, and supervised classification. By leveraging the unique reflectance characteristics of mangroves, spectral indices enable more accurate and detailed classification, particularly in detecting scattered and low-density patches.
The proposed method achieved high overall accuracy (OA) and kappa coefficient (KC), demonstrating its effectiveness for mapping mangroves in remote areas such as Farasan Island, where extensive fieldwork is challenging. This approach is particularly valuable when ground truth data are difficult to obtain due to logistical constraints. While reference samples derived from high-resolution satellite imagery using Google Earth through visual interpretation are widely used, this technique can introduce uncertainties, especially in areas with high chlorophyll concentrations that may obscure mangrove boundaries37. In such cases, overlaying NDVI data proved highly beneficial, as the NDVI effectively distinguishes vegetation health and density, reducing misclassification and improving the accuracy of mangrove delineation38.
One of the key advantages of using spectral indices is their ability to detect mangroves with varying vegetation densities, including low-density and dispersed patches. This phenomenon was demonstrated in research by Pham et al. (2020), who showed that machine learning (ML) algorithms combined with spectral indices outperformed traditional land use/land cover (LULC) mapping in terms of accuracy and precision, particularly in complex ecosystems such as mangroves39. By integrating spectral indices, that capture vital information about vegetation health (NDVI), water content (MNDWI), and thermal properties (LST), the proposed approach provides a multidimensional understanding of mangrove ecosystems. This capability surpasses LULC maps, which often rely on broad land cover categories and fail to capture the subtle distinctions between mangrove and nonmangrove areas.
The application of an ensemble of ML models (RF, SVM, GB) enhances the robustness of the classification. As demonstrated by recent studies40, combining different models improves classification accuracy and minimizes errors, particularly in complex environments where single models may fall short. For example, in this study, the ensemble method produced an overall accuracy of 92.2%, outperforming individual classifiers such as RF (91%) and SVM (89%). This result underscores the superiority of using ensemble techniques in conjunction with spectral indices for detailed and accurate mangrove mapping.
Landsat 8 (SR) compared with other satellite data for mangrove classification
Landsat 8 Surface Reflectance (SR) OLI/TIRS data, with a 30-meter spatial resolution and a 16-day revisit cycle, provides a good balance between spatial detail and temporal frequency, making it well suited for ecosystem monitoring in remote areas such as the Farasan Islands. However, compared to Sentinel-2, which offers a finer 10-meter resolution, Landsat 8 may miss narrow and linear mangrove patches, as shown in yellow polygons (Fig. 8).
Illustration of missing narrow mangrove (highlighted in yellow polygons) due to the limitations of Landsat’s 30-m spatial resolution. The classification results from multiple machine learning models (Random Forest [RF], Gradient Boosting [GB], Support Vector Machine [SVM], and Ensemble) are displayed over Sentinel-2 MSI imagery for sites A, B, C, and D, showing the impact of spatial resolution on mangrove mapping.
The higher resolution of Sentinel-2 allows for better detection of smaller vegetation structures41. Additionally, Sentinel-2’s five-day revisit cycle is significantly shorter than that of Landsat 8, making it more effective for detecting short-term environmental changes, which is crucial for monitoring dynamic ecosystems such as mangroves that may undergo rapid alterations due to environmental factors or human interference42. But, Sentinel-2 imagery lacks thermal bands, which limits its ability to capture the thermal properties of mangroves43, an essential aspect for differentiating mangroves from surrounding land and water environments. Mangroves are typically situated in complex transitional zones between land and water, where thermal data can provide valuable insights. Landsat 8’s Thermal Infrared Sensor (TIRS) plays a critical role in detecting these thermal variations, making it more suitable for comprehensive mangrove assessments, particularly in regions where land–water interactions are prominent44. Research has shown that mangrove areas often exhibit unique thermal properties due to their intertidal location and proximity to water bodies, which makes LST a valuable tool for monitoring these ecosystems45. Additionally, Landsat 8 benefits from atmospheric correction algorithms such as the Landsat surface reflectance Tier 1 product, which improves data quality by minimizing atmospheric interference, especially in regions such as the Farasan Islands, where arid conditions and coastal dynamics can affect image clarity46. In contrast, Sentinel-2, despite its finer resolution, is more prone to atmospheric contamination in areas with persistent cloud cover47.
Very high-resolution datasets, such as GeoEye and SPOT, provide even greater spatial detail, with resolutions as fine as 0.5 meters. This makes them excellent for capturing fine-scale variations in mangrove ecosystems and detecting smaller, fragmented patches that may be missed by Landsat or Sentinel imagery. However, these high-resolution datasets have significant limitations. First, they are expensive, especially for large-scale or long-term monitoring projects48. Second, their availability is typically limited to specific areas, and they do not provide regular global coverage, making them unsuitable for continuous monitoring or time series analysis49. Moreover, high-resolution data sources such as GeoEye and SPOT are not integrated into cloud-based platforms such as GEE, which limits their utility for large-scale automated analysis. The absence of thermal bands in these datasets further restricts their application in studies that require LST data, which is essential for assessing mangrove health and environmental stress50. These factors make high-resolution datasets less practical for operational monitoring, despite their superior spatial detail.
Model performance and results comparison with global mangrove maps
The classification results show variability in mangrove area estimates across groups and models, reflecting the strengths and limitations of each. The RF model identified the largest total mangrove area (4.92 km2), outperforming SVM (4.76 km2) and GB (4.56 km2), while the ensemble model provided an estimate (4.78 km2). The scattered and fragmented nature of mangroves on Farasan Island poses challenges, particularly in low-density patches, where model misclassifications occurred. Additionally, medium-resolution Landsat 8 data limit the detection of fine-scale mangrove features, impacting boundary accuracy. In this study, biases were mitigated by adding training samples at locations where mangroves existed but were not initially classified. This approach helped improve model sensitivity to low-density and fragmented patches, reducing misclassification errors.
The RF model’s superior performance is supported by its highest F1-scores for mangroves as well as for nonmangroves, outperforming both GB and SVM in these categories. The robustness of an RF can be attributed to its nature, which minimizes overfitting by aggregating decisions from multiple decision trees, leading to better generalization. In the independent accuracy assessment, the RF model achieved the highest overall, User’s and Producer’s accuracies for the mangrove class. This finding indicates that the RF effectively balances precision and recall, reducing classification errors. The RF model was the most reliable, consistent with findings from other studies51,52,53,54. On the other hand, SVM and GB, achieved lower accuracies. RF’s ability to handle complex, nonlinear relationships between spectral indices and mangrove presence, along with its resilience to noise, makes it more reliable for mangrove mapping. This highlights the importance of using models like RF for ecological applications where accurate vegetation classification is crucial.
Comparing our results with global mangrove datasets helps highlight the strengths and limitations of the study. The Global Mangrove Watch (GMW) datasets feature a spatial resolution of ~25 meters, derived from multi-sensor satellite data. ALOS PALSAR radar (25m) enables cloud-penetrating mangrove structure mapping, while Landsat optical imagery (30m) supports spectral analysis of health and extent. We used 30-meter Landsat 8 (SR) data, deriving multiple spectral indices for input into machine learning classifiers, which allowed for more localized and detection of dispersed mangrove patches. Figures 6 and 7 illustrate the classified mangrove areas, highlighting the additional extents identified by our ML approach. We also compared our results with the Global Mangrove Forests Distribution, 2000 dataset55, which used hybrid supervised and unsupervised classification techniques on Landsat-5 imagery at 30m spatial resolution, finding that it estimated a significantly lower mangrove area for the Farasan Island region than our ensemble model.
These findings suggest that while global datasets offer valuable insights into large-scale mangrove distribution, local assessments using region-specific data are essential for accurately quantifying mangrove extent. This is particularly important in arid coastal regions such as Farasan Island, where mangrove ecosystems are scattered and sensitive to climate change.
Novel contributions relative to existing mangrove study in Saudi Arabia
Compared to the study conducted in the Asir region56, which focused on large-scale mangrove mapping and identifying priority reforestation areas based on spectral index trend analysis, this study emphasizes the fine-scale classification of fragmented mangrove patches in an island ecosystem, a geomorphologically and ecologically distinct setting where detection is often hindered by ecological heterogeneity. This locally calibrated approach addresses critical gaps in global assessments and supports more targeted conservation efforts in Saudi Arabia’s protected island ecosystem.
Future research directions
Given the critical role of mangroves in protecting Saudi Arabia’s coastal ecosystems, future research should focus on assessing the long-term impacts of climate change on mangrove health and identifying potential restoration strategies. One promising direction is leveraging multi-source remote sensing (e.g., Sentinel-1 SAR, ICESat-2 LiDAR) to analyze changes in mangrove biomass and canopy structure. Developing AI-driven mangrove restoration plans using deep learning to model optimal seedling survival rates under different hydrological and climatic conditions could revolutionize large-scale mangrove rehabilitation efforts, ensuring long-term sustainability.
Conclusions
This research demonstrates the effectiveness of integrating machine learning (ML) classifiers with spectral indices for accurate mangrove mapping, successfully identifying both dense and scattered patches. The spectral data-based approach enhances classification accuracy and reduces errors, particularly in remote areas. The results indicate that the Random Forest (RF) classifier outperforms others, providing the most reliable estimates. Additionally, the study highlights the limitations of global datasets, emphasizing the importance of local, high-resolution assessments to improve mangrove conservation efforts. These findings also support sustainable management of fisheries, carbon sequestration, and coastal protection.
Data availability
The data that support the findings of this study are available at the following GitHub repository: https://github.com/zubairgis/Farasan_Mangrove. This includes the mangrove classification data developed using machine learning classifiers. The Landsat-8 satellite images used in this study are publicly accessible from the USGS Earth Explorer platform.
Abbreviations
- ML:
-
Machine learning
- RF:
-
Random forest
- SVM:
-
Support vector machine
- GB:
-
eXtreme gradient boost
- GEE:
-
Google earth engine
- LST:
-
Land surface temperature
- MNDWI:
-
Modified normalized difference water index
- NDVI:
-
Normalized difference vegetation index
- SR:
-
Simple ratio
- GCVI:
-
Green chlorophyll vegetation index
- NIR:
-
Near-infrared
- RED:
-
Red band
- GREEN:
-
Green band
- M:
-
Mangrove
- NM:
-
Non-mangrove
- QA_PIXEL:
-
Quality assurance pixel
- GMW:
-
Global mangrove watch
References
Alongi, D. M. The impact of climate change on mangrove forests. Curr. Clim. Change Rep. 1, 30–39. https://doi.org/10.1007/s40641-015-0002-x (2015).
Thomas, N. et al. Distribution and drivers of global mangrove forest change, 1996–2010. PLoS ONE 12, e0179302. https://doi.org/10.1371/journal.pone.0179302 (2017).
Wang, L., Jia, M., Yin, D. & Tian, J. A review of remote sensing for mangrove forests: 1956–2018. Remote Sens. Environ. 231, 111223. https://doi.org/10.1016/j.rse.2019.111223 (2019).
Alongi, D. M. Carbon cycling and storage in mangrove forests. Annu. Rev. Mar. Sci. 6, 195–219. https://doi.org/10.1146/annurev-marine-010213-135020 (2014).
Duke, N. C. et al. A world without mangroves?. Science 317, 41–42. https://doi.org/10.1126/science.317.5834.41b (2007).
Tregarot, E. et al. Mangrove ecological services at the forefront of coastal change in the French overseas territories. Sci. Total Environ. 763, 143004. https://doi.org/10.1016/j.scitotenv.2020.143004 (2021).
Shing, Y. L. et al. Ecological role and services of tropical mangrove ecosystems: a reassessment. Glob. Ecol. Biogeogr. 23, 726–743. https://doi.org/10.1111/geb.12155 (2014).
Barbier, E. B. The protective service of mangrove ecosystems: A review of valuation methods. Mar. Pollut. Bull. 109, 676–681. https://doi.org/10.1016/j.marpolbul.2016.01.033 (2016).
Valiela, I., Bowen, J. L. & York, J. K. Mangrove forests: One of the world’s threatened major tropical environments. Bioscience 51, 807–815 (2001).
Hamilton, S. E. & Casey, D. Creation of a high spatiotemporal resolution global database of continuous mangrove forest cover for the 21st century (CGMFC-21). Glob. Ecol. Biogeogr. 25, 729–738. https://doi.org/10.1111/geb.12449 (2016).
Spalding, M., Kainuma, M. & Collins, L. World Atlas of Mangroves (Earthscan, London, 2010). https://doi.org/10.4324/9781849776608.
Saintilan, N., Wilson, N. C., Rogers, K., Anusha, R. & Krauss, K. W. Mangrove expansion and salt marsh decline at mangrove poleward limits. Glob. Change Biol. 20, 147–157. https://doi.org/10.1111/gcb.12341 (2013).
Ximenes, A. C. et al. A comparison of global mangrove maps: Assessing spatial and bioclimatic discrepancies at poleward range limits. Sci. Total Environ. 860, 160380. https://doi.org/10.1016/j.scitotenv.2022.160380 (2023).
Giri, C. et al. Status and distribution of mangrove forests of the world using earth observation satellite data. Glob. Ecol. Biogeogr. 20, 154–159. https://doi.org/10.1111/j.1466-8238.2010.00584.x (2010).
Bunting, P. et al. The global mangrove watch—A new 2010 global baseline of mangrove extent. Remote Sens. 10, 1669. https://doi.org/10.3390/rs10101669 (2018).
Elmahdy, S. I., Al-Kindi, S. & Taha, A. Y. Assessment of mangrove forests in arid climates using remote sensing and ML. Remote Sens. Appl. Soc. Environ. 25, 100662. https://doi.org/10.1016/j.rsase.2022.100662 (2022).
Belgiu, M. & Drăguţ, L. Random forest in remote sensing: A review of applications and future directions. ISPRS J. Photogramm. Remote Sens. 114, 24–31. https://doi.org/10.1016/j.isprsjprs.2016.01.011 (2016).
Pham, T. T. H., Yoshino, K. & Nguyen, H. M. Mangrove mapping and monitoring using ML algorithms and synthetic aperture radar (SAR) data: A case study of Quang Ninh. Vietnam. Remote Sens. 12, 3388. https://doi.org/10.3390/rs12203388 (2020).
Gladstone, W., Krupp, F. & Younis, M. Development and management of a network of marine protected areas in the Red Sea and Gulf of Aden region. Ocean Coast. Manag. 46, 741–761. https://doi.org/10.1016/s0964-5691(03)00065-6 (2003).
Al-Zahrany, A. A., Farouk, M. A. & Al-Yousef, A. A. Distribution of naturally occurring radioactivity and 137Cs in the marine sediment of Farasan Island, southern Red Sea. Saudi Arabia. Radiat. Prot. Dosimetry 152, 135–139. https://doi.org/10.1093/rpd/ncs207 (2012).
Pavlopoulos, K. et al. geomorphological changes in the coastal area of Farasan Al-Kabir Island (Saudi Arabia) since mid holocene based on a multi-proxy approach. Quat. Int. 493, 198–211. https://doi.org/10.1016/j.quaint.2018.06.004 (2018).
UNESCO. Farasan Islands Protected Area - UNESCO World Heritage Centre. Unesco.org https://whc.unesco.org/en/tentativelists/6370/(2018).
Landsat 8 Datasets in Earth Engine. Google for Developers https://developers.google.com/earth-engine/datasets/catalog/landsat-8 (2024).
Huang, C., Chen, Y. & Zhang, S. Integration of multisource satellite and climate data for NDVI time-series analysis to evaluate wetland vegetation coverage changes in the Sanjiang Plain. Remote Sens. 12, 2584. https://doi.org/10.3390/rs12162584 (2020).
Du, Y. et al. Water bodies’ mapping from sentinel-2 imagery with modified normalized difference water index at 10-m spatial resolution produced by sharpening the SWIR band. Remote Sens. 8, 354. https://doi.org/10.3390/rs8040354 (2016).
Salah El-Hendawy, N. et al. Combining genetic analysis and multivariate modeling to evaluate spectral reflectance indices as indirect selection tools in wheat breeding under water deficit stress conditions. Remote Sens. 12, 1480. https://doi.org/10.3390/rs12091480 (2020).
Zhang, L. et al. Integrating satellite-derived climatic and vegetation indices to predict smallholder maize yield using deep learning. Agric. For. Meteorol. 311, 108666. https://doi.org/10.1016/j.agrformet.2021.108666 (2021).
Ermida, S. L., Trigo, I. F., DaCamara, C. C., Pires, A. C. & Gonçalves, P. Google earth engine open-source code for land surface temperature estimation from the landsat series. Remote Sens. 12, 1471. https://doi.org/10.3390/rs12091471 (2020).
Sekertekin, A. & Bonafoni, S. Land surface temperature retrieval from landsat 5, 7, and 8 over rural areas: assessment of different retrieval algorithms and emissivity models and toolbox implementation. Remote Sens. 12, 294. https://doi.org/10.3390/rs12020294 (2020).
Breiman, L. Random forests. Mach. Learn. 45, 5–32. https://doi.org/10.1023/A:1010933404324 (2001).
Mountrakis, G., Im, J. & Ogole, C. Support vector machines in remote sensing: A review. ISPRS J. Photogramm. Remote Sens. 66, 247–259. https://doi.org/10.1016/j.isprsjprs.2010.11.001 (2011).
Chen, T. & Guestrin, C. XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 785–794 (2016). https://doi.org/10.1145/2939672.2939785.
Alqahtani, A. F. & Ilyas, M. An ensemble-based multi-classification ML classifiers approach to detect multiple classes of cyberbullying. Mach. Learn. Knowl. Extr. 6, 156–170. https://doi.org/10.3390/make6010009 (2024).
Felicien, Nkomeje. Comparative Performance of Multi-Source Reference Data to Assess the Accuracy of Classified Remotely Sensed Imagery: Example of Landsat 8 OLI across Kigali City-Rwanda 2015. Zenodo (CERN European Organization for Nuclear Research) (2017). https://doi.org/10.5281/zenodo.268398.
Louppe, G., Wehenkel, L., Sutera, A. & Geurts, P. Understanding variable importances in forests of randomized trees. Adv. Neural Inf. Process. Syst. 26, 431–439. https://doi.org/10.48550/arXiv.1306.0260 (2013).
Gorelick, N. et al. Google earth engine: Planetary-scale geospatial analysis for everyone. Remote Sens. Environ. 202, 18–27. https://doi.org/10.1016/j.rse.2017.06.031 (2017).
Giri, C. et al. Status and distribution of mangrove forests of the world using earth observation satellite data. Glob. Ecol. Biogeogr. 20, 154–159 (2010).
Huang, S., Tang, L., Hupy, J. P., Wang, Y. & Shao, G. A commentary review on the use of normalized difference vegetation index (NDVI) in the era of popular remote sensing. J. For. Res. 32, 1–6. https://doi.org/10.1007/s11676-020-01155-1 (2020).
Pham, T. V., Bui, H. Q. & Vu, T. H. A review of ML approaches in land use/land cover mapping of mangrove forests using optical and radar data. Viet. J. Earth Sci. 42, 207–221. https://doi.org/10.15625/0866-7187/42/3/15279 (2020).
Shen, H., Zhang, L. & Zhang, Y. Remote sensing image fusion using deep learning: A comprehensive review. IEEE Trans. Geosci. Remote Sens. 58, 5747–5768. https://doi.org/10.1109/TGRS.2020.2978927 (2020).
Drusch, M. et al. Sentinel-2: ESA’s optical high-resolution mission for GMES operational services. Remote Sens. Environ. 120, 25–36. https://doi.org/10.1016/j.rse.2011.11.026 (2012).
Li, W., Guo, Q., Asner, G. P. & Mao, Q. Sentinel-2 improves mapping of tropical mangrove dynamics: A multiresolution and multitemporal method. Remote Sens. Environ. 259, 112416. https://doi.org/10.1016/j.rse.2021.112416 (2021).
Malakar, N. K. et al. An operational land surface temperature product for landsat thermal data: methodology and validation. IEEE Trans. Geosci. Remote Sens. 56, 5717–5735. https://doi.org/10.1109/tgrs.2018.2824828 (2018).
Ermida, S. L., Soares, P., Mantas, V., Gottsche, F. M. & Trigo, I. F. Google earth engine open-source code for land surface temperature estimation from the landsat series. Remote Sens. 12, 1471. https://doi.org/10.3390/rs12091471 (2020).
Wang, W. et al. Response of mangrove forest to climate change: A modeling study. Ecol. Indic. 103, 327–339. https://doi.org/10.1016/j.ecolind.2019.03.018 (2019).
Du, Y. et al. Water bodies’ extraction from Landsat ETM+ imagery using modified normalized difference water index (MNDWI). Remote Sens. Environ. 152, 184–192. https://doi.org/10.1016/j.rse.2014.06.016 (2016).
Frantz, D., Haß, E., Uhl, A., Stoffels, J. & Hill, J. Improvement of the Fmask algorithm for Sentinel-2 images: separating clouds from bright surfaces based on parallax effects. Remote Sens. Environ. 215, 471–481. https://doi.org/10.1016/j.rse.2018.03.017 (2018).
Myint, S. W. et al. Identifying mangrove species and their surrounding land use and land cover classes using an object-oriented approach with a Lacunarity spatial measure. GISci. Remote Sens. 48, 367–389. https://doi.org/10.2747/1548-1603.48.3.367 (2011).
Gong, P. et al. Fine-resolution mapping of global land cover by integrating multisource geospatial data. Remote Sens. Environ. 228, 203–220. https://doi.org/10.1016/j.rse.2019.04.022 (2019).
Lillesand, T., Kiefer, R. W. & Chipman, J. Remote Sensing and Image Interpretation (John Wiley & Sons, 2015).
Rodriguez-Galiano, V. F., Ghimire, B., Rogan, J., Chica-Olmo, M. & Rigol-Sanchez, J. P. An assessment of the effectiveness of a random forest classifier for land-cover classification. ISPRS J. Photogramm. Remote. Sens. 67, 93–104. https://doi.org/10.1016/j.isprsjprs.2011.11.002 (2012).
Pal, M. Random forest classifier for remote sensing classification. Int. J. Remote Sens. 26(1), 217–222. https://doi.org/10.1080/01431160412331269698 (2005).
Ghosh, A., Sharma, R. & Joshi, P. K. Random forest classification of urban landscape using landsat archive and ancillary data: Combining seasonal maps with decision level fusion. Appl. Geogr. 48, 31–41. https://doi.org/10.1016/j.apgeog.2014.01.003 (2014).
Jiang, Y. et al. High-resolution mangrove forests classification with machine learning using worldview and UAV Hyperspectral Data. Remote Sens. 13, 1529. https://doi.org/10.3390/rs13081529 (2021).
Giri, C., Ochieng, E., Tieszen, L. L., Zhu, Z., Singh, A., Loveland, T., Masek, J., & Duke, N. (2013). Global mangrove forests distribution, 2000 (Version 1.00) . NASA Socioeconomic Data and Applications Center (SEDAC). https://doi.org/10.7927/H4J67DW8.
Al-Huqail, A. A., Islam, Z. & Al-Harbi, H. F. An ML-based ensemble approach for the precision classification of mangroves, trend analysis, and priority reforestation areas in Asir, Saudi Arabia. Sustainability 16, 10355. https://doi.org/10.3390/su162310355 (2024).
Acknowledgements
The authors are grateful to the Deanship of Scientific Research, King Saud University, for funding through the Vice Deanship of Scientific Research Chairs: Chair of Climate Change, Environmental Development and Vegetation Cover.
Funding
This research was funded by the Deanship of Scientific Research, King Saud University, through the Vice Deanship of Scientific Research Chairs: Chair of Climate Change, Environmental Development and Vegetation Cover.
Author information
Authors and Affiliations
Contributions
AA: Conceptualization, Data curation, Investigation, Visualization, Writing–review and editing. ZI: Data curation, Formal analysis, Methodology, Writing–original draft, Software, Validation, Writing–review and editing. HA-H: Data curation, Resources, Writing–review and editing. FK: Data curation, Resources, Writing–review and editing.
Corresponding authors
Ethics declarations
Conflict of interest
The authors declare no competing interests.
Ethical approval
All authors have read, understood, and complied as applicable with the statement on "Ethical responsibilities of Authors" as found in the Instructions for Authors”
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Al-Huqail, A.A., Islam, Z., Al-Harbi, H.F. et al. AI-driven mangrove mapping on Farasan Islands, Saudi Arabia: enhancing the detection of dispersed patches with ML classifiers. Sci Rep 15, 19285 (2025). https://doi.org/10.1038/s41598-025-03280-5
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-025-03280-5