Affiliations 

  • 1 Department of Energy and Mineral Resources Engineering, Sejong University, 209 Neudong-roGwangjin-gu, Seoul, 05006, Republic of Korea. [email protected]
  • 2 Department of Energy and Mineral Resources Engineering, Sejong University, 209 Neudong-roGwangjin-gu, Seoul, 05006, Republic of Korea. [email protected]
  • 3 Ministry of Environment, Baghdad, Iraq
  • 4 Department of Geology, Faculty of Sciences, Ibn Zohr University, B.P. 8106, 80000, Agadir, Morocco
  • 5 Geospatial Analysis and Modelling (GAM) Research Laboratory, Department of Civil and Environmental Engineering, Universiti Teknologi PETRONAS (UTP), 32610, Seri Iskandar, Perak, Malaysia
  • 6 Engineering Services and Asset Management, John Holland Group, Sydney, NSW, 2150, Australia
  • 7 Department of Energy and Mineral Resources Engineering, Sejong University, 209 Neudong-roGwangjin-gu, Seoul, 05006, Republic of Korea
Environ Sci Pollut Res Int, 2021 Aug;28(32):43544-43566.
PMID: 33834339 DOI: 10.1007/s11356-021-13255-4

Abstract

This study investigates uncertainty in machine learning that can occur when there is significant variance in the prediction importance level of the independent variables, especially when the ROC fails to reflect the unbalanced effect of prediction variables. A variable drop-off loop function, based on the concept of early termination for reduction of model capacity, regularization, and generalization control, was tested. A susceptibility index for airborne particulate matter of less than 10 μm diameter (PM10) was modeled using monthly maximum values and spectral bands and indices from Landsat 8 imagery, and Open Street Maps were used to prepare a range of independent variables. Probability and classification index maps were prepared using extreme-gradient boosting (XGBOOST) and random forest (RF) algorithms. These were assessed against utility criteria such as a confusion matrix of overall accuracy, quantity of variables, processing delay, degree of overfitting, importance distribution, and area under the receiver operating characteristic curve (ROC).

* Title and MeSH Headings from MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine.