Air pollution events can be categorized as extreme or non-extreme on the basis of their magnitude of severity. High-risk extreme air pollution events will exert a disastrous effect on the environment. Therefore, public health and policy-making authorities must be able to determine the characteristics of these events. This study proposes a probabilistic machine learning technique for predicting the classification of extreme and non-extreme events on the basis of data features to address the above issue. The use of the naïve Bayes model in the prediction of air pollution classes is proposed to leverage its simplicity as well as high accuracy and efficiency. A case study was conducted on the air pollution index data of Klang, Malaysia, for the period of January 01, 1997, to August 31, 2020. The trained naïve Bayes model achieves high accuracy, sensitivity, and specificity on the training and test datasets. Therefore, the naïve Bayes model can be easily applied in air pollution analysis while providing a promising solution for the accurate and efficient prediction of extreme or non-extreme air pollution events. The findings of this study provide reliable information to public authorities for monitoring and managing sustainable air quality over time.
* Title and MeSH Headings from MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine.