The aim of this study is to predict the next day PM10 concentration using Bayesian Regression with noninformative
prior and conjugate prior models. The descriptive analysis of PM10, temperature, relative humidity,
nitrogen dioxide (NO2), sulphur dioxide (SO2), carbon monoxide (CO) and ozone (O3) are also included. A case
study used two-years of air quality monitoring data at three (3) monitoring stations to predict the future PM10
concentration with seven parameters (PM10, temperature, relative humidity, NO2, SO2, CO, and O3). The descriptive
analysis showed that the highest mean PM10 concentration occurred at Klang station in 2011 (71.30 µg/m3
) followed
by 2012 (68.82 µg/m3
). The highest mean PM10 concentration was at Nilai in 2012 (68.86 µg/m3
) followed by 2011
(66.29µg/m3
) respectively. The results showed that the Bayesian regression model used a conjugate prior with a
normal-gamma prior which was a good model to predict the PM10 concentration for most study stations with (R2 =
0.67 at Jerantut station), (R2 = 0.61 at Nilai station) and (R2 = 0.66 at Klang station) respectively compared to a
non-informative prior.
One of the concerns of the air pollution studies is to compute the concentrations of one or more pollutants’ species in space and time in relation to the independent variables, for instance emissions into the atmosphere, meteorological factors and parameters. One of the most significant statistical disciplines developed for the applied sciences and many other disciplines for the last few decades is the extreme value theory (EVT). This study assesses the use of extreme value distributions of the two-parameter Gumbel, two and three-parameter Weibull, Generalized Extreme Value (GEV) and two and three-parameter Generalized Pareto Distribution (GPD) on the maximum concentration of daily PM10 data recorded in the year 2010 - 2012 in Pasir Gudang, Johor; Bukit Rambai, Melaka; and Nilai, Negeri Sembilan. Parameters for all distributions are estimated using the Method of Moments (MOM) and Maximum Likelihood Estimator (MLE). Six performance indicators namely; the accuracy measures which include predictive accuracy (PA), coefficient of determination (R2), Index of Agreement (IA) and error measures that consist of Root Mean Square Error (RMSE), Mean Absolute Error (MAE) and Normalized Absolute Error (NAE) are used to find the goodness-of-fit of the distribution. The best distribution is selected based on the highest accuracy measures and the smallest error measures. The results showed that the GEV is the best fit for daily maximum concentration for PM10 for all monitoring stations. The analysis also demonstrates that the estimated numbers of days in which the concentration of PM10 exceeded the Malaysian Ambient Air Quality Guidelines (MAAQG) of 150 mg/m3 are between ½ and 1½ days.