METHODS: The most important climatic factors that contribute to dengue outbreaks were identified in the current work. Correlation analyses were performed in order to determine these factors and these factors were used as input parameters for machine learning models. Top five machine learning classification models (Bayes network (BN) models, support vector machine (SVM), RBF tree, decision table and naive Bayes) were chosen based on past research. The models were then tested and evaluated on the basis of 4-year data (January 2010 to December 2013) collected in Malaysia.
RESULTS: This research has two major contributions. A new risk factor, called the TempeRain factor (TRF), was identified and used as an input parameter for the model of dengue outbreak prediction. Moreover, TRF was applied to demonstrate its strong impact on dengue outbreaks. Experimental results showed that the Bayes Network model with the new meteorological risk factor identified in this study increased accuracy to 92.35% for predicting dengue outbreaks.
CONCLUSIONS: This research explored the factors used in dengue outbreak prediction systems. The major contribution of this study is identifying new significant factors that contribute to dengue outbreak prediction. From the evaluation result, we obtained a significant improvement in the accuracy of a machine learning model for dengue outbreak prediction.
SUBJECTS: Patients who were admitted to the University of Malaya Medical Centre due to cardiac events.
METHODS: Eight different machine learning models were evaluated. The models included 3 different sets of features: full features; significant features from multiple logistic regression; and features selected from recursive feature extraction technique. The performance of the prediction models with each set of features was compared.
RESULTS: The AdaBoost model with the top 20 features obtained the highest performance score of 92.4% (area under the curve; AUC) compared with other prediction models.
CONCLUSION: The findings showed the potential of using machine learning models to predict return to work after cardiac rehabilitation.
METHODS: A systematic literature search for studies with the primary aim of using OSN to detect and track a pandemic was conducted. We conducted an electronic literature search for eligible English articles published between 2004 and 2015 using PUBMED, IEEExplore, ACM Digital Library, Google Scholar, and Web of Science. First, the articles were screened on the basis of titles and abstracts. Second, the full texts were reviewed. All included studies were subjected to quality assessment.
RESULT: OSNs have rich information that can be utilized to develop an almost real-time pandemic surveillance system. The outcomes of OSN surveillance systems have demonstrated high correlations with the findings of official surveillance systems. However, the limitation in using OSN to track pandemic is in collecting representative data with sufficient population coverage. This challenge is related to the characteristics of OSN data. The data are dynamic, large-sized, and unstructured, thus requiring advanced algorithms and computational linguistics.
CONCLUSIONS: OSN data contain significant information that can be used to track a pandemic. Different from traditional surveys and clinical reports, in which the data collection process is time consuming at costly rates, OSN data can be collected almost in real time at a cheaper cost. Additionally, the geographical and temporal information can provide exploratory analysis of spatiotemporal dynamics of infectious disease spread. However, on one hand, an OSN-based surveillance system requires comprehensive adoption, enhanced geographical identification system, and advanced algorithms and computational linguistics to eliminate its limitations and challenges. On the other hand, OSN is probably to never replace traditional surveillance, but it can offer complementary data that can work best when integrated with traditional data.
METHOD: This paper is motivated by the gap in the literature, thus proposes an algorithm that measures the strength of the significant features that contribute to heart disease prediction. The study is aimed at predicting heart disease based on the scores of significant features using Weighted Associative Rule Mining.
RESULTS: A set of important feature scores and rules were identified in diagnosing heart disease and cardiologists were consulted to confirm the validity of these rules. The experiments performed on the UCI open dataset, widely used for heart disease research yielded the highest confidence score of 98% in predicting heart disease.
CONCLUSION: This study managed to provide a significant contribution in computing the strength scores with significant predictors in heart disease prediction. From the evaluation results, we obtained important rules and achieved highest confidence score by utilizing the computed strength scores of significant predictors on Weighted Associative Rule Mining in predicting heart disease.
SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s11192-021-04046-2.