MyMedR

Displaying publications 81 - 88 of 88 in total

Abstract:

Sort:

Fulltext DenseHillNet: a lightweight CNN for accurate classification of natural images

Saqib SM, Zubair Asghar M, Iqbal M, Al-Rasheed A, Amir Khan M, Ghadi Y, et al.

PeerJ Comput Sci, 2024;10:e1995.
PMID: 38686004 DOI: 10.7717/peerj-cs.1995

The detection of natural images, such as glaciers and mountains, holds practical applications in transportation automation and outdoor activities. Convolutional neural networks (CNNs) have been widely employed for image recognition and classification tasks. While previous studies have focused on fruits, land sliding, and medical images, there is a need for further research on the detection of natural images, particularly glaciers and mountains. To address the limitations of traditional CNNs, such as vanishing gradients and the need for many layers, the proposed work introduces a novel model called DenseHillNet. The model utilizes a DenseHillNet architecture, a type of CNN with densely connected layers, to accurately classify images as glaciers or mountains. The model contributes to the development of automation technologies in transportation and outdoor activities. The dataset used in this study comprises 3,096 images of each of the "glacier" and "mountain" categories. Rigorous methodology was employed for dataset preparation and model training, ensuring the validity of the results. A comparison with a previous work revealed that the proposed DenseHillNet model, trained on both glacier and mountain images, achieved higher accuracy (86%) compared to a CNN model that only utilized glacier images (72%). Researchers and graduate students are the audience of our article.
Fulltext A systematic literature review on meta-heuristic based feature selection techniques for text classification

Al-Shalif SA, Senan N, Saeed F, Ghaban W, Ibrahim N, Aamir M, et al.

PeerJ Comput Sci, 2024;10:e2084.
PMID: 38983195 DOI: 10.7717/peerj-cs.2084

Feature selection (FS) is a critical step in many data science-based applications, especially in text classification, as it includes selecting relevant and important features from an original feature set. This process can improve learning accuracy, streamline learning duration, and simplify outcomes. In text classification, there are often many excessive and unrelated features that impact performance of the applied classifiers, and various techniques have been suggested to tackle this problem, categorized as traditional techniques and meta-heuristic (MH) techniques. In order to discover the optimal subset of features, FS processes require a search strategy, and MH techniques use various strategies to strike a balance between exploration and exploitation. The goal of this research article is to systematically analyze the MH techniques used for FS between 2015 and 2022, focusing on 108 primary studies from three different databases such as Scopus, Science Direct, and Google Scholar to identify the techniques used, as well as their strengths and weaknesses. The findings indicate that MH techniques are efficient and outperform traditional techniques, with the potential for further exploration of MH techniques such as Ringed Seal Search (RSS) to improve FS in several applications.
Fulltext Clustering analysis for classifying fake real estate listings

Mohd Amin M, Sani NS, Nasrudin MF, Abdullah S, Chhabra A, Abd Kadir F

PeerJ Comput Sci, 2024;10:e2019.
PMID: 38983188 DOI: 10.7717/peerj-cs.2019

With the rapid growth of online property rental and sale platforms, the prevalence of fake real estate listings has become a significant concern. These deceptive listings waste time and effort for buyers and sellers and pose potential risks. Therefore, developing effective methods to distinguish genuine from fake listings is crucial. Accurately identifying fake real estate listings is a critical challenge, and clustering analysis can significantly improve this process. While clustering has been widely used to detect fraud in various fields, its application in the real estate domain has been somewhat limited, primarily focused on auctions and property appraisals. This study aims to fill this gap by using clustering to classify properties into fake and genuine listings based on datasets curated by industry experts. This study developed a K-means model to group properties into clusters, clearly distinguishing between fake and genuine listings. To assure the quality of the training data, data pre-processing procedures were performed on the raw dataset. Several techniques were used to determine the optimal value for each parameter of the K-means model. The clusters are determined using the Silhouette coefficient, the Calinski-Harabasz index, and the Davies-Bouldin index. It was found that the value of cluster 2 is the best and the Camberra technique is the best method when compared to overlapping similarity and Jaccard for distance. The clustering results are assessed using two machine learning algorithms: Random Forest and Decision Tree. The observational results have shown that the optimized K-means significantly improves the accuracy of the Random Forest classification model, boosting it by an impressive 96%. Furthermore, this research demonstrates that clustering helps create a balanced dataset containing fake and genuine clusters. This balanced dataset holds promise for future investigations, particularly for deep learning models that require balanced data to perform optimally. This study presents a practical and effective way to identify fake real estate listings by harnessing the power of clustering analysis, ultimately contributing to a more trustworthy and secure real estate market.
Efficient classification of COVID-19 CT scans by using q-transform model for feature extraction

Al-Azawi RJ, Al-Saidi NMG, Jalab HA, Kahtan H, Ibrahim RW

PeerJ Comput Sci, 2021;7:e553.
PMID: 39545145 DOI: 10.7717/peerj-cs.553

The exponential growth in computer technology throughout the past two decades has facilitated the development of advanced image analysis techniques which aid the field of medical imaging. CT is a widely used medical screening method used to obtain high resolution images of the human body. CT has been proven useful in the screening of the virus that is responsible for the COVID-19 pandemic by allowing physicians to rule out suspected infections based on the appearance of the lungs from the CT scan. Based on this, we hereby propose an intelligent yet efficient CT scan-based COVID-19 classification algorithm that is able to discriminate negative from positive cases by evaluating the appearance of lungs. The algorithm is comprised of four main steps: preprocessing, features extraction, features reduction, and classification. In preprocessing, we employ the contrast limited adaptive histogram equalization (CLAHE) to adjust the contrast of the image to enhance the details of the input image. We then apply the q-transform method to extract features from the CT scan. This method measures the grey level intensity of the pixels which reflects the features of the image. In the feature reduction step, we measure the mean, skewness and standard deviation to reduce overhead and improve the efficiency of the algorithm. Finally, "k-nearest neighbor", "decision tree", and "support vector machine" are used as classifiers to classify the cases. The experimental results show accuracy rates of 98%, 98%, and 98.25% for each of the classifiers, respectively. It is therefore concluded that the proposed method is efficient, accurate, and flexible. Overall, we are confident that the proposed algorithm is capable of achieving a high classification accuracy under different scenarios, which makes it suitable for implementation in real-world applications.
An efficient secure and energy resilient trust-based system for detection and mitigation of sybil attack detection (SAN)

Hussain MZ, Hanapi ZM, Abdullah A, Hussin M, Ninggal MIH

PeerJ Comput Sci, 2024;10:e2231.
PMID: 39145209 DOI: 10.7717/peerj-cs.2231

In the modern digital market flooded by nearly endless cyber-security hazards, sophisticated IDS (intrusion detection systems) can become invaluable in defending against intricate security threats. Sybil-Free Metric-based routing protocol for low power and lossy network (RPL) Trustworthiness Scheme (SF-MRTS) captures the nature of the biggest threat to the routing protocol for low-power and lossy networks under the RPL module, known as the Sybil attack. Sybil attacks build a significant security challenge for RPL networks where an attacker can distort at least two hop paths and disrupt network processes. Using such a new way of calculating node reliability, we introduce a cutting-edge approach, evaluating parameters beyond routing metrics like energy conservation and actuality. SF-MRTS works precisely towards achieving a trusted network by introducing such trust metrics on secure paths. Therefore, this may be considered more likely to withstand the attacks because of these security improvements. The simulation function of SF-MRTS clearly shows its concordance with the security risk management features, which are also necessary for the network's performance and stability maintenance. These mechanisms are based on the principles of game theory, and they allocate attractions to the nodes that cooperate while imposing penalties on the nodes that do not. This will be the way to avoid damage to the network, and it will lead to collaboration between the nodes. SF-MRTS is a security technology for emerging industrial Internet of Things (IoT) network attacks. It effectively guaranteed reliability and improved the networks' resilience in different scenarios.
Bacterial image analysis using multi-task deep learning approaches for clinical microscopy

Chin SY, Dong J, Hasikin K, Ngui R, Lai KW, Yeoh PSQ, et al.

PeerJ Comput Sci, 2024;10:e2180.
PMID: 39145215 DOI: 10.7717/peerj-cs.2180

BACKGROUND: Bacterial image analysis plays a vital role in various fields, providing valuable information and insights for studying bacterial structural biology, diagnosing and treating infectious diseases caused by pathogenic bacteria, discovering and developing drugs that can combat bacterial infections, etc. As a result, it has prompted efforts to automate bacterial image analysis tasks. By automating analysis tasks and leveraging more advanced computational techniques, such as deep learning (DL) algorithms, bacterial image analysis can contribute to rapid, more accurate, efficient, reliable, and standardised analysis, leading to enhanced understanding, diagnosis, and control of bacterial-related phenomena.
METHODS: Three object detection networks of DL algorithms, namely SSD-MobileNetV2, EfficientDet, and YOLOv4, were developed to automatically detect Escherichia coli (E. coli) bacteria from microscopic images. The multi-task DL framework is developed to classify the bacteria according to their respective growth stages, which include rod-shaped cells, dividing cells, and microcolonies. Data preprocessing steps were carried out before training the object detection models, including image augmentation, image annotation, and data splitting. The performance of the DL techniques is evaluated using the quantitative assessment method based on mean average precision (mAP), precision, recall, and F1-score. The performance metrics of the models were compared and analysed. The best DL model was then selected to perform multi-task object detections in identifying rod-shaped cells, dividing cells, and microcolonies.
RESULTS: The output of the test images generated from the three proposed DL models displayed high detection accuracy, with YOLOv4 achieving the highest confidence score range of detection and being able to create different coloured bounding boxes for different growth stages of E. coli bacteria. In terms of statistical analysis, among the three proposed models, YOLOv4 demonstrates superior performance, achieving the highest mAP of 98% with the highest precision, recall, and F1-score of 86%, 97%, and 91%, respectively.
CONCLUSIONS: This study has demonstrated the effectiveness, potential, and applicability of DL approaches in multi-task bacterial image analysis, focusing on automating the detection and classification of bacteria from microscopic images. The proposed models can output images with bounding boxes surrounding each detected E. coli bacteria, labelled with their growth stage and confidence level of detection. All proposed object detection models have achieved promising results, with YOLOv4 outperforming the other models.
Natural language processing for analyzing online customer reviews: a survey, taxonomy, and open research challenges

Malik N, Bilal M

PeerJ Comput Sci, 2024;10:e2203.
PMID: 39145232 DOI: 10.7717/peerj-cs.2203

In recent years, e-commerce platforms have become popular and transformed the way people buy and sell goods. People are rapidly adopting Internet shopping due to the convenience of purchasing from the comfort of their homes. Online review sites allow customers to share their thoughts on products and services. Customers and businesses increasingly rely on online reviews to assess and improve the quality of products. Existing literature uses natural language processing (NLP) to analyze customer reviews for different applications. Due to the growing importance of NLP for online customer reviews, this study attempts to provide a taxonomy of NLP applications based on existing literature. This study also examined emerging methods, data sources, and research challenges by reviewing 154 publications from 2013 to 2023 that explore state-of-the-art approaches for diverse applications. Based on existing research, the taxonomy of applications divides literature into five categories: sentiment analysis and opinion mining, review analysis and management, customer experience and satisfaction, user profiling, and marketing and reputation management. It is interesting to note that the majority of existing research relies on Amazon user reviews. Additionally, recent research has encouraged the use of advanced techniques like bidirectional encoder representations from transformers (BERT), long short-term memory (LSTM), and ensemble classifiers. The rising number of articles published each year indicates increasing interest of researchers and continued growth. This survey also addresses open issues, providing future directions in analyzing online customer reviews.
Enhancing infectious disease prediction model selection with multi-objective optimization: an empirical study

Xu D, Chan WH, Haron H

PeerJ Comput Sci, 2024;10:e2217.
PMID: 39145229 DOI: 10.7717/peerj-cs.2217

As the pandemic continues to pose challenges to global public health, developing effective predictive models has become an urgent research topic. This study aims to explore the application of multi-objective optimization methods in selecting infectious disease prediction models and evaluate their impact on improving prediction accuracy, generalizability, and computational efficiency. In this study, the NSGA-II algorithm was used to compare models selected by multi-objective optimization with those selected by traditional single-objective optimization. The results indicate that decision tree (DT) and extreme gradient boosting regressor (XGBoost) models selected through multi-objective optimization methods outperform those selected by other methods in terms of accuracy, generalizability, and computational efficiency. Compared to the ridge regression model selected through single-objective optimization methods, the decision tree (DT) and XGBoost models demonstrate significantly lower root mean square error (RMSE) on real datasets. This finding highlights the potential advantages of multi-objective optimization in balancing multiple evaluation metrics. However, this study's limitations suggest future research directions, including algorithm improvements, expanded evaluation metrics, and the use of more diverse datasets. The conclusions of this study emphasize the theoretical and practical significance of multi-objective optimization methods in public health decision support systems, indicating their wide-ranging potential applications in selecting predictive models.