Displaying publications 41 - 60 of 88 in total

Abstract:
Sort:
  1. Haw YH, Lai KW, Chuah JH, Bejo SK, Husin NA, Hum YC, et al.
    PeerJ Comput Sci, 2023;9:e1325.
    PMID: 37346512 DOI: 10.7717/peerj-cs.1325
    Oil palm is a key agricultural resource in Malaysia. However, palm disease, most prominently basal stem rot caused at least RM 255 million of annual economic loss. Basal stem rot is caused by a fungus known as Ganoderma boninense. An infected tree shows few symptoms during early stage of infection, while potentially suffers an 80% lifetime yield loss and the tree may be dead within 2 years. Early detection of basal stem rot is crucial since disease control efforts can be done. Laboratory BSR detection methods are effective, but the methods have accuracy, biosafety, and cost concerns. This review article consists of scientific articles related to the oil palm tree disease, basal stem rot, Ganoderma Boninense, remote sensors and deep learning that are listed in the Web of Science since year 2012. About 110 scientific articles were found that is related to the index terms mentioned and 60 research articles were found to be related to the objective of this research thus included in this review article. From the review, it was found that the potential use of deep learning methods were rarely explored. Some research showed unsatisfactory results due to limitations on dataset. However, based on studies related to other plant diseases, deep learning in combination with data augmentation techniques showed great potentials, showing remarkable detection accuracy. Therefore, the feasibility of analyzing oil palm remote sensor data using deep learning models together with data augmentation techniques should be studied. On a commercial scale, deep learning used together with remote sensors and unmanned aerial vehicle technologies showed great potential in the detection of basal stem rot disease.
  2. Nawaz NA, Ishaq K, Farooq U, Khalil A, Rasheed S, Abid A, et al.
    PeerJ Comput Sci, 2023;9:e1143.
    PMID: 37346522 DOI: 10.7717/peerj-cs.1143
    The term "cyber threats" refers to the new category of hazards that have emerged with the rapid development and widespread use of computing technologies, as well as our growing reliance on them. This article presents an in-depth study of a variety of security and privacy threats directed at different types of users of social media sites. Furthermore, it focuses on different risks while sharing multimedia content across social networking platforms, and discusses relevant prevention measures and techniques. It also shares methods, tools, and mechanisms for safer usage of online social media platforms, which have been categorized based on their providers including commercial, open source, and academic solutions.
  3. Alansari Z, Anuar NB, Kamsin A, Belgaum MR
    PeerJ Comput Sci, 2023;9:e1309.
    PMID: 37346586 DOI: 10.7717/peerj-cs.1309
    Routing protocols transmit vast amounts of sensor data between the Wireless Sensor Network (WSN) and the Internet of Things (IoT) gateway. One of these routing protocols is Routing Protocol for Low Power and Lossy Networks (RPL). The Internet Engineering Task Force (IETF) defined RPL in March 2012 as a de facto distance-vector routing protocol for wireless communications with lower energy. Although RPL messages use a cryptographic algorithm for security protection, it does not help prevent internal attacks. These attacks drop some or all packets, such as blackhole or selective forwarding attacks, or change data packets, like grayhole attacks. The RPL protocol needs to be strengthened to address such an issue, as only a limited number of studies have been conducted on detecting internal attacks. Moreover, earlier research should have considered the mobility framework, a vital feature of the IoT. This article presents a novel lightweight system for anomaly detection of grayhole, blackhole, and selective forwarding attacks. The study aims to use a trust model in the RPL protocol, considering attack detection under mobility frameworks. The proposed system, anomaly detection of three RPL attacks (RPLAD3), is designed in four layers and starts operating immediately after the initial state of the network. The experiments demonstrated that RPLAD3 outperforms the RPL protocol when defeating attacks with high accuracy and a true positive ratio while lowering power and energy consumption. In addition, it significantly improves the packet delivery ratio and decreases the false positive ratio to zero.
  4. Xie D, Yin C
    PeerJ Comput Sci, 2023;9:e1330.
    PMID: 37346562 DOI: 10.7717/peerj-cs.1330
    Image retrieval technology has emerged as a popular research area of China's development of cultural digital image dissemination and creative creation with the growth of the Internet and the digital information age. This study uses the shadow image in Shaanxi culture as the research object, suggests a shadow image retrieval model based on CBAM-ResNet50, and implements it in the IoT system to achieve more effective deep-level cultural information retrieval. First, ResNet50 is paired with an attention mechanism to enhance the network's capacity to extract sophisticated semantic characteristics. The second step is configuring the IoT system's picture acquisition, processing, and output modules. The image processing module incorporates the CBAM-ResNet50 network to provide intelligent and effective shadow play picture retrieval. The experiment results show that shadow plays on GPU can retrieve images at a millisecond level. Both the first image and the first six photographs may be accurately retrieved, with a retrieval accuracy of 92.5 percent for the first image. This effectively communicates Chinese culture and makes it possible to retrieve detailed shadow-play images.
  5. Chen Y, Mustafa H, Zhang X, Liu J
    PeerJ Comput Sci, 2023;9:e1231.
    PMID: 37346728 DOI: 10.7717/peerj-cs.1231
    Traditional financial accounting will become limited by new technologies which are unable to meet the market development. In order to make financial big data generate business value and improve the information application level of financial management, aiming at the high error rate of current financial data classification system, this article adopts the fuzzy clustering algorithm to classify financial data automatically, and adopts the local outlier factor algorithm with neighborhood relation (NLOF) to detect abnormal data. In addition, a financial data management platform based on distributed Hadoop architecture is designed, which combines MapReduce framework with the fuzzy clustering algorithm and the local outlier factor (LOF) algorithm, and uses MapReduce to operate in parallel with the two algorithms, thus improving the performance of the algorithm and the accuracy of the algorithm, and helping to improve the operational efficiency of enterprise financial data processing. The comparative experimental results show that the proposed platform can achieve the best the running efficiency and the accuracy of financial data classification compared with other methods, which illustrate the effectiveness and superiority of the proposed platform.
  6. Abbas Q, Hina S, Sajjad H, Zaidi KS, Akbar R
    PeerJ Comput Sci, 2023;9:e1552.
    PMID: 37705624 DOI: 10.7717/peerj-cs.1552
    Network intrusion is one of the main threats to organizational networks and systems. Its timely detection is a profound challenge for the security of networks and systems. The situation is even more challenging for small and medium enterprises (SMEs) of developing countries where limited resources and investment in deploying foreign security controls and development of indigenous security solutions are big hurdles. A robust, yet cost-effective network intrusion detection system is required to secure traditional and Internet of Things (IoT) networks to confront such escalating security challenges in SMEs. In the present research, a novel hybrid ensemble model using random forest-recursive feature elimination (RF-RFE) method is proposed to increase the predictive performance of intrusion detection system (IDS). Compared to the deep learning paradigm, the proposed machine learning ensemble method could yield the state-of-the-art results with lower computational cost and less training time. The evaluation of the proposed ensemble machine leaning model shows 99%, 98.53% and 99.9% overall accuracy for NSL-KDD, UNSW-NB15 and CSE-CIC-IDS2018 datasets, respectively. The results show that the proposed ensemble method successfully optimizes the performance of intrusion detection systems. The outcome of the research is significant and contributes to the performance efficiency of intrusion detection systems and developing secure systems and applications.
  7. Al-Ghuribi S, Mohd Noah SA, Mohammed M
    PeerJ Comput Sci, 2023;9:e1525.
    PMID: 37705634 DOI: 10.7717/peerj-cs.1525
    Collaborative filtering (CF) approaches generate user recommendations based on user similarities. These similarities are calculated based on the overall (explicit) user ratings. However, in some domains, such ratings may be sparse or unavailable. User reviews can play a significant role in such cases, as implicit ratings can be derived from the reviews using sentiment analysis, a natural language processing technique. However, most current studies calculate the implicit ratings by simply aggregating the scores of all sentiment words appearing in reviews and, thus, ignoring the elements of sentiment degrees and aspects of user reviews. This study addresses this issue by calculating the implicit rating differently, leveraging the rich information in user reviews by using both sentiment words and aspect-sentiment word pairs to enhance the CF performance. It proposes four methods to calculate the implicit ratings on large-scale datasets: the first considers the degree of sentiment words, while the second exploits the aspects by extracting aspect-sentiment word pairs to calculate the implicit ratings. The remaining two methods combine explicit ratings with the implicit ratings generated by the first two methods. The generated ratings are then incorporated into different CF rating prediction algorithms to evaluate their effectiveness in enhancing the CF performance. Evaluative experiments of the proposed methods are conducted on two large-scale datasets: Amazon and Yelp. Results of the experiments show that the proposed ratings improved the accuracy of CF rating prediction algorithms and outperformed the explicit ratings in terms of three predictive accuracy metrics.
  8. Su C, Wei J, Lei Y, Li J
    PeerJ Comput Sci, 2023;9:e1496.
    PMID: 37705669 DOI: 10.7717/peerj-cs.1496
    The rise of targeted advertising has led to frequent privacy data leaks, as advertisers are reluctant to share information to safeguard their interests. This has resulted in isolated data islands and model heterogeneity challenges. To address these issues, we have proposed a C-means clustering algorithm based on maximum average difference to improve the evaluation of the difference in distribution between local and global parameters. Additionally, we have introduced an innovative dynamic selection algorithm that leverages knowledge distillation and weight correction to reduce the impact of model heterogeneity. Our framework was tested on various datasets and its performance was evaluated using accuracy, loss, and AUC (area under the ROC curve) metrics. Results showed that the framework outperformed other models in terms of higher accuracy, lower loss, and better AUC while requiring the same computation time. Our research aims to provide a more reliable, controllable, and secure data sharing framework to enhance the efficiency and accuracy of targeted advertising.
  9. Magdy Mohamed Abdelaziz Barakat S, Sallehuddin R, Yuhaniz SS, R Khairuddin RF, Mahmood Y
    PeerJ Comput Sci, 2023;9:e1180.
    PMID: 37547391 DOI: 10.7717/peerj-cs.1180
    BACKGROUND: The development of sequencing technology increases the number of genomes being sequenced. However, obtaining a quality genome sequence remains a challenge in genome assembly by assembling a massive number of short strings (reads) with the presence of repetitive sequences (repeats). Computer algorithms for genome assembly construct the entire genome from reads in two approaches. The de novo approach concatenates the reads based on the exact match between their suffix-prefix (overlapping). Reference-guided approach orders the reads based on their offsets in a well-known reference genome (reads alignment). The presence of repeats extends the technical ambiguity, making the algorithm unable to distinguish the reads resulting in misassembly and affecting the assembly approach accuracy. On the other hand, the massive number of reads causes a big assembly performance challenge.

    METHOD: The repeat identification method was introduced for misassembly by prior identification of repetitive sequences, creating a repeat knowledge base to reduce ambiguity during the assembly process, thus enhancing the accuracy of the assembled genome. Also, hybridization between assembly approaches resulted in a lower misassembly degree with the aid of the reference genome. The assembly performance is optimized through data structure indexing and parallelization. This article's primary aim and contribution are to support the researchers through an extensive review to ease other researchers' search for genome assembly studies. The study also, highlighted the most recent developments and limitations in genome assembly accuracy and performance optimization.

    RESULTS: Our findings show the limitations of the repeat identification methods available, which only allow to detect of specific lengths of the repeat, and may not perform well when various types of repeats are present in a genome. We also found that most of the hybrid assembly approaches, either starting with de novo or reference-guided, have some limitations in handling repetitive sequences as it is more computationally costly and time intensive. Although the hybrid approach was found to outperform individual assembly approaches, optimizing its performance remains a challenge. Also, the usage of parallelization in overlapping and reads alignment for genome assembly is yet to be fully implemented in the hybrid assembly approach.

    CONCLUSION: We suggest combining multiple repeat identification methods to enhance the accuracy of identifying the repeats as an initial step to the hybrid assembly approach and combining genome indexing with parallelization for better optimization of its performance.

  10. Awan MJ, Mohd Rahim MS, Salim N, Nobanee H, Asif AA, Attiq MO
    PeerJ Comput Sci, 2023;9:e1483.
    PMID: 37547408 DOI: 10.7717/peerj-cs.1483
    Anterior cruciate ligament (ACL) tears are a common knee injury that can have serious consequences and require medical intervention. Magnetic resonance imaging (MRI) is the preferred method for ACL tear diagnosis. However, manual segmentation of the ACL in MRI images is prone to human error and can be time-consuming. This study presents a new approach that uses deep learning technique for localizing the ACL tear region in MRI images. The proposed multi-scale guided attention-based context aggregation (MGACA) method applies attention mechanisms at different scales within the DeepLabv3+ architecture to aggregate context information and achieve enhanced localization results. The model was trained and evaluated on a dataset of 917 knee MRI images, resulting in 15265 slices, obtaining state-of-the-art results with accuracy scores of 98.63%, intersection over union (IOU) scores of 95.39%, Dice coefficient scores (DCS) of 97.64%, recall scores of 97.5%, precision scores of 98.21%, and F1 Scores of 97.86% on validation set data. Moreover, our method performed well in terms of loss values, with binary cross entropy combined with Dice loss (BCE_Dice_loss) and Dice_loss values of 0.0564 and 0.0236, respectively, on the validation set. The findings suggest that MGACA provides an accurate and efficient solution for automating the localization of ACL in knee MRI images, surpassing other state-of-the-art models in terms of accuracy and loss values. However, in order to improve robustness of the approach and assess its performance on larger data sets, further research is needed.
  11. Altarturi HHM, Saadoon M, Anuar NB
    PeerJ Comput Sci, 2023;9:e1459.
    PMID: 37547394 DOI: 10.7717/peerj-cs.1459
    An immense volume of digital documents exists online and offline with content that can offer useful information and insights. Utilizing topic modeling enhances the analysis and understanding of digital documents. Topic modeling discovers latent semantic structures or topics within a set of digital textual documents. The Internet of Things, Blockchain, recommender system, and search engine optimization applications use topic modeling to handle data mining tasks, such as classification and clustering. The usefulness of topic models depends on the quality of resulting term patterns and topics with high quality. Topic coherence is the standard metric to measure the quality of topic models. Previous studies build topic models to generally work on conventional documents, and they are insufficient and underperform when applied to web content data due to differences in the structure of the conventional and HTML documents. Neglecting the unique structure of web content leads to missing otherwise coherent topics and, therefore, low topic quality. This study aims to propose an innovative topic model to learn coherence topics in web content data. We present the HTML Topic Model (HTM), a web content topic model that takes into consideration the HTML tags to understand the structure of web pages. We conducted two series of experiments to demonstrate the limitations of the existing topic models and examine the topic coherence of the HTM against the widely used Latent Dirichlet Allocation (LDA) model and its variants, namely the Correlated Topic Model, the Dirichlet Multinomial Regression, the Hierarchical Dirichlet Process, the Hierarchical Latent Dirichlet Allocation, the pseudo-document based Topic Model, and the Supervised Latent Dirichlet Allocation models. The first experiment demonstrates the limitations of the existing topic models when applied to web content data and, therefore, the essential need for a web content topic model. When applied to web data, the overall performance dropped an average of five times and, in some cases, up to approximately 20 times lower than when applied to conventional data. The second experiment then evaluates the effectiveness of the HTM model in discovering topics and term patterns of web content data. The HTM model achieved an overall 35% improvement in topic coherence compared to the LDA.
  12. Lian J
    PeerJ Comput Sci, 2023;9:e1472.
    PMID: 37547395 DOI: 10.7717/peerj-cs.1472
    Music can serve as a potent tool for conveying emotions and regulating learners' moods, while the systematic application of emotional assessment can help to improve teaching efficiency. However, existing music emotion analysis methods based on Artificial Intelligence (AI) rely primarily on pre-marked content, such as lyrics and fail to adequately account for music signals' perception, transmission, and recognition. To address this limitation, this study first employs sound-level segmentation, data frame processing, and threshold determination to enable intelligent segmentation and recognition of notes. Next, based on the extracted audio features, a Radial Basis Function (RBF) model is utilized to construct a music emotion classifier. Finally, correlation feedback was used to label the classification results further and train the classifier. The study compares the music emotion classification method commonly used in Chinese music education with the Hevner emotion model. It identifies four emotion categories: Quiet, Happy, Sad, and Excited, to classify performers' emotions. The testing results demonstrate that audio feature recognition time is a mere 0.004 min, with an accuracy rate of over 95%. Furthermore, classifying performers' emotions based on audio features is consistent with conventional human cognition.
  13. Neo EX, Hasikin K, Lai KW, Mokhtar MI, Azizan MM, Hizaddin HF, et al.
    PeerJ Comput Sci, 2023;9:e1306.
    PMID: 37346549 DOI: 10.7717/peerj-cs.1306
    BACKGROUND: The environment has been significantly impacted by rapid urbanization, leading to a need for changes in climate change and pollution indicators. The 4IR offers a potential solution to efficiently manage these impacts. Smart city ecosystems can provide well-designed, sustainable, and safe cities that enable holistic climate change and global warming solutions through various community-centred initiatives. These include smart planning techniques, smart environment monitoring, and smart governance. An air quality intelligence platform, which operates as a complete measurement site for monitoring and governing air quality, has shown promising results in providing actionable insights. This article aims to highlight the potential of machine learning models in predicting air quality, providing data-driven strategic and sustainable solutions for smart cities.

    METHODS: This study proposed an end-to-end air quality predictive model for smart city applications, utilizing four machine learning techniques and two deep learning techniques. These include Ada Boost, SVR, RF, KNN, MLP regressor and LSTM. The study was conducted in four different urban cities in Selangor, Malaysia, including Petaling Jaya, Banting, Klang, and Shah Alam. The model considered the air quality data of various pollution markers such as PM2.5, PM10, O3, and CO. Additionally, meteorological data including wind speed and wind direction were also considered, and their interactions with the pollutant markers were quantified. The study aimed to determine the correlation variance of the dependent variable in predicting air pollution and proposed a feature optimization process to reduce dimensionality and remove irrelevant features to enhance the prediction of PM2.5, improving the existing LSTM model. The study estimates the concentration of pollutants in the air based on training and highlights the contribution of feature optimization in air quality predictions through feature dimension reductions.

    RESULTS: In this section, the results of predicting the concentration of pollutants (PM2.5, PM10, O3, and CO) in the air are presented in R2 and RMSE. In predicting the PM10 and PM2.5concentration, LSTM performed the best overall high R2values in the four study areas with the R2 values of 0.998, 0.995, 0.918, and 0.993 in Banting, Petaling, Klang and Shah Alam stations, respectively. The study indicated that among the studied pollution markers, PM2.5,PM10, NO2, wind speed and humidity are the most important elements to monitor. By reducing the number of features used in the model the proposed feature optimization process can make the model more interpretable and provide insights into the most critical factor affecting air quality. Findings from this study can aid policymakers in understanding the underlying causes of air pollution and develop more effective smart strategies for reducing pollution levels.

  14. Bari BS, Islam MN, Rashid M, Hasan MJ, Razman MAM, Musa RM, et al.
    PeerJ Comput Sci, 2021;7:e432.
    PMID: 33954231 DOI: 10.7717/peerj-cs.432
    The rice leaves related diseases often pose threats to the sustainable production of rice affecting many farmers around the world. Early diagnosis and appropriate remedy of the rice leaf infection is crucial in facilitating healthy growth of the rice plants to ensure adequate supply and food security to the rapidly increasing population. Therefore, machine-driven disease diagnosis systems could mitigate the limitations of the conventional methods for leaf disease diagnosis techniques that is often time-consuming, inaccurate, and expensive. Nowadays, computer-assisted rice leaf disease diagnosis systems are becoming very popular. However, several limitations ranging from strong image backgrounds, vague symptoms' edge, dissimilarity in the image capturing weather, lack of real field rice leaf image data, variation in symptoms from the same infection, multiple infections producing similar symptoms, and lack of efficient real-time system mar the efficacy of the system and its usage. To mitigate the aforesaid problems, a faster region-based convolutional neural network (Faster R-CNN) was employed for the real-time detection of rice leaf diseases in the present research. The Faster R-CNN algorithm introduces advanced RPN architecture that addresses the object location very precisely to generate candidate regions. The robustness of the Faster R-CNN model is enhanced by training the model with publicly available online and own real-field rice leaf datasets. The proposed deep-learning-based approach was observed to be effective in the automatic diagnosis of three discriminative rice leaf diseases including rice blast, brown spot, and hispa with an accuracy of 98.09%, 98.85%, and 99.17% respectively. Moreover, the model was able to identify a healthy rice leaf with an accuracy of 99.25%. The results obtained herein demonstrated that the Faster R-CNN model offers a high-performing rice leaf infection identification system that could diagnose the most common rice diseases more precisely in real-time.
  15. Bhuiyan MR, Abdullah J, Hashim N, Al Farid F, Ahsanul Haque M, Uddin J, et al.
    PeerJ Comput Sci, 2022;8:e895.
    PMID: 35494812 DOI: 10.7717/peerj-cs.895
    This research enhances crowd analysis by focusing on excessive crowd analysis and crowd density predictions for Hajj and Umrah pilgrimages. Crowd analysis usually analyzes the number of objects within an image or a frame in the videos and is regularly solved by estimating the density generated from the object location annotations. However, it suffers from low accuracy when the crowd is far away from the surveillance camera. This research proposes an approach to overcome the problem of estimating crowd density taken by a surveillance camera at a distance. The proposed approach employs a fully convolutional neural network (FCNN)-based method to monitor crowd analysis, especially for the classification of crowd density. This study aims to address the current technological challenges faced in video analysis in a scenario where the movement of large numbers of pilgrims with densities ranging between 7 and 8 per square meter. To address this challenge, this study aims to develop a new dataset based on the Hajj pilgrimage scenario. To validate the proposed method, the proposed model is compared with existing models using existing datasets. The proposed FCNN based method achieved a final accuracy of 100%, 98%, and 98.16% on the proposed dataset, the UCSD dataset, and the JHU-CROWD dataset, respectively. Additionally, The ResNet based method obtained final accuracy of 97%, 89%, and 97% for the proposed dataset, UCSD dataset, and JHU-CROWD dataset, respectively. The proposed Hajj-Crowd-2021 crowd analysis dataset and the model outperformed the other state-of-the-art datasets and models in most cases.
  16. Mahmoud Z, Li C, Zappatore M, Solyman A, Alfatemi A, Ibrahim AO, et al.
    PeerJ Comput Sci, 2023;9:e1639.
    PMID: 38077556 DOI: 10.7717/peerj-cs.1639
    The correction of grammatical errors in natural language processing is a crucial task as it aims to enhance the accuracy and intelligibility of written language. However, developing a grammatical error correction (GEC) framework for low-resource languages presents significant challenges due to the lack of available training data. This article proposes a novel GEC framework for low-resource languages, using Arabic as a case study. To generate more training data, we propose a semi-supervised confusion method called the equal distribution of synthetic errors (EDSE), which generates a wide range of parallel training data. Additionally, this article addresses two limitations of the classical seq2seq GEC model, which are unbalanced outputs due to the unidirectional decoder and exposure bias during inference. To overcome these limitations, we apply a knowledge distillation technique from neural machine translation. This method utilizes two decoders, a forward decoder right-to-left and a backward decoder left-to-right, and measures their agreement using Kullback-Leibler divergence as a regularization term. The experimental results on two benchmarks demonstrate that our proposed framework outperforms the Transformer baseline and two widely used bidirectional decoding techniques, namely asynchronous and synchronous bidirectional decoding. Furthermore, the proposed framework reported the highest F1 score, and generating synthetic data using the equal distribution technique for syntactic errors resulted in a significant improvement in performance. These findings demonstrate the effectiveness of the proposed framework for improving grammatical error correction for low-resource languages, particularly for the Arabic language.
  17. Dyah Irawati I, Budiman G, Saidah S, Rahmadiani S, Latip R
    PeerJ Comput Sci, 2023;9:e1551.
    PMID: 38077543 DOI: 10.7717/peerj-cs.1551
    Vegetables can be distinguished according to differences in color, shape, and texture. The deep learning convolutional neural network (CNN) method is a technique that can be used to classify types of vegetables for various applications in agriculture. This study proposes a vegetable classification technique that uses the CNN AlexNet model and applies compressive sensing (CS) to reduce computing time and save storage space. In CS, discrete cosine transform (DCT) is applied for the sparsing process, Gaussian distribution for sampling, and orthogonal matching pursuit (OMP) for reconstruction. Simulation results on 600 images for four types of vegetables showed a maximum test accuracy of 98% for the AlexNet method, while the combined block-based CS using the AlexNet method produced a maximum accuracy of 96.66% with a compression ratio of 2×. Our results indicated that AlexNet CNN architecture and block-based CS in AlexNet can classify vegetable images better than previous methods.
  18. Rana SA, Azizul ZH, Awan AA
    PeerJ Comput Sci, 2023;9:e1630.
    PMID: 38077542 DOI: 10.7717/peerj-cs.1630
    Integrating artificial intelligence (AI) has transformed living standards. However, AI's efforts are being thwarted by concerns about the rise of biases and unfairness. The problem advocates strongly for a strategy for tackling potential biases. This article thoroughly evaluates existing knowledge to enhance fairness management, which will serve as a foundation for creating a unified framework to address any bias and its subsequent mitigation method throughout the AI development pipeline. We map the software development life cycle (SDLC), machine learning life cycle (MLLC) and cross industry standard process for data mining (CRISP-DM) together to have a general understanding of how phases in these development processes are related to each other. The map should benefit researchers from multiple technical backgrounds. Biases are categorised into three distinct classes; pre-existing, technical and emergent bias, and subsequently, three mitigation strategies; conceptual, empirical and technical, along with fairness management approaches; fairness sampling, learning and certification. The recommended practices for debias and overcoming challenges encountered further set directions for successfully establishing a unified framework.
  19. Zerdoumi S, Jhanjhi NZ, Ariyaluran Habeeb RA, Hashem IAT
    PeerJ Comput Sci, 2023;9:e1465.
    PMID: 38192476 DOI: 10.7717/peerj-cs.1465
    Based on the results of this research, a new method for separating Arabic offline text is presented. This method finds the core splitter between the "Middle" and "Lower" zones by looking for sharp character degeneration in those zones. With the exception of script localization and the essential feature of determining which direction a starting point is pointing, the baseline also functions as a delimiter for horizontal projections. Despite the fact that the bottom half of the characteristics is utilized to differentiate the modifiers in zones, the top half of the characteristics is not. This method works best when the baseline is able to divide features into the bottom zone and the middle zone in a complex pattern where it is hard to find the alphabet, like in ancient scripts. Furthermore, this technique performed well when it came to distinguishing Arabic text, including calligraphy. With the zoning system, the aim is to decrease the number of different element classes that are associated with the total number of alphabets used in Arabic cursive writing. The components are identified using the pixel value origin and center reign (CR) technique, which is combined with letter morphology to achieve complete word-level identification. Using the upper baseline and lower baseline together, this proposed technique produces a consistent Arabic pattern, which is intended to improve identification rates by increasing the number of matches. For Mediterranean keywords (cities in Algeria and Tunisia), the suggested approach makes use of indicators that the correctness of the Othmani and Arabic scripts is greater than 98.14 percent and 90.16 percent, respectively, based on 84 and 117 verses. As a consequence of the auditing method and the assessment section's structure and software, the major problems were identified, with a few of them being specifically highlighted.
  20. Bhardwaj A, Vishnoi A, Bharany S, Abdelmaboud A, Ibrahim AO, Mamoun M, et al.
    PeerJ Comput Sci, 2023;9:e1771.
    PMID: 38192478 DOI: 10.7717/peerj-cs.1771
    The Internet of Things has a bootloader and applications responsible for initializing the device's hardware and loading the operating system or firmware. Ensuring the security of the bootloader is crucial to protect against malicious firmware or software being loaded onto the device. One way to increase the security of the bootloader is to use digital signature verification to ensure that only authorized firmware can be loaded onto the device. Additionally, implementing secure boot processes, such as a chain of trust, can prevent unauthorized access to the device's firmware and protect against tampering during the boot process. This research is based on the firmware bootloader and application dataflow taint analysis and security assessment of IoT devices as the most critical step in ensuring the security and integrity of these devices. This process helps identify vulnerabilities and potential attack vectors that attackers could exploit and provides a foundation for developing effective remediation strategies.
Related Terms
Filters
Contact Us

Please provide feedback to Administrator ([email protected])

External Links