Affiliations 

  • 1 Faculty of Computer Systems & Software Engineering, Universiti Malaysia Pahang, Kuantan, Pahang, Malaysia. [email protected]
  • 2 Faculty of Computer Systems & Software Engineering, Universiti Malaysia Pahang, Kuantan, Pahang, Malaysia
  • 3 Institute for Artificial Intelligence and Big Data, Universiti Malaysia Kelantan, Kota Bharu, Kelantan, Malaysia
Methods Mol Biol, 2019;1986:255-266.
PMID: 31115893 DOI: 10.1007/978-1-4939-9442-7_12

Abstract

In gene expression studies, missing values are a common problem with important consequences for the interpretation of the final data (Satija et al., Nat Biotechnol 33(5):495, 2015). Numerous bioinformatics examination tools are used for cancer prediction, including the data set matrix (Bailey et al., Cell 173(2):371-385, 2018); thus, it is necessary to resolve the problem of missing-values imputation. This chapter presents a review of the research on missing-values imputation approaches for gene expression data. By using local and global correlation of the data, we were able to focus mostly on the differences between the algorithms. We classified the algorithms as global, hybrid, local, or knowledge-based techniques. Additionally, this chapter presents suitable assessments of the different approaches. The purpose of this review is to focus on developments in the current techniques for scientists rather than applying different or newly developed algorithms with identical functional goals. The aim was to adapt the algorithms to the characteristics of the data.

* Title and MeSH Headings from MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine.