Although it is possible to recover the complete mitogenome directly from shotgun sequencing data, currently reported methods and pipelines are still relatively time consuming and costly. Using a sample of the Australian freshwater crayfish Engaeus lengana, we demonstrate that it is possible to achieve three-day turnaround time (four hours hands-on time) from tissue sample to NCBI-ready submission file through the integration of MiSeq sequencing platform, Nextera sample preparation protocol, MITObim assembly algorithm and MITOS annotation pipeline.
Neuron cell are built from a myriad of axon and dendrite structures. It transmits electrochemical signals between the brain and the nervous system. Three-dimensional visualization of neuron structure could help to facilitate deeper understanding of neuron and its models. An accurate neuron model could aid understanding of brain's functionalities, diagnosis and knowledge of entire nervous system. Existing neuron models have been found to be defective in the aspect of realism. Whereas in the actual biological neuron, there is continuous growth as the soma extending to the axon and the dendrite; but, the current neuron visualization models present it as disjointed segments that has greatly mediated effective realism. In this research, a new reconstruction model comprising of the Bounding Cylinder, Curve Interpolation and Gouraud Shading is proposed to visualize neuron model in order to improve realism. The reconstructed model is used to design algorithms for generating neuron branching from neuron SWC data. The Bounding Cylinder and Curve Interpolation methods are used to improve the connected segments of the neuron model using a series of cascaded cylinders along the neuron's connection path. Three control points are proposed between two adjacent neuron segments. Finally, the model is rendered with Gouraud Shading for smoothening of the model surface. This produce a near-perfection model of the natural neurons with attended realism. The model is validated by a group of bioinformatics analysts' responses to a predefined survey. The result shows about 82% acceptance and satisfaction rate.
In this chapter, the computational biology of cardiac cavity images is proposed. The method uses collinear and triangle equation algorithms to detect and reconstruct the boundary of the cardiac cavity. The first step involves high boost filter to enhance the high frequency component without affecting the low frequency component. Second, the morphological and thresholding operators are applied to the image to eliminate noise and convert the image into a binary image. Next, the edge detection is performed using the negative Laplacian filter and followed by region filtering. Finally, the collinear and triangle equations are used to detect and reconstruct the more precise cavity boundary. Results obtained have proved that this technique is able to perform better segmentation and detection of the boundary of cardiac cavity from echocardiographic images.
An improved genetic algorithm procedure is introduced in this work based on the theory of the most highly fit parents (both male and female) are most likely to produce healthiest offspring. It avoids the destruction of near optimal information and promotes further search around the potential region by encouraging the exchange of highly important information among the fittest solution. A novel crossover technique called Segmented Multi-chromosome Crossover is also introduced. It maintains the information contained in gene segments and allows offspring to inherit information from multiple parent chromosomes. The improved GA is applied for the automatic and simultaneous parameter optimization and feature selection of multi-layer perceptron network in medical disease diagnosis. Compared to the previous works, the average accuracy of the proposed algorithm is the best among all algorithms for diabetes and heart dataset, and the second best for cancer dataset.
Metabarcoding, the coupling of DNA-based species identification and high-throughput sequencing, offers enormous promise for arthropod biodiversity studies but factors such as cost, speed and ease-of-use of bioinformatic pipelines, crucial for making the leapt from demonstration studies to a real-world application, have not yet been adequately addressed. Here, four published and one newly designed primer sets were tested across a diverse set of 80 arthropod species, representing 11 orders, to establish optimal protocols for Illumina-based metabarcoding of tropical Malaise trap samples. Two primer sets which showed the highest amplification success with individual specimen polymerase chain reaction (PCR, 98%) were used for bulk PCR and Illumina MiSeq sequencing. The sequencing outputs were subjected to both manual and simple metagenomics quality control and filtering pipelines. We obtained acceptable detection rates after bulk PCR and high-throughput sequencing (80-90% of input species) but analyses were complicated by putative heteroplasmic sequences and contamination. The manual pipeline produced similar or better outputs to the simple metagenomics pipeline (1.4 compared with 0.5 expected:unexpected Operational Taxonomic Units). Our study suggests that metabarcoding is slowly becoming as cheap, fast and easy as conventional DNA barcoding, and that Malaise trap metabarcoding may soon fulfill its potential, providing a thermometer for biodiversity.
Medical image fusion is the procedure of combining several images from one or multiple imaging modalities. In spite of numerous attempts in direction of automation ventricle segmentation and tracking in echocardiography, due to low quality images with missing anatomical details or speckle noises and restricted field of view, this problem is a challenging task. This paper presents a fusion method which particularly intends to increase the segment-ability of echocardiography features such as endocardial and improving the image contrast. In addition, it tries to expand the field of view, decreasing impact of noise and artifacts and enhancing the signal to noise ratio of the echo images. The proposed algorithm weights the image information regarding an integration feature between all the overlapping images, by using a combination of principal component analysis and discrete wavelet transform. For evaluation, a comparison has been done between results of some well-known techniques and the proposed method. Also, different metrics are implemented to evaluate the performance of proposed algorithm. It has been concluded that the presented pixel-based method based on the integration of PCA and DWT has the best result for the segment-ability of cardiac ultrasound images and better performance in all metrics.
Inherited peripheral neuropathies (IPNs) are a group of related diseases primarily affecting the peripheral motor and sensory neurons. They include the hereditary sensory neuropathies (HSN), hereditary motor neuropathies (HMN), and Charcot-Marie-Tooth disease (CMT). Using whole-exome sequencing (WES) to achieve a genetic diagnosis is particularly suited to IPNs, where over 80 genes are involved with weak genotype-phenotype correlations beyond the most common genes. We performed WES for 110 index patients with IPN where the genetic cause was undetermined after previous screening for mutations in common genes selected by phenotype and mode of inheritance. We identified 41 missense sequence variants in the known IPN genes in our cohort of 110 index patients. Nine variants (8%), identified in the genes MFN2, GJB1, BSCL2, and SETX, are previously reported mutations and considered to be pathogenic in these families. Twelve novel variants (11%) in the genes NEFL, TRPV4, KIF1B, BICD2, and SETX are implicated in the disease but require further evidence of pathogenicity. The remaining 20 variants were confirmed as polymorphisms (not causing the disease) and are detailed here to help interpret sequence variants identified in other family studies. Validation using segregation, normal controls, and bioinformatics tools was valuable as supporting evidence for sequence variants implicated in disease. In addition, we identified one SETX sequence variant (c.7640T>C), previously reported as a putative mutation, which we have confirmed as a nonpathogenic rare polymorphism. This study highlights the advantage of using WES for genetic diagnosis in highly heterogeneous diseases such as IPNs and has been particularly powerful in this cohort where genetic diagnosis could not be achieved due to phenotype and mode of inheritance not being previously obvious. However, first tier testing for common genes in clinically well-defined cases remains important and will account for most positive results.
Colorectal cancer refers to the cancer that occurs in the colon and rectum. It has been established as the third most
common cancer and the forth one in causing worldwide mortality. Cancer caused by the mutation of several genes that
usually involved in the regulation of cell proliferation, growth and cell death. The mutation that leads to abnormal
function of genes, either in enabling the genes to gain or loss of function was termed as driver mutation and the genes
with driver mutation ability was termed as driver genes. The identification of driver genes provides insight on mechanistic
process of cancer development where this information can be used to further understand their mode of action for causing
dysregulation in signaling pathways. In this study, two bioinformatic tools, i.e. CGI and iCAGES were used to predict
potential driver genes from the genome of eight colorectal cancer patients with annotated variants datasets. 44 unique
driver genes and 21 pathways have been identified; such as p53 signaling, PI3K-AKT, Endocrine resistance, MAPK and
cell cycle pathways. The identification of these pathways can lead to the identification of potential drugs targeting these
pathways.
A Mycobacterium tuberculosis strain SBH162 was isolated from a 49-year-old male with pulmonary tuberculosis. GeneXpert MDR/RIF identified the strain as rifampicin-resistant M. tuberculosis. The whole genome sequencing was performed using Illumina HiSeq 4000 system to further investigate and verify the mutation sites of the strain through genetic analyses namely variant calling using bioinformatics tools. The de novo assembly of genome generated 100 contigs with N50 of 156,381bp. The whole genome size was 4,343,911 bp with G + C content of 65.58% and consisted of 4,306 predicted genes. The mutation site, S450L, for rifampicin resistance was detected in the rpoB gene. Based on the phylogenetic analysis using the Maximum Likelihood method, the strain was identified as belonging to the Europe America Africa lineage (Lineage 4). The genome dataset has been deposited at DDBJ/ENA/GenBank under the accession number SMOE00000000.
The most common quorum sensing (QS) system in Gram-negative bacteria consists of signaling molecules called N-acyl-homoserine lactones (AHLs), which are synthesized by an enzyme AHL synthase (LuxI) and detected by a transcriptional regulator (LuxR) that are usually located in close proximity. However, many recent studies have also evidenced the presence of LuxR solos that are LuxR-related proteins in Proteobacteria that are devoid of a cognate LuxI AHL synthase. Pandoraea species are opportunistic pathogens frequently isolated from sputum specimens of cystic fibrosis (CF) patients. We have previously shown that P. pnomenusa strains possess QS activity. In this study, we examined the presence of QS activity in all type strains of Pandoraea species and acquired their complete genome sequences for holistic bioinformatics analyses of QS-related genes. Only four out of nine type strains (P. pnomenusa, P. sputorum, P. oxalativorans, and P. vervacti) showed QS activity, and C8-HSL was the only AHL detected. A total of 10 canonical luxIs with adjacent luxRs were predicted by bioinformatics from the complete genomes of aforementioned species and publicly available Pandoraea genomes. No orphan luxI was identified in any of the genomes. However, genes for two LuxR solos (LuxR2 and LuxR3 solos) were identified in all Pandoraea genomes (except two draft genomes with one LuxR solo gene), and P. thiooxydans was the only species that harbored no QS-related activity and genes. Except the canonical LuxR genes, LuxIs and LuxR solos of Pandoraea species were distantly related to the other well-characterized QS genes based on phylogenetic clustering. LuxR2 and LuxR3 solos might represent two novel evolutionary branches of LuxR system as they were found exclusively only in the genus. As a few luxR solos were located in close proximity with prophage sequence regions in the genomes, we thus postulated that these luxR solos could be transmitted into genus Pandoraea by transduction process mediated by bacteriophage. The bioinformatics approach developed in this study forms the basis for further characterization of closely related species. Overall, our findings improve the current understanding of QS in Pandoraea species, which is a potential pharmacological target in battling Pandoraea infections in CF patients.
Msb069 primer pairs encompassed region is believed to be associated with a quantitative trait loci (QTL) of dorsal fin length in subgenus Poecilia. However, detailed investigation on Msb069 which originated from Xiphophorus on subgenus Poecilia remains unexplored. In this study, full sequence of Msb069 was characterized by sequencing bioinformatics analysis and gene expression. The sequence analysis of Msb069 primer pairs encompassed region on three species of Poecilia revealed higher number of microsatellite tandem repeats in Poecilia latipinna (ATG 16 ) compared to P. sphenops (ATG 13-14 ). There is no notable pattern of ATGtandem repeats discovered in the hybrids. The full sequence of Msb069 is 734 bp in length and showed a 233 bp conserved region between Xiphophorus and Poecilia. BLAST search performed on this sequence revealed no significant similarities. Nonquantitative RT-PCR exhibited the presence of Msb069 transcripts in three different tissues in subgenus Poecilia. Meanwhile, quantitative RTPCR expression on two different tissues showed relatively higher expression of Msb069 transcript in P. latipinna dorsal fin tissues in both male and female fishes, suggesting a repressive function of this transcript with respect to dorsal fin length. However the exact gene expression event of Msb069 is still unknown and requires further investigation.
Introduction: Cancer is one of the main causes of mortality globally and the incidence has been rising over the years. Studies have shown that miRNAs have the potential as cancer biomarkers. The miR-130a has been reported to be upregulated in several types of cancer, which indicate the important roles of miR-130a in cancer development and metastasis. The aim of this study is to identify potential target genes and to predict the regulatory function of miR- 130a-3p and 5p in cancer. Methods: Three bioinformatics platforms namely miRWalk, the Database for annotations, visualization and integrated discovery (DAVID) Gene Functional Classification Tool and miRanda-miRSVR analysis tools were used to identify possible interaction between miR-130a and its target. Protein-protein interaction (PPI) network for the predicted target genes was then constructed. Results: The analyses have identified nine predicted target genes for miR-130a-3p (RAPGEF4, SOS2, NRP1, RPS6KB1, MET, IL15, ACVR1, RYR2 and ITPR1), and ten for miR-130a-5p (BCL11A, SPOPL, NLK, PPARGC1A, POU4F2, CPEB4, ST18, RSBN1L, ELF5 and ARID4B), that might
play an important role in the development of cancer. Findings from this report suggest that miR-130a may involves in controlling cancer related genes; MET, ACVR1 and BCL11A. miR-130a-3p may regulates MET which involves in apoptosis and metastasis, and ACVR1 which involves in metastasis and angiogenesis. miR-130a-5p may regulates BCL11A which involves in apoptosis, proliferation and tumorigenesis. Conclusion: This study has highlighted the molecular interaction of miR-130a with associated genes and pathways, suggesting therapeutic potential of miR- 130a as personalised targeted therapy for cancer.
The anticonvulsive potential of proteins extracted from Orthosiphon stamineus leaves (OSLP) has never been elucidated in zebrafish (Danio rerio). This study thus aims to elucidate the anticonvulsive potential of OSLP in pentylenetetrazol (PTZ)-induced seizure model. Physical changes (seizure score and seizure onset time, behavior, locomotor) and neurotransmitter analysis were elucidated to assess the pharmacological activity. The protective mechanism of OSLP on brain was also studied using mass spectrometry-based label-free proteomic quantification (LFQ) and bioinformatics. OSLP was found to be safe up to 800 µg/kg and pre-treatment with OSLP (800 µg/kg, i.p., 30 min) decreased the frequency of convulsive activities (lower seizure score and prolonged seizure onset time), improved locomotor behaviors (reduced erratic swimming movements and bottom-dwelling habit), and lowered the excitatory neurotransmitter (glutamate). Pre-treatment with OSLP increased protein Complexin 2 (Cplx 2) expression in the zebrafish brain. Cplx2 is an important regulator in the trans-SNARE complex which is required during the vesicle priming phase in the calcium-dependent synaptic vesicle exocytosis. Findings in this study collectively suggests that OSLP could be regulating the release of neurotransmitters via calcium-dependent synaptic vesicle exocytosis mediated by the "Synaptic Vesicle Cycle" pathway. OSLP's anticonvulsive actions could be acting differently from diazepam (DZP) and with that, it might not produce the similar cognitive insults such as DZP.
Marine sponges are acknowledged as a bacterial hotspot and resource of novel natural products or genetic material with industrial or commercial potential. However, sponge-associated bacteria are difficult to be cultivated and the production of their desirable metabolites is inadequate in terms of rate and quantity, yet bioinformatics and metagenomics tools are steadily progressing. Bacterial diversity profiles of high-microbial-abundance wild tropical marine sponges Aaptos aaptos and Xestospongia muta were obtained by sample collection at Pulau Bidong and Pulau Redang islands, 16S rRNA amplicon sequencing on Illumina HiSeq2500 platform (250 bp paired-end) and metagenomics analysis using Ribosomal Database Project (RDP) classifier. Raw sequencing data in fastq format and relative abundance histograms of the dominant 10 species are available in the public repository Discover Mendeley Data (http://dx.doi.org/10.17632/zrcks5s8xp). Filtered sequencing data of operational taxonomic unit (OTU) with chimera removed is available in NCBI accession numbers from MT464469 to MT465036.
Introduction: Tuberculosis (TB), commonly caused by Mycobacterium tuberculosis (Mtb), is one of the ten leading causes of death worldwide. The gold standard, microbiological culture for detection and differentiation of mycobac-teria are time-consuming and laborious. The use of fast, easy and sensitive nucleic acid amplification tests (NAATs) for diagnosis of TB remains challenging because there is a high degree of homology within Mtb complex (MTBC) members and absence of target genes in the genome of some strains. This study aimed to identify new candidate genetic marker and to design specific primers to detect Mtb using in silico methods. Methods: Using Basic Local Alignment Search Tool (BLAST) program, Mtb H37Rv chromosome reference genome sequence was mapped with other MTBC members and a single nucleotide polymorphism (SNP) at Rv1970 was found to be specific only for Mtb strains. Mismatch amplification mutation assay (MAMA) combine with polymerase chain reaction (PCR) was used as an alternative method to detect the point mutation. MAMA primers targeting the SNP were designed using Primer-BLAST and the PCR assay was optimized via Taguchi method. Results: The assay amplified a 112 bp gene fragment and was able to detect all Mtb strains, but not the other MTBC members and non-tuberculous Mycobacte-ria. The detection limit of the assay was 60 pg/μl. Conclusion: Bioinformatics has provided predictive identification of many new target markers. The designed primers were found to be highly specific at single-gene target resolution for detection of Mtb.
Recent achievements in plant microRNA (miRNA), a large class of small and non-coding RNAs, are very exciting. A wide array of techniques involving forward genetic, molecular cloning, bioinformatic analysis, and the latest technology, deep sequencing have greatly advanced miRNA discovery. A tiny miRNA sequence has the ability to target single/multiple mRNA targets. Most of the miRNA targets are transcription factors (TFs) which have paramount importance in regulating the plant growth and development. Various families of TFs, which have regulated a range of regulatory networks, may assist plants to grow under normal and stress environmental conditions. This present review focuses on the regulatory relationships between miRNAs and different families of TFs like; NF-Y, MYB, AP2, TCP, WRKY, NAC, GRF, and SPL. For instance NF-Y play important role during drought tolerance and flower development, MYB are involved in signal transduction and biosynthesis of secondary metabolites, AP2 regulate the floral development and nodule formation, TCP direct leaf development and growth hormones signaling. WRKY have known roles in multiple stress tolerances, NAC regulate lateral root formation, GRF are involved in root growth, flower, and seed development, and SPL regulate plant transition from juvenile to adult. We also studied the relation between miRNAs and TFs by consolidating the research findings from different plant species which will help plant scientists in understanding the mechanism of action and interaction between these regulators in the plant growth and development under normal and stress environmental conditions.
Transboundary emissions of smoke-haze from land and forest fires have recurred annually during the dry period (June to October, over the past few decades) in South East Asia. Hazardous air quality has been recorded in Malaysia during these episodes. Agricultural practices such as slash-and-burn of biomass and peat fires particularly in Sumatera and Kalimantan, Indonesia, have been implicated as the major causes of the haze. Past findings have shown that a diversity of microbes can thrive in air including in smoke-haze polluted air. In this study, metagenomic data were generated to reveal the diversity of microorganisms in air during days with and without haze. Air samples were collected during non-haze (2013A01) and two haze (2013A04 and 2013A05) periods in the month of June 2013. DNA was extracted from the samples, subjected to Multiple Displacement Amplification and whole genome sequencing (Next Generation Sequencing) using the HiSeq 2000 Platform. Extensive bio-informatic analyses of the raw sequence data then followed. Raw reads from these six air samples were deposited in the NCBI SRA databases under Bioproject PRJNA662021 with accession numbers SRX9087478, SRX9087479 and SRX9087480.
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has caused a pandemic involving so far more than 22 million infections and 776,157 deaths. Effective vaccines are urgently needed to prevent SARS-CoV-2 infections. No vaccines have yet been approved for licensure by regulatory agencies. Even though host immune responses to SARS-CoV-2 infections are beginning to be unravelled, effective clearance of virus will depend on both humoral and cellular immunity. Additionally, the presence of Spike (S)-glycoprotein reactive CD4+ T-cells in the majority of convalescent patients is consistent with its significant role in stimulating B and CD8+ T-cells. The search for immunodominant epitopes relies on experimental evaluation of peptides representing the epitopes from overlapping peptide libraries which can be costly and labor-intensive. Recent advancements in B- and T-cell epitope predictions by bioinformatic analysis have led to epitope identifications. Assessing which peptide epitope can induce potent neutralizing antibodies and robust T-cell responses is a prerequisite for the selection of effective epitopes to be incorporated in peptide-based vaccines. This review discusses the roles of B- and T-cells in SARS-CoV-2 infections and experimental validations for the selection of B-, CD4+ and CD8+ T-cell epitopes which could lead to the construction of a multi-epitope peptide vaccine. Peptide-based vaccines are known for their low immunogenicity which could be overcome by incorporating immunostimulatory adjuvants and nanoparticles such as Poly Lactic-co-Glycolic Acid (PLGA) or chitosan.
Bioinformatics tool is a software program made to extract meaningful information from the mass of molecular biology or biological databases and carry out sequence or structural analysis. The method of determining the order of nucleotides within a deoxyribonucleic acid (DNA) molecule is known as DNA sequencing. This analysis is meant to be run to the commercialized or factorymade goat's milk (pasteurised) from various states in Malaysia to identify the milk's authenticity, either it is pure or mixed with other foreign substances from other animals. The main objective is to compare DNA sequences of commercialized and raw goat's milk (handmilking and non-pasteurised). To achieve this, we used ClustalX to align and compare the obtained DNA from both milk samples. The sequences will be aligned using ClustalX software. ClustalX is a provider of an automated system for performing multiple alignments of sequences and profiles and evaluating the outcomes. The usage of ClustalX is helpful as it is cost-effective, user-friendly, and showing a high accuracy of the analysis.
Burkholderia pseudomallei is a soil-dwelling bacterium that causes a globally emerging disease called melioidosis. Approximately one third of the in silico annotated genes in its genome are classified as hypothetical genes. This group of genes is difficult to be functionally characterised partly due to the absence of noticeable phenotypes under conventional laboratory settings. A bioinformatic survey of hypothetical genes revealed a gene designated as BPSL3393 that putatively encodes a small protein of 11 kDA with a CoA binding domain. BPSL3393 is conserved in all the B. pseudomallei genomes as well as various in other species within the genus Burkholderia. Taking into consideration that CoA plays a ubiquitous metabolic role in all life forms, characterisation of BPSL3393 may uncover a previously over-looked metabolic feature of B. pseudomallei. The gene was deleted from the genome using a double homologous recombination approach yielding a null mutant. The BPSL3393 mutant showed no difference in growth rate with the wild type under rich and minimal growth conditions. An extensive metabolic phenotyping test was performed involving 95 metabolic substrates. The deletion mutant of BPSL3393 was severely impaired in its ethanolamine metabolism. The growth rate of the mutant was attenuated when ethanolamine was used as the sole carbon source. A transcriptional analysis of the ethanolamine metabolism genes showed that they were down-regulated in the BPSL3393 mutant. This seemed to suggest that BPSL3393 functions as a positive regulator for ethanolamine metabolism.