Displaying all 6 publications

Abstract:
Sort:
  1. Sarpan N, Taranenko E, Ooi SE, Low EL, Espinoza A, Tatarinova TV, et al.
    Plant Cell Rep, 2020 Sep;39(9):1219-1233.
    PMID: 32591850 DOI: 10.1007/s00299-020-02561-9
    KEY MESSAGE: Several hypomethylated sites within the Karma region of EgDEF1 and hotspot regions in chromosomes 1, 2, 3, and 5 may be associated with mantling. One of the main challenges faced by the oil palm industry is fruit abnormalities, such as the "mantled" phenotype that can lead to reduced yields. This clonal abnormality is an epigenetic phenomenon and has been linked to the hypomethylation of a transposable element within the EgDEF1 gene. To understand the epigenome changes in clones, methylomes of clonal oil palms were compared to methylomes of seedling-derived oil palms. Whole-genome bisulfite sequencing data from seedlings, normal, and mantled clones were analyzed to determine and compare the context-specific DNA methylomes. In seedlings, coding and regulatory regions are generally hypomethylated while introns and repeats are extensively methylated. Genes with a low number of guanines and cytosines in the third position of codons (GC3-poor genes) were increasingly methylated towards their 3' region, while GC3-rich genes remain demethylated, similar to patterns in other eukaryotic species. Predicted promoter regions were generally hypomethylated in seedlings. In clones, CG, CHG, and CHH methylation levels generally decreased in functionally important regions, such as promoters, 5' UTRs, and coding regions. Although random regions were found to be hypomethylated in clonal genomes, hypomethylation of certain hotspot regions may be associated with the clonal mantling phenotype. Our findings, therefore, suggest other hypomethylated CHG sites within the Karma of EgDEF1 and hypomethylated hotspot regions in chromosomes 1, 2, 3 and 5, are associated with mantling.
  2. Chan KL, Rosli R, Tatarinova TV, Hogan M, Firdaus-Raih M, Low EL
    BMC Bioinformatics, 2017 Jan 27;18(Suppl 1):1426.
    PMID: 28466793 DOI: 10.1186/s12859-016-1426-6
    BACKGROUND: Gene prediction is one of the most important steps in the genome annotation process. A large number of software tools and pipelines developed by various computing techniques are available for gene prediction. However, these systems have yet to accurately predict all or even most of the protein-coding regions. Furthermore, none of the currently available gene-finders has a universal Hidden Markov Model (HMM) that can perform gene prediction for all organisms equally well in an automatic fashion.

    RESULTS: We present an automated gene prediction pipeline, Seqping that uses self-training HMM models and transcriptomic data. The pipeline processes the genome and transcriptome sequences of the target species using GlimmerHMM, SNAP, and AUGUSTUS pipelines, followed by MAKER2 program to combine predictions from the three tools in association with the transcriptomic evidence. Seqping generates species-specific HMMs that are able to offer unbiased gene predictions. The pipeline was evaluated using the Oryza sativa and Arabidopsis thaliana genomes. Benchmarking Universal Single-Copy Orthologs (BUSCO) analysis showed that the pipeline was able to identify at least 95% of BUSCO's plantae dataset. Our evaluation shows that Seqping was able to generate better gene predictions compared to three HMM-based programs (MAKER2, GlimmerHMM and AUGUSTUS) using their respective available HMMs. Seqping had the highest accuracy in rice (0.5648 for CDS, 0.4468 for exon, and 0.6695 nucleotide structure) and A. thaliana (0.5808 for CDS, 0.5955 for exon, and 0.8839 nucleotide structure).

    CONCLUSIONS: Seqping provides researchers a seamless pipeline to train species-specific HMMs and predict genes in newly sequenced or less-studied genomes. We conclude that the Seqping pipeline predictions are more accurate than gene predictions using the other three approaches with the default or available HMMs.

  3. Sanusi NSNM, Rosli R, Halim MAA, Chan KL, Nagappan J, Azizi N, et al.
    Database (Oxford), 2018 01 01;2018.
    PMID: 30239681 DOI: 10.1093/database/bay095
    A set of Elaeis guineensis genes had been generated by combining two gene prediction pipelines: Fgenesh++ developed by Softberry and Seqping by the Malaysian Palm Oil Board. PalmXplore was developed to provide a scalable data repository and a user-friendly search engine system to efficiently store, manage and retrieve the oil palm gene sequences and annotations. Information deposited in PalmXplore includes predicted genes, their genomic coordinates, as well as the annotations derived from external databases, such as Pfam, Gene Ontology and Kyoto Encyclopedia of Genes and Genomes. Information about genes related to important traits, such as those involved in fatty acid biosynthesis (FAB) and disease resistance, is also provided. The system offers Basic Local Alignment Search Tool homology search, where the results can be downloaded or visualized in the oil palm genome browser (MYPalmViewer). PalmXplore is regularly updated offering new features, improvements to genome annotation and new genomic sequences. The system is freely accessible at http://palmxplore.mpob.gov.my.
  4. Ooi SE, Sarpan N, Taranenko E, Feshah I, Nuraziyan A, Roowi SH, et al.
    Plant Mol Biol, 2023 Mar;111(4-5):345-363.
    PMID: 36609897 DOI: 10.1007/s11103-022-01330-4
    The mantled phenotype is an abnormal somaclonal variant arising from the oil palm cloning process and severe phenotypes lead to oil yield losses. Hypomethylation of the Karma retrotransposon within the B-type MADS-box EgDEF1 gene has been associated with this phenotype. While abnormal Karma-EgDEF1 hypomethylation was detected in mantled clones, we examined the methylation state of Karma in ortets that gave rise to high mantling rates in their clones. Small RNAs (sRNAs) were proposed to play a role in Karma hypomethylation as part of the RNA-directed DNA methylation process, hence differential expression analysis of sRNAs between the ortet groups was conducted. While no sRNA was differentially expressed at the Karma-EgDEF1 region, three sRNA clusters were differentially regulated in high-mantling ortets. The first two down-regulated clusters were possibly derived from long non-coding RNAs while the third up-regulated cluster was derived from the intron of a DnaJ chaperone gene. Several predicted mRNA targets for the first two sRNA clusters conversely displayed increased expression in high-mantling relative to low-mantling ortets. These predicted mRNA targets may be associated with defense or pathogenesis response. In addition, several differentially methylated regions (DMRs) were identified in Karma and its surrounding regions, mainly comprising subtle CHH hypomethylation in high-mantling ortets. Four of the 12 DMRs were located in a region corresponding to hypomethylated areas at the 3'end of Karma previously reported in mantled clones. Further investigations on these sRNAs and DMRs may indicate the predisposition of certain ortets towards mantled somaclonal variation.
  5. Chan KL, Tatarinova TV, Rosli R, Amiruddin N, Azizi N, Halim MAA, et al.
    Biol. Direct, 2017 Sep 08;12(1):21.
    PMID: 28886750 DOI: 10.1186/s13062-017-0191-4
    BACKGROUND: Oil palm is an important source of edible oil. The importance of the crop, as well as its long breeding cycle (10-12 years) has led to the sequencing of its genome in 2013 to pave the way for genomics-guided breeding. Nevertheless, the first set of gene predictions, although useful, had many fragmented genes. Classification and characterization of genes associated with traits of interest, such as those for fatty acid biosynthesis and disease resistance, were also limited. Lipid-, especially fatty acid (FA)-related genes are of particular interest for the oil palm as they specify oil yields and quality. This paper presents the characterization of the oil palm genome using different gene prediction methods and comparative genomics analysis, identification of FA biosynthesis and disease resistance genes, and the development of an annotation database and bioinformatics tools.

    RESULTS: Using two independent gene-prediction pipelines, Fgenesh++ and Seqping, 26,059 oil palm genes with transcriptome and RefSeq support were identified from the oil palm genome. These coding regions of the genome have a characteristic broad distribution of GC3 (fraction of cytosine and guanine in the third position of a codon) with over half the GC3-rich genes (GC3 ≥ 0.75286) being intronless. In comparison, only one-seventh of the oil palm genes identified are intronless. Using comparative genomics analysis, characterization of conserved domains and active sites, and expression analysis, 42 key genes involved in FA biosynthesis in oil palm were identified. For three of them, namely EgFABF, EgFABH and EgFAD3, segmental duplication events were detected. Our analysis also identified 210 candidate resistance genes in six classes, grouped by their protein domain structures.

    CONCLUSIONS: We present an accurate and comprehensive annotation of the oil palm genome, focusing on analysis of important categories of genes (GC3-rich and intronless), as well as those associated with important functions, such as FA biosynthesis and disease resistance. The study demonstrated the advantages of having an integrated approach to gene prediction and developed a computational framework for combining multiple genome annotations. These results, available in the oil palm annotation database ( http://palmxplore.mpob.gov.my ), will provide important resources for studies on the genomes of oil palm and related crops.

    REVIEWERS: This article was reviewed by Alexander Kel, Igor Rogozin, and Vladimir A. Kuznetsov.

  6. Low EL, Chan KL, Zaki NM, Taranenko E, Ordway JM, Wischmeyer C, et al.
    G3 (Bethesda), 2024 Sep 04;14(9).
    PMID: 38918881 DOI: 10.1093/g3journal/jkae135
    Elaeis guineensis and E. oleifera are the two species of oil palm. E. guineensis is the most widely cultivated commercial species, and introgression of desirable traits from E. oleifera is ongoing. We report an improved E. guineensis genome assembly with substantially increased continuity and completeness, as well as the first chromosome-scale E. oleifera genome assembly. Each assembly was obtained by integration of long-read sequencing, proximity ligation sequencing, optical mapping, and genetic mapping. High interspecific genome conservation is observed between the two species. The study provides the most extensive gene annotation to date, including 46,697 E. guineensis and 38,658 E. oleifera gene predictions. Analyses of repetitive element families further resolve the DNA repeat architecture of both genomes. Comparative genomic analyses identified experimentally validated small structural variants between the oil palm species and resolved the mechanism of chromosomal fusions responsible for the evolutionary descending dysploidy from 18 to 16 chromosomes.
Filters
Contact Us

Please provide feedback to Administrator ([email protected])

External Links