METHODS: A total of 322 samples of mainly human origin were analysed using eight protocols, applying a wide variety of laboratory components. Several samples (60% of human specimens) were processed using different protocols. In total, 712 sequencing libraries were investigated for viral sequence contamination.
RESULTS: Among sequences showing similarity to viruses, 493 were significantly associated with the use of laboratory components. Each of these viral sequences had sporadic appearance, only being identified in a subset of the samples treated with the linked laboratory component, and some were not identified in the non-template control samples. Remarkably, more than 65% of all viral sequences identified were within viral clusters linked to the use of laboratory components.
CONCLUSIONS: We show that high prevalence of contaminating viral sequences can be expected in HTS-based virome data and provide an extensive list of novel contaminating viral sequences that can be used for evaluation of viral findings in future virome and metagenome studies. Moreover, we show that detection can be problematic due to stochastic appearance and limited non-template controls. Although the exact origin of these viral sequences requires further research, our results support laboratory-component-linked viral sequence contamination of both biological and synthetic origin.
FINDINGS: Our high-throughput workflow minimizes these risks via a 4-step strategy: (i) technical replication with 2 PCR replicates and 2 extraction replicates; (ii) using multi-markers (12S,16S,CytB); (iii) a "twin-tagging," 2-step PCR protocol; and (iv) use of the probabilistic taxonomic assignment method PROTAX, which can account for incomplete reference databases. Because annotation errors in the reference sequences can result in taxonomic misassignment, we supply a protocol for curating sequence datasets. For some taxonomic groups and some markers, curation resulted in >50% of sequences being deleted from public reference databases, owing to (i) limited overlap between our target amplicon and reference sequences, (ii) mislabelling of reference sequences, and (iii) redundancy. Finally, we provide a bioinformatic pipeline to process amplicons and conduct PROTAX assignment and tested it on an invertebrate-derived DNA dataset from 1,532 leeches from Sabah, Malaysia. Twin-tagging allowed us to detect and exclude sequences with non-matching tags. The smallest DNA fragment (16S) amplified most frequently for all samples but was less powerful for discriminating at species rank. Using a stringent and lax acceptance criterion we found 162 (stringent) and 190 (lax) vertebrate detections of 95 (stringent) and 109 (lax) leech samples.
CONCLUSIONS: Our metabarcoding workflow should help research groups increase the robustness of their results and therefore facilitate wider use of environmental and invertebrate-derived DNA, which is turning into a valuable source of ecological and conservation information on tetrapods.
METHODS: A subcohort of 201 children with behavioural outcome measures was identified within a longitudinal, Australian birth-cohort study. The faecal microbiota were analysed at 1, 6, and 12 months of age. Behavioural outcomes were measured at 2 years of age.
FINDINGS: In an unselected birth cohort, we found a clear association between decreased normalised abundance of Prevotella in faecal samples collected at 12 months of age and increased behavioural problems at 2 years, in particular Internalizing Problem scores. This association appeared independent of multiple potentially confounding variables, including maternal mental health. Recent exposure to antibiotics was the best predictor of decreased Prevotella.
INTERPRETATION: Our findings demonstrate a strong association between the composition of the gut microbiota in infancy and subsequent behavioural outcomes; and support the importance of responsible use of antibiotics during early life.
FUNDING: This study was funded by the National Health and Medical Research Council of Australia (1082307, 1147980, 1129813), The Murdoch Children's Research Institute, Barwon Health, Deakin University, Perpetual Trustees, and The Shepherd Foundation. The funders had no involvement in the data collection, analysis or interpretation, trial design, recruitment or any other aspect pertinent to the study.