The aim was to develop stable, complete and accessible bioinformatic pipelines for analysis of HTS metabarcoding data, as well as the benchmarking best-practice approaches for HTS data analysis through comparison of different pipelines. Also, team wanted to investigate the effects of different field techniques on HTS output. During the stay, the results of bioinformatic pipeline P.E.M.A. (recently developed in IMBBC-HCMR) were compared using different sets of parameters with those obtained from QIIME2, a commonly used bioinformatic pipeline. This STSM developed a Singularity image of P.E.M.A. that is now available on the Singularity Hub. All the scripts developed during this STSM have been included in P.E.M.A.’s page on GitHub. A total of 22,198,605 reads derived from 73 paired-end samples, including two mock communities, were analyzed. It was found that even though attempts to use similar parameters in the two pipelines were made, the final outcome was considerably different due to different execution of certain steps in P.E.M.A. and QIIME2. A comparison of the assigned OTUs to their taxonomies, and especially of those from the mock communities is needed in order to compare the two pipelines. Initial findings indicated that mock community taxa were identified in the output using most methods tested, regardless of how the reads were merged. However, mock community taxa were not found using the Naive-Bayes classifier implemented in QIIME2 against the Midori database, which warrants further investigation.
Download newsletter in PDF: DNAqua-Net_Newsletter4