Use case title (contact person)
Area 52 lab metabarcoding (Haris Zafeiropoulos, IMBBC, HCMR)
In-brief: description the analysis that was carried out
P.E.M.A. was used to analyse paired-end Illumina amplicon reads. That is in the frame of the eDNA (environmental DNA) metabarcoding technique, widely used for biodiversity assessment.
In brief: desctription of the pertinent project (and funding)
During a Short Term Scientific Mission that took place in Area 52 lab, at the University College Dublin, estuary samples were selected and then analysed with two pipelines: QIIME2 and P.E.M.A. Our goal was: to discover whether specific species meant to be in those samples, were actually found and at the same time, to benchmark those two pipelines and compare their outputs. Funding: DNAqua.net COST Action CA15219
Comments on how the cluster supported this work (e.g available software, memory, storage) Requirements can be described in plain language e.g. ¼ of the “batch” partition for 2 weeks
Technical language is also possible (e.g. CPU hours, workflow details)
The were ~22 millions of reads, from 73 samples. About 15 different runs were tested over a two weεk period and for each of those, 1 node of Zorba cluster was used. Depending on the parameter set, each job could last from 14 hours to 5 days. P.E.M.A. invoked a series of bioinformatic tools (FastQC, Trimmomatic, PANDAseq etc) and that is why in certain steps all requested resources were exploited, while in others only a sole CPU is used.
Relevant project/dataset web site if available
http://www.ucd.ie/area52/ , https://github.com/hariszaf/pema