Question: Trimmomatic output after filtering paired end data - fastqsanger vs fastqsanger.gz
After running Trimmomatic on a paired end PolyA RNA-seq dataset where both forward and reverse reads were in fastqsanger.gz format trimming the forward reads resulted in fastqsanger.gz files while trimming the reverse reads resulted in fastqsanger files. Q1. Is this normal output from the public Galaxy implementation of Trimmomatic? Q2. Can I simply uncompress the *.gz files so that both forward and reverse trimmed data sets are then ready for mappings (HISAT2)?

Thanks for any advice,

Bill Gerthoffer

If the input is compressed (gz), then the output is sometimes left compressed by tools and other times it is uncompressed. I wouldn't expect different results given the same input format from this or other tools.

Maybe double check the inputs/outputs to make certain of the datatype? The dataset name is just a name. The assigned datatype is what matters. And it can sometimes be incorrect and need to be changed. You can also choose to compress or uncompress existing data within the Galaxy history.

Click on the pencil icon per dataset to reach the Edit Attributes forms: autodetect metadata (including datatype - this is the best choice if you are not sure of the actual format), or convert to a new format (compress/uncompress), or directly assign a new datatype if the current assignment is incorrect or is too general (assigned as tabular but is actually a more specific tabular format such as pileup).

Please see these FAQ for details:

Thanks! Jen, Galaxy team

