Hello, I have 3 paired end fastaq file and when i loaded to Galaxy through EBI-SRA, it shows XXX.fastaq.gz, xxx 1.fastaq.gz and xxx 2.fastaq.gz. I was wondering what does the first file mean for (without number)??? 1. when i do FASTQC for this reads, the report shows some low quality. After TRIM sequence or Clip adapters, it can't pass FASTQC test. whats should be the reason? 2.Is FASTQC necessary for RNA-Seq data??
The xxx_1.fastq.gz
and xxx_2.fastq.gz
files are (likely already trimmed) paired-end files. The xxx.fastq.gz
file is likely the file of "orphans" or "singletons" that resulted from the trimming process discarding one of the reads in a pair. In this dataset, it appears that the submitter didn't upload the raw files, but rather what was produced after trimming.
Whether reads will pass all of the FastQC tests will depend largely on the type of dataset. For example, RNAseq datasets will ALWAYS fail some tests. In fact, the only types of datasets that should routinely pass all of the FastQC checks are whole-genome sequencing. If you're worried about a particular test then post the test result and the EBI/SRA accession number so we can see what type of experiment the data is from.