I am new to galaxy and have been trying to figure out how to align RNA-seq reads to a genome in order to observe differential gene expression. I have been messing around with Drosophila melanogaster experimental reads from the EBI SRA site.
Most experiments are labeled as paired-end reads, however when I go to download the data there is usually only one file (some do have two). Does the one file mean that both ends are in the same file, and if so in what format? Aligned next to each other in one long read? Stacked on top of each other? Is there a way to check if reads are paired-end in one file? Or split paired-end reads that are all in one file into single-end reads?
Thank you in advance for any help with this issue.
Hi, could you give examples (experiment accessions will be fine) of a couple of cases which are described as paired-end, but only have a single data file?
These are the two specific files I have been working with:
DRX013093: 454 GS Junior paired end sequencing; Transcriptome analysis of uninfected 3rd instar larvae of D. melanogaster.
DRX013094: 454 GS Junior paired end sequencing; Transcriptome analysis of Penicillium-fungus infected 3rd instar larvae of D. melanogaster.
But here are two more examples of files I have looked at that specifically say paired end in the title and yet only have one file:
: Illumina Genome Analyzer IIx paired end sequencing; SRig10098
ERX006127: AB SOLiD System 3.0 paired end sequencing; RNAi of signal transduction components in Drosophila S2 cells