Question: sam to fastq
gravatar for mduffy
4.3 years ago by
mduffy0 wrote:


I have mapped PE reads to a custom build genome using bwa and converted the sam output to bam and filtered out unpaired reads and reads that didnt align. I then converted the filtered bam to sam and tried to convert the sam to fastq but got the error

Ignoring SAM validation error due to lenient parsing:
Error parsing text SAM file. Empty sequence dictionary.; Line 1

at the start of each line and at the end of the file

Exception in thread "main" net.sf.picard.PicardException: Found 1014 unpaired mates
	at net.sf.picard.sam.SamToFastq.doWork(
	at net.sf.picard.cmdline.CommandLineProgram.instanceMain(
	at net.sf.picard.sam.SamToFastq.main(

The same sam file runs OK in cufflinks. I want the fastq files to attempt de novo assembly of the mapped on velvet because the reads were mapped to variant genes with some homology but are predicted to have recombined into novel arrangements.

Thanks for any tips



samtools • 2.6k views
ADD COMMENTlink modified 4.3 years ago by Jennifer Hillman Jackson25k • written 4.3 years ago by mduffy0
gravatar for Jennifer Hillman Jackson
4.3 years ago by
United States
Jennifer Hillman Jackson25k wrote:


Update: I failed to read this fully. You are working with unmapped sequences. BWA does not report these individually - but Tophat2 will, as an advanced option ("Write unaligned reads to seperate file(s)"). To my knowledge SAM->Fastq will unmapped from any SAM. But do not attempt to convert SAM->Interval, as there will be no "mappings to report (not useful for assembly, but info many aid someone else reading).

-- original ->

This tool requires the sam header lines. Filtering steps will usually remove these, as will converting BAM->SAM without selecting the option to include headers. If needed, Picard has a tool for replacing headers from a donor dataset. Please give this a try.

Best, Jen, Galaxy team

ADD COMMENTlink modified 4.3 years ago • written 4.3 years ago by Jennifer Hillman Jackson25k


As you said, I thought I could extract unaligned reads as an advanced option when I align RNA-seq reads with Tophat2. I cannot find that option in Tophat2, am I missing something?

ADD REPLYlink written 4.2 years ago by hanah.ng930
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 175 users visited in the last hour