Question: Non-compatible file format for SAM tools Filter Pileup
0
gravatar for macksm
4.5 years ago by
macksm10
United States
macksm10 wrote:

Hi,

We are attempting to do a paired end mapping and have been following the instructions that we found in one of the vimeo videos titled "qk12". So far everything has been going well, however we just completed the step to Generate a pileup and now for some reason it is not allowing us to filter this pileup. The error message that is appearing says: History does not include a dataset of the required format / build.


If you could please let us know what we can do to make our pileup file compatible with this tool, or perhaps there is a step we need to add in between? That would be great.

Thank you so much,

Savannah

 

rna-seq samtools • 1.0k views
ADD COMMENTlink modified 4.5 years ago by Jennifer Hillman Jackson25k • written 4.5 years ago by macksm10
0
gravatar for Jennifer Hillman Jackson
4.5 years ago by
United States
Jennifer Hillman Jackson25k wrote:

Hello,

That quickie video was created some time ago and some adjustments are now needed under certain use cases. Does the input file have the data format "datatype" of "pileup" assigned? I am guessing not. If format is in fact pileup (either type from pileup - see tool for a description, but NOT the bcf option from mpileup), you can assign the datatype directly, here is how: 
http://wiki.galaxyproject.org/Support#Tool_doesn.27t_recognize_dataset

Hopefully this helps, but if not, let us know and we can share a history (directly, to maintain privacy) if reproducible on http://usegalaxy.org

Best, Jen, Galaxy team

 

ADD COMMENTlink written 4.5 years ago by Jennifer Hillman Jackson25k

This worked! Thank you so much for the fast reply.

Savannah

ADD REPLYlink written 4.5 years ago by macksm10

Now that the filter step has completed, it says that it is an empty data file. Could you possibly send me the history that you mentioned above so that I can compare steps? Thanks so much.
 

ADD REPLYlink written 4.5 years ago by macksm10

Was this intended for this thread? If so, can you clarify? Thanks, Jen

ADD REPLYlink written 4.5 years ago by Jennifer Hillman Jackson25k

Hi,

Sorry I wasn't specific enough -- The original question about the compatibility of the file was solved with your response when I was able to change the format to pile up. We ran the "Filter Pileup" step, but when it was completed, it said that the data file was empty and would not allow us to view it.  We are in the middle of a rerun, but is there a reason you can think of that it would be showing up this way?

ADD REPLYlink written 4.5 years ago by macksm10

A few of the options will filter out reporting certain bases for cause (e.g. quality lower than minimum set, only reporting variants when there are none). Reducing the stringency to match your data/goals is a good first pass when doing a re-run. 

There are a few QA issues that could be a factor, some will produce errors with the tool, some do not. Two important ones are:

1. Was the quality scaling and datatype assigned to the input sequences correct before mapping (did you run FastQC to verify)?
https://wiki.galaxyproject.org/Support#FASTQ_Datatype_QA

2. Is the dataset assigned to the correct genome and the identifiers in your file are a match for those in the built-in index in Galaxy (should be if mapping was run on this same server, can vary if not and mapped data was uploaded)?
https://wiki.galaxyproject.org/Support#Reference_genomes
https://wiki.galaxyproject.org/Support/ChromIdentifiers

Give these a check and let us know if you still have an issue, Thanks! Jen

ADD REPLYlink written 4.5 years ago by Jennifer Hillman Jackson25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 172 users visited in the last hour