MPileup calling every base as <X>

Question: MPileup calling every base as <X>

2.8 years ago by

kjb322 • 10

kjb322 • 10 wrote:

Having a few teething problems on my first use of Galaxy (surprise!)

Workflow as follows;

Upload fastq files (forward and reverse)

Fastq groomer

Trimmomatic

BWA

SAM to BAM

MPileup

ANNOVAR

ANNOVAR gives an empty vcf file and on closer inspection, MPileup gives 55,000,000 lines form 3,600,000 in the sam.

MPileup appears to have called every base, each line looking like this;

chr10

1812199

<X>

DP=1;I16=1,0,0,0,27,729,0,0,0,0,0,0,0,0,0,0;QS=1,0;MQ0F=1

0,3,4

The alt base is <X> for every base. I assumed it was a reference genome mismatch but I have done my best to use several reference genomes and I get the same (or similar) problem.

Any offers?

help bwa alignment galaxy mpileup • 952 views

ADD COMMENT • link •

modified 2.8 years ago • written 2.8 years ago by kjb322 • 10

If you can reproduce this at http://usegalaxy.org, please send in a bug report (from any error dataset, doesn't not have to be from this particular analysis, just a tool in the same history). Or create a shared history link and email that to galaxy-bugs@lists.galaxyproject.org. Note the datasets numbers involved, please.

Make sure that all datasets in the analysis are undeleted for at least one complete pass from start to end.

Include a link to this Biostars post so we can cross-reference the two (in the email or bug report comments).

So that you know, Annovar is only pre-cached with one genome at Galaxy main (hg19). Mpileup advanced settings could be a factor. Sharing is the best way to see everything at once and get to the root of the problem.

Thanks, Jen, Galaxy team

ADD REPLY • link modified 2.8 years ago • written 2.8 years ago by Jennifer Hillman Jackson ♦ 25k

The history has got pretty messy as I have been trying everything the firstly identify the point where it goes wrong and then to try and solve it.

https://usegalaxy.org/u/kelly-hunter/h/galaxy-presentation

Thankyou kindly!

ADD REPLY • link written 2.8 years ago by kjb322 • 10

I'll take a look and get back to you by early next week. Thanks!

Ps: Sharing this way means that everyone has public access to your data. If you don't want that, unshare this way and share directly with me. Send an email to the galaxy-bugs list to arrange that.

Jen

ADD REPLY • link modified 2.8 years ago • written 2.8 years ago by Jennifer Hillman Jackson ♦ 25k

It's fine, its public data anyway!

Any help at all by Monday would be greatly appreciated if possible and thanks again either way.

P.s. In order to not interfere wth that history, is it possible for me to use the uploaded data is this history from another?

ADD REPLY • link written 2.8 years ago by kjb322 • 10

Yes, use the function "copy datasets" to create clone in another history.

I did notice that all the inputs do not quite match up (the inputs are data 1 and data 2, but all other analysis is based off of non-existant data 3 and data 4). That means that the first jobs are "data 3 acting on data 3" and "data 4 acting on data 4". Not sure how this could occur if all executed in the same history, but it may not matter.

ADD REPLY • link written 2.8 years ago by Jennifer Hillman Jackson ♦ 25k

Similar posts • Search »