Question: Line Estimation For Pileup Generation
0
gravatar for Austin Paul
7.3 years ago by
Austin Paul140
Austin Paul140 wrote:
Hello, I am curious if the line estimation shown in the history window for pileup generation is at all accurate. I am using the pileup files to generate expression data from bwa mapping for looking at differential expression, but I am having some trouble understanding the line estimates. For example, for one pileup file, when I cut the reference id column and the number of hits column (columns 1 and 4), the number of lines in the cut file is about 25% that of the pileup file, and for another file it will be 5000%. How can the number of lines grow 50x when I am just cutting columns from the file? Shouldnt the line estimate be the same? Thanks, Austin
bwa alignment • 917 views
ADD COMMENTlink modified 7.3 years ago by Dannon Baker3.7k • written 7.3 years ago by Austin Paul140
0
gravatar for Dannon Baker
7.3 years ago by
Dannon Baker3.7k
United States
Dannon Baker3.7k wrote:
As a first step, please confirm an exact line count for the files. See the "Line/Word/Character count" tool in the Text Manipulation section to do this. If the estimate is significantly off, please share the history with me and I'll take a look to see what happened with those particular datasets. Thanks! -Dannon
ADD COMMENTlink written 7.3 years ago by Dannon Baker3.7k
Hi Dannon, Thanks for telling me about that count tool. I had not used it before. So, it seems the line estimates in the history windows are a bit screwy. One pileup file I mentioned estimated ~4,000,000 lines and the count tool showed 988,000. And the other pileup file I mentioned estimated ~200,000 and the count tool showed 6,382,447. The lines totals on the cut files were off as well, but the count tool showed consistent numbers between the pileup files and the cut files, so I feel better. Thanks again. Austin
ADD REPLYlink written 7.3 years ago by Austin Paul140
Sure, no problem. Those estimates are indeed way off, ideally they're within about 10% of the actual count. Would you mind sharing the history with me at this email address so that I might take a look and figure out where the estimation went wrong? Thanks! -Dannon
ADD REPLYlink written 7.3 years ago by Dannon Baker3.7k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 169 users visited in the last hour