Question: Error Running The Picard Tool?
0
gravatar for Dottorini, Tania
4.8 years ago by
European Union
Dottorini, Tania20 wrote:
Hi, I am trying to determine the mean inner distance between mate pairs, but encountered odd results. Briefly, to calculate the mean inner distance I mapped PE data with 2x100bp read length with Bowtie against the reference transcriptome, and used the Picard InsertSize Metrics to calculate the Mean insert size. I then, subtracted the combined insert size (2*100) from the Mean insert size value, thus obtaining the mean inner distance between mate pairs. In all the cases studied, 4 samples and 4 controls, I have always obtained negative mean inner distance between mate pairs values. In addition, in some cases I had the following error running the Picard Tool: "Unable to find expected pdf file /galaxy/main/jobdir/006/471/.…../InsertSizeHist.pdf …This always happens if single ended data was provided to this tool" But in all cases I can confirm I provided paired-end NGS data. For same of the runs that gave me this failure log, I rerun them again but mapping to the genome instead of the transcriptome, and in this case it did work but always giving me negative distance values. I would like to know if this is the correct procedure to be followed or if there are other approaches I can use to find these distance values and if I can eventually use such negative distance values in Tophat Thank you for your help Tania
alignment bowtie • 1.2k views
ADD COMMENTlink modified 4.8 years ago by Jennifer Hillman Jackson25k • written 4.8 years ago by Dottorini, Tania20
0
gravatar for Jennifer Hillman Jackson
4.8 years ago by
United States
Jennifer Hillman Jackson25k wrote:
Hi Tania, The data is RNA? Did you filter for properly mapped pairs first ('Filter SAM')? I am guessing not, and that is where the error about single end data is coming from, and possibly other insert size values that are skewing the mean. If you plan on using the Tuxedo pipeline, you might be able to skip Picard and just use sample of the data to obtain this number. The tool authors suggest just running Tophat on a sample (few hundred pairs), as explained in the link below. A mean (and other calculations) can be generated on any tabular column of data using the tools 'Group', 'Compute', or 'Summary Statistics' (use the tool search to find these in the tool panel): http://tophat.cbcb.umd.edu/faq.shtml#mate_inner_dist Hopefully one of these options works for you, Jen Galaxy team -- Jennifer Hillman-Jackson http://galaxyproject.org
ADD COMMENTlink written 4.8 years ago by Jennifer Hillman Jackson25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 167 users visited in the last hour