Question: using mPileup to examine allelic imbalance in hybrids
1
gravatar for Suzanne Gomes
4.5 years ago by
Suzanne Gomes120
Canada
Suzanne Gomes120 wrote:

Hi,

I have some RNA-seq data for two different species and their hybrids. I want to look at the expression levels of the allele from each parent in the hybrids. I am trying to use mPileup for this. First, I want to find all positions with fixed differences between the two parent species. 

My problem is, I want to select only positions supported by a certain minimum number of reads. But mPileup outputs a whole bunch of > and <, which I read represents large gaps (so presumably, introns that are spanned by a read). These symbols are counted towards the final read mapping total, which causes me to get a lot of extra positions reported after filtering, that actually have fewer reads mapping there than the minimum I want. Is there a way to get mPileup not to report these gaps? Or can anyone think of a way to filter them out after the fact? 

My plan after mPileup and filtration for quality/min read count is to then filter the file to find positions where there is a unique base represented in the reads. I'll do this for each parent, and then compare what read occurs in each parent to find any fixed differences between the two. 

After I have the list of fixed differences between the parents, I want to do a pileup of just those positions in the hybrids. I see there is an option 'List of regions or sites on which to operate' in mPileup, which seems to require a BED format file. How would I convert the list of positions in the parents to a BED file? 

I'd appreciate any help/suggestions! 

 

rna-seq snp • 1.5k views
ADD COMMENTlink modified 4.4 years ago • written 4.5 years ago by Suzanne Gomes120
2
gravatar for Jennifer Hillman Jackson
4.4 years ago by
United States
Jennifer Hillman Jackson25k wrote:

Hello,

This is probably not be the best tool for the job. Have you considered the tools "Naive Variant Caller" and "Variant Annotator"? Other tools in this same group, NGS: Variant Analysis, may also be of interest. But working in VCF format, with tools that have the option to retain all data (including sites *without* varitation) and provide depth information is what you are looking for. These two will do that, so are a good place to start.

Others are welcome to add to the advice & recommend alternate tools. There are almost certainly many solutions to this question, even just using the tools on the public Main Galaxy server (http://usegalaxy.org) and even more in the Tool Shed (http://usegalaxy.org/toolshed) for local/cloud use.

Jen, Galaxy team

 

ADD COMMENTlink written 4.4 years ago by Jennifer Hillman Jackson25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 171 users visited in the last hour