Question: Mapping (with HISAT2 / Bowtie) on custom genomes yields always 0 matches
0
gravatar for stoyan.velkov
3 months ago by
stoyan.velkov10 wrote:

Hi,

I've the following problem. I'm using HISAT2 and Bowtie. I'm using a gene with 1200 nucleotides which is present in the hg38. When aligning my FASTQ data that ist around 250 nucleotides to the hg38 I get 99%+ mapping on the that gene, when I use the gene itself as a FASTA genome for mapping I get 0% mapping. The reads are all in-between those 1200 bases and if I run a blast on some of the reads I've a 100% match. Can someone explain to me what I'm doing wrong?

I can see that the script recognized the gene that I'm using as a genome:

@SQ SN:HPD_xdna_1182    LN:1182

There seems to be the issue?! Even if I rename it.

Dataset peek HISAT2 summary stats:

Total reads: 86715

    Aligned 0 time: 86715 (100.00%)

    Aligned 1 time: 0 (0.00%)

    Aligned >1 times: 0 (0.00%)
alignment dna rna galaxy spliced • 180 views
ADD COMMENTlink modified 3 months ago • written 3 months ago by stoyan.velkov10
1
gravatar for Jennifer Hillman Jackson
3 months ago by
United States
Jennifer Hillman Jackson25k wrote:

Hello,

It sounds like you need to tune the alignment parameters. Spliced mapping settings, overlap, overhang, mismatches, and the like. I suspect the current settings are too strict.

And just in case, this is the FAQ for how to format a custom reference genome (same rules apply to transcriptome data): https://galaxyproject.org/support/ >> Preparing and using a Custom Reference Genome or Build

Spliced mapping:

  • Query RNA, target DNA (genome or full transcript/gene bound including introns)

Unspliced mapping:

  • Query RNA, target RNA (transcript only mRNA, no introns)
  • Query DNA, target RNA
  • Query DNA, target DNA

Exception: most bacterial genomes would be unspliced for all RNA/DNA combinations.

Thanks! Jen, Galaxy team

ADD COMMENTlink modified 3 months ago • written 3 months ago by Jennifer Hillman Jackson25k
0
gravatar for stoyan.velkov
3 months ago by
stoyan.velkov10 wrote:

Hi Jen,

Thanks for your answer. So far I only had done that:

Did you use NormalizeFasta with the options to wrap sequence lines at 80 bases and to trim the title line at the first whitespace?

Will continue to try, I tried also with different settings, but wasn't lucky so far. Atleast I know now what could be the possible issues to look for. I'm looking for DNA query on DNA

Greetings, Stoyan

ADD COMMENTlink written 3 months ago by stoyan.velkov10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 175 users visited in the last hour