Alignment rate differes using hg19 and hg38

Heads up! This is a static archive of our support site. Please go to help.galaxyproject.org if you want to reach the Galaxy community. If you want to search this archive visit the Galaxy Hub search

Latest

Open

RNA-Seq

ChIP-Seq

SNP

Assembly

Forum

Home

Welcome to Galaxy Biostar! User support for Galaxy! about • faq • rss

Log In

Sign Up

Question: Alignment rate differes using hg19 and hg38

0

2.2 years ago by

fate.gh • 10

fate.gh • 10 wrote:

I have some RNA-seq fastq files.I'm using HiSat to align the fastq files. To do so, I use the built-in genome reference in Galaxy. When I align them to hg19, the alignment rate is much higher than when I align them to hg38. What causes such big difference?

Here this is an example:

* Alignment to hg19:

format    bam
database  hg19

37328991 reads; of these:
37328991 (100.00%) were unpaired; of these:
3767939 (10.09%) aligned 0 times
16947652 (45.40%) aligned exactly 1 time
16613400 (44.51%) aligned >1 times
89.91% overall alignment rate
[bam_sort_core] merging from 24

* Alignment to hg38:

format    bam
database   hg38

37328991 reads; of these:
37328991 (100.00%) were unpaired; of these:
11718927 (31.39%) aligned 0 times
15954103 (42.74%) aligned exactly 1 time
9655961 (25.87%) aligned >1 times
68.61% overall alignment rate
[bam_sort_core] merging from 24

What should I do?

Will it cause a problem if I want to obtain read counts for DE analysis using htseq?

hisat hg19 hg39 alignment rate • 975 views

ADD COMMENT • link •

modified 2.2 years ago by Jennifer Hillman Jackson ♦ 25k • written 2.2 years ago by fate.gh • 10

0

2.2 years ago by

Jennifer Hillman Jackson ♦ 25k

United States

Jennifer Hillman Jackson ♦ 25k wrote:

Hello,

These are the same exact inputs and parameters?

Seems odd, but possible, it depends on the read content. Hg38 is much more polished than hg19.

Perhaps run htseq_count on both and compare? It might provide a clue about where those extra unaligned hg38 reads were previously mapped to hg19. And what those sequences contain - you could isolate them and run FastQC to do some QA on them.

Jen, Galaxy team

ADD COMMENT • link written 2.2 years ago by Jennifer Hillman Jackson ♦ 25k

Hi,

Yes, These are the same exact inputs and parameters... I ran htseq-count and compared the results, but since the gtf files are different (different ensembl versions to match hg19 and hg38), the result are not the same.

ADD REPLY • link modified 2.2 years ago • written 2.2 years ago by fate.gh • 10

Please log in to add an answer.

Similar posts • Search »

bowtie2 generates a fatal error
Hi, I am trying to map 6 fastq file using bowtie2. Some of the jobs are completed but some of th...
Where to find Bowtie2 output stderr filehandle?
Dear all, I want to find the alignment rate as suggested in bowtie2 "standard error" ("stderr") ...
Filtering BAM files from HISAT2
Hi. I am new to rna-seq and I have a couple of quick questions. My input was paired-end non-stran...
Extract aligned exactly 1 time fron Bowtie2?
I have an output from Bowtie2 aligner as below: 16075710 reads; of these: 16075710 (100...
Trimmomatic output results in extremely low alignment using bowtie2
Hello, I have trimmed by paired end data RNA seq (Illumina Miseq) data using trimmomatic. Howeve...
>90% aligned concordantly 0 times ChIP-seq Bowtie2
Hi, I know this question have raised many times here and in other forums but I've tried everythi...
Commandline Bowtie2 and variant call help.
Hello there! When you run bowtie2 to align the reads to the genome, some stats about the file ar...
Low alignment for paired end reads using HISAT2
Hello, I am new to processing and analysis of RNAseq data. I have recently completed a paired-en...
Poor overall alignment rates
I am having problems with the alignment process for some data that I obtained through NCBI GEO da...
TopHat align summary
Hello, I am using galaxy to analyze RNA seq of 100bp sing end data, sequenced with Illumina 2500...
HISAT2 gzip: input_r.fastq.gz: not in gzip format
Hello, I have been trying to use HISAT2 to to align RNA-seq ENCODE data which I download on my de...
HiSAT2 Alignment Rate Dropping with Cufflinks export option enabled
I am trying to take trimmed and qc filtered paired RNA seq data and run it through HiSat2 > St...
Low data conversion rate for BAM-to-SAM. Fix Database, Datatype, Sorting
I have WXS fastq files from an illumina HiSeq 4000 paired end run- I uploaded them through FTP as...
Mapping (with HISAT2 / Bowtie) on custom genomes yields always 0 matches
Hi, I've the following problem. I'm using HISAT2 and Bowtie. I'm using a gene with 1200 nucleoti...
April 8, 2011 Galaxy Development News Brief
April 8, 2011 Galaxy Development News Brief http://bitbucket.org/galaxy/galaxy- central/wiki/Fea...
Number Of Mismatches Allowed In The Initial Read Mapping
Dear All, I tested how to set the "Number of mismatches allowed in the initial read mapping" a...
Bowtie2/FreeBayes/mpileup variant detection on NGS of PCR amplicons around Cas9/CRISPR indels
Hey threre, I have an MiSeq experiment using 24 indices where in each index I was sequencing 3 P...
Low mapping of paired-end reads with tophat2
Hello. I'm trying to map paired-end reads on reference scaffold using tophat2. But the percentage...
Tophat Error: Segment-Based Junction Search Failed With Err
Hello, I don't know why I still have this problem.. I have run tophat2 with different dataset, so...
Cuffdiff Fatal error: Matched on error
Hello, I am doing my first RNA seq data analysis using galaxy. I have assembled my transcripts u...

Content

Help

About
FAQ

Access

RSS
Stats
API

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by Biostar version 16.09

Traffic: 175 users visited in the last hour