Tophat Mapping And Cufflinks Output Issues

Heads up! This is a static archive of our support site. Please go to help.galaxyproject.org if you want to reach the Galaxy community. If you want to search this archive visit the Galaxy Hub search

Latest

Open

RNA-Seq

ChIP-Seq

SNP

Assembly

Forum

Home

Welcome to Galaxy Biostar! User support for Galaxy! about • faq • rss

Log In

Sign Up

Question: Tophat Mapping And Cufflinks Output Issues

0

5.9 years ago by

Jim Cohen • 10

Jim Cohen • 10 wrote:

Hello Galaxy Users- I've been using the Main Galaxy server to work on an RNA-Seq project for a non-model plant, and I've noticed that my output from Tophat and Cufflinks might not be as good as I'd like. I have a reference transcriptome assembled in Trinity, and it is based on the same Illumina-generated 100 bp reads I'm trying to map to it. When I use Tophat to map the reads to the reference transcriptome (I have trimmed the reads and filtered the lower quality ones), only about 10% of the reads actually map, so I go from 30,000,000 reads before mapping to 3,000,000 that are actually mapped. Therefore, I feel like I'm losing a lot of data. When I've changed the parameters to allow for more mismatches, not many more reads seem to map, and in many cases, the Tophat run fails and I receive the error message: "*Settings: Output files: "/tmp/ 3030460.cyberstar.psu.edu/tmpWbxTnm/dataset_5530451.*.ebwt" Line rate: 6 (line is 64 bytes) Lines per side: 1 (side is 64 bytes) Offset rate: 5 (one in 32) FTable chars: 10 Strings: unpacked Max bucket size: def"*. I've had similar numbers of reads map with Bowtie by itself and BWA as well. I've also tried mapping the reads to the assembled isoforms (contigs) of the transcriptome, and this results in many more reads (close to 90%) being mapped. Therefore, I figure the reads should map to the reference transcriptome, and I'm not sure why this isn't happening. The other issue I've run into is that in Cuffdiff only about 4,800 genes appear in the output files as being tested for differential expression. There are approximately 100,000 genes in the reference transcriptome, so I was thinking that there should be more than ca. 4,800 that are tested for differential expression. Should each gene be tested? Does Cuffdiff just not report some of the genes that are not differentially expressed, or is the program not testing all of the genes? If anyone can provide some help, guidance, or a suggestion, I'd greatly appreciate it. Thanks, and take care. Jim

rna-seq bwa alignment bowtie cufflinks • 1.7k views

ADD COMMENT • link •

modified 5.9 years ago by Jeremy Goecks • 2.2k • written 5.9 years ago by Jim Cohen • 10

0

5.9 years ago by

Jeremy Goecks • 2.2k

Jeremy Goecks • 2.2k wrote:

Tophat should be used when mapping reads to the genome, not the transcriptome. Because you're mapping your reads to the transcriptome assembled via Trinity, Bowtie or BWA are good choices. This also changes your downstream analyses, because Cufflinks does not work well on reads mapped to the transcriptome. Tools for quantitating transcriptome-mapped reads include RSEM and eXpress. Good luck, J.

ADD COMMENT • link written 5.9 years ago by Jeremy Goecks • 2.2k

Please log in to add an answer.

Similar posts • Search »

Output files: "genome.*.bt2" file. What is it?
Hi everyone, I ran Bowtie2, mapping my MiSeq reads against a plasmid (8kb) I had to find out if ...
NGS mapping via Bowtie on paired end trimmed files is failing
Hi Galaxy support, I have ran paired end files for a staphylococcus aureus genome (Illumina Mis...
Aligning S. aureus reads to a reference genome
Dear colleagues, I am currently trying to aligning reads from a dual RNA-seq experiment to the S...
Hisat error: 'NoneType' object has no attribute 'file_name'
Hi, I create a workflow using local galaxy. PE RNA-Seq data after trimming are mapped against ge...
Tophat Error.
Hi Everyone. Somebody knows what this error message means: An error occurred running this job:S...
Enhanced bowtie mapper
Hi all, I use tool called 'Enhanced Bowtie Mapper', and want to map my input file which is FASTQ...
TopHat Error Occurred with Dataset
Hello! I am attempting to map solid colorspace data with galaxy. I am running the pipeline as f...
quantifying differential gene expression across three groups using cuffdiff
Hello, I am trying to identify differentially expressed genes across three groups. The groups c...
Error for running Tophat in GALAXY
Hello, I was trying to use Tophat of GALAXY (run by my university) to perform mapping of my RNA-S...
Error with Cuffdiff
Hi, I have used cuffdiff on my cuffmerge results. Then an error occurred. Could anyone tell me w...
How many should multiple alignment of tophat-out in Human RNA-seqs?
Hi I'm checking my tophat output data and I would like to know the rate of this multiple alignme...
Issue running Cuffdiff using Cuffmerge gtf output
I am trying to use Cuffdiff to find significant changes in mRNA transcript expression between two...
Error using stringtie - AttributeError: 'NoneType' object has no attribute"
Hi, I have **RNA-seq data** and I am interested in whole gene expression results but also transcr...
TopHat: Error: Couldn't build bowtie index with err = 1
Hi I'm trying to run TopHat on a Cloudman Galaxy instance and I keep getting the below error. I'...
Gene Names From Cuffdiff Data
How does one get gene names when using cuffdiff when looking at "gene differential expression tes...
Rna-Seq Analysis
Dear all, I am using Galaxy for RNA-Seq analysis. I expect two lists: differentially expressed tr...
RNA-Seq High Expression Genes Lost
Hi I'm noticing that after running my BAM files through CuffDiff and CuffNorm that some genes th...
Re: Cufflinks Merging More Than One Transcript On Bacterial Genomes
Noa, I'm not sure that folks on this list have much experience with bacterial transcriptome ana...
Reference genome can it be partial?
I'm doing an RNA-seq on wheat to get differential expression of wild type vs a mutant. I'm only l...
Problem With Repeated Genes In Cuffdiff'S Output
Hi all, I am working on RNA-seq using TopHat/Cufflink/Cuffdiff for differential gene expression a...
Reference genome build (hg19 installation)
Hi guys. So I basically deposited my data (hg19 , .bam format) onto my local server and when i n...

Content

Help

About
FAQ

Access

RSS
Stats
API

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by Biostar version 16.09

Traffic: 169 users visited in the last hour