I am trying to extract fasta file from galaxy after cuffcompare analysis which gave me combined transcripts as GTF format. This file does not have sequence information in it. How could I get the fasta data of this GTF file? Thank you.
Hello,
A GTF file can be used with Fetch Alignments/Sequences: Extract Genomic DNA
Note that this will not include any variation in your original fastq data, unless it changed the overall splicing. Only exons will be extracted if the input represents spliced input (a transcript GTF from this tool does represent that type of content).
To be clear, the Tuxedo pipeline does not fully "assemble" the reads. Rather, it builds up scaffolds and keeps track of splices. The output is a description of a set of transcript start/stop/internal-splices, how these transcripts group together, and how these transcripts and transcript groups (genes) compare to each other with respect to differential abundance (aka expression in this case) based on the associated reads.
To assemble reads into consensus sequences, there is a new suite of tools on http://usegalaxy.org in the group NGS: Du Novo. Or you can review the tools in the Tool Shed under Assembly for use in a local/cloud Galaxy.
Thanks, Jen, Galaxy team