Question: Extract Genomic DNA tool behavior
0
gravatar for vidar.blikstad
6 months ago by
vidar.blikstad0 wrote:

Hi I have downloaded RNA_seq data as fastq files and aligned these against FASTA sequences as references, then generated GTF files. However, the coordinates of the transcripts, given by cufflinks, do not match with the length of the output sequences generated by the Extract Genomic DNA tool - these sequences, which represent repeated stretches, seem to be nearly perfect duplications of the sequences predicted by cufflinks. My question is if one can trust that the Extract Genomic DNA tool correctly output assembled sequences when it comes to repeated sequences.

Vidar

fasta genome gtf cufflinks extract • 217 views
ADD COMMENTlink modified 6 months ago • written 6 months ago by vidar.blikstad0
0
gravatar for Jennifer Hillman Jackson
6 months ago by
United States
Jennifer Hillman Jackson25k wrote:

Hello,

The tool extracts genomic sequence based on the coordinates. One fasta sequence per input line (bed, gtf). Repeats are soft-masked for most genomes indexed at https://usegalaxy.org and can be extracted.

If the coordinates are provided in GTF format or BED12, then the result will be spliced fasta output. This is different than the start/end coordinates of GTF lines labeled as "transcripts" -- those coordinates include the total genomic span, exons plus introns plus possibly 5'/3' UTR.

And if you are trying to compare a public annotation GTF with a Cufflinks GTF result, differences are expected, as Cufflinks generates transcripts based on the actual reads given as input, and may contain variation or novel content.

Thanks, Jen, Galaxy

ADD COMMENTlink modified 6 months ago • written 6 months ago by Jennifer Hillman Jackson25k
0
gravatar for vidar.blikstad
6 months ago by
vidar.blikstad0 wrote:

Hello

It seems clear to me now - thanks a lot

Vidar

ADD COMMENTlink written 6 months ago by vidar.blikstad0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 137 users visited in the last hour