Question: RNA-star mapped.bam problems
0
gravatar for a.walne
8 months ago by
a.walne40
a.walne40 wrote:

I'm trying to use RNA-star but keep getting the following error message "An error occurred setting the metadata for this dataset Set it manually or retry auto-detection" in the tab "RNA STAR on data 11, data 10, and others: mapped.bam" In this example data 10 is - ftp://ftp.ensembl.org/pub/release91/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz

Data 11 is - ftp://ftp.ensembl.org/pub/release-91/gff3/homo_sapiens/Homo_sapiens.GRCh38.91.gff3.gz Others I assume refers to the paired end fastq files.

I have tried auto-detection but this doesn't help. Any suggestions? Thanks

ADD COMMENTlink modified 8 months ago by Jennifer Hillman Jackson25k • written 8 months ago by a.walne40

Hello - The public Galaxy server https://usegalaxy.org was updated this morning. I am running some tests to see if this is a usage error or a server-side error.

This includes a direct rerun and a few reruns with the Custom genome fasta and annotation GFF3 inputs cleaned up. Both are slightly out of specification and might be a factor (some tools are pickier about formats than others). I also noticed that you are assigning a metadata database attribute to your data even when they are not based on the built-in genome and indexes that use that same database name. Using a Custom Build is a better choice. Even if these are not the root problem with this run, the datasets should be cleaned up to work properly with all tools (including tools downstream of RNA-STAR, once working and returning proper alignment results).

Support FAQs: https://galaxyproject.org/support/

ADD REPLYlink modified 8 months ago • written 8 months ago by Jennifer Hillman Jackson25k
0
gravatar for Jennifer Hillman Jackson
8 months ago by
United States
Jennifer Hillman Jackson25k wrote:

Hello,

The custom genome/annotation data is too large to process with the RNA STAR mapper at Galaxy Main https://usegalaxy.org even with the input corrections. This results in an empty BAM output that cannot be indexed, triggering the metadata problems.

This means that you'll need to do one of these:

  1. use HISAT2 instead
  2. run the RNA STAR job using an indexed built-in genome at Galaxy Main (along with a reference annotation dataset that is a match: eg: same genome build/common chromosome identifiers).
  3. consider starting up your own Galaxy server and provide it with enough memory to run.

For items 2 & 3, please be aware that it is possible that the job may still remain too large to execute, even when using a built-in genome or given more resources at your own Galaxy server. RNA STAR uses much more memory during job execution than the other mapping tools - whether used in Galaxy or not.

I tested HISAT2 with your data and the job completed successfully. What I did:

  • corrected the custom genome format
  • removed the database assignment from the fastq inputs
  • changed the datatype for the fastq inputs to fastqsanger.gz
  • no reference annotation was used (HISAT2 accepts gtf formatted annotation, not gff3)

How to do the above and where to obtain reference annotation in gtf format is covered in the FAQs I linked in the original comment.

Galaxy tutorials for RNA-seq with workflow/tool example usage: https://galaxyproject.org/learn/

Thanks! Jen, Galaxy team

ADD COMMENTlink modified 8 months ago • written 8 months ago by Jennifer Hillman Jackson25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 176 users visited in the last hour