Question: Fastqsanger to FastQ
0
gravatar for sdbaney
5 months ago by
sdbaney10
sdbaney10 wrote:

I am trying to run a utility called transrate to assess my trinity assembled files. I am trying to use my trimmed files (trimmomatic files downloaded from galaxy) that are in fastqsanger format but when they are downloaded it seems to be just edited text documents and not any type of sequence file. Can someone help me? Is this a problem with the download from galaxy?

ADD COMMENTlink modified 5 months ago • written 5 months ago by sdbaney10

(cont. error code)

Completed first pass through the alignment file. Total # of mapped reads : 0

of uniquely mapped reads : 050000000

ambiguously mapped reads : 0

[2018-06-18 12:23:50.923] [jointLog] [info] Computed 0 rich equivalence classes for further processing [2018-06-18 12:23:50.923] [jointLog] [info] Counted 0 total reads in the equivalence classes [2018-06-18 12:23:52.059] [jointLog] [warning] Only 0 fragments were mapped, but the number of burn-in fragments was set to 5000000. The effective lengths have been computed using the observed mappings.

[2018-06-18 12:23:52.059] [jointLog] [warning] Since only 0 (< 5000000) fragments were observed, modeling of the fragment start position distribution has been disabled [2018-06-18 12:23:52.069] [jointLog] [info] starting optimizer [2018-06-18 12:23:52.083] [jointLog] [info] Marked 0 weighted equivalence classes as degenerate [2018-06-18 12:23:52.083] [jointLog] [info] iteration = 0 | max rel diff. = -1.79769e+308 [2018-06-18 12:23:52.088] [jointLog] [info] iteration = 50 | max rel diff. = -1.79769e+308 [2018-06-18 12:23:52.088] [jointLog] [error] Total alpha weight was too small! Make sure you ran salmon correclty.

Freeing memory used by read queue . . . Joined parsing thread . . . "/Users/admin/Desktop/transrate-1.0.3-osx/transrate_results/transcripts30/read1.fastq.read2.fastq.transcripts30.bam" Closed all files . . . Emptied frag queue. . . Emptied Alignemnt Group Pool. . Emptied Alignment Group Queue. . . done [2018-06-18 12:23:53.037] [jointLog] [error] Quantification was un-successful. Please check the log for information about why quantification failed. If this problem persists, please report this issue on GitHub.

[ERROR] 2018-06-18 12:23:53 : postSample.bam not created

ADD REPLYlink written 5 months ago by sdbaney10

Here is the log the error is referring to:

[2018-06-18 10:49:57.487] [fileLog] [info] quantification processed 0 fragments so far

[2018-06-18 10:50:07.385] [fileLog] [info] quantification processed 1000000 fragments so far

[2018-06-18 10:50:21.124] [fileLog] [info] quantification processed 2000000 fragments so far

[2018-06-18 10:55:53.920] [fileLog] [info] quantification processed 3000000 fragments so far

[2018-06-18 10:57:31.755] [fileLog] [info] quantification processed 3164579 fragments so far

[2018-06-18 12:23:50.784] [jointLog] [info]

Completed first pass through the alignment file. Total # of mapped reads : 0

of uniquely mapped reads : 0

ambiguously mapped reads : 0

[2018-06-18 12:23:50.923] [jointLog] [info] Computed 0 rich equivalence classes for further processing [2018-06-18 12:23:50.923] [jointLog] [info] Counted 0 total reads in the equivalence classes [2018-06-18 12:23:52.059] [jointLog] [warning] Only 0 fragments were mapped, but the number of burn-in fragments was set to 5000000. The effective lengths have been computed using the observed mappings.

[2018-06-18 12:23:52.059] [jointLog] [warning] Since only 0 (< 5000000) fragments were observed, modeling of the fragment start position distribution has been disabled [2018-06-18 12:23:52.069] [jointLog] [info] starting optimizer [2018-06-18 12:23:52.083] [jointLog] [info] Marked 0 weighted equivalence classes as degenerate [2018-06-18 12:23:52.083] [jointLog] [info] iteration = 0 | max rel diff. = -1.79769e+308 [2018-06-18 12:23:52.088] [jointLog] [info] iteration = 50 | max rel diff. = -1.79769e+308 [2018-06-18 12:23:52.088] [jointLog] [error] Total alpha weight was too small! Make sure you ran salmon correclty. [2018-06-18 12:23:53.037] [jointLog] [error] Quantification was un-successful. Please check the log for information about why quantification failed. If this problem persists, please report this issue on GitHub.

ADD REPLYlink written 5 months ago by sdbaney10
0
gravatar for Jennifer Hillman Jackson
5 months ago by
United States
Jennifer Hillman Jackson25k wrote:

Hello,

It sounds like there was a problem downloading the Trimmomatic fastq data.

For smaller files, you can just click on the disc icon. For larger data, use curl or wget with the dataset's URL (copied from the disc icon).

FAQ: https://galaxyproject.org/support/

If that doesn't help, please explain how you are downloading the data, if other files download oddly or just this one, and which Galaxy server you are working on.

Thanks! Jen, Galaxy team

ADD COMMENTlink written 5 months ago by Jennifer Hillman Jackson25k

I was able to download it and edit the file to end with .fastq which seemed to recover the data (before, it was a seemingly empty text document). When I use transrate with the raw read sequencing files (illumina RNA seq) mapped against the trinity assembly file, it runs fine (although gives poor statistics due to it not being trimmed, removing adapter sequences, etc).

I was able to get the transrate program to recognize the trimmed files downloaded from galaxy (using the save icon) but this the error I am now given. Which makes me think there is an error with the trimmed file.

caput301-it3561:transrate-1.0.3-osx admin$ bash transrate --assembly=/Users/admin/Desktop/transcripts30.fasta --left=/Users/admin/Desktop/read1.fastq --right=/Users/admin/Desktop/read2.fastq [ INFO] 2018-06-18 12:23:03 : Loading assembly: /Users/admin/Desktop/transcripts30.fasta [ INFO] 2018-06-18 12:23:06 : Analysing assembly: /Users/admin/Desktop/transcripts30.fasta [ INFO] 2018-06-18 12:23:06 : Results will be saved in /Users/admin/Desktop/transrate-1.0.3-osx/transrate_results/transcripts30 [ INFO] 2018-06-18 12:23:06 : Calculating contig metrics... [ INFO] 2018-06-18 12:23:09 : Contig metrics: [ INFO] 2018-06-18 12:23:09 : ----------------------------------- [ INFO] 2018-06-18 12:23:09 : n seqs 46955 [ INFO] 2018-06-18 12:23:09 : smallest 201 [ INFO] 2018-06-18 12:23:09 : largest 17234 [ INFO] 2018-06-18 12:23:09 : n bases 25555784 [ INFO] 2018-06-18 12:23:09 : mean len 544.26 [ INFO] 2018-06-18 12:23:09 : n under 200 0 [ INFO] 2018-06-18 12:23:09 : n over 1k 4959 [ INFO] 2018-06-18 12:23:09 : n over 10k 26 [ INFO] 2018-06-18 12:23:09 : n with orf 8661 [ INFO] 2018-06-18 12:23:09 : mean orf percent 69.58 [ INFO] 2018-06-18 12:23:09 : n90 244 [ INFO] 2018-06-18 12:23:09 : n70 379 [ INFO] 2018-06-18 12:23:09 : n50 695 [ INFO] 2018-06-18 12:23:09 : n30 1481 [ INFO] 2018-06-18 12:23:09 : n10 3576 [ INFO] 2018-06-18 12:23:09 : gc 0.4 [ INFO] 2018-06-18 12:23:09 : bases n 0 [ INFO] 2018-06-18 12:23:09 : proportion n 0.0 [ INFO] 2018-06-18 12:23:09 : Contig metrics done in 3 seconds [ INFO] 2018-06-18 12:23:09 : Calculating read diagnostics... [ERROR] 2018-06-18 12:23:53 : Version Info: ### A newer version of Salmon is available. ####

#

The newest version, available at https://github.com/COMBINE-lab/salmon/releases contains important bug fixes and improvements; please upgrade at your earliest convenience.

#

salmon (alignment-based) v0.6.0

[ program ] => salmon

[ command ] => quant

[ libType ] => { IU }

[ alignments ] => { /Users/admin/Desktop/transrate-1.0.3-osx/transrate_results/transcripts30/read1.fastq.read2.fastq.transcripts30.bam }

[ targets ] => { /Users/admin/Desktop/transcripts30.fasta }

[ threads ] => { 8 }

[ sampleOut ] => { }

[ sampleUnaligned ] => { }

[ output ] => { . }

[ useErrorModel ] => { }

[ biasCorrect ] => { }

[ noEffectiveLengthCorrection ] => { }

[ useFSPD ] => { }

Library format { type:paired end, relative orientation:inward, strandedness:unstranded } Logs will be written to ./logs numQuantThreads = 4 parseThreads = 4 Checking that provided alignment files have consistent headers . . . done Populating targets from aln = "/Users/admin/Desktop/transrate-1.0.3-osx/transrate_results/transcripts30/read1.fastq.read2.fastq.transcripts30.bam", fasta = "/Users/admin/Desktop/transcripts30.fasta" . . .replaced 0 non-ACGT nucleotides with random nucleotides done

                                  [2018-06-18 12:23:50.784] [jointLog] [info]
ADD REPLYlink written 5 months ago by sdbaney10

If the inputs are formatted correctly, then there is either a content problem or a tool problem. The tool author would be able to help: http://hibberdlab.com/transrate/

You can check if the data was downloaded completely by comparing the data in Galaxy to that downloaded. And "md5" locally should be the same as one generated in Galaxy with the tool Secure Hash / Message Digest on a dataset.

Datasets downloaded from Galaxy will be given an extension that matches the assigned datatype. If the file name is modified after download, that doesn't impact the overall dataset size/content, but it can impact how your computer interprets the content (which GUI utilities will work with them, etc). Line command text editors and commands (example: vim, head) don't read in the file extension, it is all read in as "plain text".

ADD REPLYlink written 5 months ago by Jennifer Hillman Jackson25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 160 users visited in the last hour