stringtie output from multiple samples into an FPKM matrix

Question: stringtie output from multiple samples into an FPKM matrix

20 months ago by

vaughandy • 10 wrote:

Hi all,

Anyone have a good set of steps to take the stringtie transcript-level expression output from multiple samples and put it into an FPKM matrix? Do I need to first normalize between the samples with something like cuffmerge? I thought that Stringtie merge would do this, but it doesn't seem to generate any FPKM values for the samples.

If I can just take the FPKM values from the multiple samples and compare them at face value then I'm probably fine and can make the matrix using some of the text manipulation commands in Galaxy, but I'm guessing the comparison across samples is going to take another step first to be appropriate.

Thanks!!!

merge fpkm stringtie • 1.5k views

ADD COMMENT • link •

modified 20 months ago by Mo Heydarian ♦ 830 • written 20 months ago by vaughandy • 10

20 months ago by

Mo Heydarian ♦ 830

United States

Mo Heydarian ♦ 830 wrote:

Hello,

This is a common scenario and you first have to choose what analysis strategy you want to pursue, whether you want to quantify expression levels of just annotated transcripts in a reference database or to quantify expression of all the transcripts in your experiment.

The first strategy is referred to as a reference based transcript evaluation pipeline, where after mapping reads, you quantify expression levels (and differential expression) relative to a reference transcriptome database (for example RefSeq). The second strategy is known as a de novo transcriptome reconstruction pipeline, where after mapping the reads are assembled into transcript structures (in the absence of a reference) to provide a comprehensive view of the transcriptome/sample. These de novo transcriptome structures are then provided to a tool like Cuffmerge or Stringtie-Merge to generate an experiment-specific transcriptome database, which is then used as a reference to generate expression values and differential expression estimates with tools like Cuffdiff and Featurecounts/Deseq2.

Have a look at the two strategies using the provided links. These will take you to step-by-step tutorials that will introduce you to both strategies and commonly used tools (with parameter recommendations) to achieve your goals.

Hope this helps!

Thanks for using Galaxy!

Cheers, Mo Heydarian

ADD COMMENT • link written 20 months ago by Mo Heydarian ♦ 830

Similar posts • Search »