Hello I am doing RNA-seq analysis using local server galaxy. I'm following the new Tuxedo protocol HISAT>Stringtie>Ballgown. I have a question regarding Ballgown input. Apart from the 5 stringtie output files ballgown requires another phenotype data file for input. Is there any reference phenotype data file? or Do I have to write the file myself using text editor?

I have two experiment groups of Tumor and Normal with 30 matching samples in each group. Can anyone kindly give me an idea about how to arrange the phenotype data file?

This input file is a two-column file created by the analyst (you) that contains the sample name and phenotype. The order of the samples in this file (pData) must be in the same exact order as the samples given in other inputs or the tool will error.

The Ballgown wrapper was last updated some time ago, so I am not sure exactly about how well it works with the latest versions of Galaxy. If you have problems, contacting the tool wrapper author (aka repository owner) through the Tool Shed is how to get help or to report problems: > log in > search for tool > click into tool and find the contact option in a top right menu named "Repository Actions".

The Ballgown tool (and the existing wrapper) may be reviewed again before or during the GCC Hackathon in late June.

Hope this helps! Others that have used the tool recently in the latest Galaxy release are encouraged to add more.

Thanks, Jen, Galaxy team

Thank you for the kind advice and suggestion.

Hi, may I ask a simple question related to this? What's the phenotype? I have two samples and two replicates for each sample. I am not sure what the phenotype is when I tried to generate pData.

The "phenotype" is a description of the sample. To view an example, download the test data included the primary publication here:

Specifically, the sample pData input chrX_data/geuvadis_phenodata.csv looks like this:

I have a most basic question: Where do the id names for the Ballgown phenotype csv table come from?

For example my dataset names in the right-side history panel are: "193: Stringtie on data 2 and data 11:exon to transcript mapping" or "data 11" or "Stringtie on data 2 and data 11"

Hello @rjames did you get to the bottom of this did you use the "193: Stringtie on data 2 and data 11:exon to transcript mapping" or the data storage in folders e.g. "dataset_2010.dat", as Im lost on this also. The pheno data looks straight forward but the sample naming file is a bit tricky, Ryan

