Question: Phenotype data for Ballgown tool
gravatar for tamrin.chowdhury
6 months ago by
tamrin.chowdhury40 wrote:

Hello I am doing RNA-seq analysis using local server galaxy. I'm following the new Tuxedo protocol HISAT>Stringtie>Ballgown. I have a question regarding Ballgown input. Apart from the 5 stringtie output files ballgown requires another phenotype data file for input. Is there any reference phenotype data file? or Do I have to write the file myself using text editor?

I have two experiment groups of Tumor and Normal with 30 matching samples in each group. Can anyone kindly give me an idea about how to arrange the phenotype data file?

rna-seq galaxy ballgown • 507 views
ADD COMMENTlink modified 6 months ago by Jennifer Hillman Jackson23k • written 6 months ago by tamrin.chowdhury40
gravatar for Jennifer Hillman Jackson
6 months ago by
United States
Jennifer Hillman Jackson23k wrote:


This input file is a two-column file created by the analyst (you) that contains the sample name and phenotype. The order of the samples in this file (pData) must be in the same exact order as the samples given in other inputs or the tool will error.

The Ballgown wrapper was last updated some time ago, so I am not sure exactly about how well it works with the latest versions of Galaxy. If you have problems, contacting the tool wrapper author (aka repository owner) through the Tool Shed is how to get help or to report problems: > log in > search for tool > click into tool and find the contact option in a top right menu named "Repository Actions".

The Ballgown tool (and the existing wrapper) may be reviewed again before or during the GCC Hackathon in late June.

Hope this helps! Others that have used the tool recently in the latest Galaxy release are encouraged to add more.

Thanks, Jen, Galaxy team

ADD COMMENTlink written 6 months ago by Jennifer Hillman Jackson23k

Thank you for the kind advice and suggestion.

ADD REPLYlink written 6 months ago by tamrin.chowdhury40

Hi, may I ask a simple question related to this? What's the phenotype? I have two samples and two replicates for each sample. I am not sure what the phenotype is when I tried to generate pData.

ADD REPLYlink written 4 months ago by sophialovechan10

The "phenotype" is a description of the sample. To view an example, download the test data included the primary publication here:

Specifically, the sample pData input chrX_data/geuvadis_phenodata.csv looks like this:

ADD REPLYlink modified 4 months ago • written 4 months ago by Jennifer Hillman Jackson23k

I have a most basic question: Where do the id names for the Ballgown phenotype csv table come from?

For example my dataset names in the right-side history panel are: "193: Stringtie on data 2 and data 11:exon to transcript mapping" or "data 11" or "Stringtie on data 2 and data 11"

ADD REPLYlink written 3 months ago by rjames0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 107 users visited in the last hour