Hi,
I am struggling to find a proper annotation file for the human genome, where the gene name (e.g. ACTB) will be included and not some x#*! I need this when running Cufflinks, to annotate the genes from my TopHat file. When I go to UCSF table webpage I can find the ref seq. annotation I need, but when I export it to Galaxy as GTF, only some columns are included - unfortunately excluding the gene name. If I custom export the file (from UCSF), I get what I want but I cannot have it as GTF file (obviously because the information in the custom exported file is not (fully) tab delimited)!
Any idea?
Thanks! Andrei
Gene names aren't unique, which is why they aren't normally output. Why not use biomart to import the ID->Gene name conversion and annotate the cufflinks results with that? This is how we normally teach people to do things in our trainings, since it doesn't break downstream analyses.