This is somewhat Galaxy related, but more of a general question.
I did all of my mapping and initial analysis for scRNA-Seq in Galaxy using Tophat and Cufflinks and at this point, I've downloaded the data and generated an expression matrix.
However, I found that a lot of the Reference Annotation values from my .GTF file have the same common gene name (i.e. there are multiple rows in my matrix with the same common gene name). What is an appropriate way of handling this if my main interest is to look at gene expression for clustering? Should I simply sum all the rows with the same common gene name? Is there a more appropriate transformation to keep my data accurate?
If more information is needed, my samples are human, I used hg38, and the UCSC gene names for my reference.