I am new with RNA-seq data analysis. I am using Cygwin software in order to create a Linux environment on my windows machine.
I indexed a reference genome and am running alignment using HiSat2. Then I will use Cuffdiff to run a differential expression analysis between data sets.
The output files from the alignment are in SAM format. However, I understand these files must be reformatted before being entered into Cuffdiff. How do I reformat this files for Cufflinks? Can someone please explain how this is done using Samtools (include command/script).
I am also aware of the fact that a gene annotation input is necessary to run the aforementioned processes. There is a gene annotation file in the correct format within the reference genome file I downloaded. Is this okay to use, or is it better (or even possible) to obtain one using Cufflinks?
And can someone write a brief synopsis of some of the programs I mentioned (Cufflinks/Cuffdiff and Samtools) and possibly include the command for Cuffdiff.
Also, how long do you think it will take to run these processes, considering the programs/software I am using?
Please answer as much as you can as soon as possible.