I am analysing both sRNA data and mRNA data for differential expression from bacteria. For mRNA-seq I used the pipeline Bowtie+Stringtie+Ballgown. For sRNA i am planning to use, Bowtie+HTseq-count+DESeq2. I opted for HTseq2 + DEseq2 as I didnt find any previous references that have used STRINGTIE for small RNAs. Which one is a better option? will using STRINGTIE give more accurate results than HTseq+DESeq2?
Hi there!
I was wondering what sRNA means but I suppose these are small RNA-Seq reads to find small non codingRNAs? For small RNA's I would not use Stringtie as assembler - it tries to assemble genes and their intron exon structure and thats not what you will be sequencing with small RNA-Seq. In small RNA-Seq you will be sequencing miRNAs and other ncRNA derived fragments up to a size of ~35bp. If do you want to find miRNAs, you could go for tools like miRDeep but those are highly specific for miRNAs only. If you want to use a small ncRNA assembler that shouldn't give any bias towards a certain sub type of small ncRNAs you could go for BlockBuster or (okay, this will obviously be biassed as I wrote the tool myself) FlaiMapper which I believe works better, in particular, for snoRNAs. I am rewriting that package to make it compatible with entire reference genomes so you could use it on bacteria as well. This will be finished somewhere this or next week..
After assembly it is fine to use HTseq-count+DESeq2 if you like to. EdgeR and/or featureCounts will be fine too - it doens't matter too much which of those you use.
I gave a talk on small RNA-Seq two days ago. You can have a look if you're interested:
https://humgenprojects.lumc.nl/trac/humgenprojects/wiki/RNA-seq-course#Day2
all best,
Youri
Is it advisable to use different methods of expression analysis for the mRNA's and sRNA's if i want to compare them and extract oppositely regulated pairs?
Hi there,
I can interpret your question in two ways so I'll try to cover them both.
In the first place you need to estimate read counts for both the mRNA and sRNA data. I would argue to use the same method for counting consistently, to avoid/reduce a tool specific count bias. However, if you compare anything from the one source with the other I don't think you are able to distinguish between covariates due to library prep or actual expression differences. I would also stay as consistent as possible with settings, but sometimes it's necessary to use different settings (e.g. multi-map reads are affecting sRNAs much more than mRNAs).
If you have done expression analysis on mRNA and sRNA separately and want to correlate P-Values/FDR, you should use try to use the same method for estimating the P-values to avoid artefacts specific for the normalization method (and you will find them, take a look at the following picture: https://github.com/galaxyproject/tools-iuc/pull/20#issuecomment-143682829, this is a correlation of P-values for exactly the same data...).
The general problem is that each tool (and even each version, each setting and even the version of the dependencies..) has it's own characteristical behaviour resulting in biases of any kind, which are virtually impossible to correct for. If you just use the same tool (and version, and settings) consistently, you are at least not introducing such biasses. I would in your case choose for one tool for counting (HTSeq-count, featureCounts, ...) and do your statistics also in just one other tool. Which tool is up to you of course...
It's just my view, not a golden standard, so take it with a grain of salt please
Thank you for your detailed answer. Infact I am doing non-coding RNA's in bacteria, and currently focus only on the differential expression and not on identification of novel sRNA's (That part was done earlier by another team and I am using their annotation for my works).
If you do have an annotation of existing sRNAs you can skip the assembly part indeed, and proceed with htseq+deseq. Good luck!
It appears that y.hoogstrate helped to solve your problem. If you could accept this answer, that will help others with the same type of inquiry find the correct answer. This also helps you, as accepted answers encourage others to do the same, building this resource and community. Thanks! Jen, Galaxy team