Question: Normalization of RNA-seq data
0
gravatar for alice.mcgovern
18 months ago by
alice.mcgovern10 wrote:

Hi,

I am wanting to normalize my RNA-seq data without running a differential gene expression on any of my samples. I can easily run HISAT2 and then sam/bam counts to get a count matrix but I assume this is raw counts? How do I then, at the very least, normalize those counts to account for library size etc?

Thanks

rna-seq normalization • 1.2k views
ADD COMMENTlink written 18 months ago by alice.mcgovern10

Generally one does this in comparison to other samples (e.g., with edgeR or DESeq2 in R). This is generally the most robust method.

ADD REPLYlink written 18 months ago by Devon Ryan1.9k
1

Thanks for your response. Basically i have rna seq fastq files on single cells and im not wanting to draw any comparisons on them yet. I'm just hoping to normalize and come up with a table of gene expression. Does edgeR or DESeq2 do that without running differential comparisons? Sorry if this is a really naive question

ADD REPLYlink written 18 months ago by alice.mcgovern10
0
gravatar for Devon Ryan
18 months ago by
Devon Ryan1.9k
Germany
Devon Ryan1.9k wrote:

You can have DESeq2 do that. Randomly assign your samples to two groups (it doesn't matter which samples end up in which group) and run dds = DESeq(...) on that. Then get a matrix of counts with counts(dds, normalized=T). The DESeq2 Galaxy wrapper has an "Output normalized counts table" option, so if you set that to "Yes" then you can use that file and ignore the rest of the output.

BTW, that's certainly not a naive question :)

ADD COMMENTlink modified 18 months ago • written 18 months ago by Devon Ryan1.9k

Hi, thanks again. Sorry to be annoying but for some reason when I try to run DESeq2 on my files it comes up with an error. My series of events are HISAT2 my fastq files, bams to dge count matrix individually on each HISAT2 file, then perform DEseq2 like you suggested. The error I get it is:

Tool execution generated the following error message: Fatal error: An undefined error occurred, please check your input carefully and contact your administrator. Warning messages: 1: multiple methods tables found for 'arbind' 2: multiple methods tables found for 'acbind' 3: replacing previous import 'IRanges::arbind' by 'SummarizedExperiment::arbind' when loading 'DESeq2' 4: replacing previous import 'IRanges::acbind' by 'SummarizedExperiment::acbind' when loading 'DESeq2' Error in Ops.factor(a$V1, l[[1]]$V1) : level sets of factors are different Calls: DESeqDataSetFromHTSeqCount -> sapply -> sapply -> lapply -> FUN -> Ops.factor Warning message: In is.na(e1) | is.na(e2) : longer object length is not a multiple of shorter object length The tool produced the following additional output: DESeq2 run information

sample table: A dataset_53446.dat A2 dataset_53447.dat A2 dataset_53448.dat A1 dataset_53449.dat A1

design formula: ~A

primary factor: A


ADD REPLYlink written 18 months ago by alice.mcgovern10

Click on the bug report button in the history item so your local admin can have a look. This is one of those things that takes 30 minutes or so to debug locally but days remotely.

ADD REPLYlink written 18 months ago by Devon Ryan1.9k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 120 users visited in the last hour