Question: Count Reads Mapping To Introns, Extragenic Regions
0
Alex Koeppel • 10 wrote:
Hello everyone,
I have some SAM/BAM files containing the alignments of some RNA-seq
reads
to hg19. I'm interested in calculating some mapping statistics,
specifically, the percentage of reads mapping to exons, introns, and
extragenic regions.
I gather that this can be done with bedtools, but I'm finding myself a
little bit stuck just figuring out what files I need to get this
information. I gather that I need a GTF (or possibly GFF) file, and I
downloaded one from the UCSC browser using the settings in the
attached
image.
The first couple lines of the resulting file are pasted below. I see
that
the file has exon start and end sites. Is there a way to get what I
need
with this file, or do I need something else?
Any assistance would be much appreciated,
Thanks
Alex
cat gencode.gtf | head -3
#bin name chrom strand txStart txEnd cdsStart cdsEnd
exonCount exonStarts exonEnds score name2
cdsStartStat cdsEndStat exonFrames
0 ENST00000237247.6 chr1 + 66999065
67210057
67000041 67208778 27
66999065,66999928,67091529,67098752,67099762,67105459,67108492,671092
26,67126195,67133212,67136677,67137626,67138963,67142686,67145360,6714
7551,67149789,67154830,67155872,67161116,67184976,67194946,67199430,67
205017,67206340,67206954,67208755,
66999090,67000051,67091593,67098777,67099846,67105516,67108547,671094
02,67126207,67133224,67136702,67137678,67139049,67142779,67145435,6714
8052,67149870,67154958,67155999,67161176,67185088,67195102,67199563,67
205220,67206405,67207119,67210057,
0 SGIP1 cmpl cmpl
-1,0,1,2,0,0,0,1,0,0,0,1,2,1,1,1,1,1,0,1,1,2,2,0,2,1,1,
0 ENST00000371039.1 chr1 + 66999274
67210768
67000041 67208778 22
66999274,66999928,67091529,67098752,67105459,67108492,67109226,671366
77,67137626,67138963,67142686,67145360,67154830,67155872,67160121,6718
4976,67194946,67199430,67205017,67206340,67206954,67208755,
66999355,67000051,67091593,67098777,67105516,67108547,67109402,6713670
2,67137678,67139049,67142779,67145435,67154958,67155999,67160187,67185
088,67195102,67199563,67205220,67206405,67207119,67210768,
0 SGIP1 cmpl cmpl
-1,0,1,2,0,0,1,0,1,2,1,1,1,0,1,1,2,2,0,2,1,1,
ADD COMMENT
• link
•
modified 5.7 years ago
by
Jennifer Hillman Jackson ♦ 25k
•
written
5.8 years ago by
Alex Koeppel • 10