Hi everyone! I have a question, I hope you can help me out. I have a list of genes that I´d like to obtain their 5'-UTR and 3'-UTR sequences. I work with Medicago truncatula, and I have RNA-SEQ data to work with. Any idea?? Thanks so much.
This is easiest to do in R (within Galaxy, I presume you could use one of the interactive environments) with the GenomicFeatures and rtracklayer bioconductor packages:
library(GenomicFeatures) library(rtracklayer) txdb = makeTxDbFromGFF("Mt4.0v1_genes_20130731_1800.gff3") utr5p = fiveUTRsByTranscript(txdb, use.names=T) utr3p = threeUTRsByTranscript(txdb, use.names=T) export.bed(utr5p, "5pUTR.bed") export.bed(utr3p, "3pUTR.bed")
3pUTR.bed files are BED12 files, though you could export GFF3 or GTF files if you preferred. I'll take the liberty of uploading the 3' and 5' BED12 files so you can also just download them.