Question: sort reads by start position
0
gravatar for c.zijlstra
2.1 years ago by
c.zijlstra0
c.zijlstra0 wrote:

I did RNA sequencing and many reads map to tRNA regions. Specific tRNAs are completely covered by all the reads that map to it, but many reads are not full-length tRNA. (So some reads start at 5'end while others start for example at base 20) To find out to which part of the tRNA reads correspond, I would like to sort (and count) reads by their start position. (I already used the tool intersect to determine fraction of tRNA overlap). Does anyone know which tool from the Galaxy platform I can use to do this??

Thanks in advance. Carla

rna-seq • 664 views
ADD COMMENTlink modified 2.1 years ago by Jennifer Hillman Jackson25k • written 2.1 years ago by c.zijlstra0
0
gravatar for Jennifer Hillman Jackson
2.1 years ago by
United States
Jennifer Hillman Jackson25k wrote:

Hello,

Use the tool Group. Summarize by c1 (the default) then add in the c2 (start) column and "count distinct". The output can be sorted any way you want with the tool Sort in the same tool group.

Thanks! Jen, Galaxy team

ADD COMMENTlink written 2.1 years ago by Jennifer Hillman Jackson25k

Thanks Jennifer! I tried the Group tool as you suggested but then I only find the number of reads per chromosome (not per gene) that starts exactly at the beginning of the regions defined in column 2. I would also like to sort reads that start at positions more towards 3’end of region of interest. is that also possible?

Cheers! carla

From: Jennifer Hillman Jackson on Galaxy Biostar [mailto:notifications@biostars.org] Sent: maandag 31 oktober 2016 19:22 To: Zijlstra, C. (Carla) Subject: [galaxy-biostar] sort reads by start position

Activity on a post you are following on Galaxy Biostarhttp://biostar.usegalaxy.org

User Jennifer Hillman Jacksonhttp://biostar.usegalaxy.org/u/254/ wrote Answer: sort reads by start positionhttp://biostar.usegalaxy.org/p/20363/#20369:

Hello,

Use the tool Group. Summarize by c1 (the default) then add in the c2 (start) column and "count distinct". The output can be sorted any way you want with the tool Sort in the same tool group.

Thanks! Jen, Galaxy team

ADD REPLYlink written 2.1 years ago by c.zijlstra0

Hi Carla, The sort by start coordinate is consistent throughout the sorted file. You could also sort by stop coordinate, or use the tool Filter to find regions large and/or smaller than a particular value. Tools in the groups Operate on Genomic Intervals and BED Tools have other options if you want to compare hit region coordinates to defined/annotated regions in the genome.

ADD REPLYlink modified 2.1 years ago • written 2.1 years ago by Jennifer Hillman Jackson25k

Hi Jennifer, I finally managed to do the filtering and sorting I wanted to do by using the tool ‘Filter data on any column..’. So Thanks again for your advice!

Carla From: Jennifer Hillman Jackson on Galaxy Biostar [mailto:notifications@biostars.org] Sent: dinsdag 1 november 2016 19:22 To: Zijlstra, C. (Carla) Subject: [galaxy-biostar] sort reads by start position

Activity on a post you are following on Galaxy Biostarhttp://biostar.usegalaxy.org

User Jennifer Hillman Jacksonhttp://biostar.usegalaxy.org/u/254/ wrote Comment: sort reads by start positionhttp://biostar.usegalaxy.org/p/20363/#20378:

Hi Carla, The sort by start coordinate is consistent throughout the sorted file. You could also sort by stop coordinate, or use the tool Filter to find regions large and/or smaller than a particular value. Tools in the groups Operate on Genomic Intervals and BED Tools have other options if you want to compare hit region coordinates to defined regions in the genome.

ADD REPLYlink written 2.1 years ago by c.zijlstra0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 155 users visited in the last hour