Remove sequences with duplicate chromosome start position

Heads up! This is a static archive of our support site. Please go to help.galaxyproject.org if you want to reach the Galaxy community. If you want to search this archive visit the Galaxy Hub search

Latest

Open

RNA-Seq

ChIP-Seq

SNP

Assembly

Forum

Home

Welcome to Galaxy Biostar! User support for Galaxy! about • faq • rss

Log In

Sign Up

Question: Remove sequences with duplicate chromosome start position

0

4.1 years ago by

lem • 0

USA/Chicago

lem • 0 wrote:

I used GALAXY to extract the 1000 bp upstream of all UCSC genes (i.e. promoters). I sorted the data by chromosome number then by start position (i.e. by c1 then by c2). For any gene with multiple isoforms using the same start site, there will be duplicate chr start coordinates and I want to remove these.

Essentially, column 2 contains the start coordinate. I want to remove all lines with a duplicate start coordinate (for a given chromosome).

Thank you in advance for your wonderful help to a student who is still learning the computational basics.

remove duplicate column • 1.0k views

ADD COMMENT • link •

modified 4.1 years ago by Bjoern Gruening ♦ 5.1k • written 4.1 years ago by lem • 0

0

4.1 years ago by

Bjoern Gruening ♦ 5.1k

Germany

Bjoern Gruening ♦ 5.1k wrote:

Hi Lem,

you can use the tool "Unique occurrences of each record" and use it on c1 and c2 only. This is under advanced options.

Cheers,

Bjoern

ADD COMMENT • link written 4.1 years ago by Bjoern Gruening ♦ 5.1k

Thanks Bjoern- I do not see this option, though. Nor am I able to find 'advanced options' tab

ADD REPLY • link written 4.1 years ago by lem • 0

Please log in to add an answer.

Similar posts • Search »

Fetch Alignments using Stitch Gene blocks, but '-' strand has been joined wrong
As I use Fetch Alignments => Stitch Gene blocks extract genes from the maf file accordin...
rmdup - remove PCR duplicates from ChIP-seq data
I have aligned my chip-seq data with bow tie and wanted to remove PCR duplicates. I ran rmdup in ...
Reads on Y chromosome while doing sequecing with a female donor
Hello, When looking closely to my alignments data I found something interesting. Some of my read...
Query Related To Lift Over
Hi all,I have data with RGSC 3.4 build, along with the start and end coordinates of genes. I want...
Duplicated sequences within a gene_id in fasta entry?
Hello, I used the Extract Genomic DNA function in Galaxy and it outputted a fasta file using my ...
interval file to fastq conversiosn
I want to obtain fastq file from chromosome coordinates. I have interval file which contains 4 co...
Hg17-->Hg18 Liftover Problem
Hi folks, I am trying to perform a liftover of a UCSC wiggle plot from hg17 coordinates to hg18...
RNAseq data to be processed in two ways: (i) mapping to de novo Trinity-based transcriptome and (ii) mapping a relatively new genome
Hello all, I am new to RNAseq data and learning this process step by step, so I have a few quest...
Stitch MAF Blocks output mostly empty
Hi, I am trying to extract multiple alignments for a large number of zebrafish regions (~90,000)...
mapping over lapping genes to CNV probes
I am very new to galaxy. I have CNV call data which has a start position and stop position for th...
Performing Operations On Tab Delimited Text Files
Hi, My name is Lanelle Edwards. I am a new galaxy user. I uploaded a tab delimited text file to m...
Performing Operations On Tab Delimited Text Files
Hi, My name is Lanelle Edwards. I am a new galaxy user. I uploaded a tab delimited text file to m...
Remove duplicate rows from table
I have a table in Galaxy with 19,132 rows. I can remove duplicates using group from join, subtra...
[Genome] Genome-Wide Dataset Of Protein Location Conversion
Hi Dana, the way I do it, is the following: Get a list of all genes (e.g. RefSeq) using the "Ge...
Difference between mRNA, CDS, transcript and gene in annotation file
Hi, I am new to bioinformatics and have a very basic question. In the annotation files (gff) ava...
Trying to understand the steps for 'identify DNA polymorphism sites'
Greetings! My partners and I got this homework. We're not specialized in bioinformatics or anythi...
GENE AND CHROMOSOME SEARCHES
HI I WAS WONDERING ONCE YOU LOAD UP A VCF FILE HOW DO YOU LOOK AT SPECIFIC AREAS AND CHROMOSOMES...

Content

Help

About
FAQ

Access

RSS
Stats
API

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by Biostar version 16.09

Traffic: 181 users visited in the last hour