Hi, Taka,
I noticed that the full manhattan plot looks odd in the history I have
shared with you, and I think it's because the offsets for some of your
snp are wrong.
For example, the very last marker in chr1 in your data is rs11488669.
In your data, the offset is 2147483647 which is way beyond the end of
chr1 - the genome is only 3B base pairs - so the manhatten plot looks
clumpy instead of uniform.
According to genome.ucsc.edu it is at chr1:153517269-153517769
I'm going to guess that your data (eg the map file) has at some stage
been changed using spreadsheet software such as excel which can easily
do strange things to numeric columns.
If all your processing is inside Galaxy, these kinds of errors can be
prevented. I can see you have tried unsuccessfully to upload some
plink lped files in the history you shared - here's some information
that might help you from a previous enquiry on galaxy-user a few weeks
ago:
==============================================
Hi, Sylvian,
The plink/rgenetics lped and pbed (compressed) formats are special
'composite' Galaxy datatypes because the map and pedigree/genotype
files need to be kept together correctly inside Galaxy. As a result,
the upload tool requires that the file type be specified so all of the
components can be properly uploaded and stored together.
For example, to upload pbed data from your local desktop, choose
'Upload file' from the Get Data tools.
When the upload form appears, the trick is that you *must* change the
default 'Autodetect' in the first (filetype) select box to the
specific rgenetics datatype - either 'pbed' as the format for
compressed plink data (or 'lped' for uncompressed plink genotype data)
as the very first step. Type the first few letters into the first box,
and select the right one from the list that appears.
Once this is done, you will see that the upload tool form will change
to show three separate file upload inputs - one each for the plink
xxx.bim xxx.bed and xxx.fam where xxx is the name you set when you ran
plink to create the files, or for uncompressed linkage format two
separate file upload inputs - the plink .ped and .map files.
Now you can browse for the corresponding file for each input box from
your local machine - be careful not to mix them up as the upload tool
is unable to tell unfortunately.
At the bottom of the form, I suggest you then change the genome build
to the appropriate one (eg hg18 or hg19).
Finally, I'd recommend that you change the 'metadata value for
basename' (which will be the new dataset name) to something that will
remind you what the data are - something more meaningful than the
default 'rgenetics'.
Click 'execute' to upload the data and create the new dataset in your
history. Compressed (pbed) format is preferred so the upload is
quicker.
Note that some tools will autoconvert between lped and pbed so there
is a delay the first time some tools are run on a new dataset. There
are built in converters (use the pencil icon) also if you need them.
I hope this helps - thanks for using Galaxy and Rgenetics - please let
us know how you go and feel free to contact me if you have other
questions.
On Fri, Feb 18, 2011 at 9:26 AM, Ross Lazarus
--
Ross Lazarus MBBS MPH
Associate Professor, HMS; Director of Bioinformatics, Channing
Laboratory;
181 Longwood Ave., Boston MA 02115, USA. Tel: +1 617 505 4850;
Head, Medical Bioinformatics, BakerIDI; PO Box 6492, St Kilda Rd
Central;
Melbourne, VIC 8008, Australia; Tel: +61 385321444