My name is Andreia and I am writing you because I am having trouble mapping ChIP-seq samples to human genome reference using local Galaxy. I have installed Bowtie2 although it does not contain the genome reference, and as an alternative I tried to upload via FTP, unsuccessfully.

Could you please tell me whether I need to install a genome reference package or how may I overcome the upload of a file bigger than 2Gb?

I would suggest using Data Managers since you are running a local. Custom genomes take up more resources, every job.

The source Fasta can be loaded from a history if it is not on a public server and used with DMs (from the Tool Shed installed then used as Admin). There are two fasta installers - use the one that includes a "dbkey" if creating your own (not already in the genome list - Upload or Edit Attributes).

To load large data (genomes, other experimental data), enable FTP on the instance or load larger files directly into a Library from the file system (then move data lib > history for use). Or do both as needed!

Best, Jen, Galaxy team

Bit more help if go the DM route. Install/run indexes in this order:

  1. Fetch fasta (allow to complete). If pulling from UCSC, only retrieve 2 at a time or the jobs will fail (UCSC has a concurrent-access limit). Should that happen by mistake, just rerun.
  2. Ensure that 1 is intact, then Samtools indexes (allow to complete)
  3. Ensure that 2 is intact, Picard indexes (allow to complete)
  4. Ensure that 1,2,3 are intact, then launch twoBit indexes (can execute 5 - below - while this is running) -- ALWAYS do the first 4, every genome, and do it first
  5. Install any other indexes that you want to. If you run into memory/resource failures, run one at a time.

ps: The Bowtie2 DM as an option to create Tophat indexes at the same time. This is almost always a good idea. It is difficult and can create conflicts/duplicates in your instance if you run this DM again just to create Tophat indexes later.

