Question: Concatenating four Nextseq fastq files
3 months ago
alice.mcgovern10


Please excuse my naivety but my sequencing was run on NextSeq with single end reads, I have 4 fastq files per sample and therefore need to combine the 4 files. I have been using the concatenate tool and then end up with 4 separate files that are called concatenate datasets on data XX. Which one of the four would I use for my alignment and analysis? Or do I need to do something else?

3 months ago
United States
Jennifer Hillman Jackson


The tool is designed so that it can be used for single or batch operations when datasets are added to both "Datasets to concatenate" and "Dataset > Insert dataset". Example of a batch operation: there are multiple input datasets to merge into multiple output datasets. However, for your single operation usage, try this method instead:

  1. Datasets to concatenate = add just one of the datasets
  2. Dataset, click on "Insert dataset" = add just one of the other datasets
  3. Repeat step 2 until all four are added individually
  4. The output will be one dataset with all four datasets merged. Use this merged output with downstream tools.

If you instead add all four datasets in step 1 (only), then an individual job is launched per dataset. This effectively creates four distinct new datasets with the each of the original single dataset's content (no dataset with merged/concatenated output is created).

Hope this clears up the usage for the tool! Jen, Galaxy team

