Question: MACS did not work - invalid literal for int() with base 10: '1:N:0:CAGATCTG'
0
gravatar for dorota.komar
2.6 years ago by
dorota.komar10
Spain
dorota.komar10 wrote:

Hi I tried to call peaks of my ChIP-seq with MACS but an error appeared:

INFO @ Thu, 21 Apr 2016 15:31:51:

ARGUMENTS LIST:

name = MACS_in_Galaxy

format = SAM

ChIP-seq file = /galaxy-repl/main/files/015/336/dataset_15336509.dat

control file = /galaxy-repl/main/files/015/336/dataset_15336520.dat

effective genome size = 2.70e+09

tag size = 35

band width = 300

model fold = 32

pvalue cutoff = 1.00e-05

Ranges for calculating regional lambda are : peak_region,1000,5000,10000

INFO @ Thu, 21 Apr 2016 15:31:51: #1 read tag files... INFO @ Thu, 21 Apr 2016 15:31:51: #1 read treatment tags... Traceback (most recent call last): File "/galaxy/main/deps/macs/1.3.7.1/devteam/package_macs_1_3_7_1/a7ea583a35d2/bin/macs", line 273, in <module> main() File "/galaxy/main/deps/macs/1.3.7.1/devteam/package_macs_1_3_7_1/a7ea583a35d2/bin/macs", line 57, in main (treat, control) = load_tag_files_options (options) File "/galaxy/main/deps/macs/1.3.7.1/devteam/package_macs_1_3_7_1/a7ea583a35d2/bin/macs", line 252, in load_tag_files_options treat = options.build(open2(options.tfile, gzip_flag=options.gzip_flag)) File "/galaxy/main/deps/macs/1.3.7.1/devteam/package_macs_1_3_7_1/a7ea583a35d2/lib/python/MACS/IO/__init__.py", line 1480, in build_fwtrack (chromosome,fpos,strand) = self.__fw_parse_line(thisline) File "/galaxy/main/deps/macs/1.3.7.1/devteam/package_macs_1_3_7_1/a7ea583a35d2/lib/python/MACS/IO/__init__.py", line 1500, in __fw_parse_line bwflag = int(thisfields[1]) ValueError: invalid literal for int() with base 10: '1:N:0:CAGATCTG'

Any ideas on what have I done wrong? :(

ADD COMMENTlink modified 2.2 years ago by Jennifer Hillman Jackson25k • written 2.6 years ago by dorota.komar10
0
gravatar for Jennifer Hillman Jackson
2.6 years ago by
United States
Jennifer Hillman Jackson25k wrote:

Hello,

Double check your inputs. If they are in SAM format, convert to BAM (SAMTools) and re-run. MACS will fail this way when there are spaces in the sequence identifier names in SAM format.

Thanks, Jen, Galaxy team

ADD COMMENTlink written 2.6 years ago by Jennifer Hillman Jackson25k
0
gravatar for dorota.komar
2.6 years ago by
dorota.komar10
Spain
dorota.komar10 wrote:

Thank you very much for your response. I tried the conversion of Inputs into BAM files, however the peak calling did not work:

INFO @ Fri, 22 Apr 2016 08:33:00:

ARGUMENTS LIST:

name = MACS_Inflorescence_1

format = SAM

ChIP-seq file = /galaxy-repl/main/files/015/336/dataset_15336509.dat

control file = /galaxy-repl/main/files/015/389/dataset_15389523.dat

effective genome size = 2.70e+09

tag size = 35

band width = 300

model fold = 32

pvalue cutoff = 1.00e-05

Ranges for calculating regional lambda are : peak_region,1000,5000,10000

INFO @ Fri, 22 Apr 2016 08:33:00: #1 read tag files... INFO @ Fri, 22 Apr 2016 08:33:00: #1 read treatment tags... Traceback (most recent call last): File "/galaxy/main/deps/macs/1.3.7.1/devteam/package_macs_1_3_7_1/a7ea583a35d2/bin/macs", line 273, in <module> main() File "/galaxy/main/deps/macs/1.3.7.1/devteam/package_macs_1_3_7_1/a7ea583a35d2/bin/macs", line 57, in main (treat, control) = load_tag_files_options (options) File "/galaxy/main/deps/macs/1.3.7.1/devteam/package_macs_1_3_7_1/a7ea583a35d2/bin/macs", line 252, in load_tag_files_options treat = options.build(open2(options.tfile, gzip_flag=options.gzip_flag)) File "/galaxy/main/deps/macs/1.3.7.1/devteam/package_macs_1_3_7_1/a7ea583a35d2/lib/python/MACS/IO/__init__.py", line 1480, in build_fwtrack (chromosome,fpos,strand) = self.__fw_parse_line(thisline) File "/galaxy/main/deps/macs/1.3.7.1/devteam/package_macs_1_3_7_1/a7ea583a35d2/lib/python/MACS/IO/__init__.py", line 1500, in __fw_parse_line bwflag = int(thisfields[1]) ValueError: invalid literal for int() with base 10: '1:N:0:CAGATCTG'

So I also converted ChIP files to BAM format and it failed once again:

INFO @ Fri, 22 Apr 2016 10:16:43:

ARGUMENTS LIST:

name = MACS_in_Galaxy

format = BAM

ChIP-seq file = /galaxy-repl/main/files/015/392/dataset_15392040.dat

control file = /galaxy-repl/main/files/015/389/dataset_15389523.dat

effective genome size = 2.70e+09

tag size = 35

band width = 300

model fold = 32

pvalue cutoff = 1.00e-05

Ranges for calculating regional lambda are : peak_region,1000,5000,10000

INFO @ Fri, 22 Apr 2016 10:16:43: #1 read tag files... INFO @ Fri, 22 Apr 2016 10:16:43: #1 read treatment tags... INFO @ Fri, 22 Apr 2016 10:16:44: #1.2 read input tags... INFO @ Fri, 22 Apr 2016 10:16:54: 1000000 INFO @ Fri, 22 Apr 2016 10:17:04: 2000000 INFO @ Fri, 22 Apr 2016 10:17:15: 3000000 INFO @ Fri, 22 Apr 2016 10:17:25: 4000000 INFO @ Fri, 22 Apr 2016 10:17:40: 5000000 INFO @ Fri, 22 Apr 2016 10:17:56: 6000000 INFO @ Fri, 22 Apr 2016 10:18:05: 7000000 INFO @ Fri, 22 Apr 2016 10:18:13: 8000000 INFO @ Fri, 22 Apr 2016 10:18:17: #1 Background Redundant rate: 1.00 INFO @ Fri, 22 Apr 2016 10:18:17: #1 finished! INFO @ Fri, 22 Apr 2016 10:18:17: #2 Build Peak Model... INFO @ Fri, 22 Apr 2016 10:18:17: #2 number of paired peaks: 13 WARNING @ Fri, 22 Apr 2016 10:18:17: Too few paired peaks (13) so I can not build the model! Lower your MFOLD parameter may erase this error. WARNING @ Fri, 22 Apr 2016 10:18:17: Process is terminated!

Do you have maybe any pother ideas of how to fix it?

Thanking you in advance, Dorota

ADD COMMENTlink written 2.6 years ago by dorota.komar10

Hello,

This is now a data/parameter issue, not a tool error. The data appears to be sparse.

First, check the mapping results to ensure that process went OK. If mapping rates are low, then go back further to the fastq data and confirm the datatype, sequence quality, that the treatment/control were entered on the tool form in the right order, etc.

After that, following the error message's advice, MFOLD is one option to adjust and test. There are others. See the complete MACS documentation for the parameters and impact, test a few variations out, consider visualizing, and determine the options that produce useful peaks with your data.

Best, Jen, Galaxy team

ADD REPLYlink written 2.6 years ago by Jennifer Hillman Jackson25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 183 users visited in the last hour