Skip to content

not enough memory for gatk3_join #15

@alextidd

Description

@alextidd

Hi!
I have run scan2 on 91 BAM files (median coverage 18X), with the genome split up into 100,000 chunks in the --regions-file. Here is my command:

scan2 run \
  --joblimit 800 \
  --cluster \
    "bsub -q basement -M {resources.mem_mb} -R'span[hosts=1] select[mem>{resources.mem_mb}] rusage[mem={resources.mem_mb}]' \
      -n {threads} -o %logdir/%J.out -e %logdir/%J.err"

...and I got this error:

[Mon Nov  3 16:32:18 2025]
Error in rule gatk_scatter:
    jobid: 53403
    input: plate3_wellA2_dna_run49882.bam, [...], PD63118b_lo0001.sample.dupmarked.bam
    output: gatk/hc_raw.mmq1_chunk24572.vcf, gatk/hc_raw.mmq1_chunk24572.vcf.idx
    shell:
        gatk3 -Xmx3500M -Xms3500M    -T HaplotypeCaller    -R data/scan2/GRCh37/genome.fa    --dontUseSoftClippedBases -l INFO    --dbsnp reference/dbsnp/GRCh37/common_all_20180423.vcf    -rf BadCigar     -mmq 1    -I plate3_wellA2_dna_run49882.bam [...] -I PD63118b_lo0001.sample.dupmarked.bam    -L 16:46400001-46500000    -o gatk/hc_raw.mmq1_chunk24572.vcf
        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
    cluster_jobid: Job <983205> is submitted to queue <basement>.

Error executing rule gatk_scatter on cluster (jobid: 53403, external: Job <983205> is submitted to queue <basement>., jobscript: PD63118/.snakemake/tmp.j0wo9x64/snakejob.gatk_scatter.53403.sh). For error details see the cluster log and the log files of the involved rule(s).
[Mon Nov  3 16:32:18 2025]
Error in rule gatk_scatter:
    jobid: 53402
    input: plate3_wellA2_dna_run49882.bam, [...], PD63118b_lo0001.sample.dupmarked.bam
    output: gatk/hc_raw.mmq1_chunk24571.vcf, gatk/hc_raw.mmq1_chunk24571.vcf.idx
    shell:
        gatk3 -Xmx3500M -Xms3500M    -T HaplotypeCaller    -R /nfs/casm/team268im/at31/projects/hashimoto_thyroiditis/data/scan2/GRCh37/genome.fa    --dontUseSoftClippedBases -l INFO    --dbsnp /nfs/casm/team268im/at31/reference/dbsnp/GRCh37/common_all_20180423.vcf    -rf BadCigar     -mmq 1    -I plate3_wellA2_dna_run49882.bam [...] -I PD63118b_lo0001.sample.dupmarked.bam    -L 16:46300001-46400000    -o gatk/hc_raw.mmq1_chunk24571.vcf
        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
    cluster_jobid: Job <983255> is submitted to queue <basement>.

Error executing rule gatk_scatter on cluster (jobid: 53402, external: Job <983255> is submitted to queue <basement>., jobscript: /lustre/scratch125/casm/teams/team268/at31/projects/hashimoto_thyroiditis/out/resolveome/scan2/PD63118/.snakemake/tmp.j0wo9x64/snakejob.gatk_scatter.53402.sh). For error details see the cluster log and the log files of the involved rule(s).
Submitted job 9591 with external jobid 'Job <983594> is submitted to queue <basement>.'.

Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: .snakemake/log/2025-11-03T112035.866671.snakemake.log

The actual error in 983205.err is an out-of-memory error:

##### ERROR MESSAGE: An error occurred because you did not provide enough memory to run this program. 
You can use the -Xmx argument (before the -jar argument) to adjust the maximum heap size provided to Java.

The job allocates 3.5GB to GATK (gatk3 -Xmx3500M -Xms3500M), which seems to be insufficient for 91 BAMs. I see that the memory allocation scales with the number of BAMs in the SENTIEON implementation, but it doesn't in the GATK3 implementation. I'm not sure how to increase the memory allocation for this process. Do you have any idea how I can prevent this failing?
Thanks so much!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions