Illegal Instruction Error When Running Apptainer on GPU Nodes

Body

Symptom

Running apptainer exec on a GPU node produces:

Illegal instruction (core dumped)

This typically occurs when running containers that invoke compiled binaries (e.g., PyTorch, TensorFlow).

Cause

The default module path loads packages compiled for Intel Cascade Lake (cascadelake) CPUs. The GPU nodes use AMD EPYC (Zen 2) processors. Binaries compiled with Cascade Lake-specific instructions (AVX-512, etc.) will crash with "Illegal instruction" on AMD hardware.

Fix

Before loading apptainer (or any other module for use on GPU nodes), switch to the Zen 2 module stack:

module use /usr/local/spack/share/spack/modules/linux-ubuntu22.04-zen2
module load apptainer

You can verify you're loading the correct module with:

module avail apptainer

The loaded module should come from the zen2 path, not cascadelake.

Notes

  • This applies to any compiled software loaded via modules on GPU nodes, not just Apptainer.

Details

Details

Article ID: 21522
Created
Thu 3/26/26 2:39 PM
Modified
Thu 3/26/26 2:51 PM

Related Articles

Related Articles (1)