Efficient CPU memory usage helps ensure that the shared cluster resources remain available for all users. Requesting too much memory can lead to longer queue times (for you and others), while requesting too little may cause jobs to fail.

Aim to request an appropriate amount memory for all of your jobs.
• Target utilization: ~80–90% of requested memory

If you are running many similar jobs (e.g., job arrays, parameter sweeps, workflows processing many different samples, etc.), it is especially important to estimate memory needs before scaling up.

Why this matters

Submitting hundreds or thousands of jobs with overestimated memory can:
• Compound wasted memory
• Increase queue wait times
• Reduce overall cluster throughput
• Lead to significant number of unusable CPUs


Requesting Memory in Slurm

You can request memory in two main ways:
• Total memory for job:
#SBATCH –mem=16G
• Memory per CPU core:
#SBATCH –mem-per-cpu=4G


How to Check Memory Usage

  1. seff (after job completes)
    Provides a quick summary of efficiency:
    seff
    Example output:
    Memory Utilized: 1.2 GB
    Memory Efficiency: 7.5% of 16.0 GB

  2. jobstats (during or after job)
    More detailed and works for running jobs:
    module load jobstats; jobstats
    Provides:
    • Memory usage (maximum)
    • CPU utilization

  3. Grafana Dashboard (during or after job)
    Provides interactive monitoring of job performance, including:
    • Memory usage over time
    • CPU utilization trends
    • Node-level resource usage
    Best for:
    • Visualizing spikes vs steady usage
    • Identifying peak memory requirements


Simplified Workflow for Right-Sizing Memory

✔ Run a few test jobs first
✔ Measure peak memory usage
✔ Account for input variability
✔ Add a modest safety buffer
✔ Then scale to full production


Common Pitfalls

• Over-requesting “just to be safe”
→ leads to longer queue times and wasted resources
• Forgetting memory scales with CPUs
→ –mem-per-cpu × CPUs can unintentionally request large totals
• Ignoring peak vs average usage
→ short spikes matter (Grafana helps here)
• Not re-tuning after pipeline changes
→ different steps may have very different memory needs


Need Help?

If you’re unsure how much memory your workflow requires, Research Computing is happy to help:

• Review job scripts
• Analyze seff/jobstats/Grafana output
• Recommend optimized resource requests

https://www.rc.virginia.edu/support/