cpu vs largemem
For jobs that required a lot of memory and a small number of cores, Vega has a largemem partition.
The partition is specified with the following data:
Physical Partition | Slurm Partition | Nodes | Threads per node (Slurm CPUs**) | Memory per node | Threads/memory | Slurm TRESBillingWeights parameter |
---|---|---|---|---|---|---|
CPU | cpu, longcpu* | 768 | 256 | 256 GB | 1CPU/1GB | CPU=1,Mem=1G |
CPU LM | largemem | 192 | 256 | 1024 GB | 1CPU/1GB | CPU=1,Mem=0.25G |
*cpu and longcpu Slurm partitions includes nodes from physical CPU LM partition.
** Hyperthreading is ON, so 1 CPU = 1 Thread in Slurm. For billing, 1 CPU core has 2 threads (Slurm CPU hours are devided by 2 to get core-hours).
For job with following requiraments:
Number of threads: 12
Amount of memory: 32GB
Time: 1h
SBATCH example for cpu
partition is:
#!/bin/bash
#SBATCH --job-name=my_job
#SBATCH --partition=cpu
#SBATCH --cpus-per-task=12
#SBATCH --mem=32GB
#SBATCH --time=01:00:00
Billing for running this job on cpu
partition using billing is 32 (memory) divided by 2 (threads-core ratio): 16 core-hours.
SBATCH example on largemem
partition is the same except parameter:
#SBATCH --partition=largemem
#!/bin/bash
#SBATCH --job-name=my_job
#SBATCH --partition=largemem
#SBATCH --cpus-per-task=12
#SBATCH --mem=32GB
#SBATCH --time=01:00:00
Billing for running this job on largemem
partition using billing 12 (threads) divided by 2 (Threads-core ratio): 6 core-hours.
This means that sending such jobs gets the desired resources faster, and in terms of billing, such jobs cost less. It is recommended, that users use largemem partition for similar jobs with high ratio.
More information for billig are avaiable on Billing.