Welcome to the Info TEST server!

Skip to content. | Skip to navigation

Sections

Memory Options

Memory options

We recommend setting both mem and vmem options when submitting batch jobs.

For single-node jobs, the mem option directs the scheduler to put your job on a node with at least mem bytes of memory. If your job exceeds mem, then your job will swap but continue to run.  The vmem option directs the scheduler to reserve vmem bytes of memory+swap on a node.  If your job exceeds vmem, the system will begin killing your processes until the memory usage is reduced below vmem.  So, set mem to the total amount of memory you expect your job to use at any one time and set vmem to 1.5 * mem.

-l mem=16gb,vmem=24gb

If you are submitting a multi-node batch job, the scheduler will divide mem and vmem by the number of nodes requested and put your job on nodes with at least mem/nodes bytes of memory.  If your job exceeds mem/nodes bytes it will start swapping but continue to run.  If it exceeds vmem/nodes, the system will begin killing your processes until the memory usage is reduced below vmem/nodes.  So, set mem to the total amount of memory you expect your job to use at any one time and set vmem to 1.5 * mem.

-l nodes=2:ppn=4,mem=64,vmem=96

The default for all the memory options, like mem and vmem, is unlimited.  This is why we ask that you set limits to your job so that others can have resources to use.

In the examples below, we use a fictional program called memgrab as an example, where -s is the amount of memory it allocates in gigabytes.  This is meant as an analog to running CASA.

 

-l mem

For single-node jobs, mem is the maximum amount of memory expected to be used by all the processes in the job combined. If the total amount of memory used by all the processes in the job combined exceeds mem, those processes will swap.

For multi-node jobs, the scheduler divides mem by the number of nodes requested and will use that as the maximum amount of memory expected to be used by all the processes on each node in the job.  If the amount of memory used by the job on any given node exceeds mem/nodes then those processes will swap.

  • Scheduler will not kill the job if any process or the whole job exceeds mem, but it will swap.
  • Sets "data seg size" in ulimit a.k.a. "Max data size" in /proc/$$/limits only if nodes=1 (the default).
  • Sets "max memory size" in ulimit a.k.a. "Max resident set" in /proc/$$/limits only if nodes=1 (the default).
  • Sets "memory.limit_in_bytes" in the memory cgroup to mem/nodes.
#!/bin/sh

#PBS -l nodes=1:ppn=2,mem=10gb,vmem=15gb

# Creates a cgroup on each node requested (E.g. 1) with a memory limit of mem/nodes (E.g. 10gb).

memgrab -s 8 & # 8GB < 10GB so it doesn't swap, yet
memgrab -s 8         # 16GB > 10GB so both processes start swapping

 

-l vmem

For single-node jobs, vmem is the maximum amount of memory+swap allowed by all processes combined in the job.  If any process exceeds vmem, the scheduler will kill it.  If the total amount of memory used by all the processes in the job combined exceeds vmem, then oom-killer will kill processes until the usage falls below vmem.

For multi-node jobs, the scheduler divides vmem by the number of nodes requested and will use that as the maximum amount of memory+swap allowed by all the processes on each node in the job.  If any process exceeds vmem/nodes, the scheduler will kill it.  If the amount of memory used by the job on any given node exceeds vmem/nodes then oom-killer will kill processes in that job on that node until the usage falls below vmem/nodes.

  • Linuix will kill any process that exceeds vmem with a Segmentation fault.  This may end the job with "Exit_status=-10".
  • Either the Scheduler or the Linux oom-killer will kill the job if the whole job exceeds vmem/nodes.
  • Sets "virtual memory" in ulimit a.k.a. "Max address space" in /proc/$$/limits to vmem.
  • Sets "memory.limit_in_bytes" and "memory.memsw.limit_in_bytes" in memory cgroup to vmem/nodes.
#!/bin/sh

#PBS -l nodes=1:ppn=2,mem=10gb,vmem=15gb

# Creates a cgroup on each node requested (E.g. 1) with a memory+swap limit of vmem/nodes (E.g. 15gb).

memgrab -s 12 & # 12GB < 15GB so it will swap but isn't killed, yet
memgrab -s 12        # both memgrabs total > 15GB so oom-killer will kill one

 

-l pmem

For both single-node and multi-node jobs, pmem is the maximum amount of memory expected to be used per processor, per node.  If asking for multiple processors (via ppn) then the scheduler will multiply pmem by the number of processors requested and look for that much available memory.  For example, if you use -l nodes=1,ppn=2,pmem=3gb then the scheduler will look for one node with 6GB of memory available.  If you use -l nodes=2,ppn=4,pmem=3gb then the scheduler will look for two nodes, each with 12GB of memory available.

If the total amount of memory used by all the processes combined on any node in the job exceeds pmem*ppn, then those processes on that node will swap.  Processes on a node can exceed pmem without swapping as long as the total stays under pmem*ppn.

  • Scheduler will not kill the job if any process or the whole job exceeds pmem*ppn, but it will swap.
  • Sets "data seg size" in ulimit a.k.a. "Max data size" in /proc/$$/limits to pmem.
  • Sets "max memory size" in ulimit a.k.a. "Max resident set" in /proc/$$/limits to pmem.
  • Sets "memory.limit_in_bytes" in memory cgroup to pmem*ppn.
#!/bin/sh

#PBS -l nodes=1:ppn=2,pmem=16gb

# Creates a cgroup on each node requested (E.g. 1) with a memory limit of pmem*ppn (E.g. 32gb).

memgrab -s 17 &       # 17GB < 32GB so it doesn't swap, yet
memgrab -s 17        # 34GB > 32GB so both processes start swapping

 

-l pvmem

For both single-node and multi-node jobs, pvmem is the maximum amount of memory+swap allowed by any single process in the job.  If asking for multiple processors (via ppn) then the scheduler will multiply pvmem by the number of processors requested and look for that much available memory+swap.

If any process exceeds pvmem (not pvmem*ppn), it will be killed but the job will continue.

If the total amount of memory used by all the processes combined in the job exceeds pmem*ppn, then oom-killer will kill processes until the usage falls below pmem*ppn.

  • Linuix will kill any process that exceeds pvmem with a Segmentation fault.  This may end the job with "Exit_status=137" or "Exit_status=139" or "Exit_status=255".
  • Sets "virtual memory" in ulimit a.k.a. "Max address space" in /proc/$$/limits to pvmem.
  • Sets "memory.limit_in_bytes" and "memory.memsw.limit_in_bytes" in memory cgroup to pvmem*ppn.
#!/bin/sh

#PBS -l nodes=1:ppn=2,pvmem=16gb

# Creates a cgroup on each node requested (E.g. 1) with a memory+swap limit of pvmem*ppn (E.g. 32gb).

memgrab -s 17 &       # 17GB > 16GB so this is killed
memgrab -s 15 &       # 15GB < 16GB so this isn't killed, yet
memgrab -s 15 & # 15GB < 16GB and 30GB < 32GB so this isn't killed, yet
memgrab -s 15        # 15GB < 16GB but 45GB > 32GB so oom-killer kills a memgrab process

 

-L memory

Using this option requires the -L syntax (E.g. -L tasks=1:lprocs=2:memory=10gb) instead of the -l syntax.

Maximum amount of memory expected to be used by all the processes in the job combined.  If the total amount of memory used by all the processes in the job combined exceeds memory, then those processes will  swap.

Essentially the same as the mem option but does not get divided by the number of nodes requested.

  • Scheduler will not kill the job if any process or the whole job exceeds memory, but it will swap.
  • Doesn't set anything in ulimit or /proc/$$/limits.
  • Sets "memory.limit_in_bytes" in the memory cgroup to memory.
#!/bin/sh

#PBS -L tasks=1:lprocs=2:memory=10gb

# Creates a cgroup on each node requested (E.g. 1) with a memory limit of memory (E.g. 10gb).

memgrab -s 8 & # 8GB < 10GB so it doesn't swap, yet
memgrab -s 8         # 16GB > 10GB so both processes swap
Info Services Contacts
 
Search All NRAO