sbatch
Create a batch script my-script.sh like the following and submit with sbatch my-script.sh:
#!/bin/bash #SBATCH --time=0-00:01:00 #SBATCH --nodes=1 #SBATCH --partition=maxcpu #SBATCH --job-name=slurm-01 unset LD_PRELOAD # useful on max-display nodes, harmless on others source /etc/profile.d/modules.sh # make the module command available ... # your actual job
That's the core information would you probably should also keep. Note: never add a #SBATCH after a regular command. It will be ignored like any other comment.
A simple example for a mathematica:
#!/bin/bash #SBATCH --time=0-00:01:00 #SBATCH --nodes=1 #SBATCH --partition=allcpu #SBATCH --job-name=mathematica unset LD_PRELOAD source /etc/profile.d/modules.sh module purge module load mathematica export nprocs=$((`/usr/bin/nproc` / 2)) # we have hyperthreading enabled. nprocs==number of physical cores math -noprompt -run '<<math-trivial.m' # sample math-trivial.m: tmp = Environment["nprocs"] nprocs = FromDigits[tmp] LaunchKernels[nprocs] Do[Pause[1];f[i],{i,nprocs}] // AbsoluteTiming >> "math-trivial.out" ParallelDo[Pause[1];f[i],{i,nprocs}] // AbsoluteTiming >>> "math-trivial.out" Quit[]
salloc
salloc uses the same syntax as sbatch.
# request one node with a P100 GPU for 8hours in the allcpu partition: salloc --nodes=1 --partition=allcpu --constraint=P100 --time=08:00:00 # start an interactive graphical matlab session on the allocated host. ssh -t -Y $SLURM_JOB_NODELIST matlab_R2018a # the allocation won't disappear when being idle. You have to terminate the session exit
scancel
scancel 1234 # cancel job 1234 scancel -u $USER # cancel all my jobs scancel -u $USER -t PENDING # cancel all my pending jobs scancel --name myjob # cancel a named job scancel 1234_3 # cancel an indexed job in a job array
sinfo
sinfo # basic list of partitions sinfo -N -p allcpu # list all nodes and state in allcpu partition sinfo -N -p petra4 -o "%10P %.6D %8c %8L %12l %8m %30f %N" # list all nodes with limits and features in petra4 partition
squeue
squeue # show all jobs squeue -u $USER # show all jobs of user squeue -u $USER -p upex -t PENDING # all pending jobs of user in upex partition
sacct
Provides accounting information. Never use it for time spans exceeding a month!
sacct -j 1628456 # accounting information for jobid sacct -u $USER # todays jobs # get detailed information about all my jobs since 2019-01-01 and grep for all that FAILED: sacct -u $USER --format="partition,jobid,state,start,end,nodeList,CPUTime,MaxRSS" --starttime 2019-01-01 | grep FAILED
scontrol
Display information about currently running/pending jobs, configuration of partitions and nodes. Allows to alter job characteristics of pending jobs.
scontrol show job 12345 # show information about job 12345. Will show nothing after a job has finished. scontrol show reservation # list current and future reservations sontrol update jobid=12345 partition=allcpu # move pending job 12345 to partition allcpu
slurm
module load maxwell tools slurm #Show or watch job queue: slurm [watch] queue # show own jobs slurm [watch] q <user> # show user's jobs slurm [watch] quick # show quick overview of own jobs slurm [watch] shorter # sort and compact entire queue by job size slurm [watch] short # sort and compact entire queue by priority slurm [watch] full # show everything slurm [w] [q|qq|ss|s|f] shorthands for above! slurm qos # show job service classes slurm top [queue|all] # show summary of active users #Show detailed information about jobs: slurm prio [all|short] # how priority components slurm j|job <jobid> # how everything else slurm steps <jobid> # show memory usage of running srun job steps #Show usage and fair-share values from accounting database: slurm h|history <time> # show jobs finished since, e.g. "1day" (default) slurm shares #Show nodes and resources in the cluster: slurm p|partitions # all partitions slurm n|nodes # all cluster nodes slurm c|cpus # total cpu cores in use slurm cpus <partition> # cores available to partition, allocated and free slurm cpus jobs # cores/memory reserved by running jobs slurm cpus queue # cores/memory required by pending jobs slurm features # List features and GRES slurm brief_features # List features with node counts slurm matrix_features # List possible combinations of features with node counts
Ensuring minimum memory per core
The Maxwell cluster is not configured for consumable resource like memory. For an mpi-job running on heterogeneous hardware, you have to prepare your batch-script to tailor the number of cores used to the available memory for each node. A simple example:
#!/bin/bash #SBATCH --partition=maxcpu unset LD_PRELOAD source /etc/profile.d/modules.sh module purge module load mpi/openmpi-x86_64 # set hostfile HOSTFILE=/tmp/hosts.$SLURM_JOB_ID rm -f $HOSTFILE # set minimum 40GB per core mem_per_core=$((40*1024)) # generate hostfile for node in $(srun hostname -s | sort -u) ; do mem=$(sinfo -n $node --noheader -o '%m') cores=$(sinfo -n $node --noheader -o '%c') slots=$(( $mem / $mem_per_core )) slots=$(( $cores < $slots ? $cores : $slots )) echo $node slots=$slots >> $HOSTFILE done # run ... mpirun --hostfile $HOSTFILE
For a homogeneous set of nodes life becomes much easier
#!/bin/bash #SBATCH --partition=allcpu,maxcpu #SBATCH --constraint='[(EPYC&7402)|Gold-6240|Gold-6140]' #SBATCH --nodes=8 unset LD_PRELOAD source /etc/profile.d/modules.sh module purge module load mpi/openmpi-x86_64 # only use physical cores. Since nodes are all identical (constraint) this fits for all nodes nprocs=$(( $(nproc) / 2 )) # -N ensure $nprocs processes per node mpirun -N $nprocs hostname | sort | uniq -c # should have same counts for each node