Created by Frank Schluenzen, last modified on Jan 24, 2023 22:40
SLURM does not have a role model which would allow to delegate certain tasks to individual users; you are either a SLURM admin or you are not. The ability to create and manage reservations is one of the tasks which would greatly benefit from delegation, which would allow for example to reserve a compute node for a beamtime. We have therefore implemented a web-service which does exactly that (but please note that it's work in progress!).
The web-service is accessible (from within DESY network) at https://max-portal.desy.de/reservation/. It requires DESY credentials to login, but you won't be able to do anything unless you have been added to the list of authorized accounts. If you would like to use the web-service please get in touch with maxwell.service@desy.de.
The reservation tool has a number of nice features:
- it allows to set authorization per partition
- it allows to limit the consumable resources per partition; it's possible to impose limits that a partition can never have more than N nodes reserved at a time.
- it supports constraints; it guides you through set of constraints and makes it impossible to create invalid combination of constraints
- it nicely handles groups and users
- it comes with a REST API
The REST API has been used to create a couple of python scriplets, which allow to perform most of the tasks of the web-services directly from the command-line.
SLURM RESERVATION CLI
The python modules to handle slurm reservations can be found on maxwell under /software/tools/lib/python3.6/slurmres. The python modules are not bound to Maxwell, and should work on any machine (i.e. would allow to create reservation from a beamline pc).
Like for the web-service: without account authorization none of the modules will work. Assuming that you are authorized to manage reservations for partition allcpu, the CLI works as follows.
TOKEN
To work conveniently with the CLI you'll need a token:
@max-wgse001:~$ python3 /software/tools/lib/python3.6/slurmres/slurmrestoken.py -h
usage: slurmrestoken.py [-h] [-t TOKEN_PATH | -r]
Creates a new token
optional arguments:
-h, --help show this help message and exit
-t TOKEN_PATH, --token TOKEN_PATH
local path to token file
-r, --revoke revoke token on server
@max-wgse001:~$ python3 /software/tools/lib/python3.6/slurmres/slurmrestoken.py
Username: user
Password:
Token generated at /home/user/slurm_res/slurm_res_token.dat
LIST
@max-wgse001:~$ python3 /software/tools/lib/python3.6/slurmres/slurmreslist.py -h
usage: slurmreslist.py [-h] [-t TOKEN] [-p PARTITION]
List all reservations or all reservations of a partition.
optional arguments:
-h, --help show this help message and exit
-t TOKEN, --token TOKEN
the path to the token
-p PARTITION, --partition PARTITION
show reservations only for a specific partition
@max-wgse001:~$ python3 /software/tools/lib/python3.6/slurmres/slurmreslist.py -p allcpu
{'accounts': [],
'burst_buffer': [],
'core_cnt': 20,
'end_time': '2021-07-19T14:00:00',
'features': [],
'flags': '',
'licenses': {},
'name': 'res_test_001',
'node_cnt': 1,
'node_list': 'max-cfel023',
'partition': 'allcpu',
'start_time': '2021-07-19T12:00:00',
'tres_str': ['cpu=40'],
'users': ['user1', 'user2']}
1 reservation found
CREATE
@max-wgse001:~$ python3 /software/tools/lib/python3.6/slurmres/slurmresnew.py -h
usage: slurmresnew.py -n NAME -p PARTITION -c COUNT -u user [user ...] -s
START -e END [-h] [-f feature [feature ...] | -N node
[node ...]] [-i | -P | -k] [-t TOKEN_PATH]
Create a reservation in a partition.
required arguments:
-n NAME, --name NAME the name of the reservation
-p PARTITION, --partition PARTITION
name of partition the reservation is to be created in
-c COUNT, --count COUNT
the amount of nodes
-u user [user ...], --users user [user ...]
a list of users
-s START, --start START
the start date of the reservation [Y-M-DTH:M]
-e END, --end END the end date of the reservation [Y-M-DTH:M]
optional arguments:
-h, --help show this help message and exit
-f feature [feature ...], --features feature [feature ...]
optional features
-N node [node ...], --nodes node [node ...]
optional specified nodes
-i, --ignore_jobs ignore currently running jobs
-P, --preempt_jobs kill currently running jobs if preemptable
-k, --kill_jobs kill currently running jobs
-t TOKEN_PATH, --token TOKEN_PATH
local path to token file
@max-wgse001:~$ python3 /software/tools/lib/python3.6/slurmres/slurmresnew.py -p allcpu -n res_test_002 -c 1 -s 2021-07-19T12:00 -e 2021-07-19T14:00 -u user1,user2
{'accounts': [],
'burst_buffer': [],
'core_cnt': 20,
'end_time': '2021-07-19T14:00:00',
'features': [],
'flags': '',
'licenses': {},
'name': 'res_test_001',
'node_cnt': 1,
'node_list': 'max-cfel023',
'partition': 'allcpu',
'start_time': '2021-07-19T12:00:00',
'tres_str': ['cpu=40'],
'users': ['user1', 'user2']}
{'accounts': [],
'burst_buffer': [],
'core_cnt': 20,
'end_time': '2021-07-19T14:00:00',
'features': [],
'flags': '',
'licenses': {},
'name': 'res_test_002',
'node_cnt': 1,
'node_list': 'max-cfel024',
'partition': 'allcpu',
'start_time': '2021-07-19T12:00:00',
'tres_str': ['cpu=40'],
'users': ['user1', 'user2']}
Reservation successfully created
EDIT
@max-wgse001:~$ python3 /software/tools/lib/python3.6/slurmres/slurmresedit.py -h
usage: slurmresedit.py -n NAME -p PARTITION [-h] [-c COUNT]
[-u user [user ...]] [-s [START]] [-e [END]]
[-N node [node ...]] [-i | -P | -k] [-t TOKEN_PATH]
Edit a reservation in a partition.
required arguments:
-n NAME, --name NAME the name of the reservation
-p PARTITION, --partition PARTITION
name of the reservations partition
optional arguments:
-h, --help show this help message and exit
-c COUNT, --count COUNT
the amount of nodes
-u user [user ...], --users user [user ...]
a list of users
-s [START], --start [START]
the start date of the reservation [Y-M-DTH:M]
-e [END], --end [END]
the end date of the reservation [Y-M-DTH:M]
-N node [node ...], --nodes node [node ...]
optional specified nodes
-i, --ignore_jobs ignore currently running jobs
-P, --preempt_jobs kill currently running jobs if preemptable
-k, --kill_jobs kill currently running jobs
-t TOKEN_PATH, --token TOKEN_PATH
local path to token file
# change nodecount and list of users:
@max-wgse001:~$ python3 /software/tools/lib/python3.6/slurmres/slurmresedit.py -n res_test_002 -c 2 -u user1,user2,user3 -p allcpu
[...]
{'accounts': [],
'burst_buffer': [],
'core_cnt': 40,
'end_time': '2021-07-19T14:00:00',
'features': [],
'flags': '',
'licenses': {},
'name': 'res_test_002',
'node_cnt': 2,
'node_list': 'max-cfel[024-025]',
'partition': 'allcpu',
'start_time': '2021-07-19T12:00:00',
'tres_str': ['cpu=80'],
'users': ['user1', 'user2', 'user3']}
Reservation successfully edited
DELETE
@max-wgse001:~$ python3 /software/tools/lib/python3.6/slurmres/slurmresdelete.py -h
usage: slurmresdelete.py [-h] -n NAME -p PARTITION [-t TOKEN]
Delete a reservation in a partition.
optional arguments:
-h, --help show this help message and exit
-n NAME, --name NAME name of reservation
-p PARTITION, --partition PARTITION
name of partition the reservation is in
-t TOKEN, --token TOKEN
the path to the token
@max-wgse001:~$ python3 /software/tools/lib/python3.6/slurmres/slurmresdelete.py -n res_test_002 -p allcpu
Reservation successfully deleted
WRAPPER
Most people will presumably make use of the python code. For convenience there is a wrapper which invokes the python-module, syntax is identical:
@max-wgse001:~$ /software/tools/sbin/slurmreservation
usage: slurmreservation token|list|create|edit|delete
@max-wgse001:~$ /software/tools/sbin/slurmreservation create -n res_test_004 -c 1 -u user1,user2 -p allcpu -s 2021-07-19T14:50 -e 2021-07-19T16:00
{'accounts': [],
'burst_buffer': [],
'core_cnt': 48,
'end_time': '2021-07-19T16:00:00',
'features': [],
'flags': '',
'licenses': {},
'name': 'res_test_004',
'node_cnt': 1,
'node_list': 'max-wn096',
'partition': 'allcpu',
'start_time': '2021-07-19T14:50:00',
'tres_str': ['cpu=96'],
'users': ['user1', 'user2']}
Reservation successfully created
@max-wgse001:~$ /software/tools/sbin/slurmreservation list
{'accounts': [],
'burst_buffer': [],
'core_cnt': 48,
'end_time': '2021-07-19T16:00:00',
'features': [],
'flags': '',
'licenses': {},
'name': 'res_test_004',
'node_cnt': 1,
'node_list': 'max-wn096',
'partition': 'allcpu',
'start_time': '2021-07-19T14:50:00',
'tres_str': ['cpu=96'],
'users': ['user1', 'user2']}
1 reservation found
@max-wgse001:~$ /software/tools/sbin/slurmreservation delete -n res_test_004 -p allcpu
Reservation successfully deleted