Computing : Python Bindings

HTCondor as a powerful python API so that you can integrate the BIRD usage into your Python scripts.

Documentation

http://research.cs.wisc.edu/htcondor/manual/v8.6/6_7Python_Bindings.html

https://htcondor-python.readthedocs.io/en/latest/index.html

Basic Preparations

Import the two modules

>>> import htcondor, classad

to talk to Condor and to use its ClassAds, which are key=value stores of the all the information in the system.

Connect to the collector, which collects all information about the cluster (takes optionally the target collector's FQDN as argument)

>>> collector = htcondor.Collector()

Scheduler

get the default scheduler where to submit jobs to

>>> mySchedd = htcondor.Schedd()

for more infos you can get the default scheduler's common ClassAds

>>> myScheddClassAds = collector.locate(htcondor.DaemonTypes.Schedd)

Submitting a job (currently not possible, use an execution of condor_submit instead)

create a job object with some executable to run

>>> myjob = htcondor.Submit()
>>> myjob["executable"] = "~/TestJobs/mypayload.sh"
>>> myjob["arguments"] = "baz"

HTCondor as a powerful python API so that you can integrate the BIRD usage into your Python scripts.

Documentation

http://research.cs.wisc.edu/htcondor/manual/v8.6/6_7Python_Bindings.html

https://htcondor-python.readthedocs.io/en/latest/index.html

Basic Preparations

Import the two modules

>>> import htcondor, classad

to talk to Condor and to use its ClassAds, which are key=value stores of the all the information in the system.

Connect to the collector, which collects all information about the cluster (takes optionally the target collector's FQDN as argument)

>>> collector = htcondor.Collector()

Scheduler

get the default scheduler where to submit jobs to

>>> mySchedd = htcondor.Schedd()

for more infos you can get the default scheduler's common ClassAds

>>> myScheddClassAds = collector.locate(htcondor.DaemonTypes.Schedd)

get a list of all schedulers do (check the documentation/admins, which of the scheds are intended for end user usage - else job losses might lurk around)

>>> collector.query(htcondor.AdTypes.Schedd, projection=["Name"])
[[ Name = "bird-htc-sched01.desy.de" ], [ Name = "bird-htc-sched02.desy.de" ]]

Submitting a job (currently not possible, use an execution of condor_submit instead)

create a job object with some executable to run

>>> myjob = htcondor.Submit()
>>> myjob["executable"] = "~/TestJobs/mypayload.sh"
>>> myjob["arguments"] = "baz"

create a scheduler object, to which we will talk to to submit the job

>>> mySchedd = htcondor.Schedd()

and submit the job (use the 'with' as else there seems to be things dangling around)

>>> with mySchedd.transaction() as myTransaction:
...     myjob.queue(myTransaction)
1234567

Querying ClassAd Information

Most HTCondor daemon objects support queries, so that you can ask them for information in the form of ClassAds.

To ask the collector for a full list of all scheduler daemons' info

>>> collector.query(htcondor.AdTypes.Schedd)

as optional additional arguments the query syntax takes also selections for specific elements and projections to select only for specific ClassAds.
For example, some regular expression fiddling to limit the matching scheduler daemons to all with '1' or '2' in the name and following a projection of just the name and how many jobs they have managed

>>> collector.query(htcondor.AdTypes.Schedd, 'regexp(".*[1-2].*", Name)',projection=["Name","JobsStarted"])
[[ Name = "bird-htc-sched01.desy.de"; JobsStarted = 721605 ], [ Name = "bird-htc-sched02.desy.de"; JobsStarted = 601639 ]]

or - a bit more simpler, just get the name of the schedulers, which have run less than 1000 jobs

>>> collector.query(htcondor.AdTypes.Schedd,'JobsStarted<1000', projection=["Name","JobsStarted","JobsRunning"])
[[ Name = "bird-htc-sched05.desy.de"; JobsStarted = 0; JobsRunning = 0 ]]

btw: the Scheduler Daemons just serve as examples here and are not intended for manual scheduler selection. Check the documentation/admins, which of the scheds are intended for end user usage - else job losses might lurk around. If in question, juust juse the default scheduler object, that htcondor.Schedd() gives you


btw: there are two types of 'queries'

daemon.query(...) → gives the full list [result1, result2, ...]

daemon.xquery(...) → gives just an lighter iterator, i.e., to loop over it - not the full list, but much faster and slimmer, if you just want to iterate anyway