Computing : Submitting Jobs

Login and submit nodes

You can find an overview of login nodes with HTCodnor available for submission into the NAF see: NAF Login, WGS and remote Desktop


If you want to have a quick look around on a worker node try an interactive batch session using 'condor_submit -i' see details here: Interactive batch session


DESY specific submit file entries

If you are an experienced HTC-user but did not use the NAF before here are the few DESY specific submit file entries you should be aware of (for all other user, please see the in-detail-explained part underneath):

  • Default OS is the one of your submit host. Since Scientific Linux 6 lifetime ended 2020, only CentOS7 (EL7) is available. To explicitly request a specific OS/release, add: 'Requirements = ( OpSysAndVer == "CentOS7" )'
    (since only EL7 is currently supported, the OS requirement is dispensable, but can be useful in other Condor environments with more systems)
  • You need an accounting group for successful job submission, the default accounting group is set on the WGS at login time. If you want/need to alter the accounting group put:

    '+MyProject = "<accounting group>"'

  • A default job gets 1 core, 2 GB of ram and 3 h run time (job gets killed after 3 hours), here are the submit file entries to alter these defaults:
    • 'Request_Cpus = <num>' # number of requested cpu-cores
    • 'Request_Memory = <quantity>' # memory in MiB e.g 512, 1GB etc ...
    • '+RequestRuntime = <seconds>' # requested run time in seconds

Note: By omitting the alteration of requested cpu-cores, runtime & memory you create a 'standard' job with 1 core, 2gb of memory and 3h run-time. These jobs can run on opportunistic quota and give you the good chance to get a lot of CPU-cycles that are currently not used by their respective owners ! 

Simple submit file

The submit file is the definition of your job, here is something very simple for a start:

Simple HTC submit file using AFS
Executable  = $ENV(HOME)/<path>/<to>/<your>/<executable>
Log         = $ENV(HOME)/log_$(Cluster)_$(Process).txt
Output      = $ENV(HOME)/out_$(Cluster)_$(Process).txt
Error       = $ENV(HOME)/error_$(Cluster)_$(Process).txt
Queue 1

Explained:

Executable = $ENV(HOME)/<path>/<to>/<your>/<executable>

Path to your executable that is supposed to run inside the job slot (hence it must have the 'x-bit' set). Of course it is wise to test the executable from the command line first on the submit host to make sure everything runs as expected. In the example the executable sits in AFS it could also be located on DUST or on your local disk. In the later case use :

Transfer_Executable = true

To make sure your executable gets transferred with your job !

Log = $ENV(HOME)/log_$(Cluster)_$(Process).txt

Path to your log file, the place were you can find information about the jobs lifetime. The log file gets appended, hence gets very big after a while, therefore in the example '$(Cluster)_$(Process)' will be expanded to the cluster and jobID resulting in one log file per job. 

If you omit the absolute path to the log file it will show up in the actual directory where the submit took place.

Remember: while the batchsystem is happy to create files for you it will never create a missing directory instead the job will go into 'hold' state and not run.

Error = $ENV(HOME)/error_$(Cluster)_$(Process).txt

Path to your error file, the place were you can find information about the things that went wrong during the jobs lifetime. '$(Cluster)_$(Process)' will be expanded to the cluster and job id resulting in one error file per job.

If you omit the absolute path to the log file it will show up in the actual directory where the submit took place.

Remember: while the batch system is happy to create files for you it will never create a missing directory instead the job will go into 'hold' state and not run.

Queue 1

The 'Queue' command is a powerful way to create array jobs instead of looping inside a script and create single jobs. In the example it just creates one job. See More sophisticated job submit file examples for proper usage of the flexibility of 'queue' !