Computing : large number of jobs

If you want to submit more than a couple of thousands jobs at a time (e.g. 10.000) there are some basic rules that apply for not putting a very high load on the scheduler which in turn embarrases yourself and others by slow 'condor_q' response times, failing condor_submits and even job loss.

As a basic rule condor is much more forgiving when large number of jobs are submitted through a single condor_submit command using the 'queue <N>' syntax opposed to individual condor_submit commands for every single job.

The submit file syntax allows you to traverse through direcotries and produce jobs for every file of a certain type that is found, even script output can be used from inside the submit file to create large numbers of jobs. Please have a look in the chapter 'more sophisticated submit files in this section. 

If you have to submit more than 10.000 files at a time there is an additional feature called 'late_materialization' :


The option 'max_materialize' in your submit file tells condor to throttle the jobs of your current submission:

For ex.

materialize_max_idle = 10

queue 10000

In your submit file means, submit 10.000 jobs and always keep 10 jobs idle in the sched queue (when the first job vanishes and the number of idle queued  jobs becomes 9 the next job from your submission will get 'materialized' in the queue).

Using this feature the scheduler is not busy writing and maintaining 100.000 classadds for user XY's jobs that will not run untill tomorrow anyway. Technically the feature is very much alike to what the array jobs in SGE did and will create a much more healthy behaviour of the schedulers if you use it + it will not slow down your throughput in any way (!)


Another option of this featuer is to limit the number of active jobs of one kind for example if these jobs make a heavy load on another ressource like a filesystem for ex.


max_materialize = <n>


Will keep the number of active jobs at the level of <n> at all times.