Grid : Condor CE Node

Setting up a CondorCE Scheduler on a Condor node, to submit incoming Grid jobs as batch jobs to the local Condor cluster

HTCondor CE Basics

We are running a HTCondor CE on a dedicated VM. The CondorCE recieves Grid jobs through the CE interface, tries to map the incoming job's Grid proxy DN to a local user:group ID and submits the jobs as batch job to the local HTCondor cluster. 
Currently, we have also put a load balancer in front of the CE with the initial idea to rotate CEs later on  - however, it might be a problem that the return job ID is no full URN as for example for ARC CE job IDs, where the resolved A/AAAA/CNAME record of the actual node is returned in the job ID URN. 

Installation

We use HTCondor version 4, which is as of writing available in the HTCondor development repository (watch out not to mix HTCondorCE v3 and v4 config files). 

  • htcondor-ce
  • htcondor-ce-apel
  • condor
  • python3-condor
  • fetch-crl
  • ca-policy-egi-core
  • ca-policy-lcg
  • voms
  • OpenJDK if necessary

Please note, that a number of packages might still be python2.7 etc. so an installation with CentOS 8 as base OS might not work.

Host Certificate

The host certificate/key go to the default under /etc/grid-security.

Paths

The CondorCE uses the ordinary Condor tools to manage the incoming Grid jobs in its own pool to submit them to the local batch system. So, the CondorCE replicates the Condor paths so that on a CondorCE-to-Condor setup, the paths would be familiar and in addition to the corresponding Condor paths (when using a different local batch system but not HTCondor the local batch paths will be different of course)

Condor-CECondor
/etc/condor-ce/config.d
/var/log/condor-ce
/var/lib/condor-ce/spool
...
/etc/sysconfig/condor-ce
/etc/condor/config.d
/var/log/condor
/var/lib/condor/spool
...


Configs

To avoid clashes with updated default configs from package updates, we do not modify/overwrite the shipped default configs but overrule them (if necessary) with our own configs. E.g.,

/etc/condor-ce/config.d/02-ce-condor.conf        (untouched default)
/etc/condor-ce/config.d/90_02-ce-condor.conf  (our own modifications to some class ads of the default)

With the default configs and a set of adapted class ads, the CondorCE should be in a working state. Our basic additional config templates in puppet look like

/etc/condor-ce/config.d/90_50-ce-apel.conf
/etc/condor-ce/config.d/90_10-logging.conf
/etc/condor-ce/config.d/90_03-ce-routes.conf
/etc/condor-ce/config.d/90_01-ce-condor.conf
/etc/condor-ce/config.d/90_01-ce-auth.conf

where one has to fill in the local values like FQDNs and check if all paths exist.

DN to UID:GID mapping

Since nowadays mainly pilot jobs are circulating in the Grid, the need for pool accounts has decreased. So, we decided to go for a simpler static mapping of Grid proxy DNs to local users. Our Puppet template getting fileld with values from Hiera looks like
 /etc/condor-ce/condor_mapfile.erb, with the resulting mapping like

...
GSI ".*,\/desy\/Role=lcgadmin\/Capability=NULL" desylocaluser001
GSI ".*,\/desy\/Role=lcgadmin" desylocaluser002
GSI "(/CN=[-.A-Za-z0-9/= ]+)" \1@unmapped.htcondor.org
CLAIMTOBE .* anonymous@claimtobe
FS (.*) \1

ensure, that the users exist on your CE and pool, so that Condor can switch into these.

HTCondor Batch

CondorCE Sched

The config for the Condor schedd to the local Condor batch cluster in /etc/condor/config.d is as for other scheduler nodes in the cluster with additions for the APEL accounting and impersonation of all users /etc/condor/config.d/90-condor-ce.conf.erb

Each Batch Node

For the accounting, each batch nodes has to know/report its HS06 benchmark value back to the CondorCE.
For reporting to APEL, an average value for the whole cluster has to be published towards APEL. On a batch node, the ApelScaling class than be used to correct the power of this node relative to the cluster average. Thus, we drop an additional config with the HS06 values after benchmarking on each batch node as /etc/condor/config.d/40_apel_hs06.conf, e.g.,

# averaged HS06 power per core over the whole cluster
ClusterAvgCoreHS06 = 11.22

CoresTotalCores = 12345

# actual HS06 value benchmarked on this node (additionally normalizing it to the node's (HT) core count as well for our internal statistics)
HS06 = 456

HS06PerSlot =  eval( real(HS06)/real(TotalCpus) )
# calculating the ratio of the actual core HS06 value and the averaged one
ApelScaledPerSlot = eval( real(HS06PerSlot)/real(ClusterAvgCoreHS06) )
ApelScaling = ApelScaledPerSlot

# adding all class ads defined before to the startd, so that they can be evaluated later on
STARTD_ATTRS = $(STARTD_ATTRS) ApelScaling ApelScaledPerSlot HS06 HS06PerSlot HS06perWatt CoresTotalCores ClusterAvgCoreHS06

APEL Accounting

Unfortunately APEL accounting with HTCondor-CEs is rather new and rather poorly documented ... Best docs are here (as of the day this doc is written ...): https://twiki.cern.ch/twiki/bin/viewauth/LCG/HtCondorCeAccounting

Due to scaling/locking issues, it seems not to be feasible to use one database for all CEs. For a new CE, ask the database group to create a new db and set it in the CE's host yaml 

Hiera CE db name
grid::apel::db:
  name : 'grid_condor_newceid'

The RPM "htcondor-ce-apel" contains 2 scripts: /usr/share/condor-ce/condor_blah.sh & /usr/share/condor-ce/condor_batch.sh which generate accounting records from HTCondor + HTCondor-CE under /var/lib/condor-ce/apel/ (or whatever path you define as APEL_OUTPUT_DIR in /etc/condor-ce/config.d/50-ce-apel.conf).

/usr/share/condor-ce/condor_blah.sh mainly invokes condor_history with adapted options. The last field in the output file is supposed to be "ApelScaling". However, it is "undefined" (probably because of a wrong/misunderstood setup?). It needs to be replaced by a correct value (in the simplest case: 1). This field will be used to multiply the job usage value - compared to the "default" HS06 value specified in APEL's client.cfg.

apelclient is unable to read GLUE2 LDAP records from the information system. Therefore the SPEC records of the cluster need to be put into the accounting db manually. It should looks like this (in /etc/apel/client.cfg):

[spec_updater]
enabled = true
# The GOCDB site name
site_name = DESY-ZN
lrms_server = grid-htcondor.zeuthen.desy.de
spec_type = HEPSPEC
spec_value = 20.39
manual_spec1 = grid-htcondorce1.zeuthen.desy.de:9619/grid-htcondor.zeuthen.desy.de-condor,HEPSPEC,20.39
manual_spec2 = grid-htcondorce2.zeuthen.desy.de:9619/grid-htcondor.zeuthen.desy.de-condor,HEPSPEC,20.39

Publishing information towards BDII

See BDII intergation for on how to start a local BDII on the CE and query its information from a GIIS/BDII.


Job Submission

See CondorCE Test Job Submission on how to submit some test jobs directly to your CE.

Attachments:

90_50-ce-apel.conf.erb (application/octet-stream)
90_10-logging.conf.erb (application/octet-stream)
90_03-ce-routes.conf.erb (application/octet-stream)
90_01-ce-condor.conf.erb (application/octet-stream)
90_01-ce-auth.conf.erb (application/octet-stream)
condor_mapfile.erb (application/octet-stream)
90-condor-ce.conf.erb (application/octet-stream)
40_apel_hs06.conf.erb (application/octet-stream)
2019-09-26.htcondorce-bdii-apel.pdf (application/pdf)