Draining a CondorCE
Submission of new jobs should be prevented with
MAX_JOBS_PER_OWNER=0
MAX_JOBS_SUBMITTED=0
MAX_JOBS_PER_SUBMISSION=0
MAX_JOBS_RUNNING
is dangerous, as it also affects running jobs
ATLAS
for ATLAS, detach the CE from the PQ before starting the draining, else it might be, that there are still jobs from the factory targeted for the CE and harvester does not switch over solely to another working CE.
Update the associated queues for the Panda Queues under: https://atlas-cric.cern.ch/atlas/pandaqueue/detail/DESY-HH/
- site note: Services in the ATLAS CRIC https://atlas-cric.cern.ch/core/service/list/
- site note: CEs in ATLAS CRIC under: https://atlas-cric.cern.ch/core/cequeue/list/
CondorCE service unit fails to restart the daemons
Symptoms
Trying to reload/restart the condor-ce from the systemd service unit, fails with the condor user not able to communicate with the daemons to send commands.
> systemctl restart condor-ce.service
~~> journalctl
-- Unit condor-ce.service has finished shutting down.
Mar 15 15:11:47 grid-htcondorce-dev.desy.de systemd[1]: Unit condor-ce.service entered failed state.
Mar 15 15:11:47 grid-htcondorce-dev.desy.de systemd[1]: condor-ce.service failed.
Mar 15 15:11:47 grid-htcondorce-dev.desy.de systemd[1]: Starting HTCondor CE...
-- Subject: Unit condor-ce.service has begun start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
and when running the unit's ExecReload manually
> /usr/bin/condor_ce_restart
ERROR
SECMAN:2010:Received "DENIED" from server for user condor@users.htcondor.org using method FS.
Can't send Restart command to local master
Solution
The user is probably not mapped correctly, so that it is not authorized from the CondorCE's view to send commands to the other daemons (but anyway it seems to kill the daemons for good at some point). Check in the `/etc/condor-ce/condor_mapfile
`, if it contains a mapping from the root
and condor
user to the right role
> tail -n 4 /etc/condor-ce/condor_mapfile
GSI "(/CN=[-.A-Za-z0-9/= ]+)" \1@unmapped.htcondor.org
CLAIMTOBE .* anonymous@claimtobe
FS "^(root|condor)$" \1@daemon.htcondor.org
FS (.*) \1