Minimal submit node
Have the usual GSI stuff (voms-proxy-init) installed e.g. via CVMFS. Have htcondor packages installed. You do not need necessarily the htcondor_ce_client packages installed but the CE client package brings tools like condor_ce_trace for debugging. Submission to a Condor CE works via the local Scheduler, so that most existing Condor submit nodes should already be able to submit to a remote Condor CE. However, remote submit hosts, i.e., nodes that do not have a schedd daemon running but submit/forward a job to a schedd host probably do not work, as the Grid proxy needs to be forwarded etc.
Minimal HTCondor configuration in /etc/condor/config.d/ce-submit.conf:
use ROLE: Submit AUTH_SSL_CLIENT_CADIR = /cvmfs/grid.cern.ch/etc/grid-security/certificates GSI_DAEMON_TRUSTED_CA_DIR = /cvmfs/grid.cern.ch/etc/grid-security/certificates
Else the Condor client will expect the Grid certificates under /etc/grid-security/certificates
Start the condor unit and enable it as service for the coming reboots:
[submitnode] /root # systemctl start condor.service [submitnode] /root # systemctl enable condor.service
In case one needs condor-ce debugging tools, install htcondor-ce-client package. Minimal configuration in /etc/condor-ce/config.d/ce-client.conf:
GSI_DAEMON_TRUSTED_CA_DIR = /cvmfs/grid.cern.ch/etc/grid-security/certificates
Testing the submission to the CE
request a proxy (check if you are using a tool version that works, the htcondor CE client pulls explicitly the C++ flavour of the voms tools, so you might want to source en environment with the Java version referenced from the Grid (needs Java JVM installed locally).
[submitnode] ~ % voms-proxy-init -voms dteam Enter GRID pass phrase: Your identity: /C=DE/O=GermanGrid/OU=DESY/CN=Andreas Haupt Creating temporary proxy ............................................................................................... Done Contacting voms2.hellasgrid.gr:15004 [/C=GR/O=HellasGrid/OU=hellasgrid.gr/CN=voms2.hellasgrid.gr] "dteam" Done Creating proxy ................................... Done Your proxy is valid until Wed Aug 12 03:17:07 2020
Testing the DN mapping and authorization
Make sure, that a mapping exists on the CE for your proxy DN. I.e., ping the CondorCE and test the authorization
[submitnode] ~ % condor_ce_ping -verbose -name grid-htcondorce0.desy.de -pool grid-htcondorce0.desy.de:9619 WRITERemote Version: $CondorVersion: 8.9.7 May 19 2020 BuildID: 504263 PackageID: 8.9.7-1 $ Local Version: $CondorVersion: 8.9.7 May 19 2020 BuildID: 504263 PackageID: 8.9.7-1 $ Session ID: grid-htcondorce0:1834216:1597156581:9167 Instruction: WRITE Command: 60021 Encryption: none Integrity: MD5 Authenticated using: GSI All authentication methods: FS,TOKEN,SCITOKENS,GSI Remote Mapping: SOMELOCALMAPPEDUSERHERE@users.htcondor.org Authorized: TRUE Information about authentication methods that were attempted but failed: AUTHENTICATE:1004:Failed to authenticate using SCITOKENS AUTHENTICATE:1004:Failed to authenticate using IDTOKENS AUTHENTICATE:1004:Failed to authenticate using FS
Sending a trace job for debugging
send a trace job to the CE as predefined debugging job
[submitnode] ~ % condor_ce_trace grid-htcondorce0.desy.de Testing HTCondor-CE authorization... Verified READ access for collector daemon at <131.169.223.129:9619?addrs=131.169.223.129-9619+[2001-638-700-10df--1-81]-9619&alias=grid-htcondorce0.desy.de&noUDP&sock=collector> Verified WRITE access for scheduler daemon at <131.169.223.129:9619?addrs=131.169.223.129-9619+[2001-638-700-10df--1-81]-9619&alias=grid-htcondorce0.desy.de&noUDP&sock=schedd_1834144_00e5> Submitting job to schedd <131.169.223.129:9619?addrs=131.169.223.129-9619+[2001-638-700-10df--1-81]-9619&alias=grid-htcondorce0.desy.de&noUDP&sock=schedd_1834144_00e5> - Successful submission; cluster ID 3263 Resulting job ad: [ ClusterId = 3263; [...] CommittedSuspensionTime = 0 ] Spooling cluster 3263 files to schedd <131.169.223.129:9619?addrs=131.169.223.129-9619+[2001-638-700-10df--1-81]-9619&alias=grid-htcondorce0.desy.de&noUDP&sock=schedd_1834144_00e5> - Successful spooling Job status: Held Job transitioned from Held to Idle Job transitioned from Idle to Completed - Job was successful
If everything works, you should be able to submit a dedicated job to the CondorCE by sending it as grid-universe job to the local Schedd, that forwards it to the CE
Submitting a real job
The job description file and some executable payload to run
[submitnode] > cat HTCondorCE.submit universe = grid use_x509userproxy = true #+Owner = undefined grid_resource = condor grid-htcondorce0.desy.de grid-htcondorce0.desy.de:9619 # Files executable = mypayload.sh output = stdout error = stderr log = logs # File transfer behavior ShouldTransferFiles = YES WhenToTransferOutput = ON_EXIT # Optional resource requests #+xcount = 4 # Request 4 cores #+maxMemory = 4000 # Request 4GB of RAM #+maxWallTime = 120 # Request 2 hrs of wall clock time #+remote_queue = "osg" # Request the OSG queue # Run job once queue [submitnode] > cat mypayload.sh #!/bin/sh DATE=$(date +%s) ...
submit the job to the local scheduler, which should evaluate the 'grid_resource' ad and contact the CE.
[submitnode] > condor_submit -debug HTCondorCE.submit Submitting job(s)08/11/20 16:43:17 Can't open directory "/etc/condor/passwords.d" as PRIV_UNKNOWN, errno: 13 (Permission denied) 08/11/20 16:43:17 Can't open directory "/home/hartmath/.condor/tokens.d" as PRIV_UNKNOWN, errno: 2 (No such file or directory) . 1 job(s) submitted to cluster 4.
Might be, that you see some warnings about missing passowrd/token directories, that should be not critical, since you are not submitting locally but with the proxy to the remote CE.