Page tree

The HTCondor batch system runs currently with

Master Nodes

called condor01 and condor02, which host the Condor daemons

schedd, defrag, negotiator, master, gangliad and collector

[details see:  Master Node Setup in HTCondor]

Worker Nodes

about 180 of them with ~7200 HT-cores in total, each running as local daemons

startd and master

[details see: Worker Node Setup in HTCondor]

Submit Hosts

for job submission, the grid world submits jobs through

ARC CE nodes

with two of them load-balanced behind grid-arcce.desy.de

schedd and master

details see: ARC CE Setup for HTCondor]

NAF Workgroupservers

for local submissions

schedd and master

details see: Submit Host in HTCondor]

Monitoring

DESY uses different tools to monitor the basic system and hardware monitoring and the cluster status

InfluxDB/Grafana

The HTCondor cluster and job status is monitored in InfluxDB and visualized with Grafana

[details see: Job/Cluster Monitoring]

Icinga

Node states and system states are monitored in Icinga

[details see: Infrastructure Monitoring]

HTCondor Documentation
ARC CE
GridPP

GridPP has a very detailed description of their setup(s), from which we have profited significantly

Mailing Lists

 

  • No labels