Grid : Parsing Condor Event Logs as XML with Logstash

see also https://gitlab.desy.de/thomas.hartmann/condor-xml-logstash - probably there is a previous git repo somewhere...


Condor can be told to write all its daemons' events to a event log, that can be better machine parsed than the 'normal' logs.

Condor Event Log config
EVENT_LOG = /var/log/condor/EventLog.xml
# EVENT_LOG = /var/log/condor-ce/EventLog.xml
EVENT_LOG_MAX_SIZE = 500000000
EVENT_LOG_MAX_ROTATIONS = 4
EVENT_LOG_JOB_AD_INFORMATION_ATTRS= x509UserProxyEmail, x509UserProxyFQAN, x509UserProxyVOName
EVENT_LOG_USE_XML=True

with it, the Daemons' events end up wrapped in a XML schema (where the XML is just optional as output formating, but we used it here for our education).  As you can see, an event is wrapped in a <c> tag and each value is in a <a> tag, where the actual key name is the a-tag's attribute. The value is wrapped in a type-like tag (<i> integer, <s> string, <r> real(?))

Condor XML Event Log Example
<c>
    <a n="Proc"><i>0</i></a>
    <a n="Cluster"><i>64871</i></a>
    <a n="EventTime"><s>2020-09-17T16:40:17.845</s></a>
    <a n="MyType"><s>ExecuteEvent</s></a>
    <a n="ExecuteHost"><s>79660.0</s></a>
root@grid-htcondorce0: [~/bin] head -n50 /var/log/condor-ce/EventLog
<c>
    <a n="Info"><s>Global JobLog: ctime=1600353617 id=0.1158487.1600353617.819171.1.1600353617.820011 sequence=1 size=0 events=0 offset=0 event_off=0 max_rotation=4 creator_name=<>                                                                                               </s></a>
    <a n="EventTime"><s>2020-09-17T16:40:17.820</s></a>
    <a n="MyType"><s>GenericEvent</s></a>
    <a n="EventTypeNumber"><i>8</i></a>
</c>
<c>
    <a n="Proc"><i>0</i></a>
    <a n="Cluster"><i>64880</i></a>
    <a n="EventTime"><s>2020-09-17T16:40:17.819</s></a>
    <a n="MyType"><s>ExecuteEvent</s></a>
    <a n="ExecuteHost"><s>79670.0</s></a>
    <a n="Subproc"><i>0</i></a>
    <a n="EventTypeNumber"><i>1</i></a>
</c>

To convert the attributes to keys in the resulting Logstash event JSON and also clean up the values from the type tags, a bit of Ruby magic is necessary in the grok. So first we parse the events from <c> to </c> and put these as new key/values to the event (we remove the xmlparse field, in which the xml filter plugin wrote its output, after Ruby has walked through it).


Logstash Grok Example
input {
  file {
    path => "/var/log/condor-ce/EventLog"
    start_position => "beginning"
    sincedb_path => "/var/log/condor-ce/.EventLog.sincedb"
    exclude => "*.gz"
    type => "xml"
      codec => multiline {
        pattern => "<c>" 
        negate => "true"
        what => "previous"
      }
  }
}
filter{
    xml{
        source => "message"
        target => "xmlparse"
        force_array => false
#        store_xml => true
        namespaces => {
          "xsl" => "http://www.w3.org/1999/XSL/Transform"
          "xhtml" => "http://www.w3.org/1999/xhtml"
        }
#        add_tag => [ "xmltag" ]
    }
    ruby {
        code => '
            e = event.get("xmlparse")
            if e.is_a? Hash
                e["a"].each { |x|
                    key = x["n"]
                    if x["s"]
                        value = x["s"]
                    elsif x["i"]
                        value = x["i"].to_i
                    elsif x["r"]
                        value = x["r"].to_f
                    elsif x["b"]
                        value = (x["b"]["v"] == "t")
                    end
                    event.set(key, value)
                }
            end
        '
#        add_tag => [ "rubytag" ]
        add_tag => [ "condorce","eventlog","grid" ]
	remove_field => [ "xmlparse" ]
    }
}

output {
  stdout{
    codec => "json"
  }
  file {
    path => "/tmp/logstash.eventxml.debug"
    codec => "json_lines"
  }
}

condor_eventxml.logstash


Attachments:

condor_eventxml.logstash (application/octet-stream)