Page tree

The configuration is split into two parts:

  • The base configuration which is static for the whole facility (conf/base_sender.yaml)
  • The detector/application specific configuration (conf/datamanager.yaml)

By default the base config contains information like log file size, the ldap uri, hidra internal ports, ...

The specific config settings are added on top of the base ones, meaning that if a config parameter set it set in both the specific config overwrites the base config. Same is true for config parameters on the command line.

 base config --overwritten by-- specific config --overwritten by-- command line arguments

Configuration

  • conf/datamanager.yaml is used as default configuration file
    → either modify this file or create a new config file and use --config_file <config_file> when starting the datamanager
  • Set up logging
    • log_path: Path where the log file will be created

    • log_name: Filename used for logging

    • log_size: File size before rollover in B (Linux only)

  • Set up general settings:
    • username:
      • If systemd is used: user to hidra is started and stopped as
    • procname

      • Name with which the service should be running

    • ext_ip

      • IP/DNS name of the interface to bind to for external communication

    • ldapuri:
      • LDAP node and port needed to check whitelist (format: <node>:<port:)
    • com_port
      • Port number to receive signals from (should be set to 50000; if this is changed all data receivers have to adjusted as well)
    • request_port

      • ZMQ port to get new requests (should be set to 50001; if this is changed all data receivers have to adjusted as well)

    • whitelist:
      • List of hosts allowed to connect
        • empty list: nobody is allowed to get data (except data_stream_targets if use_data_stream is enabled)
        • None: everybody gets data
    • for windows additional ports have to be configured for internal process communication (on Linux this is done with the IPC protocol)
      • request_fw_port

        • ZMQ port to forward requests internally (needed if running on Windows only)

      • control_pub_port:
        •  ZMQ port to distribute control signals (needed if running on Windows only)
      • control_sub_port
        • ZMQ port to distribute control signals (needed if running on Windows only)
  • Set up event detector:
    • event_detector_type
    • monitored_dir
      • Directory to be monitor for changes Inside this directory only the subdirectories "commissioning", "current" and "local" are monitored (needed if event detector is inotifyx_events or watchdog_events)
    • fix_subdirs

      • Subdirectories to be monitored and to store data to. These directory have to exist when HiDRA is started and should not be removed during the run. (needed if eventdetector is inotifyx_events or watchdog_events or if datafetcher is file_fetcher)

    • create_fix_subdirs:
      • Flag describing if the subdirectories should be created if they do not exist. This does not affect http_events, hidra_events, zmq_events since they do not use the monitored_dir variable.

    • monitored_events:
      • Event type of files (options are: IN_CLOSE_WRITE, IN_MOVED_TO, ...) and the formats to be monitored, files in an other format will be be neglected (needed if event_detector is inotifyx_events or watchdog_events)

    • history_size
      • Number of events stored to look for doubles (needed if event_detector is inotifyx_events or http_events)

    • use_cleanup:

      • Flag describing if a clean up thread which regularly checks if some files were missed should be activated (needed if event_detector is inotifyx_events)
    • action_time:
      • Intervall time (in seconds) used for clean up resp. checking of events (only needed if event_detector_type is inotifyx_events together with use_cleanup enabled or if event_detector is watchdog_events)
    • time_till_closed:
      • Time (in seconds) since last modification after which a file will be seen as closed (needed if event_detector is inotifyx_events (for clean up) or watchdog_events)
    • ext_data_port:
      • ZMQ port to get incoming data from (needed if event_detector_type is hidra_events)

    • event_det_port:
      • ZMQ port to get events from (needed if event_detector_type is zmq_events)
    • det_ip:
      • IP/DNS name of the detector (needed if event_detector_type is http_events)
    • det_api_version:
      • API version of the detector (needed if event_detector_type is http_events)
    • dirs_not_to_create:
      • Subdirectories which should not be created when create_fix_subdirs option is enabled.

  • Set up data fetcher:
    • data_fetcher_type:
      • Module with methods specifying how to get the data (options are "file_fetcher", "zmq_fetcher", "http_fetcher").
    • data_fetcher_port:
      • If "zmq_fetcher" is specified as data_fetcher_type it needs a port to listen to (needed if event_detector_type is zmq_events).
    • use_data_stream: 
      • Enable ZMQ pipe into storage system (uses data_stream_targets).
    • data_stream_targets:
      • Fixed host and port to send the data to with highest priority.
    • number_of_streams
      • Number of parallel data streams.
    • status_check_resp_port
      • Test signal port to notify the data sender about problems (needed if event_detector_type is hidra_events).

    • confirmation_resp_port
      • Confirmation socket to send a confirmation for each data message sent (needed if event_detector_type is hidra_events).

    • chunksize:

      • Chunk size of file-parts getting send via zmq
    • router_port:

      • ZMQ-router port which coordinates the load-balancing to the worker-processes (needed if running on Windows).
    • status_check_port:
      • Test signal port to check if the data receiver is having problems.
    • store_data:

      • Flag describing if the data should be stored in localTarget (needed if data_fetcher_type is file_fetcher or http_fetcher)

    • local_target:
      • Target to move the files into (needed if store_data is enabled).
    • remove_data:

      • Flag describing if the files should be removed from the source. (needed if data_fetcher_type is file_fetcher or http_fetcher)
        Options are:
        • False - data stays on the source
        • True - data is removed from the source after processing it
        • with_confirmation - only supported if use_data_stream is enabled; data is removed from the source only after the target sent a verification.
    • cleaner_port
      • ZMQ port to communicate with cleaner process (needed if running on Windows).

    • cleaner_trigger_port
      • ZMQ port to communicate with cleaner process (needed if running on Windows).
    • confirmation_port
      • Confirmation socket to get a confirmation for each data message sent.

Examples:

base config is static for all examples 

base_sender.yaml
general:
    log_size: 10485760
    com_port: 50000
    ldapuri: ldap_host.desy.de:1234
    request_port: 50001

    # for windows
    request_fw_port: 50002
    control_pub_port: 50005
    control_sub_port: 50006

eventdetector:
    ext_data_port: 50101
    event_det_port: 50003
    dirs_not_to_create: []

datafetcher:
    data_fetcher_port: 50010
    status_check_resp_port: 50011
    confirmation_resp_port: 50012
    chunksize: 10485760
    status_check_port: 50050
    confirmation_port: 50053

    # for windows
    router_port: 50004
    cleaner_port: 50051
    cleaner_trigger_port: 50052


Example 1:

  • Linux
  • Logs should be written to /var/log/hidra
  • Detector writes data into a local ramdisk:  /ramdisk

  • Inside the ramdisk only the directories commissioning, current, and local should be monitored and they should be created when hidra starts and they do not exist

  • Only tif and cbf files types should be processed as soon as inotify gives an IN_MOVE_TO event and all log files on IN_CLOSE_WRITE (this is the pattern of the Pilatus detector)

  • Run with python 2.7

  • HiDRA should run with 1 worker process only

  • Data should be moved over ZMQ to the node myfixedtarget.desy.de and port 50100 

  • Data should not be stored locally anywhere else

  • Data should be removed from the ramdisk after processing

  • No additional data requester should be allowed to connect and receive data


The configuration then looks like this:


datamanager.yaml
general:
    log_path: /var/log/hidra
    log_name: datamanager.log
    procname: hidra

    whitelist: []

eventdetector:
    type: inotifyx_events

    inotifyx_events:
        monitored_dir: /ramdisk
        fix_subdirs: &fix_subdirs ["commissioning", "current", "local"]
        create_fix_subdirs: False
        monitored events: {"IN_MOVED_TO": [".tif", ".cbf", ".nxs"],
                           "IN_CLOSE_WRITE": [".log"]}

        history_size: 0
        use_cleanup: False

datafetcher:
    type: file_fetcher
    use_data_stream: True
    number_of_streams: 1
    data_stream_targets: [["myfixedtarget.desy.de", 50100]]
    store_data: False
    remove_data: True

    file_fetcher: *fix_subdirs  

Example 2:

  • same as example 1 but now on windows

  • because there are no on_close-events on windows a file is considered as "closes" if it is not modified for more than 2 seconds and this check runs every 2 seconds


Config:

datamanager.yaml
general:
    log_path: /var/log/hidra
    log_name: datamanager.log
    procname: hidra
    whitelist: []

eventdetector:
    type: watchdog_events

    watchdog_events:
        monitored_dir: /ramdisk
        fix_subdirs: &fix_subdirs ["commissioning", "current", "local"]
        create_fix_subdirs: False
        monitored events: {"IN_MOVED_TO": [".tif", ".cbf", ".nxs"],
                           "IN_CLOSE_WRITE": [".log"]}

        action_time: 2
        time_till_closed: 2

datafetcher:
    type: file_fetcher
    use_data_stream: True
    number_of_streams: 1
    data_stream_targets: [["my_fixed_target.desy.de", 50100]]
    store_data: False
    remove_data: True

    file_fetcher: *fix_subdirs

Example 3

  • Linux
  • Data is fetched from an Eiger detector via HTTP GET
  • The DNS name of the Eiger detector is my-eiger.desy.de wher the Eiger filewriter uses API version 1.6.0
  • HiDRA should run with 16 worker processes
  • The data received from the Eiger should be stored in /storage_system
  • After processing the data should be removed from the Eiger
  • All connections from data-requester.desy.de are allowed


Config:

dataamanger.yaml
general:
    log_path: /var/log/hidra
    log_name: datamanager.log
    procname: hidra
    whitelist: ["data-requester.desy.de"]

eventdetector:
    type: http_events

    http_events:
        fix_subdirs: &fix_subdirs ["commissioning", "current", "local"]
        history_size: 0
        det_ip: my-eiger.desy.de
        det_api_version: 1.6.0

datafetcher:
    type: http_fetcher
    use_data_stream: False
    number_of_streams: 16
    store_data: True
    local_target: /storage_system
    remove_data: True

    http_fetcher: *fix_subdirs



  • No labels