The configuration is split into two parts:
- The base configuration which is static for the whole facility (conf/base_sender.yaml)
- The detector/application specific configuration (conf/datamanager.yaml)
By default the base config contains information like log file size, the ldap uri, hidra internal ports, ...
The specific config settings are added on top of the base ones, meaning that if a config parameter set it set in both the specific config overwrites the base config. Same is true for config parameters on the command line.
No Format |
---|
base config --overwritten by-- specific config --overwritten by-- command line arguments |
Configuration
- conf/datamanager.yaml is used as default configuration file
→ either modify this file or create a new config file and use --config_file <config_file> when starting the datamanager - Set up logging
log_path: Path where the log file will be created
log_name: Filename used for logging
log_size: File size before rollover in B (Linux only)
- Set up general settings:
- username:
- If systemd is used: user to hidra is started and stopped as
procname:
Name with which the service should be running
ext_ip:
IP/DNS name of the interface to bind to for external communication
- ldapuri:
- LDAP node and port needed to check whitelist (format: <node>:<port:)
- com_port:
- Port number to receive signals from (should be set to 50000; if this is changed all data receivers have to adjusted as well)
request_port:
ZMQ port to get new requests (should be set to 50001; if this is changed all data receivers have to adjusted as well)
- whitelist:
- List of hosts allowed to connect
- empty list: nobody is allowed to get data (except data_stream_targets if use_data_stream is enabled)
- None: everybody gets data
- List of hosts allowed to connect
- for windows additional ports have to be configured for internal process communication (on Linux this is done with the IPC protocol)
request_fw_port:
ZMQ port to forward requests internally (needed if running on Windows only)
- control_pub_port:
- ZMQ port to distribute control signals (needed if running on Windows only)
- control_sub_port:
- ZMQ port to distribute control signals (needed if running on Windows only)
- username:
- Set up event detector:
- event_detector_type:
- Type of event detector to use (options are: inotifyx_events, watchdog_events, zmq_events, http_events)
Inotifyx is not python3 compatible, see https://bugs.launchpad.net/inotifyx/+bug/1006053
- monitored_dir:
- Directory to be monitor for changes Inside this directory only the subdirectories "commissioning", "current" and "local" are monitored (needed if event detector is inotifyx_events or watchdog_events)
fix_subdirs:
Subdirectories to be monitored and to store data to. These directory have to exist when HiDRA is started and should not be removed during the run. (needed if eventdetector is inotifyx_events or watchdog_events or if datafetcher is file_fetcher)
- create_fix_subdirs:
Flag describing if the subdirectories should be created if they do not exist. This does not affect http_events, hidra_events, zmq_events since they do not use the monitored_dir variable.
- monitored_events:
Event type of files (options are: IN_CLOSE_WRITE, IN_MOVED_TO, ...) and the formats to be monitored, files in an other format will be be neglected (needed if event_detector is inotifyx_events or watchdog_events)
- history_size:
Number of events stored to look for doubles (needed if event_detector is inotifyx_events or http_events)
use_cleanup:
- Flag describing if a clean up thread which regularly checks if some files were missed should be activated (needed if event_detector is inotifyx_events)
- action_time:
- Intervall time (in seconds) used for clean up resp. checking of events (only needed if event_detector_type is inotifyx_events together with use_cleanup enabled or if event_detector is watchdog_events)
- time_till_closed:
- Time (in seconds) since last modification after which a file will be seen as closed (needed if event_detector is inotifyx_events (for clean up) or watchdog_events)
- ext_data_port:
ZMQ port to get incoming data from (needed if event_detector_type is hidra_events)
- event_det_port:
- ZMQ port to get events from (needed if event_detector_type is zmq_events)
- det_ip:
- IP/DNS name of the detector (needed if event_detector_type is http_events)
- det_api_version:
- API version of the detector (needed if event_detector_type is http_events)
- dirs_not_to_create:
Subdirectories which should not be created when create_fix_subdirs option is enabled.
- event_detector_type:
- Set up data fetcher:
- data_fetcher_type:
- Module with methods specifying how to get the data (options are "file_fetcher", "zmq_fetcher", "http_fetcher").
- data_fetcher_port:
- If "zmq_fetcher" is specified as data_fetcher_type it needs a port to listen to (needed if event_detector_type is zmq_events).
- use_data_stream:
- Enable ZMQ pipe into storage system (uses data_stream_targets).
- data_stream_targets:
- Fixed host and port to send the data to with highest priority.
- number_of_streams:
- Number of parallel data streams.
- status_check_resp_port
Test signal port to notify the data sender about problems (needed if event_detector_type is hidra_events).
- confirmation_resp_port
Confirmation socket to send a confirmation for each data message sent (needed if event_detector_type is hidra_events).
chunksize:
- Chunk size of file-parts getting send via zmq
router_port:
- ZMQ-router port which coordinates the load-balancing to the worker-processes (needed if running on Windows).
- status_check_port:
- Test signal port to check if the data receiver is having problems.
store_data:
Flag describing if the data should be stored in localTarget (needed if data_fetcher_type is file_fetcher or http_fetcher)
- local_target:
- Target to move the files into (needed if store_data is enabled).
remove_data:
- Flag describing if the files should be removed from the source. (needed if data_fetcher_type is file_fetcher or http_fetcher)
Options are:- False - data stays on the source
- True - data is removed from the source after processing it
- with_confirmation - only supported if use_data_stream is enabled; data is removed from the source only after the target sent a verification.
- Flag describing if the files should be removed from the source. (needed if data_fetcher_type is file_fetcher or http_fetcher)
- cleaner_port
ZMQ port to communicate with cleaner process (needed if running on Windows).
- cleaner_trigger_port
- ZMQ port to communicate with cleaner process (needed if running on Windows).
- confirmation_port
- Confirmation socket to get a confirmation for each data message sent.
- data_fetcher_type:
Examples:
base config is static for all examples
Code Block | ||||
---|---|---|---|---|
| ||||
general: log_size: 10485760 com_port: 50000 ldapuri: ldap_host.desy.de:1234 request_port: 50001 # for windows request_fw_port: 50002 control_pub_port: 50005 control_sub_port: 50006 eventdetector: ext_data_port: 50101 event_det_port: 50003 dirs_not_to_create: [] datafetcher: data_fetcher_port: 50010 status_check_resp_port: = 50011 confirmation_resp_port: = 50012 chunksize: 10485760 status_check_port: 50050 confirmation_port: 50053 # for windows router_port: 50004 cleaner_port: 50051 cleaner_trigger_port: 50052 |
Example 1:
- Linux
- Logs should be written to /var/log/hidra
Detector writes data into a local ramdisk: /ramdisk
Inside the ramdisk only the directories commissioning, current, and local should be monitored and they should be created when hidra starts and they do not exist
Only tif and cbf files types should be processed as soon as inotify gives an IN_MOVE_TO event and all log files on IN_CLOSE_WRITE (this is the pattern of the Pilatus detector)
Run with python 2.7
HiDRA should run with 1 worker process only
Data should be moved over ZMQ to the node myfixedtarget.desy.de and port 50100
Data should not be stored locally anywhere else
Data should be removed from the ramdisk after processing
- No additional data requester should be allowed to connect and receive data
Code Block | ||||
---|---|---|---|---|
| ||||
general: log_path: /var/log/hidra log_name: datamanager.log procname: hidra whitelist: = [] eventdetector: type: inotifyx_events inotifyx_events: monitored_dir: /ramdisk fix_subdirs: &fix_subdirs ["commissioning", "current", "local"] create_fix_subdirs: False monitored events: {"IN_MOVED_TO": [".tif", ".cbf", ".nxs"], "IN_CLOSE_WRITE": [".log"]} history_size: 0 use_cleanup: False datafetcher: type: file_fetcher use_data_stream: True number_of_streams: 1 data_stream_targets: [["myfixedtarget.desy.de", 50100]] store_data: False remove_data: True file_fetcher: *fix_subdirs |
Example 2:
same as example 1 but now on windows
because there are no on_close-events on windows a file is considered as "closes" if it is not modified for more than 2 seconds and this check runs every 2 seconds
Config:
Code Block | ||||
---|---|---|---|---|
| ||||
general: log_path: /var/log/hidra log_name: datamanager.log procname: hidra whitelist: [] eventdetector: type: watchdog_events watchdog_events: monitored_dir: /ramdisk fix_subdirs: &fix_subdirs ["commissioning", "current", "local"] create_fix_subdirs: False monitored events: {"IN_MOVED_TO": [".tif", ".cbf", ".nxs"], "IN_CLOSE_WRITE": [".log"]} action_time: 2 time_till_closed: 2 datafetcher: type: file_fetcher use_data_stream: True number_of_streams: 1 data_stream_targets: [["my_fixed_target.desy.de", 50100]] store_data: False remove_data: True file_fetcher: *fix_subdirs |
Example 3
- Linux
- Data is fetched from an Eiger detector via HTTP GET
- The DNS name of the Eiger detector is my-eiger.desy.de wher the Eiger filewriter uses API version 1.6.0
- HiDRA should run with 16 worker processes
- The data received from the Eiger should be stored in /storage_system
- After processing the data should be removed from the Eiger
- All connections from data-requester.desy.de are allowed
Config:
Code Block | ||||
---|---|---|---|---|
| ||||
general: log_path: /var/log/hidra log_name: datamanager.log procname: hidra whitelist: = ["data-requester.desy.de"] eventdetector: type: http_events http_events: fix_subdirs: &fix_subdirs ["commissioning", "current", "local"] history_size: 0 det_ip: my-eiger.desy.de det_api_version: 1.6.0 datafetcher: type: http_fetcher use_data_stream: False number_of_streams: 16 store_data: True local_target: /storage_system remove_data: True http_fetcher: *fix_subdirs |