Local Environment Tuning
To set up your environment to use DESY-HH specifics, run
source ${VO_ATLAS_SW_DIR}/local/setup.sh -s DESY-HH
in your sessions and jobs.
Rucio to local paths
Since Rucio datasets are somewhat handled like in an objectstore, the actual file paths on the local storage have to be translated.
A Rucio dataset consists of a scope and a name, which together give a Data Identifier (DID).
To avoid lumping files into a few heavily used directories, Rucio balances the files on the storages. The files are placed in two subdirectories, that are derived from taking the md5 hashes of the DIDs.
Since the mapping is deterministic, one can derive locally the paths from DIDs.
For example, the file/dataset of
- scope :
user.thartman
- name :
2.txt
- on the RSE:
DESY-HH_LOCALGROUPDISK
can be found under
/pnfs/desy.de/atlas/dq2/atlaslocalgroupdisk/rucio/user/thartman/b4/0d/2.txt
where the two "hash" subdirectories 'b4/0d/
' are deterministically derived from the DID
Helpers
Helper scripts/methods in bash or python, that can be used to locally translate the DIDs to paths (thanks to Mario)
# bash example scope='user.thartman' name='2.txt' echo "/$(echo $scope | sed -E 's/^(user|group)\./\1\//')/$(echo -n $scope:$name | md5sum | sed -E 's/(..)(..).*/\1\/\2/')/$name"
## python example import hashlib,os def rucio2naf(scope,name,RSE='localgroupdisk',basePath='/pnfs/desy.de/atlas/dq2/'): nameHash = hashlib.md5(('%s:%s' % (scope, name)).encode('utf-8')).hexdigest() if scope.startswith('user') or scope.startswith('group'): scope = scope.replace('.', '/') pfnsPath ='%s/%s/%s/%s' % (scope, nameHash[0:2], nameHash[2:4], name) return os.path.join(basePath,'atlas'+RSE.lower(),'rucio',pfnsPath)
Symlinks
For a given dataset's files on a storage element, you can also create in your current directories a set of symbolic links to the files in the namespace of the storage element
rucio list-file-replicas --rse DESY-HH_DATADISK --protocols=root --pfns --link "/pnfs:/pnfs" user.ivukotic:user.ivukotic.xrootd.desy-hh-1M
which will create for the dataset 'user.ivukotic:user.ivukotic.xrootd.desy-hh-1M' symlinks to its files, which point to the paths the files have as the storage endpoint 'DESY-HH_DATADISK' sees them locally, i.e., the links make only sense if you have access to the same namespace as the storage endpoint. (protocol 'root' seems to be necessary, as with other protocols nonsensical symlinks to URLs of the files are created)
Please note, that this will not be local but query the remote Rucio instance.