Computing : ATLAS@NAF

Local Environment Tuning

To set up your environment to use DESY-HH specifics, run

 source ${VO_ATLAS_SW_DIR}/local/setup.sh -s DESY-HH 

in your sessions and jobs.

Rucio to local paths

Since Rucio datasets are somewhat handled like in an objectstore, the actual file paths on the local storage have to be translated.

A Rucio dataset consists of a scope and a name, which together give a Data Identifier (DID).
To avoid lumping files into a few heavily used directories, Rucio balances the files on the storages.  The files are placed in two subdirectories, that are derived from taking the md5 hashes of the DIDs.

Since the mapping is deterministic, one can derive locally the paths from DIDs.


For example, the file/dataset of

  • scope : user.thartman
  • name :  2.txt
  • on the RSE: DESY-HH_LOCALGROUPDISK

can be found under
  /pnfs/desy.de/atlas/dq2/atlaslocalgroupdisk/rucio/user/thartman/b4/0d/2.txt

where the two "hash" subdirectories 'b4/0d/' are deterministically derived from the DID

Helpers

Helper scripts/methods in bash or python, that can be used to locally translate the DIDs to paths (thanks to Mario)

Resolving Rucio DIDs to local
# bash example 
scope='user.thartman'
name='2.txt'
echo "/$(echo $scope | sed -E 's/^(user|group)\./\1\//')/$(echo -n $scope:$name | md5sum | sed -E 's/(..)(..).*/\1\/\2/')/$name"


Rucio DIDs to local in Python
## python example

import hashlib,os

def rucio2naf(scope,name,RSE='localgroupdisk',basePath='/pnfs/desy.de/atlas/dq2/'):
    nameHash = hashlib.md5(('%s:%s' % (scope, name)).encode('utf-8')).hexdigest()
    if scope.startswith('user') or scope.startswith('group'):
        scope = scope.replace('.', '/')
    pfnsPath ='%s/%s/%s/%s' % (scope, nameHash[0:2], nameHash[2:4], name)
    return os.path.join(basePath,'atlas'+RSE.lower(),'rucio',pfnsPath)


For a given dataset's files on a storage element, you can also create in your current directories a set of symbolic links to the files in the namespace of the storage element

rucio list-file-replicas --rse DESY-HH_DATADISK --protocols=root --pfns --link "/pnfs:/pnfs" user.ivukotic:user.ivukotic.xrootd.desy-hh-1M

which will create for the dataset 'user.ivukotic:user.ivukotic.xrootd.desy-hh-1M' symlinks to its files, which point to the paths the files have as the storage endpoint 'DESY-HH_DATADISK' sees them locally, i.e., the links make only sense if you have access to the same namespace as the storage endpoint. (protocol 'root' seems to be necessary, as with other protocols nonsensical symlinks to URLs of the files are created)

Please note, that this will not be local but query the remote Rucio instance.