Overview
The NAF gives the user the opportunity to have different access methods to different kind of data. The user can choose the best kind of storage for his different kinds of data (like scripts, histograms, big data sets,...) This means on the other side that the user needs to think about the access pattern of each different kind of his data.
In the table below and the more extensive descriptions (see links) you can find some advice about how to use the different storage systems best.
In general you should keep in mind that many small files are bad in some file systems but are worse in other file systems: so please try to avoid them => make tarballs of log files (stdout, stderr, ....aux, etc.)
Brief storage overview | Usage pattern | |
AFS | is the well known wide-area filesystem. It is optimal for holding small login scrips, eventually some code and small ntuples. With quota and backup | DO keep scripts and small important files for backup DO NOT have more than 65k entries in a directory |
dCache | is the main (and largest) storage system which is entry point for all data and exchange point with the Grid world. dCache can be accessed using the pNFS mount (/pnfs/desy.de/…), or using dCap, XROOTD or GridFTP protocol. | DO store analysis output and large files which should stay and eventually be visible from the grid DO NOT put small files here |
DUST | is a scratch "playground" area and a cluster file system. With quota, no backup, no snapshots. View your quota and usage at AMFORA or use the CLI (see below DUST: Quota reporting and management) | DO reading and writing large files with high bandwidth DO NOT store too many small files (you might want to tar directories that contain many small files). |
Other | Of course, you also have local disks, mainly as temporary local scratch area. | DO copy data to local disks in the WN that needs to run over quite often DO NOT leave data there as it will vanish at the end of your job |
AFS: Renewing Kerberos and AFS credentials
If you are logged in for a longer time, your credentials can expire. Check the current Kerberos credentials:
>klist Ticket cache: FILE:/tmp/krb5cc_14937_Ln2tcr Default principal: donald@DESY.DE Valid starting Expires Service principal 04/11/14 13:15:11 04/12/14 13:15:11 krbtgt/DESY.DE@DESY.DE renew until 04/13/14 13:15:11
And check the current AFS token:
>tokens Tokens held by the Cache Manager: Tokens for afs@desy.de [Expires Apr 12 13:15] --End of list--
To renew both on a NAF WGS, please use the following:
kinit
will give you new Kerberos credentials.
aklog
will then take these credentials and give you a new AFS token.
DUST: Quota reporting and management
DUST directories are limited by a quota. You can either visit AMFORA (only reachable from within DESY network) or use my-dust-quota on any interactive login node (exception: nafhh-x1|x2):
[myuser]~% my-dust-quota
Fileset Name Usage (TB) Limit (TB) Use (%) File Usage
user.myexp.myuser 0.043 0.4 10.76 80610
By default, only quotas for your personal DUST directories are displayed. To include quotas for accessible group directories, add -g: my-dust-quota -g:
[myuser]~% my-dust-quota -g
Fileset Name Usage (TB) Limit (TB) Use (%) File Usage
user.myexp.myuser 0.043 0.4 10.76 80610
group.myexp.mygrpdir 12.9 15.0 86.0 878863
The quota information is updated hourly.
If you want to access the raw information for advanced scripts etc., you can access the raw data quota file:
cat /nfs/dust/YOUREXP/user/YOURUSER/.quota-usage.csv
cat /nfs/dust/YOUREXP/group/GROUPDIR/.quota-usage.csv
or for simple pretty printing:
column -s ";" -t /nfs/dust/YOUREXP/user/YOURUSER/.quota-usage.csv
Users within the DESY network can also use AMFORA, this is also the place for admins to change quotas.