Computing : Storage on and around the NAF


Overview

The NAF gives the user the opportunity to have different access methods to different kind of data. The user can choose the best kind of storage for his different kinds of data (like scripts, histograms, big data sets,...) This means on the other side that the user needs to think about the access pattern of each different kind of his data.

In the table below and the more extensive descriptions (see links) you can find some advice about how to use the different storage systems best.

In general you should keep in mind that many small files are bad in some file systems but are worse in other file systems: so please try to avoid them => make tarballs of log files (stdout, stderr, ....aux, etc.)



Brief storage overviewUsage pattern
AFSis the well known wide-area filesystem. It is optimal for holding small login scrips, eventually some code and small ntuples. With quota and backupDO keep scripts and small important files for backup

DO NOT have more than 65k entries in a directory

dCacheis the main (and largest) storage system which is entry point for all data and exchange point with the Grid world. dCache can be accessed using the pNFS mount (/pnfs/desy.de/…), or using dCap, XROOTD or GridFTP protocol.DO store analysis output and large files which should stay and eventually be visible from the grid

DO NOT put small files here

DUST

is a scratch "playground" area and a cluster file system.

With quota, no backup, no snapshots. View your quota and usage at AMFORA or use the CLI (see below DUST: Quota reporting and management)

DO reading and writing large files with high bandwidth

DO NOT store too many small files (you might want to tar directories that contain many small files).

OtherOf course, you also have local disks, mainly as temporary local scratch area.DO copy data to local disks in the WN that needs to run over quite often

DO NOT leave data there as it will vanish at the end of your job


AFS: Renewing Kerberos and AFS credentials

If you are logged in for a longer time, your credentials can expire. Check the current Kerberos credentials:

>klist
Ticket cache: FILE:/tmp/krb5cc_14937_Ln2tcr
Default principal: donald@DESY.DE
 
Valid starting     Expires            Service principal
04/11/14 13:15:11  04/12/14 13:15:11  krbtgt/DESY.DE@DESY.DE
	renew until 04/13/14 13:15:11

And check the current AFS token:

>tokens
 
Tokens held by the Cache Manager:
 
Tokens for afs@desy.de [Expires Apr 12 13:15]
   --End of list--

To renew both on a NAF WGS, please use the following:

kinit will give you new Kerberos credentials. 

aklog will then take these credentials and give you a new AFS token.

DUST: Quota reporting and management

DUST directories are limited by a quota. You can either visit AMFORA (only reachable from within DESY network) or use my-dust-quota on any interactive login node (exception: nafhh-x1|x2):

[myuser]~% my-dust-quota 
Fileset Name Usage (TB) Limit (TB) Use (%) File Usage
user.myexp.myuser 0.043 0.4 10.76 80610

By default, only quotas for your personal DUST directories are displayed. To include quotas for accessible group directories, add -g: my-dust-quota -g:

[myuser]~% my-dust-quota -g
Fileset Name Usage (TB) Limit (TB) Use (%) File Usage
user.myexp.myuser 0.043 0.4 10.76 80610
group.myexp.mygrpdir 12.9 15.0 86.0 878863

The quota information is updated hourly.

If you want to access the raw information for advanced scripts etc., you can access the raw data quota file:

cat /nfs/dust/YOUREXP/user/YOURUSER/.quota-usage.csv
cat /nfs/dust/YOUREXP/group/GROUPDIR/.quota-usage.csv

or for simple pretty printing:

column -s ";" -t /nfs/dust/YOUREXP/user/YOURUSER/.quota-usage.csv 

Users within the DESY network can also use AMFORA, this is also the place for admins to change quotas.