Page tree

A new analysis resource has been set up for Photon Science, in addition to the already known Workgroup server pool p3wgs (formerly known as p3wgs6). The new system is foreseen for high demanding analysis/processing tasks arising from modern detectors and currently offers access to data taken within the new GPFS-storage environment at PETRA III.

Summary

  • For login to CPU workgroup server connect to max-fsc.desy.de. Don't use individual node-names.
    • Be aware that you’re not alone on the systems
  • For login to GPU workgroup server connect to max-fsg.desy.de. Don't use individual node-names.
    • Be aware that you’re not alone on the systems
  • If you cannot access the system (not a member of netgroup @hasy-users), please contact fs-ec@desy.de
    Note: Users having a so called Scientific account (or the registry ressource psx) can/have to access this part of the Maxwell resource by connecting from inside DESY to desy-ps-cpu.desy.de or desy-ps-gpu.desy.de.
    From outside DESY one has to connect (or tunnel) to firewall desy-ps-ext.desy.de to connect to the resources mentioned above.
  • If you’re missing software or for support, contact unix@desy.de (and in cc: andre.rothkirch@desy.de)
  • dCache instances petra3 and flash1 are mounted on max-fsc/g (/pnfs/desy.de/ ; no Infiniband)
  • A gpfs-scratch folder is available by /gpfs/petra3/scratch/ (approx 11T, no snapshot, no backup, automatic clean-up [will come]).
    it's scratch space in a classical sense, i.e. for temporary data. This space is used by default as temporary folder for Gaussian and you can create a folder for yourself (e.g named after your account) for other purposes. It's shared among all users and thus you're ask to check  (in particular in view of gaussian) and to clean up / free scratch disk space if no longer needed.

Currently you can use the resource without scheduling (exclusive usage would limit the resource to 11 users in parallel). We will monitor/see how it goes without such a scheduling measure. In case of ‘crossfire’ / users disturbing each other we have the option to put a scheduling similar to HPC or Maxwell core (i.e. SLURM) in front of the resource, the how / in which way needs to be discussed then within photon science (e.g. scheduler only for max-fsg or for max-fsg and 50% of max-fsc or  ….).


Hardware environment

The photon science (FS) maxwell resource consists currently of 11 nodes:

  • 8 workgroup server equipped with 32 cores and 512GB ram. 
    • CPUs are Intel(R) Xeon(R) CPU E5-2698 v3 @ 2.30GHz
    • check the benchmarks
  • 3 workgroup server / compute nodes with additional Cuda card for GPU computing 

The workgroup server are connected by Infiniband (FDR, 56Gb) to the newly established GPFS storage system for PETRA III and thus should e.g. allow for faster access/analysis of larger amounts of data. 

Personal environment

To access this additional analysis resource, you can use the following aliases for an ssh connect using your DESY-account (similar to p3wgs a load balancer decides on which server you end up):

  • max-fsc.desy.de    to access one of the cpu nodes
  • max-fsg.desy.de    to access on of the nodes with Cuda card

The home-directory is not the usual AFS but on a network-storage (currently hosted on GPFS) located in /home/<user>:

[user@max-p3a001 ~]$ echo $HOME
/home/user

This working directory resides on GPFS and thus you’ll have the same “home” on all workgroup servers belonging to the new resource. Please note that this “working directory” should NOT be considered as replacement for your AFS or Win Home-directory. In particular the home-directory on max-fs is NOT in backup! 

The usual afs-environment will not be available on max-fs! However, upon login you will obtain an afs-token and kerberos-ticket:

[user@max-p3a001 ~]$ tokens
Tokens held by the Cache Manager:
User's (AFS ID 3904) tokens for afs@desy.de [Expires Jun 20 09:40]
   --End of list--

[user@max-p3a001 ~]$ klist
Ticket cache: FILE:/tmp/krb5cc_9999_4GphrK1aA2
Default principal: user@DESY.DE
Valid starting       Expires              Service principal
06/19/2015 11:01:51  06/20/2015 09:40:08  krbtgt/DESY.DE@DESY.DE
	renew until 06/21/2015 09:40:08
06/19/2015 11:01:51  06/20/2015 09:40:08  afs/desy.de@DESY.DE
	renew until 06/21/2015 09:40:08

Tokens and tickets will expire after about 24h. In contrast to AFS-homes, accessing the home-directory does not require tokens or tickets, which means that long-running jobs will continue after token expiry. Just ensure not to introduce dependencies into your AFS-home. If needed, tokens and tickets can be renewed as usual with e..g:

[user@max-p3a001 ~]$ k5log -tmp

Your home-directory is subject to a non-extendable, hard quota of 20GB. To check the quota:

[user@max-p3a001 ~]$ mmlsquota max-home
                         Block Limits                                    |     File Limits
Filesystem type             KB      quota      limit   in_doubt    grace |    files   quota    limit in_doubt    grace  Remarks
max-home   USR           76576          0   10485760          0     none |      520       0        0        0     none core.desy.de


Software environment

The systems are set up with CentOS 7.x (RedHat derivate) and provide common software packages as well as some packages in particular of interest for photon science (e.g ccp4 [xds], conuss, fdmnes, phenix or xrt).

An overview of software available can be found here (no confluence account necessary for reading): https://confluence.desy.de/display/IS/Software

Some of the software can be executed directly, some packages are NOT available out-of-the-box and have to be loaded via module before you can use them (similar to p3wgs6 pool). Which package to load (if necessary), how to load it and additional comments are given in confluence as well (see for example subsection https://confluence.desy.de/display/IS/xds). In case software is missing please let us know (email to unix@desy.de and – until end of August – in cc: andre.rothkirch@desy.de).

Because the system is for analysing data with demands for parallel computing, no office software like pine, thunderbird, firefox, etc... will be available (such things you can usually find on your office PC or resources like p3wgs6, PAL or PAUL.

Currently, for login to the machines the accounts need the resource (netgroup) “hasy-users”, which most but not all of FS staff accounts get by default. In case you cannot log into the resource please contact fs-ec@desy.de .

Storage environment

The GPFS core filesystem is mounted on the system as

 /asap3

Below /asap3, you will find several facilities, e.g. /asap3/petra3/gpfs or /asap3/flash/gpfs

Underneath you’ll find data taken recently at the facilities, sorted in subfolders by beamline with additional substructure of year/type/tag, e.g. data in 2015 at beamline p00 with beamtime AppID 12345678 would reside in /asap3/petra/gpfs/p00/2015/data/12345678/, followed by the directory tree as given/created during beamtime. Folder “logic” is same as during beamtime, i.e. assuming you’ve been named participant for a beamtime and are granted access to the data (controlled by ACls [access control list]), you can read data from subfolder “raw” and store analysis results in scratch_cc (temporary/testing results) or processed (final results).
For further details concerning folders and meaning please see subsections in the ASAP3 Confluence Space.

The folder exported read-only to the Beamline PCs foreseen for documentation, macros etc. can be found in folder /asap3/<facilitiy>/gpfs/common/<beamline>.

AFS is fully available on all max-nodes. 

dCache is currently not available on max-fs nodes. If access to dCache hosted data is required, use the P3WGS6 nodes, which mount both dCache and GPFS. 

Remarks

Please note that these servers are currently accessible without a scheduling system as used for HPC. Thus the systems (or a single WGS) are shared resources among all of you and multiple users can be logged in consuming given RAM and cores in parallel. Please keep this in mind and pay attention on e.g. core consumption and RAM demands when running own multithreaded jobs (i.e. restrict your jobs not to use all cores/RAM available on a single server) and do not try to ‘optimise’ program execution such as you might do if a WGS is exclusively yours alone. You are also recommended not to distribute jobs on multiple nodes.

With respect to the GPU (Cuda) nodes, please keep in mind that currently GPU cards do not allow for running jobs in parallel by e.g. different users. Thus do not execute/start GPU processing while the card is already in use, it will lead to crashes. You may check the Cuda status by calling “nvidia-smi” from the shell, besides others it will report on running processes.

The environment incl. storage system mounts will further evolve, e.g. next to come are that data from external beamtimes carried out by DESY staff will be put to GPFS, and a “scratch” space will be added to allow for processing of previously taken data (to be copied from dCache then) or processing data not originating from a PETRA Beamtime (e.g. simulations/theory).

Though the nodes are all connected via infiniband, please do not run any processes on multiple hosts (for example with MPI). Due to current restraints, the workgroup server are split over two infiniband-fabrics, which limits the possibilities of multi-host processes. 


 

  • No labels