Maxwell : qchem

Summary

License: commercial https://www.q-chem.com/purchase/license/

Path: /software/qchem/[variant]

Documentation: https://www.q-chem.com/learn/

Description: Q-Chem is a comprehensive ab initio quantum chemistry software for accurate predictions of molecular structures, reactivities, and vibrational, electronic and NMR spectra.

NOTE: qchem is currently licensed for a single DESY group, and accessible for named members of the group. Please let us know if you'd like to use qchem, but are currently not entitled to do so.

Q-Chem, Inc. is a comprehensive ab initio quantum chemistry program to solve computational problems faster, more accurately and less expensively than ever before possible. Q-Chem's capabilities facilitate applications in pharmaceuticals, materials science, biochemistry and other fields. Q-Chem also provides users with the highest level of technical support. Q-Chem was established in 1993, and there have been five major releases to date. (from: https://www.q-chem.com/about/)

Using qchem

qchem is available in two variants, as a shared memory (shm) and as a cluster/mpi version mit openmpi. The environment can best be initiated using the module command:

[max]% module load maxwell qchem/shm                 # for the shared memory version. Don't see a need to use it at all.
[max]% module load maxwell qchem/mpi                 # for cluster/mpi enabled version
[max]% module load maxwell qchem                     # for cluster/mpi enabled version, which is the default.

Note: the modules define QCSCRATCH as

/scratch/$USER/qcscratch - for the shm version
/beegfs/desy/user/$USER/qcscratch - for the mpi version

Alter the environment variable if you favor a different location. Make sure /beegfs/desy/user/$USER exists before launching jobs (mk-beegfs will do).

Note: the mpi module uses mpi/mpich-3.2-x86_64. If you favor another MPI implementation use e.g.

[max]% module load maxwell qchem        
[max]% module unload mpi/mpich-3.2-x86_64 
[max]% module load <your favorite MPI>

Sample job script

qchem setup is a bit tricky. Here is a sample script which basically works. Note that there is hardly an

#!/bin/bash
#SBATCH --time=0-08:00:00
#SBATCH --partition=upex     # change!
#SBATCH --constraint=7542    # change!
#SBATCH --nodes=4            # change!
unset LD_PRELOAD
#
#  qchem setup
#
source /etc/profile.d/modules.sh
module purge
module load maxwell qchem/mpi

#
#  use sample for naming files
#
sample=N2O-mod

#
#  use one mpi-process per node
#  use number of physical cores as number of threads, assuming identical core count across nodes
#
nc=$(( $(nproc) / 2 ))
np=$SLURM_NNODES
nt=$nc

#
#  generate a hostfile. This one uses 1 MPI process per node (:1)
# 
sinfo --noheader -n $SLURM_JOB_NODELIST  -o "%n:1"  > hostfile
export QCMACHINEFILE=$PWD/hostfile

#
#  use 75% of available memory (on the node with lowest memory). Allow 50G for other applications (i.e. gpfs)
#  add memory setting to input file
#
mem=$( scontrol show node $SLURM_JOB_NODELIST | grep -o 'mem=.*G' | cut -d= -f2 | tr -d G | sort -u | head -1)
mem=$(( ($mem - 50) * 750 ))
perl -p -e "s|MEM_TOTAL.*|MEM_TOTAL $mem|" $sample.tmpl > $sample.in

#
#  just to have some info in the slurm-log as well
#
echo "$(date) sample: $sample threads: $nt mpi: $np mem_total: $mem "

s=$(date +%s)
qchem -mpi -np $np -nt $nt $sample.in $sample.out > $sample.log 2>&1
e=$(date +%s)
elapsed=$(( $e - $s ))

#
#  all done 
#
echo "$(date) sample: $sample threads: $nt mpi: $np elapsed: $elapsed "

exit

Setting up partitions and constraints

In general you want to use a set of fairly identical nodes for MPI jobs. To find a suitable set of constraints:

use my-partitions to determine which partition to use. Don't use the jhub partition for jobs!
use something like sinfo -p <favorite partitions> -o '%n %f %t' to determine what kind of constraints are available in which partition
use savail -f <constraint> to show real availability of nodes

Test run

Some test runs for 1-8 nodes using the mpi-version show, that with the above job-script (which might not be appropriate) the use of multiple nodes is barely improving performance - if at all. That might well be due to the size of the problem. The jobs however make intense use of filesystems. Neither BeeGFS nor GPFS are particularly happy with that. The jobs running on GPFS all failed exceeding the time limit of 8 hours. Some jobs using BeeGFS didn't terminate properly unable to close files. Using the INCORE backend, the jobs ran - seemingly smoothly and almost a factor 10 faster.

The resulting wall clock times

	QCSCRATCH on BeeGFS (1)				QCSCRATCH on GPFS (scratch) (2)				INCORE option (3)
job step	1	2	4	8	1	2	4	8	1	2	4	8
SCF	16.00	15.00	17.00	16.00	18.00	17.00	17.00	18.00	17.00	17.00	15.00	16.00
Import integrals	236.04	239.83	208.27	265.10	109.38	107.11	104.63	105.51	299.44	412.77	207.69	354.21
MP2 amplitudes	3.24	2.84	2.96	2.99	2.89	3.21	2.82	2.83	2.97	2.80	2.91	3.00
CCSD calculation	777.79	766.90	642.72	4313.87	1212.88	1295.29	1576.83	1647.26	248.76	247.57	244.63	248.73
CVS-EOMEE-CCSD	17763.41	17825.04	13405.08	13953.12					1696.24	1692.50	1726.34	1751.49
Transition Properties	258.19	149.08	142.63	124.34					29.39	27.32	27.55	28.96
Total ccman2 time	19053.61	18997.27	14415.67	18675.25					2288.29	2389.93	2216.67	2394.26
Total job time	19076.31	19019.52	14441.59	18700.22	na	na	na	na	2313.55	2413.60	2236.64	2416.81
Configuration	1 MPI process per node 64 threads per process MEM_TOTAL 337500 MEM_STATIC 1000 CC_MEMORY 30000 Default CC_BACKEND QSCRATCH on BeeGFS				1 MPI process per node 64 threads per process MEM_TOTAL 337500 MEM_STATIC 1000 CC_MEMORY 30000 Default CC_BACKEND QSCRATCH on GPFS Petra3 scratch				1 MPI process per node 64 threads per process MEM_TOTAL 337500 MEM_STATIC 1000 CC_MEMORY 160000 CC_BACKEND INCORE QSCRATCH on BeeGFS