this page briefly describes a local installation of Alphafold 2.2.2 (no docker or singularity involved).
Running alphafold 2.2.2 (no container)
- Create a batch-script a sample is pasted below. Customize it to contain proper partitions and limits
- Use /software/alphafold/2.2.2L/alphafold.sh or customize it according to your needs.
- For multimer: use AF_preset=multimer ..., the default is monomer.
- For multimer: each monomer has to be a separate entry with full sequence in the fasta-file, even if all monomers are identical
- Almost all parameter can be customized, see the table below for details
- sbatch <your-alphafold-script>
Sample batch script
/software/alphafold/2.2.0L/sbatch-alphafold.sh
#!/bin/bash # this is just a sample batch-script #SBATCH --partition=allgpu #SBATCH --constraint='A100|V100' #SBATCH --time=0-12:00 #SBATCH --job-name=T1050-dimer #SBATCH --output=slurm.T1050-dimer.out unset LD_PRELOAD export AF_preset=multimer export AF_outdir=/beegfs/desy/user/$USER/ALPHAFOLD2.2.2 /software/alphafold/2.2.2L/alphafold.sh --fasta_paths=/software/alphafold/2.2.2L/T1050-2.fasta
Sample run script
/software/alphafold/2.2.0L/alphafold.sh
#!/bin/bash # basic setup unset LD_PRELOAD source /etc/profile.d/modules.sh module purge module load maxwell cuda/11.3 # alphafold basics export PATH=/software/alphafold/2.2.2L/envs/af2.2/bin:$PATH export TF_FORCE_UNIFIED_MEMORY=1 export AF_datadir=${AF_datadir:-/beegfs/desy/group/it/ReferenceData/alphafold} # databases AF_uniref90=${AF_uniref90:-$AF_datadir/uniref90/uniref90.fasta} AF_bfd=${AF_bfd:-$AF_datadir/bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt} AF_mmcif=${AF_mmcif:-$AF_datadir/pdb_mmcif/mmcif_files} AF_obsolete=${AF_obsolete:-$AF_datadir/pdb_mmcif/obsolete.dat} AF_pdb70=${AF_pdb70:-$AF_datadir/pdb70/pdb70} AF_mgnify=${AF_mgnify:-$AF_datadir/mgnify/mgy_clusters.fa} AF_uniclust30=${AF_uniclust30:-$AF_datadir/uniclust30/uniclust30_2018_08/uniclust30_2018_08} AF_uniprot=${AF_uniprot:-$AF_datadir/uniprot/uniprot.fasta} AF_pdbseqres=${AF_pdbseqres:-$AF_datadir/pdb_seqres/pdb_seqres.txt} AF_template_date=${AF_template_date:-$(date +%Y-%m-%d)} # make sure they all exist for e in $( /usr/bin/env | grep "$AF_datadir" | cut -d= -f2 ) ; do if [[ ! -e $e ]]; then echo "missing $e -- check your environment " exit fi done export AF_preset="${AF_preset:-monomer}" if [[ $AF_preset =~ monomer ]]; then export AF_dbs="--uniref90_database_path=$AF_uniref90 --bfd_database_path=$AF_bfd --template_mmcif_dir=$AF_mmcif" export AF_dbs="$AF_dbs --obsolete_pdbs_path=$AF_obsolete --pdb70_database_path=$AF_pdb70 --mgnify_database_path=$AF_mgnify" export AF_dbs="$AF_dbs --uniclust30_database_path=$AF_uniclust30" else export AF_dbs="--uniref90_database_path=$AF_uniref90 --bfd_database_path=$AF_bfd --template_mmcif_dir=$AF_mmcif" export AF_dbs="$AF_dbs --obsolete_pdbs_path=$AF_obsolete --mgnify_database_path=$AF_mgnify" export AF_dbs="$AF_dbs --uniclust30_database_path=$AF_uniclust30 --uniprot_database_path=$AF_uniprot --pdb_seqres_database_path=$AF_pdbseqres" fi # user customizable setup export AF_outdir="${AF_outdir:-/tmp/alphafold}" cat <<EOF AlphaFold Setup ---------------------------------------------------------------------------------------------------- AF_datadir.: $AF_datadir AF_outdir,,: $AF_outdir AF_preset..: $AF_preset Hardware Setup ---------------------------------------------------------------------------------------------------- Host.......: $(hostname) CPU........: $(grep "model name" /proc/cpuinfo | head -1 | cut -d: -f2 | grep -o '[a-Z].*') GPU........: $(nvidia-smi -L |cut -d'(' -f1 | tr '\n' ' ') Cores......: $(nproc) Memory.....: $(free -g | grep Mem | awk '{print $2}') Time.......: $(date) Execute: ---------------------------------------------------------------------------------------------------- python3 /software/alphafold/2.2.2L/alphafold/run_alphafold.py \ --output_dir=$AF_outdir \ --data_dir=$AF_datadir \ --model_preset=$AF_preset \ --max_template_date=$AF_template_date \ $AF_dbs \ "$@" EOF python3 /software/alphafold/2.2.2L/alphafold/run_alphafold.py --output_dir=$AF_outdir --data_dir=$AF_datadir --model_preset=$AF_preset --max_template_date=$AF_template_date $AF_dbs --use_gpu_relax "$@"
Databases
Databases can be found in /beegfs/desy/group/it/ReferenceData/alphafold/, but feel free to use your own set of DBs. small_bfd is not defined in the sample script, but can be found at /beegfs/desy/group/it/ReferenceData/alphafold/small_bfd/bfd-first_non_consensus_sequences.fasta. Last update: mid November 2021.
Multimers
Note: the fasta-file has to contain each chain as separate entry even if all sequences are identical. For the 1WUF sample it looks like this:
>1WUF_1|Chains A|hypothetical protein lin2664|Listeria innocua (272626) GHHHHHHHHHHGLVPRGSHMYFQKARLIHAELPLLAPFKTSYGELKSKDFYIIELINEEGIHGYGELEAFPLPDYTEETLSSAILIIKEQLLPLLAQRKIRKPEEIQELFSWIQGNEMAKAAVELAVWDAFAKMEKRSLAKMIGATKESIKVGVSIGLQQNVETLLQLVNQYVDQGYERVKLKIAPNKDIQFVEAVRKSFPKLSLMADANSAYNREDFLLLKELDQYDLEMIEQPFGTKDFVDHAWLQKQLKTRICLDENIRSVKDVEQAHSIGSCRAINLKLARVGGMSSALKIAEYCALNEILVWCGGMLEAGVGRAHNIALAARNEFVFPGDISASNRFFAEDIVTPAFELNQGRLKVPTNEGIGVTLDLKVLKKYTKSTEEILLNKGWS >1WUF_2|Chains B|hypothetical protein lin2664|Listeria innocua (272626) GHHHHHHHHHHGLVPRGSHMYFQKARLIHAELPLLAPFKTSYGELKSKDFYIIELINEEGIHGYGELEAFPLPDYTEETLSSAILIIKEQLLPLLAQRKIRKPEEIQELFSWIQGNEMAKAAVELAVWDAFAKMEKRSLAKMIGATKESIKVGVSIGLQQNVETLLQLVNQYVDQGYERVKLKIAPNKDIQFVEAVRKSFPKLSLMADANSAYNREDFLLLKELDQYDLEMIEQPFGTKDFVDHAWLQKQLKTRICLDENIRSVKDVEQAHSIGSCRAINLKLARVGGMSSALKIAEYCALNEILVWCGGMLEAGVGRAHNIALAARNEFVFPGDISASNRFFAEDIVTPAFELNQGRLKVPTNEGIGVTLDLKVLKKYTKSTEEILLNKGWS
Installation
tmpdir=/scratch/$USER inst_dir=/software/alphafold/2.2.2L pushd $tmpdir wget https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh sh Miniforge3-Linux-x86_64.sh -b -p $inst_dir export PATH=$inst_dir/bin:$PATH # create a mamba-init, conda-init . conda-init conda create -n af2.2.2 python=3.9 conda activate af2.2.2 conda install -y -c nvidia cudatoolkit=11.1 cudnn==8.0.4 conda install -y -c bioconda hmmer hhsuite==3.3.0 kalign3 conda install -y -c conda-forge openmm=7.5.1 pdbfixer pip # # alphafold itself # wget https://github.com/deepmind/alphafold/archive/refs/tags/v2.2.2.tar.gz -O alphafold-2.2.2.tar.gz tar xf alphafold-2.2.2.tar.gz mv alphafold-2.2.2 $inst_dir/alphafold rm alphafold-2.2.2.tar.gz # both jax and jaxlib versions have to be explicit, will cause problems otherwise python3 -m pip install --upgrade jax==0.2.21 jaxlib==0.1.69+cuda111 -f https://storage.googleapis.com/jax-releases/jax_releases.html popd # # patch # pushd $inst_dir/envs/af2.2.2/lib/python3.8/site-packages patch -p0 < $inst_dir/alphafold/docker/openmm.patch popd # # get stereo_chemical_props.txt # wget -P $inst_dir/alphafold/common/ https://git.scicore.unibas.ch/schwede/openstructure/-/raw/7102c63615b64735c4941278d92b554ec94415f8/modules/mol/alg/src/stereo_chemical_props.txt --no-check-certificate conda deactivate