The use of anaconda® is deprecated. Please use mamba instead. Do NOT use the anaconda or miniconda installer, use miniforge or mambaforge instead.
For existing installations or environments please see the Python page for instructions how to disable the non-free anaconda channel.
We strongly recommend replacing all anaconda/miniconda installations by miniforge or mambaforge.
Summary
Source:mamba https://github.com/mamba-org/mamba
Source: miniforge/mambaforge https://github.com/conda-forge/miniforge
License: 3-clause BSD
Path: /software/mamba
Documentation: https://docs.conda.io/projects/conda/en/latest/
conda/mamba is a package manager that offers an easy way to perform Python/R data science and machine learning.
Working with conda/mamba on Maxwell
we offer basic conda installations together with a number of environments, kernels, labextensions. The conda installation serves as the base environment for the jupyter-hub which might limit options to upgrade to newest versions. You can however easily install your own conda version, and your group/institute like Eu.XFEL, CSSB, etc most likely has separate conda installations tailored for their specific applications. When working with conda, be aware that
- conda init or mamba init is a bad idea
- your home-directory on Maxwell has a hard limit of 30GB. conda can easily fill up all your quota
The simplest option to setup conda python is
module load maxwell mamba # or use module load maxwell conda which does exactly the same thing . mamba-init # afterwards you can use mamba or conda. for clarity we recommend using mamba.
module load maxwell mamba initializes python=3.9
. mamba-init does the same as the "mamba init" block in your login environment, but without side-effects
- if you encounter permission problems, please try (once) (and you might want to choose a different location if your home gets too tight):
cat <<eof >> ~/.condarc pkgs_dirs: - /home/$USER/.conda/pkgs eof
Note: the latest version of the conda/3.9 module automatically generates .condarc if none is present!
Available environments
# list available environments mamba env list # conda env list will work as well # conda environments: # base * /software/mamba/2022.06 rapids-22.04 /software/mamba/2022.06/envs/rapids-22.04 # activate an environment mamba activate rapids-22.04
Using conda environments
conda environments allow to install python versions and packages in a self-contained way. For example
module load maxwell conda/3.9 # you can replace mamba by conda in the following steps .... . mamba-init mamba create -n hexrd python=3.8 mamba activate hexrd mamba install -c hexrd -c conda-forge hexrd
That will produce an environment (located in ~/.conda/envs) containing hexrd and all dependencies.
You can create a kernel to be used in the jupyterhub from an environment:
module load maxwell conda/3.9 . mamba-init mamba activate hexrd mamba install ipykernel -c conda-forge python -m ipykernel install --user --name=hexrd
kernel definitions are very simple json files. If the creation of the kernel using ipython fails for any reason, you can create one manually. For example
mkdir -p ~/.local/share/jupyter/kernels/hexrd cat <<eof> ~/.local/share/jupyter/kernels/hexrd/kernel.json { "argv": [ "/home/<username>/.conda/envs/hexrd/bin/python", "-m", "ipykernel_launcher", "-f", "{connection_file}" ], "display_name": "tf-gpu-2.4", "language": "python", "metadata": { "debugger": true } eof
Conda environments in BeeGFS
As mentioned, conda can quickly consume your 30GB quota of your home directory. You can install conda environments in other locations like BeeGFS, but it's not entirely free of problems. The BeeGFS setup is simply lacking the hardware to cope with myriads of metadata-requests (e.g. everything doing a "stat" on a file or directory like ls -lR is very expensive). conda unfortunately touches a huge number of files, and spawning multiple processes magnifies the problem. So conda environments might become quite slow. Preparing singularity images with conda environments might be a good alternative.
Be aware: if you install conda-environments in BeeGFS the conda pkgs also have to reside in BeeGFS! Mixing GPFS and BeeGFS will inevitably result in broken environments. There is however a simple way around the problem. For example
- create ~/.condarc to use your home-directory for environment installations
- create ~/.condarc.beegfs to use BeeGFS for environment installations
.condarc auto_activate_base: false channels: - conda-forge channel_priority: disabled pkgs_dirs: - ~/.conda/pkgs envs_dirs: - ~/.conda/envs | .condarc.beegfs auto_activate_base: false channels: - conda-forge channel_priority: disabled pkgs_dirs: - /beegfs/desy/user/<username>/.conda/pkgs envs_dirs: - /beegfs/desy/user/<username>/.conda/envs |
You can than switch between environments in $HOME and in BeeGFS:
# install environments in BeeGFS export CONDARC=~/.condarc.beegfs mamba create -n env-in-beegfs python=3.10 mamba activate env-in-beegfs [...] # install environments in $HOME unset CONDARC mamba create -n env-in-home python=3.10 mamba activate env-in-home [...]
module load maxwell mamba/3.9 . mamba-init mk-beegfs # create your beegfs folder if you don't have one yet mkdir -p /tmp/$USER/spack-stage/ # the mamba module will usually do this for you export CONDARC=~/.condarc.beegfs mamba create -n hexrd python=3.8 mamba activate hexrd mamba install -c hexrd -c conda-forge hexrd # create a jupyter kernel mamba install ipykernel -c conda-forge python -m ipykernel install --user --name=hexrd # The environment now resides in /beegfs/desy/user/<username>/.conda/envs/hexrd # In most cases you don't need to activate the environment to run the code, but just set the PATH: export PATH=/beegfs/desy/user/<username>/.conda/envs/hexrd/bin:$PATH
Moving existing conda environments to BeeGFS
It's not possible to simply move a conda environment from GPFS (e.g. $HOME) to BeeGFS. It's however relatively easy to clone a conda environment. Lets assume you've created ~/.condarc.beegfs as oulined above:
export CONDARC=~/.condarc.beegfs module load maxwell mamba/3.9 . mamba-init mamba create --prefix /beegfs/desy/user/$USER/.conda/envs/my-new-env --clone /home/$USER/.conda/envs/my-old-env # activate the environment mamba activate my-new-env # should work as long as CONDARC is set. If it fails: mamba activate /beegfs/desy/user/$USER/.conda/envs/my-new-env # alternatively just set the PATH, works in most cases: export PATH= /beegfs/desy/user/$USER/.conda/envs/my-new-env/bin:$PATH
Adding packages globally (for your account)
pip is usually the easiest to install packages for all your python environments, but be aware that this can quickly lead to inconsistencies:
module load maxwell mamba/3.9 . conda-init python3 -m pip install --user --upgrade numpy # it works exactly the same way when working with the system python 3.6.
Note:
- packages will install in ~/.local/bin and ~/.local/lib/python3.9/site-packages
- you will need to add ~/.local/bin to your PATH or use a full path to execute commands installed there
- ~/.local takes precedence over packages installed in any environment. It hence can easily break dependencies. conda or virtual environments are the better choice.
Making your own mamba installation
https://github.com/conda-forge/miniforge offers lightweight installer which will exclusively use the conda-forge channel. A simple installation instruction could look like this
# fetch installer wget https://github.com/conda-forge/miniforge/releases/latest/download/Mambaforge-Linux-x86_64.sh # silent install, use -h for help PREFIX=/beegfs/desy/user/$USER/minitest /bin/bash Mambaforge-Linux-x86_64.sh -b -s -p $PREFIX # setup PATH export PATH=$PREFIX/bin:$PATH
# create mamba-init cat <<eof > $PREFIX/bin/mamba-init # >>> conda initialize >>> # !! Contents within this block are managed by 'conda init' !! __conda_setup="\$('$PREFIX/bin/conda' 'shell.bash' 'hook' 2> /dev/null)" if [ \$? -eq 0 ]; then eval "\$__conda_setup" else if [ -f "$PREFIX/etc/profile.d/conda.sh" ]; then . "$PREFIX/etc/profile.d/conda.sh" else export PATH="$PREFIX/bin:\$PATH" fi fi unset __conda_setup if [ -f "$PREFIX/etc/profile.d/mamba.sh" ]; then . "$PREFIX/etc/profile.d/mamba.sh" fi # <<< conda initialize <<< eof # Note: if you just use an editor to create the file, replace \$ by $! # Note: if you don't have write permission in - or don't want to modify - the mamba installation, you need to define the pkgs-directory (also see above), e.g. cat <<eof >> ~/.condarc auto_activate_base: false pkgs_dirs: - /home/$USER/.conda/pkgs eof
Now install mamba and continue with package installations and environments:
. mamba-init mamba install -y numpy scipy matplotlib # create a python 3.7 test environment: mkdir -p /tmp/$USER/spack-stage/ mamba create -n py37 python=3.7 # use the environment: mamba env list # conda environments: # base * /beegfs/desy/user/schluenz/minitest py37 /beegfs/desy/user/schluenz/minitest/envs/py37 mamba activate py37 mamba list # installed packages mamba install numpy ... # add packages to py37 environment
Note: when creating environments in beegfs, the packages also have to be in beegfs! There are basically two ways to achieve that:
# Option 1. cd rm -rf ~/.conda # removes all your conda stuff! mkdir -p /beegfs/desy/user/$USER/.conda ln -s /beegfs/desy/user/$USER/.conda . mamba create -n my-conda-env python=3.8 # This way .conda, pkgs-dir and environments will all reside in beegfs. No need to use prefixes. # ---------------------------------------------------------------------------------------------- # # Option 2. mkdir -p /beegfs/desy/user/$USER/.conda/pkgs conda config --add pkgs_dirs /beegfs/desy/user/$USER/.conda/pkgs mamba create --prefix=/beegfs/desy/user/$USER/my-conda-env python=3.8 # This will place packages and environment into beegfs, but will leave for example environments.txt in ~/.conda/.