Using Python on Cluster

Many tools, such as scipy, numpy, pandas, or skimage require newer version of Python. Because Python is a core system component, we cannot upgrade it on our cluster. Luckily, Anaconda packed those tools, among many other packages.

To use Python3.8 on cluster, run the following command in your shell, or put it into .bashrc:

$ source /sw/anaconda/3.8-2020.07/thisconda.sh

To use Python3.7 on cluster, run the following command in your shell, or put it into .bashrc:

$ source /sw/anaconda/3.7-2020.02/thisconda.sh

for Python2.7, run:

$ source /sw/anaconda/2.7-2019.10/thisconda.sh

Conda Environments

If the tools you needed is not included in the standard Anaconda installation, try use conda environment which can have different version of Python and/or packages. To create an environment with Python 3.7:

$ conda create --name mytestenv python=3.7

Activate this new environment:

$ conda activate mytestenv

Then use conda or pip to install new packages.

More on Managing Conda Environments.

Conda Environments in HTCondor

You can also use conda environments in HTCondor. The trick is to use the python interpreter comes with the new environment. For example, python binary of above mytestenv is under /home/username/.conda/envs/mytestenv/bin. The condor job file looks like:

Executable = /home/username/.conda/envs/mytestenv/bin/python
Universe = vanilla
Notification = Never
request_cpus = 1
Output = log/job.out
Error  = log/job.err
Log    = log/job.log
Arguments = my_python_script.py arg1 arg2 arg3
Queue 1