You can run relion
jobs on the HPC by using the singularity container.
Containers are maintained by NVIDIA.
The tags correspond to the relion version.
| ----- | ------------------------ |
| 3.1.3 | nvcr.io/hpc/relion:3.1.3 |
| 3.1.2 | nvcr.io/hpc/relion:3.1.2 |
| 3.1.0 | nvcr.io/hpc/relion:3.1.0 |
(This is mostly grabbed from NVIDIA)
The examples below use the benchmark data.
This is a very large dataset. You may want to submit this as a job.
# get the data
wget ftp://ftp.mrc-lmb.cam.ac.uk/pub/scheres/relion_benchmark.tar.gz
tar -xzvf relion_benchmark.tar.gz
export BENCHMARK_DIR=$PWD/relion_benchmark
cd ${BENCHMARK_DIR}
wget https://gitlab.com/NVHPC/ngc-examples/-/raw/master/relion/single-node/run_relion.sh
chmod +x run_relion.sh
This uses the benchmark data up above.
If you need a job that has access to GPUs you'll need to submit it to a queue that has GPU nodes. These are nodes that begin with g
or p
.
#!/usr/bin/env bash
# submit-relion.sh
#############################################
# run this script as ./submit-relion.sh
#############################################
#SBATCH --time 04:00:00
#SBATCH --exclusive
#SBATCH --partition gpu-single-g5
#SBATCH --constraint g5xlarge
export SINGULARITY_BIND="/shared:/shared,/apps:/apps"
export PWD=$(pwd)
export IMAGE="docker://nvcr.io/hpc/relion:3.1.3"
singularity run --nv --pwd ${PWD} ${IMAGE} ./run_relion.sh
If you need a job that has access to GPUs you'll need to submit it to a queue that has GPU nodes. These are nodes that begin with g
or p
.
#!/bin/bash
# submit-relion-mpi.sh
#############################################
# run this script as ./submit-relion-mpi.sh
#############################################
#SBATCH --time 04:00:00
#SBATCH --exclusive
#SBATCH --partition gpu-single-g5
#SBATCH --constraint g5xlarge
#SBATCH --nodes 2
set -e; set -o pipefail
module load intelmpi
# Set cluster/experiment specific variables
export SINGULARITY_BIND="/shared:/shared,/apps:/apps"
export PWD=$(pwd)
export IMAGE="docker://nvcr.io/hpc/relion:3.1.3"
export relion_sif=${IMAGE}
readonly gpus_per_node=${SLURM_GPUS_ON_NODE}
readonly benchmark_dir="BENCHMARK_RESULTS"
readonly procs_per_gpu=1
readonly cpus_per_node=${SLURM_CPUS_ON_NODE}
readonly iter=10
readonly pool=100
readonly output_dir="run.$(date +%Y.%m.%d.%H.%M)"
export RESULTS="${PWD}/${benchmark_dir}/${output_dir}"
mkdir -p "${RESULTS}"
# Set Relion 3D classification experiment flags
relion_opts="--gpu \
--i Particles/shiny_2sets.star \
--ref emd_2660.map:mrc \
--firstiter_cc \
--ini_high 60 \
--ctf \
--ctf_corrected_ref \
--tau2_fudge 4 \
--K 6 \
--flatten_solvent \
--healpix_order 2 \
--sym C1 \
--iter ${iter} \
--particle_diameter 360 \
--zero_mask \
--oversampling 1 \
--offset_range 5 \
--offset_step 2 \
--norm \
--scale \
--random_seed 0 \
--pool ${pool} \
--dont_combine_weights_via_disc \
--o ${RESULTS}"
# Attempt to use as many CPU cores as possible
readonly procs_per_node=$(((gpus_per_node*procs_per_gpu)+1))
readonly tpg_max=6
readonly tpg=$(( cpus_per_node/procs_per_node ))
readonly threads_per_proc=$(( tpg <= tpg_max ? tpg : tpg_max))
relion_opts+=" --j ${threads_per_proc}"
echo "INFO: Running RELION with:"
echo " ${SLURM_JOB_NUM_NODES:-$SLURM_NNODES} Nodes"
echo " ${gpus_per_node} GPUs per node"
echo " ${procs_per_node} MPI processes per node"
echo " ${procs_per_gpu} MPI processes per GPU"
echo " ${threads_per_proc} threads per worker process"
echo "Running relion_refine_mpi"
srun --mpi=pmix \
--ntasks-per-node=${procs_per_node} \
singularity run --nv --workdir ${PWD} \
${relion_sif} \
relion_refine_mpi \
${relion_opts}
You should see output that looks like this:
Loading intelmpi version 2021.9.0
INFO: Running RELION with:
2 Nodes
1 GPUs per node
2 MPI processes per node
1 MPI processes per GPU
2 threads per worker process
Running relion_refine_mpi
RELION version: 3.1.3-commit-3ee3b6
Precision: BASE=double, CUDA-ACC=single
=== RELION MPI setup ===
+ Number of MPI processes = 4
+ Number of threads per MPI process = 2
+ Total number of threads therefore = 8
+ Leader (0) runs on host = gpu-single-g5-dy-g5xlarge-1
+ Follower 1 runs on host = GPU-HOST-1
+ Follower 2 runs on host = GPU-HOST-2
+ Follower 3 runs on host = GPU-HOST-2
=================
uniqueHost GPU-HOST-1 has 1 ranks.
uniqueHost GPU-HOST-2 has 2 ranks.
GPU-ids not specified for this rank, threads will automatically be mapped to available devices.
Thread 0 on follower 1 mapped to device 0
Thread 1 on follower 1 mapped to device 0
GPU-ids not specified for this rank, threads will automatically be mapped to available devices.
Thread 0 on follower 2 mapped to device 0
Thread 1 on follower 2 mapped to device 0
GPU-ids not specified for this rank, threads will automatically be mapped to available devices.
Thread 0 on follower 3 mapped to device 0
Thread 1 on follower 3 mapped to device 0
Device 0 on gpu-single-g5-dy-g5xlarge-2 is split between 2 followers