Alphafoldは docker にて稼働させてますが、もうひとつのコンテナ技術 Singularity で行う場合のお話
本家様 alphafold_singularity https://github.com/prehensilecode/alphafold_singularity
ここではこの alphafold_singularity を構築してみる.
Alphafoldにはdocker向けのレシピのDockerfileがある. 同様にalphafold_singularity のgitにはSingularity向けのレシピ def ファイルがある
これら両者を比べたのが下記で、左は alphafld のDokerfile、右は alphafold_singularity のSingularity.def です
alphafold(20240610,alphafold/docker/Dockerfile,dbe2a43) alphafold_singularity(20240610,alphafold_singularity/Singularity.def, 86dba48)
# Copyright 2021 DeepMind Technologies Limited Bootstrap: docker
# From: nvidia/cuda:11.1.1-cudnn8-runtime-ubuntu18.04
# Licensed under the Apache License, Version 2.0 (the "License"); Stage: spython-base
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at %files
# . /app/alphafold
# http://www.apache.org/licenses/LICENSE-2.0 %post
# # Copyright 2023 David Chin
# Unless required by applicable law or agreed to in writing, software #
# distributed under the License is distributed on an "AS IS" BASIS, # This file is part of alphafold_singularity.
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. #
# See the License for the specific language governing permissions and # alphafold_singularity is free software: you can redistribute it and/or
# limitations under the License. # modify it under the terms of the GNU General Public License as
# published by the Free Software Foundation, either version 3 of the
ARG CUDA=12.2.2 # License, or (at your option) any later version.
FROM nvidia/cuda:${CUDA}-cudnn8-runtime-ubuntu20.04 #
# FROM directive resets ARGS, so we specify again (the value is retained if # alphafold_singularity is distributed in the hope that it will be
# previously set). # useful, but WITHOUT ANY WARRANTY; without even the implied warranty
ARG CUDA # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
# General Public License for more details.
# Use bash to support string substitution. #
SHELL ["/bin/bash", "-o", "pipefail", "-c"] # You should have received a copy of the GNU General Public License
# along with alphafold_singularity. If not, see <https://www.gnu.org/licenses/>.
RUN apt-get update \
&& DEBIAN_FRONTEND=noninteractive apt-get install --no-install-recommends -y \
build-essential \ # FROM directive resets ARGS, so we specify again (the value is retained if
cmake \ # previously set).
cuda-command-line-tools-$(cut -f1,2 -d- <<< ${CUDA//./-}) \
git \ apt-get update \
hmmer \ && DEBIAN_FRONTEND=noninteractive apt-get install --no-install-recommends -y \
kalign \ build-essential \
tzdata \ cmake \
wget \ cuda-command-line-tools-11-1 \
&& rm -rf /var/lib/apt/lists/* \ git \
&& apt-get autoremove -y \ hmmer \
&& apt-get clean kalign \
tzdata \
# Compile HHsuite from source. wget \
RUN git clone --branch v3.3.0 https://github.com/soedinglab/hh-suite.git /tmp/hh-suite \ && rm -rf /var/lib/apt/lists/* \
&& mkdir /tmp/hh-suite/build \ && apt-get autoremove -y \
&& pushd /tmp/hh-suite/build \ && apt-get clean
&& cmake -DCMAKE_INSTALL_PREFIX=/opt/hhsuite .. \
&& make -j 4 && make install \ # Compile HHsuite from source.
&& ln -s /opt/hhsuite/bin/* /usr/bin \ /bin/rm -rf /tmp/hh-suite \
&& popd \ && git clone --branch v3.3.0 https://github.com/soedinglab/hh-suite.git /tmp/hh-suite \
&& rm -rf /tmp/hh-suite && mkdir /tmp/hh-suite/build \
&& cd /tmp/hh-suite/build \
# Install Miniconda package manager. && cmake -DCMAKE_INSTALL_PREFIX=/opt/hhsuite .. \
RUN wget -q -P /tmp \ && make -j 4 && make install \
https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh \ && ln -s /opt/hhsuite/bin/* /usr/bin \
&& bash /tmp/Miniconda3-latest-Linux-x86_64.sh -b -p /opt/conda \ && cd / \
&& rm /tmp/Miniconda3-latest-Linux-x86_64.sh && /bin/rm -rf /tmp/hh-suite
# Install conda packages. # Install Miniconda package manager.
ENV PATH="/opt/conda/bin:$PATH" wget -q -P /tmp \
ENV LD_LIBRARY_PATH="/opt/conda/lib:$LD_LIBRARY_PATH" https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh \
RUN conda install -qy conda==24.1.2 pip python=3.11 \ && bash /tmp/Miniconda3-latest-Linux-x86_64.sh -b -p /opt/conda \
&& conda install -y -c nvidia cuda=${CUDA_VERSION} \ && rm /tmp/Miniconda3-latest-Linux-x86_64.sh
&& conda install -y -c conda-forge openmm=8.0.0 pdbfixer \
&& conda clean --all --force-pkgs-dirs --yes # Install conda packages.
PATH="/opt/conda/bin:/usr/local/cuda-11.1/bin:$PATH"
COPY . /app/alphafold conda install -qy conda==23.5.2 \
RUN wget -q -P /app/alphafold/alphafold/common/ \ && conda install -y -c conda-forge \
https://git.scicore.unibas.ch/schwede/openstructure/-/raw/7102c63615b64735c49(略 openmm=7.7.0 \4ec94415f8/modules/mol/alg/src/stereo_chemical_props.txt
cudatoolkit==11.1.1 \
# Install pip packages. pdbfixer \
RUN pip3 install --upgrade pip --no-cache-dir \ pip \
&& pip3 install -r /app/alphafold/requirements.txt --no-cache-dir \ python=3.10 \
&& pip3 install --upgrade --no-cache-dir \ && conda clean --all --force-pkgs-dirs --yes
jax==0.4.26 \
jaxlib==0.4.26+cuda12.cudnn89 \ ### /bin/cp -r . /app/alphafold
-f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html
wget -q -P /app/alphafold/alphafold/common/ \
# Add SETUID bit to the ldconfig binary so that non-root users can run it. https://git.scicore.unibas.ch/schwede/openstructure/-/raw/7102c63615b64735c4941278d92b554ec94(略
RUN chmod u+s /sbin/ldconfig.real
# Install pip packages.
# Currently needed to avoid undefined_symbol error. # N.B. The URL specifies the list of jaxlib releases.
RUN ln -sf /usr/lib/x86_64-linux-gnu/libffi.so.7 /opt/conda/lib/libffi.so.7 pip3 install --upgrade pip --no-cache-dir \
&& pip3 install -r /app/alphafold/requirements.txt --no-cache-dir \
# We need to run `ldconfig` first to ensure GPUs are visible, due to some quirk && pip3 install --upgrade --no-cache-dir \
# with Debian. See https://github.com/NVIDIA/nvidia-docker/issues/1399 for jax==0.3.25 \
# details. jaxlib==0.3.25+cuda11.cudnn805 \
# ENTRYPOINT does not support easily running multiple commands, so instead we -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html
# write a shell script to wrap them up.
WORKDIR /app/alphafold # Add SETUID bit to the ldconfig binary so that non-root users can run it.
RUN echo $'#!/bin/bash\n\ chmod u+s /sbin/ldconfig.real
ldconfig\n\
python /app/alphafold/run_alphafold.py "$@"' > /app/run_alphafold.sh \ ### SETUID bit does not matter: Apptainer does not allow suid commands
&& chmod +x /app/run_alphafold.sh ### Workaround below is to use /mnt/out/ld.so.cache for the ld cache file
ENTRYPOINT ["/app/run_alphafold.sh"]
%environment
export PATH="/opt/conda/bin:/usr/local/cuda-11.1/bin:$PATH"
%runscript
cd /app/alphafold
ldconfig -C /mnt/output/ld.so.cache
exec python /app/alphafold/run_alphafold.py "$@"
# %startscript
# cd /app/alphafold
# exec python /app/alphafold/run_alphafold.py "$@"
alphafold側が cuda-12.2 の ubuntu20.04 ベースになった模様.
Singularity側を下記のように修正しました
|
環境はこんな感じ
[root@rockylinux ~]# cat /etc/redhat-release
Rocky Linux release 8.7 (Green Obsidian)
[root@rockylinux ~]# cat /proc/driver/nvidia/version
NVRM version: NVIDIA UNIX x86_64 Kernel Module 515.105.01 Mon Feb 27 12:49:44 UTC 2023
GCC version: gcc version 8.5.0 20210514 (Red Hat 8.5.0-15) (GCC)
[root@rockylinux ~]#
Alphafoldとalphafold_singularityを準備します
[root@rockylinux ~]# cd /apps/
[root@rockylinux apps]# git clone https://github.com/google-deepmind/alphafold
[root@rockylinux apps]# cd alphafold
[root@rockylinux alphafold]# git clone https://github.com/prehensilecode/alphafold_singularity
[root@rockylinux alphafold]# mv alphafold_singularity singularity
[root@rockylinux alphafold]# ls -lF
total 92
drwxr-xr-x 2 root root 23 Feb 16 01:09 afdb
drwxr-xr-x 7 root root 112 Feb 16 01:09 alphafold
-rw-r--r-- 1 root root 973 Feb 16 01:09 CONTRIBUTING.md
drwxr-xr-x 2 root root 69 Feb 16 01:09 docker
drwxr-xr-x 2 root root 68 Feb 16 01:09 docs
drwxr-xr-x 2 root root 54 Feb 16 01:09 imgs
-rw-r--r-- 1 root root 11358 Feb 16 01:09 LICENSE
drwxr-xr-x 2 root root 29 Feb 16 01:09 notebooks
-rw-r--r-- 1 root root 35017 Feb 16 01:09 README.md
-rw-r--r-- 1 root root 209 Feb 16 01:09 requirements.txt
-rw-r--r-- 1 root root 22575 Feb 16 01:09 run_alphafold.py
-rw-r--r-- 1 root root 4284 Feb 16 01:09 run_alphafold_test.py
drwxr-xr-x 2 root root 315 Feb 17 13:08 scripts
-rw-r--r-- 1 root root 2131 Feb 16 01:09 setup.py
drwxr-xr-x 3 root root 197 Feb 17 20:45 singularity
[root@rockylinux alphafold]#
ここからはpyenv/anaconda経由で実行環境を作ります
もしtopazらで既に anaconda 環境があるのならそれを使います
git clone https://github.com/yyuu/pyenv.git /apps/pyenv
export PYENV_ROOT=/apps/pyenv
export PATH=$PYENV_ROOT/bin:$PATH
pyenv install anaconda3-2023.09-0
pyenv global anaconda3-2023.09-0
source /apps/pyenv/versions/anaconda3-2023.09-0/etc/profile.d/conda.sh
conda update --all
(既にanaconda環境があるなら)
source /apps/pyenv/versions/anaconda3-2023.09-0/etc/profile.d/conda.sh
作成する環境は alphafold_singularity としてます
[root@rockylinux alphafold]# conda create -n alphafold_singularity pip -y
[root@rockylinux alphafold]# conda activate alphafold_singularity
(alphafold_singularity) [root@rockylinux alphafold]#
この状態で構築を進める
(alphafold_singularity) [root@rockylinux alphafold]# which pip
/apps/pyenv/versions/anaconda3-2023.09-0/envs/alphafold_singularity/bin/pip
(alphafold_singularity) [root@rockylinux alphafold]# cat singularity/requirements.txt
(略
absl-py==1.0.0
spython==0.3.0
(alphafold_singularity) [root@rockylinux alphafold]# pip install -r singularity/requirements.txt
次に Singularity パッケージをインストールして、
(alphafold_singularity) [root@rockylinux alphafold]# dnf install apptainer apptainer-suid
AlphafoldのSingularity コンテナイメージを作成します.
(alphafold_singularity) [root@rockylinux alphafold]# singularity build alphafold.sif singularity/Singularity.def
(alphafold_singularity) [root@rockylinux alphafold]# ls -lh alphafold.sif
-rwxr-xr-x 1 root root 4.6G Feb 17 21:29 alphafold.sif
(alphafold_singularity) [root@rockylinux alphafold]# conda deactivate
[root@rockylinux alphafold]#
4.6GBほどのサイズになりました.
ここでエラーが発生するなら alphafold/docker/Dockerfile と alphafold_singularity/Singularity.def を見比べて
alphafold/docker/Dockerfileが優先になるように singularity/Singularity.def を修正します.
alphafold と alphafold_singularity は互いに連携しているわけではないので. あとalphafold_singularityに書かれた方法は一見するとバージョン番号で統制が取れるっぽいが失敗する. 目で見比べて修正した方が早い
次に計算に使用するスクリプト「/apps/alphafold/singularity/run_singularity.py」
「/apps/alphafold/singularity」にPATHを入れて「run_singularity.py」を実行すれば計算が出来るようにしたいので
行頭にcondaのpythonを入れる.
|
これで実行環境は完成. っで次は恒例の EnvironmentModules.
いつものように「/apps/modulefiles/alphafold_singularity」として記載
[root@rockylinux ~]# vi /apps/modulefiles/alphafold_singularity
#%Module1.0
#
set root /apps/pyenv/versions/anaconda3-2023.09-0/envs/alphafold_singularity
set af /apps/alphafold
prepend-path PATH $af/singularity:$root/bin
setenv ALPHAFOLD_DIR $af
[root@rockylinux ~]#
[illya@rockylinux ~]$ module use --append /apps/modulefiles/
[illya@rockylinux ~]$ module load alphafold_singularity
[illya@rockylinux ~]$ run_singularity.py
/apps/alphafold/alphafold.sif
FATAL Flags parsing error:
flag --data_dir=None: Flag --data_dir must have a value other than None.
flag --fasta_paths=None: Flag --fasta_paths must have a value other than None.
flag --max_template_date=None: Flag --max_template_date must have a value other than None.
Pass --helpshort or --helpfull to see help on flags.
[saber@rockylinux ~]$
[illya@rockylinux ~]$ vi query.fasta
>dummy_sequence
GWSTELEKHREELKEFLKKEGITNVEIRIDNGRLEVRVEGGTERLKRFLEELRQKLEKKGYTVDIKIE
[illya@rockylinux ~]$ mkdir out
[illya@rockylinux ~]$ run_singularity.py --fasta_paths=query.fasta --max_template_date=2020-05-14 --data_dir=/AlphaFold --db_preset=reduced_dbs --output_dir=out
:
:
[illya@rockylinux ~]$ ls -lF ./out
total 24
-rw-r--r-- 1 illya illya 20104 Feb 17 21:47 ld.so.cache
drwxr-xr-x 3 illya illya 4096 Feb 17 22:16 query/
[illya@rockylinux ~]$
結果は $HOME/out に置かれる. --output_dir=. とすると入力ファイルのファイル名を使ったフォルダが用意されます. query.fasta --> 「query」フォルダが作られる
次にdimerでテスト
[illya@rockylinux ~]$ vi dimer.fasta
>XP_009313165.1.1
MRAAFAEARAALAEGEVPVGCVLVPVDASCAANAQLAADDDDDENKSKGSSNSNNSKKNDAVERLIAARG
RNATNREHHALAHAEFVAVEALLRELAANGQQRPASLAGYVLYVVVEPCIMCAAMLLYNRVQKVFFGCGN
PRFGGNGTVLAVHTAAGCSAPGYESSGGHRADEAVALLQEFYRHENTNAPGHKRRRKCECLNN
>XP_009313165.1.2
MRAAFAEARAALAEGEVPVGCVLVPVDASCAANAQLAADDDDDENKSKGSSNSNNSKKNDAVERLIAARG
RNATNREHHALAHAEFVAVEALLRELAANGQQRPASLAGYVLYVVVEPCIMCAAMLLYNRVQKVFFGCGN
PRFGGNGTVLAVHTAAGCSAPGYESSGGHRADEAVALLQEFYRHENTNAPGHKRRRKCECLNN
[illya@rockylinux ~]$ run_singularity.py --fasta_paths=dimer.fasta --max_template_date=2020-05-14 --data_dir=/AlphaFold \
--db_preset=reduced_dbs --model_preset=multimer --output_dir=.
[illya@rockylinux ~]$ ls -lFd dimer*
drwxr-xr-x 3 illya illya 12288 Feb 17 23:19 dimer/
-rw-r--r-- 1 illya illya 448 Feb 17 22:19 dimer.fasta
[illya@rockylinux ~]$ ls -l dimer/ | head -n 5
total 2586692
-rw-r--r-- 1 illya illya 5595 Feb 17 22:43 confidence_model_1_multimer_v3_pred_0.json
-rw-r--r-- 1 illya illya 5598 Feb 17 22:44 confidence_model_1_multimer_v3_pred_1.json
-rw-r--r-- 1 illya illya 5598 Feb 17 22:45 confidence_model_1_multimer_v3_pred_2.json
-rw-r--r-- 1 illya illya 5604 Feb 17 22:47 confidence_model_1_multimer_v3_pred_3.json
[illya@rockylinux ~]$
Singularityイメージファイルならクラスターマシンとかでも比較的容易に使えます.