Alphafoldは docker にて稼働させてますが、もうひとつのコンテナ技術 Singularity で行う場合のお話
本家様 alphafold_singularity https://github.com/prehensilecode/alphafold_singularity

ここではこの alphafold_singularity を構築してみる.
一応https://cloud.sylabs.io/library/prehensilecode/alphafold_singularity/alphafoldにてalphafold :2.2.4 の Singularity イメージが配布されている

Alphafoldにはdocker向けのレシピのDockerfileがある. 同様にalphafold_singularity のgitにはSingularity向けのレシピ def ファイルがある

これら両者を比べたのが下記で、左は alphafld のDokerfile、右は alphafold_singularity のSingularity.def です

# Copyright 2021 DeepMind Technologies Limited                                                      Bootstrap: docker
#                                                                                                   From: nvidia/cuda:11.1.1-cudnn8-runtime-ubuntu18.04
# Licensed under the Apache License, Version 2.0 (the "License");                                   Stage: spython-base
# you may not use this file except in compliance with the License.                                  
# You may obtain a copy of the License at                                                           %files
#                                                                                                   . /app/alphafold
#      http://www.apache.org/licenses/LICENSE-2.0                                                   %post
#                                                                                                   # Copyright 2021 DeepMind Technologies Limited
# Unless required by applicable law or agreed to in writing, software                               #
# distributed under the License is distributed on an "AS IS" BASIS,                                 # Licensed under the Apache License, Version 2.0 (the "License");
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.                          # you may not use this file except in compliance with the License.
# See the License for the specific language governing permissions and                               # You may obtain a copy of the License at
# limitations under the License.                                                                    #
                                                                                                    #      http://www.apache.org/licenses/LICENSE-2.0
ARG CUDA=11.1.1                                                                                     #
FROM nvidia/cuda:${CUDA}-cudnn8-runtime-ubuntu18.04                                                 # Unless required by applicable law or agreed to in writing, software
# FROM directive resets ARGS, so we specify again (the value is retained if                         # distributed under the License is distributed on an "AS IS" BASIS,
# previously set).                                                                                  # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
ARG CUDA                                                                                            # See the License for the specific language governing permissions and
                                                                                                    # limitations under the License.
# Use bash to support string substitution.                                                          
SHELL ["/bin/bash", "-o", "pipefail", "-c"]                                                         # FROM directive resets ARGS, so we specify again (the value is retained if
                                                                                                    # previously set).
RUN apt-get update \                                                                                
    && DEBIAN_FRONTEND=noninteractive apt-get install --no-install-recommends -y \                  apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install --no-install-recommends -y \
        build-essential \                                                                           build-essential \
        cmake \                                                                                     cmake \
        cuda-command-line-tools-$(cut -f1,2 -d- <<< ${CUDA//./-}) \                                 cuda-command-line-tools-11-1 \
        git \                                                                                       git \
        hmmer \                                                                                     hmmer \
        kalign \                                                                                    kalign \
        tzdata \                                                                                    tzdata \
        wget \                                                                                      wget \
    && rm -rf /var/lib/apt/lists/* \                                                                && rm -rf /var/lib/apt/lists/* \
    && apt-get autoremove -y \                                                                      && apt-get autoremove -y \
    && apt-get clean                                                                                && apt-get clean
                                                                                                    
# Compile HHsuite from source.                                                                      # Compile HHsuite from source.
RUN git clone --branch v3.3.0 https://github.com/soedinglab/hh-suite.git /tmp/hh-suite \            /bin/rm -rf /tmp/hh-suite \
    && mkdir /tmp/hh-suite/build \                                                                  && git clone --branch v3.3.0 https://github.com/soedinglab/hh-suite.git /tmp/hh-suite \
    && pushd /tmp/hh-suite/build \                                                                  && mkdir /tmp/hh-suite/build \
    && cmake -DCMAKE_INSTALL_PREFIX=/opt/hhsuite .. \                                               && cd /tmp/hh-suite/build \
    && make -j 4 && make install \                                                                  && cmake -DCMAKE_INSTALL_PREFIX=/opt/hhsuite .. \
    && ln -s /opt/hhsuite/bin/* /usr/bin \                                                          && make -j 4 && make install \
    && popd \                                                                                       && ln -s /opt/hhsuite/bin/* /usr/bin \
    && rm -rf /tmp/hh-suite                                                                         && cd / \
                                                                                                    && /bin/rm -rf /tmp/hh-suite
# Install Miniconda package manager.                                                                
RUN wget -q -P /tmp \                                                                               # Install Miniconda package manager.
  https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh \                           wget -q -P /tmp \
    && bash /tmp/Miniconda3-latest-Linux-x86_64.sh -b -p /opt/conda \                               https://repo.anaconda.com/miniconda/Miniconda3-py37_4.12.0-Linux-x86_64.sh \
    && rm /tmp/Miniconda3-latest-Linux-x86_64.sh                                                    && bash /tmp/Miniconda3-py37_4.12.0-Linux-x86_64.sh -b -p /opt/conda \
                                                                                                    && rm /tmp/Miniconda3-py37_4.12.0-Linux-x86_64.sh
# Install conda packages.                                                                           
ENV PATH="/opt/conda/bin:$PATH"                                                                     # Install conda packages.
RUN conda install -qy conda==4.13.0 \                                                               PATH="/opt/conda/bin:/usr/local/cuda-11.1/bin:$PATH"
    && conda install -y -c conda-forge \                                                            conda install -qy conda==4.13.0 \
      openmm=7.5.1 \                                                                                && conda install -y -c conda-forge \
      cudatoolkit==${CUDA_VERSION} \                                                                openmm=7.5.1 \
      pdbfixer \                                                                                    cudatoolkit==11.1.1 \
      pip \                                                                                         pdbfixer \
      python=3.8 \                                                                                  pip \
      && conda clean --all --force-pkgs-dirs --yes                                                  python=3.7 \
                                                                                                    && conda clean --all --force-pkgs-dirs --yes
COPY . /app/alphafold                                                                               
RUN wget -q -P /app/alphafold/alphafold/common/ \                                                   ### /bin/cp -r . /app/alphafold
  https://git.scicore.unibas.ch/schwede/openstructure/-/raw/7102c63615b64735c4941278d92b(略          
                                                                                                    wget -q -P /app/alphafold/alphafold/common/ \
# Install pip packages.                                                                             https://git.scicore.unibas.ch/schwede/openstructure/-/raw/7102c63615b64735c4941278d92b(略
RUN pip3 install --upgrade pip --no-cache-dir \                                                     
    && pip3 install -r /app/alphafold/requirements.txt --no-cache-dir \                             # Install pip packages.
    && pip3 install --upgrade --no-cache-dir \                                                      # N.B. The URL specifies the list of jaxlib releases.
      jax==0.3.25 \                                                                                 pip3 install --upgrade pip  --no-cache-dir \
      jaxlib==0.3.25+cuda11.cudnn805 \                                                              && pip3 install -r /app/alphafold/requirements.txt --no-cache-dir \
      -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html                         && pip3 install --upgrade --no-cache-dir \
                                                                                                    jax==0.3.17 \
# Apply OpenMM patch.                                                                               jaxlib==0.3.15+cuda11.cudnn805 \
WORKDIR /opt/conda/lib/python3.8/site-packages                                                      -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html
RUN patch -p0 < /app/alphafold/docker/openmm.patch                                                  
                                                                                                    # Apply OpenMM patch.
# Add SETUID bit to the ldconfig binary so that non-root users can run it.                          cd /opt/conda/lib/python3.7/site-packages
RUN chmod u+s /sbin/ldconfig.real                                                                   patch -p0 < /app/alphafold/docker/openmm.patch
                                                                                                    
# We need to run `ldconfig` first to ensure GPUs are visible, due to some quirk                     # Add SETUID bit to the ldconfig binary so that non-root users can run it.
# with Debian. See https://github.com/NVIDIA/nvidia-docker/issues/1399 for                          chmod u+s /sbin/ldconfig.real
# details.                                                                                          
# ENTRYPOINT does not support easily running multiple commands, so instead we                       %environment
# write a shell script to wrap them up.                                                             export PATH="/opt/conda/bin:/usr/local/cuda-11.1/bin:$PATH"
WORKDIR /app/alphafold                                                                              %runscript
RUN echo $'#!/bin/bash\n\                                                                           cd /app/alphafold
ldconfig\n\                                                                                         ldconfig
python /app/alphafold/run_alphafold.py "$@"' > /app/run_alphafold.sh \                              exec python /app/alphafold/run_alphafold.py "$@"
  && chmod +x /app/run_alphafold.sh                                                                 # %startscript
ENTRYPOINT ["/app/run_alphafold.sh"]                                                                # cd /app/alphafold
                                                                                                    # exec python /app/alphafold/run_alphafold.py "$@"
 

ほぼほぼ同じです. ここでは alphafoldがv2.3にupdateされたが(2022.12.11)、alphafold_singularity の方はまだそれ以前だたったので後ほど修正を加えます.
alphafold 2.3.1でもほぼほぼ同じでした

まずは実行環境とSingularityコンテナを作成してみます.

環境はこんな感じ

[root@rockylinux ~]# cat /etc/redhat-release
Rocky Linux release 8.6 (Green Obsidian)
 
[root@rockylinux ~]# cat /proc/driver/nvidia/version
NVRM version: NVIDIA UNIX x86_64 Kernel Module  525.85.05  Sat Jan 14 00:49:50 UTC 2023
GCC version:  gcc version 8.5.0 20210514 (Red Hat 8.5.0-10) (GCC)
 
[root@rockylinux ~]#

実行環境は pyenv/anaconda で作ります
もしtopazらで既に anaconda 環境があるのならそれを使います

git clone https://github.com/yyuu/pyenv.git /apps/pyenv
export PYENV_ROOT=/apps/pyenv
export PATH=$PYENV_ROOT/bin:$PATH
pyenv install anaconda3-2022.10
pyenv global anaconda3-2022.10
 
export PATH=$PYENV_ROOT/versions/anaconda3-2022.10/bin/:$PATH
conda update --all
 
(既にanaconda環境があるなら)
export PYENV_ROOT=/apps/pyenv
export PATH=$PYENV_ROOT/bin:$PATH
export PATH=$PYENV_ROOT/versions/anaconda3-2022.10/bin/:$PATH

作成する環境は alphafold_singularity としてます

[root@rockylinux ~]# conda create -n alphafold_singularity pip -y
 
[root@rockylinux ~]# source  activate alphafold_singularity
(alphafold_singularity) [root@rockylinux ~]#

この状態で構築を進める

(alphafold_singularity) [root@rockylinux ~]# cd /apps/
(alphafold_singularity) [root@rockylinux apps]# git clone https://github.com/deepmind/alphafold
(alphafold_singularity) [root@rockylinux apps]# cd alphafold/
(alphafold_singularity) [root@rockylinux alphafold]#
(alphafold_singularity) [root@rockylinux alphafold]# git tag | tail -1
v2.3.1
(alphafold_singularity) [root@rockylinux alphafold]# git branch              <-- 「git checkout v2.3.1」にするとmodels_to_relaxでエラーになる..
* main
(alphafold_singularity) [root@rockylinux alphafold]#
 
*「/apps/alphafold」の中に「singularity(alphafold_singularityの名称変更)」を配置します
(alphafold_singularity) [root@rockylinux alphafold]# git clone https://github.com/prehensilecode/alphafold_singularity singularity
 
(alphafold_singularity) [root@rockylinux alphafold]# which pip
/apps/pyenv/versions/anaconda3-2022.10/envs/alphafold_singularity/bin/pip
 
(alphafold_singularity) [root@rockylinux alphafold]# cat singularity/requirements.txt
# Dependencies necessary to execute run_singularity.py
absl-py==0.13.0
spython==0.1.16
 
(alphafold_singularity) [root@rockylinux alphafold]# pip install -r singularity/requirements.txt

次に Singularity のコンテナイメージを作成します.
っでdefファイルの修正内容ですが、20230204時点でまだalphafold v2.3.1の内容が反映されてませんが、こちらで修正を加えてみた.

--- singularity/Singularity.def.orig    2023-02-04 00:58:07.116444402 +0900
+++ singularity/Singularity.def 2023-02-04 01:00:05.119492586 +0900
@@ -48,9 +48,9 @@
 
 # Install Miniconda package manager.
 wget -q -P /tmp \
-https://repo.anaconda.com/miniconda/Miniconda3-py37_4.12.0-Linux-x86_64.sh \
-&& bash /tmp/Miniconda3-py37_4.12.0-Linux-x86_64.sh -b -p /opt/conda \
-&& rm /tmp/Miniconda3-py37_4.12.0-Linux-x86_64.sh
+https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh \
+&& bash /tmp/Miniconda3-latest-Linux-x86_64.sh -b -p /opt/conda \
+&& rm /tmp/Miniconda3-latest-Linux-x86_64.sh
 
 # Install conda packages.
 PATH="/opt/conda/bin:/usr/local/cuda-11.1/bin:$PATH"
@@ -60,7 +60,7 @@
 cudatoolkit==11.1.1 \
 pdbfixer \
 pip \
-python=3.7 \
+python=3.8 \
 && conda clean --all --force-pkgs-dirs --yes
 
 ### /bin/cp -r . /app/alphafold
@@ -73,12 +73,12 @@
 pip3 install --upgrade pip  --no-cache-dir \
 && pip3 install -r /app/alphafold/requirements.txt --no-cache-dir \
 && pip3 install --upgrade --no-cache-dir \
-jax==0.3.17 \
-jaxlib==0.3.15+cuda11.cudnn805 \
+jax==0.3.25 \
+jaxlib==0.3.25+cuda11.cudnn805 \
 -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html
 
 # Apply OpenMM patch.
-cd /opt/conda/lib/python3.7/site-packages
+cd /opt/conda/lib/python3.8/site-packages
 patch -p0 < /app/alphafold/docker/openmm.patch
 
 # Add SETUID bit to the ldconfig binary so that non-root users can run it.

これを使って singularity image file を作成します

(alphafold_singularity) [root@rockylinux alphafold]# singularity build alphafold.sif singularity/Singularity.def      <--- 「/apps/alphafold」で実行するのが肝
 
(alphafold_singularity) [root@rockylinux alphafold]# ls -lh alphafold.sif
-rwxr-xr-x. 1 root root 4.5G Feb  4 01:10 alphafold.sif
 
(alphafold_singularity) [root@rockylinux alphafold]# conda deactivate
[root@rockylinux alphafold]#

4.5GBほどのサイズになりました.

計算に使用するスクリプトは「/apps/alphafold/singularity/run_singularity.py」ですが
これも alphafold v2.3.1 に対応していないので下記のように修正します.

--- singularity/run_singularity.py.orig 2023-02-04 01:14:20.425092745 +0900
+++ singularity/run_singularity.py      2023-02-04 01:17:07.223577644 +0900
@@ -1,3 +1,4 @@
+#!/apps/pyenv/versions/anaconda3-2022.10/envs/alphafold_singularity/bin/python
 # Copyright 2021 DeepMind Technologies Limited
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
@@ -44,12 +45,15 @@
 
 flags.DEFINE_bool(
     'use_gpu', True, 'Enable NVIDIA runtime to run with GPUs.')
-flags.DEFINE_boolean(
-    'run_relax', True,
-    'Whether to run the final relaxation step on the predicted models. Turning '
-    'relax off might result in predictions with distracting stereochemical '
-    'violations but might help in case you are having issues with the '
-    'relaxation stage.')
+flags.DEFINE_enum('models_to_relax', 'best', ['best', 'all', 'none'],
+                  'The models to run the final relaxation step on. '
+                  'If `all`, all models are relaxed, which may be time '
+                  'consuming. If `best`, only the most confident model is '
+                  'relaxed. If `none`, relaxation is not run. Turning off '
+                  'relaxation might result in predictions with '
+                  'distracting stereochemical violations but might help '
+                  'in case you are having issues with the relaxation '
+                  'stage.')
 flags.DEFINE_bool(
     'enable_gpu_relax', True, 'Run relax on GPU if GPU is enabled.')
 flags.DEFINE_string(
@@ -145,7 +149,7 @@
 
   # Path to the MGnify database for use by JackHMMER.
   mgnify_database_path = os.path.join(
-      FLAGS.data_dir, 'mgnify', 'mgy_clusters_2018_12.fa')
+      FLAGS.data_dir, 'mgnify', 'mgy_clusters_2022_05.fa')
 
   # Path to the BFD database for use by HHblits.
   bfd_database_path = os.path.join(
@@ -156,9 +160,9 @@
   small_bfd_database_path = os.path.join(
       FLAGS.data_dir, 'small_bfd', 'bfd-first_non_consensus_sequences.fasta')
 
-  # Path to the Uniclust30 database for use by HHblits.
-  uniclust30_database_path = os.path.join(
-      FLAGS.data_dir, 'uniclust30', 'uniclust30_2018_08', 'uniclust30_2018_08')
+  # Path to the Uniref30 database for use by HHblits.
+  uniref30_database_path = os.path.join(
+      FLAGS.data_dir, 'uniref30', 'UniRef30_2021_03')
 
   # Path to the PDB70 database for use by HHsearch.
   pdb70_database_path = os.path.join(FLAGS.data_dir, 'pdb70', 'pdb70')
@@ -211,7 +215,7 @@
     database_paths.append(('small_bfd_database_path', small_bfd_database_path))
   else:
     database_paths.extend([
-        ('uniclust30_database_path', uniclust30_database_path),
+        ('uniref30_database_path', uniref30_database_path),
         ('bfd_database_path', bfd_database_path),
     ])
   for name, path in database_paths:
@@ -221,7 +225,7 @@
       command_args.append(f'--{name}={target_path}')
 
   output_target_path = os.path.join(_ROOT_MOUNT_DIRECTORY, 'output')
-  binds.append(f'{output_dir}:{output_target_path}')
+  binds.append(f'{FLAGS.output_dir}:{output_target_path}')
 
   use_gpu_relax = FLAGS.enable_gpu_relax and FLAGS.use_gpu
 
@@ -233,7 +237,7 @@
       f'--benchmark={FLAGS.benchmark}',
       f'--use_precomputed_msas={FLAGS.use_precomputed_msas}',
       f'--num_multimer_predictions_per_model={FLAGS.num_multimer_predictions_per_model}',
-      f'--run_relax={FLAGS.run_relax}',
+      f'--models_to_relax={FLAGS.models_to_relax}',
       f'--use_gpu_relax={use_gpu_relax}',
       '--logtostderr',
   ])
@@ -242,6 +246,7 @@
     '--bind', f'{",".join(binds)}',
     '--env', 'TF_FORCE_UNIFIED_MEMORY=1',
     '--env', 'XLA_PYTHON_CLIENT_MEM_FRACTION=4.0',
+    '--env', f'NVIDIA_VISIBLE_DEVICES={FLAGS.gpu_devices}',
     '--env', 'OPENMM_CPU_THREADS=12'
   ]

先頭行にpythonを付けてます. っでこのスクリプトに実行権を渡します

[root@rockylinux alphafold]# chmod +x singularity/run_singularity.py

これで実行環境は完成. っで次は恒例の EnvironmentModules.

[root@rockylinux ~]# vi /apps/modulefiles/alphafold_singularity
#%Module1.0
#
set          root /apps/pyenv/versions/anaconda3-2022.05/envs/alphafold_singularity
set          af   /apps/alphafold
prepend-path PATH $af/singularity:$root/bin
 
setenv ALPHAFOLD_DIR $af
 
[root@rockylinux ~]#

っでテスト実行

[illya@rockylinux ~]$ module use --append /apps/modulefiles/
[illya@rockylinux ~]$ module load alphafold_singularity
 
[illya@rockylinux ~]$ run_singularity.py
/apps/alphafold/alphafold.sif
FATAL Flags parsing error:
  flag --data_dir=None: Flag --data_dir must have a value other than None.
  flag --fasta_paths=None: Flag --fasta_paths must have a value other than None.
  flag --max_template_date=None: Flag --max_template_date must have a value other than None.
Pass --helpshort or --helpfull to see help on flags.
 
[illya@rockylinux ~]$
 
[illya@rockylinux ~]$ vi query.fasta
>dummy_sequence
GWSTELEKHREELKEFLKKEGITNVEIRIDNGRLEVRVEGGTERLKRFLEELRQKLEKKGYTVDIKIE
 
[illya@rockylinux ~]$ mkdir out
 
[illya@rockylinux ~]$ run_singularity.py --fasta_paths=query.fasta  --max_template_date=2020-05-14 --data_dir=/AlphaFold --db_preset=reduced_dbs --output_dir=out
                      
(結果は $HOME/out に置かれる. --output_dir=. とすると入力ファイルのファイル名を使ったフォルダが用意されます. query.fasta --> queryフォルダが作られる )
 
(次にdimerでテスト)
 
[illya@rockylinux ~]$ cat dimer.fasta
>XP_009313165.1.1
MRAAFAEARAALAEGEVPVGCVLVPVDASCAANAQLAADDDDDENKSKGSSNSNNSKKNDAVERLIAARG
RNATNREHHALAHAEFVAVEALLRELAANGQQRPASLAGYVLYVVVEPCIMCAAMLLYNRVQKVFFGCGN
PRFGGNGTVLAVHTAAGCSAPGYESSGGHRADEAVALLQEFYRHENTNAPGHKRRRKCECLNN
>XP_009313165.1.2
MRAAFAEARAALAEGEVPVGCVLVPVDASCAANAQLAADDDDDENKSKGSSNSNNSKKNDAVERLIAARG
RNATNREHHALAHAEFVAVEALLRELAANGQQRPASLAGYVLYVVVEPCIMCAAMLLYNRVQKVFFGCGN
PRFGGNGTVLAVHTAAGCSAPGYESSGGHRADEAVALLQEFYRHENTNAPGHKRRRKCECLNN
 
[illya@rockylinux ~]$ run_singularity.py --fasta_paths=dimer.fasta --max_template_date=2020-05-14 --data_dir=/AlphaFold  \
--db_preset=reduced_dbs --model_preset=multimer --output_dir=.

めも

Singularityイメージファイルで計算できるのでクラスターマシンとかでも使えるのかも.

ただ、Alphafoldの更新に伴い入力パラメータの追加があると、run_singularity.pyも追随して修正する必要がある.
この辺が面倒なところかなと思ってる.

ならオリジナルのdockerを使えば?なのでしょうけどね.


トップ   編集 添付 複製 名前変更     ヘルプ   最終更新のRSS
Last-modified: 2023-02-04 (土) 01:25:37 (47d)