Alphafold. タンパク質立体構造予測プログラム.

本家様 https://github.com/deepmind/alphafold

ここでは version 2.3.0 を構築します. オリジナルはdockerの利用を想定してます.
dockeを使用しないで自家に作る alphafold_non_docker版はこちら alphafold_non_docker
dockerの代わりに同じコンテナ技術の Singularity を使う場合はこちら Alphafold/Singularity

dockerはdocker-ce でそれに NVIDIA Container Toolkit を設けてます

dockerを用意する

まずはインストールする計算機の紹介.
nvidiaドライバーは入っているけど、cudaライブラリは入れてないです.

[root@alphafold ~]# cat /etc/redhat-release
Rocky Linux release 9.1 (Blue Onyx)
 
[root@alphafold ~]# getenforce
Enforcing
 
[root@alphafold ~]# cat /proc/driver/nvidia/version
NVRM version: NVIDIA UNIX x86_64 Kernel Module  525.60.11  Wed Nov 23 23:04:03 UTC 2022
GCC version:  gcc version 11.3.1 20220421 (Red Hat 11.3.1-2) (GCC)
 
[root@alphafold ~]# nvidia-smi -L
GPU 0: NVIDIA RTX A2000 (UUID: GPU-23cc3ee7-31d3-a068-2f61-5aa00052d084)
 
[root@alphafold ~]# ls -l /usr/local/cuda
ls: cannot access '/usr/local/cuda': No such file or directory
[root@alphafold ~]#

っでdocker-ceのリポジトリを入れます

[root@alphafold ~]# dnf config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
Adding repo from: https://download.docker.com/linux/centos/docker-ce.repo
 
[root@alphafold ~]# head /etc/yum.repos.d/docker-ce.repo              <--- 中身はこんな感じ
[docker-ce-stable]
name=Docker CE Stable - $basearch
baseurl=https://download.docker.com/linux/centos/$releasever/$basearch/stable
enabled=1
gpgcheck=1
gpgkey=https://download.docker.com/linux/centos/gpg
 
[docker-ce-stable-debuginfo]
name=Docker CE Stable - Debuginfo $basearch
baseurl=https://download.docker.com/linux/centos/$releasever/debug-$basearch/stable
[root@alphafold ~]#
 
(docker-ceをインストール)
[root@alphafold ~]# dnf -y install docker-ce docker-ce-cli containerd.io docker-compose-plugin
 
[root@alphafold ~]# systemctl status docker
○ docker.service - Docker Application Container Engine
     Loaded: loaded (/usr/lib/systemd/system/docker.service; disabled; vendor preset: disabled)
     Active: inactive (dead)
TriggeredBy: ○ docker.socket
       Docs: https://docs.docker.com
 
[root@alphafold ~]#
 
(dokcer-ceをデーモンとして起動)
[root@alphafold ~]# systemctl enable docker --now
Created symlink /etc/systemd/system/multi-user.target.wants/docker.service → /usr/lib/systemd/system/docker.service.
[root@alphafold ~]#

NVIDIA Container Toolkitのインストール

https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.htmlを拝見して
取得するリポジトリを選びます. rockylinux9.1なので「rhel9.0」ですね.

[root@alphafold ~]# curl -s -o /etc/yum.repos.d/nvidia-docker.repo https://nvidia.github.io/nvidia-docker/rhel9.0/nvidia-docker.repo
 
(中身確認)
[root@alphafold ~]# head /etc/yum.repos.d/nvidia-docker.repo
[libnvidia-container]
name=libnvidia-container
baseurl=https://nvidia.github.io/libnvidia-container/stable/centos8/$basearch
repo_gpgcheck=1
gpgcheck=0
enabled=1
gpgkey=https://nvidia.github.io/libnvidia-container/gpgkey
sslverify=1
sslcacert=/etc/pki/tls/certs/ca-bundle.crt
 
[root@alphafold ~]# dnf install nvidia-docker2     (以前は nvidia-container-toolkit と nvidia-container-runtime を指定したけど. nvidia-docker2でこれら2つも入るみたい)
 
[root@alphafold ~]# systemctl restart docker

テスト

[root@alphafold ~]# nvidia-container-cli info
NVRM version:   525.60.11
CUDA version:   12.0
 
Device Index:   0
Device Minor:   0
Model:          NVIDIA RTX A2000
Brand:          NvidiaRTX
GPU UUID:       GPU-23cc3ee7-31d3-a068-2f61-5aa00052d084
Bus Location:   00000000:0b:00.0
Architecture:   8.6
 
[root@alphafold ~]#
 
[root@alphafold ~]# docker run --gpus all --rm nvidia/cuda:11.8.0-base-ubuntu22.04 nvidia-smi
Unable to find image 'nvidia/cuda:11.8.0-base-ubuntu22.04' locally
11.8.0-base-ubuntu22.04: Pulling from nvidia/cuda
6e3729cf69e0: Pull complete
33effac16366: Pull complete
49118e74c29b: Pull complete
b40dd12f6d8e: Pull complete
23773815605e: Pull complete
Digest: sha256:7d667ce4e95c299f701074715138bce548a3c51b07e3b64acd29a971c557c5d8
Status: Downloaded newer image for nvidia/cuda:11.8.0-base-ubuntu22.04
Fri Dec 16 10:19:19 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.60.11    Driver Version: 525.60.11    CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA RTX A2000    Off  | 00000000:0B:00.0 Off |                  Off |
| 30%   52C    P0    N/A /  70W |      0MiB /  6138MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
 
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
[root@alphafold ~]#

上出来.
proxyでそのままではお外に出れない場合は

[root@alphafold ~]# EDITOR=vim systemctl edit docker.service
[Service]
Environment = 'http_proxy=http://proxy.sybyl.local:10080' 'https_proxy=http://proxy.sybyl.local:10080'
 
[root@alphafold ~]# systemctl restart docker

とする

っでdockerを利用するユーザを group の docker に加える

[root@alphafold ~]# useradd -m illya
[root@alphafold ~]# usermod -aG docker illya
[root@alphafold ~]# id illya
uid=1000(illya) gid=1000(illya) groups=1000(illya),986(docker)
 
[root@alphafold ~]# su - illya
 
 
[illya@alphafold ~]$ docker run --gpus all --rm nvidia/cuda:11.8.0-base-ubuntu22.04 nvidia-smi -L
GPU 0: NVIDIA RTX A2000 (UUID: GPU-23cc3ee7-31d3-a068-2f61-5aa00052d084)
[illya@alphafold ~]$

っでok

alphafoldのdockerを作成

[root@alphafold ~]# mkdir /apps
[root@alphafold ~]# cd /apps
[root@alphafold apps]# git clone https://github.com/deepmind/alphafold
 
[root@alphafold apps]# cd alphafold/
[root@alphafold alphafold]# ls -CF
afdb/       CONTRIBUTING.md  docs/  LICENSE     README.md         run_alphafold.py       scripts/
alphafold/  docker/          imgs/  notebooks/  requirements.txt  run_alphafold_test.py  setup.py
 
[root@alphafold alphafold]#
 
[root@alphafold alphafold]# docker build -f docker/Dockerfile -t alphafold .
 :
 :
[root@alphafold alphafold]# docker images
REPOSITORY    TAG                                 IMAGE ID       CREATED         SIZE
alphafold     latest                              fcc5a1618a22   2 minutes ago   9.93GB
nvidia/cuda   11.8.0-base-ubuntu22.04             06a52e1c2be9   41 hours ago    239MB
nvidia/cuda   11.1.1-cudnn8-runtime-ubuntu18.04   cd358584cc21   7 weeks ago     4.65GB
[root@alphafold alphafold]#

「Genetic databases」と「model parameters」を取得

この辺はAlphafold/databaseを参照. ここでは「/AlphafoldData」をデータベースの置き場所としました.

実行環境を作る

前段でalphafoldを実行するコンテナが作れた. 次にそのコンテナを使って計算するスクリプトの準備になります.
OSの上に直接構築してもいいのですが、EnvironmentModulesで「module load」させたいので

git clone https://github.com/yyuu/pyenv.git /apps/pyenv
export PYENV_ROOT=/apps/pyenv
export PATH=$PYENV_ROOT/bin:$PATH
 
pyenv install anaconda3-2022.05
pyenv global anaconda3-2022.05
export PATH=$PYENV_ROOT/versions/anaconda3-2022.05/bin/:$PATH
conda update --all

下準備を終わらせて、alphafold向けの環境を作ります

[root@alphafold alphafold]# conda create -n alphafold-docker python=3.8
 
[root@alphafold alphafold]# source activate alphafold-docker
 
(alphafold-docker) [root@alphafold alphafold]# cat /apps/alphafold/docker/requirements.txt
# Dependencies necessary to execute run_docker.py
absl-py==1.0.0
docker==5.0.0
 
(alphafold-docker) [root@alphafold alphafold]# pip install -r /apps/alphafold/docker/requirements.txt
 
(alphafold-docker) [root@alphafold alphafold]# conda deactivate
[root@alphafold alphafold]#
[root@centos7 ~]#

っでEnvironmentModules向けにmodulefileを作ります「/etc/modulefiles/alphafold-docker」

[root@alphafold ~]# mkdir -p /apps/modulefiles
[root@alphafold ~]# vi /apps/modulefiles/alphafold-docker
#%Module1.0
set          root /apps/pyenv/versions/anaconda3-2022.05/envs/alphafold-docker
prepend-path PATH  $root/bin
 
[root@alphafold ~]#

使ってみる

[illya@alphafold ~]$ vi query.fasta
>dummy_sequence
GWSTELEKHREELKEFLKKEGITNVEIRIDNGRLEVRVEGGTERLKRFLEELRQKLEKKGYTVDIKIE
[illya@alphafold ~]$
[illya@alphafold ~]$ mkdir /tmp/alphafold  <-- 初回だけ
 
[illya@alphafold ~]$ module use --append /apps/modulefiles/
[illya@alphafold ~]$ module load alphafold-docker
 
[illya@alphafold ~]$ which python
/apps/pyenv/versions/anaconda3-2022.05/envs/alphafold-docker/bin/python
 
[illya@alphafold ~]$ python /apps/alphafold/docker/run_docker.py
FATAL Flags parsing error:
  flag --data_dir=None: Flag --data_dir must have a value other than None.
  flag --fasta_paths=None: Flag --fasta_paths must have a value other than None.
  flag --max_template_date=None: Flag --max_template_date must have a value other than None.
Pass --helpshort or --helpfull to see help on flags.
 
[illya@alphafold ~]$
 
[illya@alphafold ~]$ python /apps/alphafold/docker/run_docker.py --fasta_paths=query.fasta --max_template_date=2020-05-14 --data_dir=/AlphafoldData  --db_preset=reduced_dbs

結果は「/tmp/alphafold/」に置かれる. 特定の場所に結果を書き込みたい場合は予めそのフォルダを作って「--output_dir=/home/illya/out」と絶対パスで指定する
計算中のログを残したい場合はコマンド行の末尾に「2>&1 | tee query.log」を追加すればいいかも.

複数サブユニットでの予測は

[illya@centos7 ~]$ cat dimaer.fasta
>XP_009313165.1
MRAAFAEARAALAEGEVPVGCVLVPVDASCAANAQLAADDDDDENKSKGSSNSNNSKKNDAVERLIAARG
RNATNREHHALAHAEFVAVEALLRELAANGQQRPASLAGYVLYVVVEPCIMCAAMLLYNRVQKVFFGCGN
PRFGGNGTVLAVHTAAGCSAPGYESSGGHRADEAVALLQEFYRHENTNAPGHKRRRKCECLNN
>XP_009313165.2
MRAAFAEARAALAEGEVPVGCVLVPVDASCAANAQLAADDDDDENKSKGSSNSNNSKKNDAVERLIAARG
RNATNREHHALAHAEFVAVEALLRELAANGQQRPASLAGYVLYVVVEPCIMCAAMLLYNRVQKVFFGCGN
PRFGGNGTVLAVHTAAGCSAPGYESSGGHRADEAVALLQEFYRHENTNAPGHKRRRKCECLNN
 
[illya@centos7 ~]$ mkdir dimaer
[illya@centos7 ~]$ python /apps/alphafold/docker/run_docker.py --fasta_paths=dimaer.fasta --max_template_date=2020-05-14 --data_dir=/AlphafoldData  --db_preset=reduced_dbs --model_preset=multimer --output_dir=/home/illya/dimaer

出力先を予め作る必要がある

memo

作ったdocker imageの中身を探索してみる. 「alphafold/docker/Dockerfile」からimagesにはENTRYPOINTが張られているので

docker run -it --rm --entrypoint /bin/bash alphafold:latest

として中身を見ることができる.


トップ   編集 添付 複製 名前変更     ヘルプ   最終更新のRSS
Last-modified: 2022-12-17 (土) 20:13:46 (47d)