Alphafold. タンパク質立体構造予測プログラム.
本家様 https://github.com/deepmind/alphafold
ここでは version 2.3.0 を構築します. オリジナルはdockerの利用を想定してます.
dockeを使用しないで自家に作る alphafold_non_docker版はこちら alphafold_non_docker
dockerの代わりに同じコンテナ技術の Singularity を使う場合はこちら Alphafold/Singularity
dockerはdocker-ce でそれに NVIDIA Container Toolkit を設けてます
まずはインストールする計算機の紹介.
nvidiaドライバーは入っているけど、cudaライブラリは入れてないです.
[root@alphafold ~]# cat /etc/redhat-release
Rocky Linux release 9.1 (Blue Onyx)
[root@alphafold ~]# getenforce
Enforcing
[root@alphafold ~]# cat /proc/driver/nvidia/version
NVRM version: NVIDIA UNIX x86_64 Kernel Module 525.60.11 Wed Nov 23 23:04:03 UTC 2022
GCC version: gcc version 11.3.1 20220421 (Red Hat 11.3.1-2) (GCC)
[root@alphafold ~]# nvidia-smi -L
GPU 0: NVIDIA RTX A2000 (UUID: GPU-23cc3ee7-31d3-a068-2f61-5aa00052d084)
[root@alphafold ~]# ls -l /usr/local/cuda
ls: cannot access '/usr/local/cuda': No such file or directory
[root@alphafold ~]#
っでdocker-ceのリポジトリを入れます
[root@alphafold ~]# dnf config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
Adding repo from: https://download.docker.com/linux/centos/docker-ce.repo
[root@alphafold ~]# head /etc/yum.repos.d/docker-ce.repo <--- 中身はこんな感じ
[docker-ce-stable]
name=Docker CE Stable - $basearch
baseurl=https://download.docker.com/linux/centos/$releasever/$basearch/stable
enabled=1
gpgcheck=1
gpgkey=https://download.docker.com/linux/centos/gpg
[docker-ce-stable-debuginfo]
name=Docker CE Stable - Debuginfo $basearch
baseurl=https://download.docker.com/linux/centos/$releasever/debug-$basearch/stable
[root@alphafold ~]#
(docker-ceをインストール)
[root@alphafold ~]# dnf -y install docker-ce docker-ce-cli containerd.io docker-compose-plugin
[root@alphafold ~]# systemctl status docker
○ docker.service - Docker Application Container Engine
Loaded: loaded (/usr/lib/systemd/system/docker.service; disabled; vendor preset: disabled)
Active: inactive (dead)
TriggeredBy: ○ docker.socket
Docs: https://docs.docker.com
[root@alphafold ~]#
(dokcer-ceをデーモンとして起動)
[root@alphafold ~]# systemctl enable docker --now
Created symlink /etc/systemd/system/multi-user.target.wants/docker.service → /usr/lib/systemd/system/docker.service.
[root@alphafold ~]#
https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.htmlを拝見して
取得するリポジトリを選びます. rockylinux9.1なので「rhel9.0」ですね.
[root@alphafold ~]# curl -s -o /etc/yum.repos.d/nvidia-docker.repo https://nvidia.github.io/nvidia-docker/rhel9.0/nvidia-docker.repo
(中身確認)
[root@alphafold ~]# head /etc/yum.repos.d/nvidia-docker.repo
[libnvidia-container]
name=libnvidia-container
baseurl=https://nvidia.github.io/libnvidia-container/stable/centos8/$basearch
repo_gpgcheck=1
gpgcheck=0
enabled=1
gpgkey=https://nvidia.github.io/libnvidia-container/gpgkey
sslverify=1
sslcacert=/etc/pki/tls/certs/ca-bundle.crt
[root@alphafold ~]# dnf install nvidia-docker2 (以前は nvidia-container-toolkit と nvidia-container-runtime を指定したけど. nvidia-docker2でこれら2つも入るみたい)
[root@alphafold ~]# systemctl restart docker
テスト
[root@alphafold ~]# nvidia-container-cli info
NVRM version: 525.60.11
CUDA version: 12.0
Device Index: 0
Device Minor: 0
Model: NVIDIA RTX A2000
Brand: NvidiaRTX
GPU UUID: GPU-23cc3ee7-31d3-a068-2f61-5aa00052d084
Bus Location: 00000000:0b:00.0
Architecture: 8.6
[root@alphafold ~]#
[root@alphafold ~]# docker run --gpus all --rm nvidia/cuda:11.8.0-base-ubuntu22.04 nvidia-smi
Unable to find image 'nvidia/cuda:11.8.0-base-ubuntu22.04' locally
11.8.0-base-ubuntu22.04: Pulling from nvidia/cuda
6e3729cf69e0: Pull complete
33effac16366: Pull complete
49118e74c29b: Pull complete
b40dd12f6d8e: Pull complete
23773815605e: Pull complete
Digest: sha256:7d667ce4e95c299f701074715138bce548a3c51b07e3b64acd29a971c557c5d8
Status: Downloaded newer image for nvidia/cuda:11.8.0-base-ubuntu22.04
Fri Dec 16 10:19:19 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.60.11 Driver Version: 525.60.11 CUDA Version: 12.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA RTX A2000 Off | 00000000:0B:00.0 Off | Off |
| 30% 52C P0 N/A / 70W | 0MiB / 6138MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
[root@alphafold ~]#
上出来.
proxyでそのままではお外に出れない場合は
[root@alphafold ~]# EDITOR=vim systemctl edit docker.service
[Service]
Environment = 'http_proxy=http://proxy.sybyl.local:10080' 'https_proxy=http://proxy.sybyl.local:10080'
[root@alphafold ~]# systemctl restart docker
とする
っでdockerを利用するユーザを group の docker に加える
[root@alphafold ~]# useradd -m illya
[root@alphafold ~]# usermod -aG docker illya
[root@alphafold ~]# id illya
uid=1000(illya) gid=1000(illya) groups=1000(illya),986(docker)
[root@alphafold ~]# su - illya
[illya@alphafold ~]$ docker run --gpus all --rm nvidia/cuda:11.8.0-base-ubuntu22.04 nvidia-smi -L
GPU 0: NVIDIA RTX A2000 (UUID: GPU-23cc3ee7-31d3-a068-2f61-5aa00052d084)
[illya@alphafold ~]$
っでok
[root@alphafold ~]# mkdir /apps
[root@alphafold ~]# cd /apps
[root@alphafold apps]# git clone https://github.com/deepmind/alphafold
[root@alphafold apps]# cd alphafold/
[root@alphafold alphafold]# ls -CF
afdb/ CONTRIBUTING.md docs/ LICENSE README.md run_alphafold.py scripts/
alphafold/ docker/ imgs/ notebooks/ requirements.txt run_alphafold_test.py setup.py
[root@alphafold alphafold]#
[root@alphafold alphafold]# docker build -f docker/Dockerfile -t alphafold .
:
:
[root@alphafold alphafold]# docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
alphafold latest fcc5a1618a22 2 minutes ago 9.93GB
nvidia/cuda 11.8.0-base-ubuntu22.04 06a52e1c2be9 41 hours ago 239MB
nvidia/cuda 11.1.1-cudnn8-runtime-ubuntu18.04 cd358584cc21 7 weeks ago 4.65GB
[root@alphafold alphafold]#
この辺はAlphafold/databaseを参照. ここでは「/AlphafoldData」をデータベースの置き場所としました.
前段でalphafoldを実行するコンテナが作れた. 次にそのコンテナを使って計算するスクリプトの準備になります.
OSの上に直接構築してもいいのですが、EnvironmentModulesで「module load」させたいので
git clone https://github.com/yyuu/pyenv.git /apps/pyenv
export PYENV_ROOT=/apps/pyenv
export PATH=$PYENV_ROOT/bin:$PATH
pyenv install anaconda3-2022.05
pyenv global anaconda3-2022.05
export PATH=$PYENV_ROOT/versions/anaconda3-2022.05/bin/:$PATH
conda update --all
下準備を終わらせて、alphafold向けの環境を作ります
[root@alphafold alphafold]# conda create -n alphafold-docker python=3.8
[root@alphafold alphafold]# source activate alphafold-docker
(alphafold-docker) [root@alphafold alphafold]# cat /apps/alphafold/docker/requirements.txt
# Dependencies necessary to execute run_docker.py
absl-py==1.0.0
docker==5.0.0
(alphafold-docker) [root@alphafold alphafold]# pip install -r /apps/alphafold/docker/requirements.txt
(alphafold-docker) [root@alphafold alphafold]# conda deactivate
[root@alphafold alphafold]#
[root@centos7 ~]#
っでEnvironmentModules向けにmodulefileを作ります「/etc/modulefiles/alphafold-docker」
[root@alphafold ~]# mkdir -p /apps/modulefiles
[root@alphafold ~]# vi /apps/modulefiles/alphafold-docker
#%Module1.0
set root /apps/pyenv/versions/anaconda3-2022.05/envs/alphafold-docker
prepend-path PATH $root/bin
[root@alphafold ~]#
[illya@alphafold ~]$ vi query.fasta
>dummy_sequence
GWSTELEKHREELKEFLKKEGITNVEIRIDNGRLEVRVEGGTERLKRFLEELRQKLEKKGYTVDIKIE
[illya@alphafold ~]$
[illya@alphafold ~]$ mkdir /tmp/alphafold <-- 初回だけ
[illya@alphafold ~]$ module use --append /apps/modulefiles/
[illya@alphafold ~]$ module load alphafold-docker
[illya@alphafold ~]$ which python
/apps/pyenv/versions/anaconda3-2022.05/envs/alphafold-docker/bin/python
[illya@alphafold ~]$ python /apps/alphafold/docker/run_docker.py
FATAL Flags parsing error:
flag --data_dir=None: Flag --data_dir must have a value other than None.
flag --fasta_paths=None: Flag --fasta_paths must have a value other than None.
flag --max_template_date=None: Flag --max_template_date must have a value other than None.
Pass --helpshort or --helpfull to see help on flags.
[illya@alphafold ~]$
[illya@alphafold ~]$ python /apps/alphafold/docker/run_docker.py --fasta_paths=query.fasta --max_template_date=2020-05-14 --data_dir=/AlphafoldData --db_preset=reduced_dbs
結果は「/tmp/alphafold/」に置かれる. 特定の場所に結果を書き込みたい場合は予めそのフォルダを作って「--output_dir=/home/illya/out」と絶対パスで指定する
計算中のログを残したい場合はコマンド行の末尾に「2>&1 | tee query.log」を追加すればいいかも.
複数サブユニットでの予測は
[illya@centos7 ~]$ cat dimaer.fasta
>XP_009313165.1
MRAAFAEARAALAEGEVPVGCVLVPVDASCAANAQLAADDDDDENKSKGSSNSNNSKKNDAVERLIAARG
RNATNREHHALAHAEFVAVEALLRELAANGQQRPASLAGYVLYVVVEPCIMCAAMLLYNRVQKVFFGCGN
PRFGGNGTVLAVHTAAGCSAPGYESSGGHRADEAVALLQEFYRHENTNAPGHKRRRKCECLNN
>XP_009313165.2
MRAAFAEARAALAEGEVPVGCVLVPVDASCAANAQLAADDDDDENKSKGSSNSNNSKKNDAVERLIAARG
RNATNREHHALAHAEFVAVEALLRELAANGQQRPASLAGYVLYVVVEPCIMCAAMLLYNRVQKVFFGCGN
PRFGGNGTVLAVHTAAGCSAPGYESSGGHRADEAVALLQEFYRHENTNAPGHKRRRKCECLNN
[illya@centos7 ~]$ mkdir dimaer
[illya@centos7 ~]$ python /apps/alphafold/docker/run_docker.py --fasta_paths=dimaer.fasta --max_template_date=2020-05-14 --data_dir=/AlphafoldData --db_preset=reduced_dbs --model_preset=multimer --output_dir=/home/illya/dimaer
出力先を予め作る必要がある
作ったdocker imageの中身を探索してみる. 「alphafold/docker/Dockerfile」からimagesにはENTRYPOINTが張られているので
docker run -it --rm --entrypoint /bin/bash alphafold:latest
として中身を見ることができる.