本家様 https://github.com/YoshitakaMo/localcolabfold
localcolabfoldは、ColabFold様の「ColabFold: AlphaFold2 using MMseqs2」をlocal(自分の計算機)で実現できる代物でございます.
ColabFold様による「ColabFold」のご説明はこちら
https://docs.google.com/presentation/d/1mnffk23ev2QMDzGZ5w1skXEadTe54l8-Uei6ACce8eI/edit#slide=id.p
Note (May 21, 2024)
Since current GPU-supported jax > 0.4.26 requires CUDA 12.1 or later and cudnn 9, please upgrade or install your CUDA driver and cudnn. CUDA 12.4 is recommended.
deepL先生訳
現在のGPU対応jax > 0.4.26にはCUDA 12.1以降とcudnn 9が必要なため、CUDAドライバとcudnnをアップグレードまたはインストールしてください。CUDA 12.4を推奨します。
あと「GNU compiler version is 9.0 or later」ともあります.
なので
[root@rockylinux ~]# cat /etc/redhat-release
Rocky Linux release 8.10 (Green Obsidian)
[root@rockylinux ~]#
[root@rockylinux ~]# cat /proc/driver/nvidia/version
NVRM version: NVIDIA UNIX x86_64 Kernel Module 535.183.01 Sun May 12 19:39:15 UTC 2024
GCC version: gcc version 8.5.0 20210514 (Red Hat 8.5.0-22) (GCC)
[root@rockylinux ~]#
[root@rockylinux ~]# /usr/local/cuda/bin/nvcc --version
-bash: /usr/local/cuda/bin/nvcc: No such file or directory
[root@rockylinux ~]#
「/usr/local/cuda-12.4」とかのcudaライブラリは入れてません.
gccはRockylinux8では8.5.0系が規定なので要件には合わない. っで「gcc-toolset」で上げてみる
[root@rockylinux ~]# dnf install gcc-toolset-13
:
[root@rockylinux ~]# source scl_source enable gcc-toolset-13
[root@rockylinux ~]# gcc --version
gcc (GCC) 13.2.1 20231205 (Red Hat 13.2.1-6)
Copyright (C) 2023 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
[root@rockylinux ~]#
[root@rockylinux ~]# echo $PATH
/opt/rh/gcc-toolset-13/root/usr/bin:/usr/share/Modules/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin
[root@rockylinux ~]# echo $LD_LIBRARY_PATH
/opt/rh/gcc-toolset-13/root/usr/lib64:/opt/rh/gcc-toolset-13/root/usr/lib
[root@rockylinux ~]#
ってな感じ. 一応これで作ってみる
githubからソースを取得して展開したい場所の上でインストーラーを実行します
[root@rockylinux ~]# mkdir -p /apps/src && cd /apps/src/
[root@rockylinux src]# git clone https://github.com/YoshitakaMo/localcolabfold
[root@rockylinux src]# cd localcolabfold/
[root@rockylinux localcolabfold]# git log -1
commit 07e87ed3cc809ff779d0f367a5adeb34df9dfa3c (HEAD -> main, origin/main, origin/HEAD)
Author: Yoshitaka Moriwaki <virgospica93@gmail.com>
Date: Tue Jun 18 16:56:59 2024 +0900
fix to JAX[cuda12]==0.4.28
[root@rockylinux localcolabfold]#
[root@rockylinux localcolabfold]# cd /apps
[root@rockylinux apps]# ./src/localcolabfold/install_colabbatch_linux.sh
っでここでGPUが使えるかのテストを行ってみる.
[root@rockylinux ~]# source /apps/localcolabfold/conda/etc/profile.d/conda.sh
[root@rockylinux ~]# conda env list
# conda environments:
#
/apps/localcolabfold/colabfold-conda
base /apps/localcolabfold/conda
[root@rockylinux ~]#
[root@rockylinux ~]# conda activate /apps/localcolabfold/colabfold-conda
(/apps/localcolabfold/colabfold-conda) [root@rockylinux ~]# conda list
:
cudatoolkit 11.8.0 h4ba93d1_13 conda-forge
:
jax 0.4.28 pypi_0 pypi
jax-cuda12-pjrt 0.4.28 pypi_0 pypi
jax-cuda12-plugin 0.4.28 pypi_0 pypi
jaxlib 0.4.28 pypi_0 pypi
:
matplotlib 3.9.0 pypi_0 pypi
:
nvidia-cublas-cu12 12.5.2.13 pypi_0 pypi
nvidia-cuda-cupti-cu12 12.5.39 pypi_0 pypi
nvidia-cuda-nvcc-cu12 12.5.40 pypi_0 pypi
nvidia-cuda-nvrtc-cu12 12.5.40 pypi_0 pypi
nvidia-cuda-runtime-cu12 12.5.39 pypi_0 pypi
nvidia-cudnn-cu12 8.9.7.29 pypi_0 pypi
nvidia-cufft-cu12 11.2.3.18 pypi_0 pypi
nvidia-cusolver-cu12 11.6.2.40 pypi_0 pypi
nvidia-cusparse-cu12 12.4.1.24 pypi_0 pypi
nvidia-nccl-cu12 2.22.3 pypi_0 pypi
nvidia-nvjitlink-cu12 12.5.40 pypi_0 pypi
:
tensorflow 2.16.1 pypi_0 pypi
tensorflow-cpu 2.16.1 pypi_0 pypi
tensorflow-io-gcs-filesystem 0.37.0 pypi_0 pypi
:
(/apps/localcolabfold/colabfold-conda) [root@rockylinux ~]# python
Python 3.10.14 | packaged by conda-forge | (main, Mar 20 2024, 12:45:18) [GCC 12.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>
>>> from jax.lib import xla_bridge
>>>
>>> print(xla_bridge.get_backend().platform)
gpu
>>>
>>> quit();
(/apps/localcolabfold/colabfold-conda) [root@rockylinux ~]#
とGPUを認識しているみたい
[root@rockylinux ~]# vi /apps/modulefiles/localcolabfold
#%Module1.0
set root /apps/localcolabfold/colabfold-conda
prepend-path PATH $root/bin
prepend-path LD_LIBRARY_PATH $root/lib
[root@rockylinux ~]#
予測対象の配列は別途ファイルで用意するみたい
[saber@rockylinux test]$ module use /apps/modulefiles
[saber@rockylinux test]$ module load localcolabfold
[saber@rockylinux test]$ colabfold_batch -h
usage: colabfold_batch [-h] [--msa-only]
[--msa-mode {mmseqs2_uniref_env,mmseqs2_uniref_env_envpair,mmseqs2_uniref,single_sequence}]
[--pair-mode {unpaired,paired,unpaired_paired}] [--pair-strategy {complete,greedy}] [--templates]
[--custom-template-path CUSTOM_TEMPLATE_PATH] [--pdb-hit-file PDB_HIT_FILE]
[--local-pdb-path LOCAL_PDB_PATH] [--num-recycle NUM_RECYCLE]
[--recycle-early-stop-tolerance RECYCLE_EARLY_STOP_TOLERANCE] [--num-ensemble NUM_ENSEMBLE]
[--num-seeds NUM_SEEDS] [--random-seed RANDOM_SEED] [--num-models {1,2,3,4,5}]
[--model-type {auto,alphafold2,alphafold2_ptm,alphafold2_multimer_v1,alphafold2_multimer_v2,alphafold2_multimer_v3,deepfold_v1}]
[--model-order MODEL_ORDER] [--use-dropout] [--max-seq MAX_SEQ] [--max-extra-seq MAX_EXTRA_SEQ]
[--max-msa MAX_MSA] [--disable-cluster-profile] [--data DATA] [--amber] [--num-relax NUM_RELAX]
[--relax-max-iterations RELAX_MAX_ITERATIONS] [--relax-tolerance RELAX_TOLERANCE]
[--relax-stiffness RELAX_STIFFNESS] [--relax-max-outer-iterations RELAX_MAX_OUTER_ITERATIONS]
[--use-gpu-relax] [--rank {auto,plddt,ptm,iptm,multimer}] [--stop-at-score STOP_AT_SCORE]
[--jobname-prefix JOBNAME_PREFIX] [--save-all] [--save-recycles] [--save-single-representations]
[--save-pair-representations] [--overwrite-existing-results] [--zip]
[--sort-queries-by {none,length,random}] [--host-url HOST_URL] [--disable-unified-memory]
[--recompile-padding RECOMPILE_PADDING]
input results
:
:
[illya@rockylinux test]$ vi query.fasta
>sample
PIAQIHILEGRSDEQKETLIREVSEAISRSLDAPLTSVRVIITEMAKGHFGIGGELASK
[saber@rockylinux test]$ colabfold_batch --amber --templates --num-recycle 3 --use-gpu-relax ./query.fasta ./out
2024-06-21 15:58:10,322 Running colabfold 1.5.5 (1648d2335943f9a483b6a803ebaea3e76162c788)
2024-06-21 15:58:10,410 Running on GPU
2024-06-21 15:58:11,297 Failed to extract font properties from /usr/share/fonts/google-noto-emoji/NotoColorEmoji.ttf: In FT2Font: Can not load face (unknown file format; error code 0x2)
2024-06-21 15:58:11,326 generated new fontManager
2024-06-21 15:58:11,539 Found 9 citations for tools or databases
2024-06-21 15:58:11,540 Query 1/1: sample (length 59)
COMPLETE: 100%|
2024-06-21 15:58:25,690 Sequence 0 found templates: ['3mb2_C', '6bgn_C', '2fm7_B', '4fdx_A', '1otf_D', '3ry0_B', '1bjp_A', '7m59_B', '6fps_P', '5cln_I',
'6fps_R', '5clo_C', '3abf_B', '4faz_C', '7xuy_A', '7puo_F', '4x1c_F', '6ogm_L', '7puo_F', '2op8_A']
2024-06-21 15:58:26,053 Setting max_seq=512, max_extra_seq=5120
2024-06-21 15:59:17,880 alphafold2_ptm_model_1_seed_000 recycle=0 pLDDT=97.9 pTM=0.786
2024-06-21 15:59:21,973 alphafold2_ptm_model_1_seed_000 recycle=1 pLDDT=97.9 pTM=0.792 tol=0.108
2024-06-21 15:59:26,082 alphafold2_ptm_model_1_seed_000 recycle=2 pLDDT=97.9 pTM=0.791 tol=0.0391
2024-06-21 15:59:30,185 alphafold2_ptm_model_1_seed_000 recycle=3 pLDDT=97.8 pTM=0.788 tol=0.0233
2024-06-21 15:59:30,185 alphafold2_ptm_model_1_seed_000 took 58.9s (3 recycles)
2024-06-21 15:59:34,303 alphafold2_ptm_model_2_seed_000 recycle=0 pLDDT=97.9 pTM=0.795
2024-06-21 15:59:38,420 alphafold2_ptm_model_2_seed_000 recycle=1 pLDDT=97.9 pTM=0.802 tol=0.0654
2024-06-21 15:59:42,544 alphafold2_ptm_model_2_seed_000 recycle=2 pLDDT=97.9 pTM=0.8 tol=0.0286
2024-06-21 15:59:46,674 alphafold2_ptm_model_2_seed_000 recycle=3 pLDDT=97.8 pTM=0.8 tol=0.0254
2024-06-21 15:59:46,674 alphafold2_ptm_model_2_seed_000 took 16.5s (3 recycles)
2024-06-21 16:00:04,602 alphafold2_ptm_model_3_seed_000 recycle=0 pLDDT=97.2 pTM=0.774
2024-06-21 16:00:08,700 alphafold2_ptm_model_3_seed_000 recycle=1 pLDDT=97.4 pTM=0.783 tol=0.29
2024-06-21 16:00:12,803 alphafold2_ptm_model_3_seed_000 recycle=2 pLDDT=97.4 pTM=0.783 tol=0.081
2024-06-21 16:00:16,911 alphafold2_ptm_model_3_seed_000 recycle=3 pLDDT=97.4 pTM=0.783 tol=0.0637
2024-06-21 16:00:16,912 alphafold2_ptm_model_3_seed_000 took 30.2s (3 recycles)
2024-06-21 16:00:21,027 alphafold2_ptm_model_4_seed_000 recycle=0 pLDDT=97.3 pTM=0.772
2024-06-21 16:00:25,151 alphafold2_ptm_model_4_seed_000 recycle=1 pLDDT=97.4 pTM=0.781 tol=0.248
2024-06-21 16:00:29,280 alphafold2_ptm_model_4_seed_000 recycle=2 pLDDT=97.2 pTM=0.779 tol=0.0473
2024-06-21 16:00:33,415 alphafold2_ptm_model_4_seed_000 recycle=3 pLDDT=97 pTM=0.778 tol=0.0413
2024-06-21 16:00:33,415 alphafold2_ptm_model_4_seed_000 took 16.5s (3 recycles)
2024-06-21 16:00:37,563 alphafold2_ptm_model_5_seed_000 recycle=0 pLDDT=97.4 pTM=0.783
2024-06-21 16:00:41,715 alphafold2_ptm_model_5_seed_000 recycle=1 pLDDT=97 pTM=0.785 tol=0.237
2024-06-21 16:00:45,862 alphafold2_ptm_model_5_seed_000 recycle=2 pLDDT=96.3 pTM=0.778 tol=0.175
2024-06-21 16:00:50,017 alphafold2_ptm_model_5_seed_000 recycle=3 pLDDT=96.1 pTM=0.776 tol=0.134
2024-06-21 16:00:50,018 alphafold2_ptm_model_5_seed_000 took 16.6s (3 recycles)
2024-06-21 16:00:50,024 reranking models by 'plddt' metric
2024-06-21 16:00:50,186 Warning: importing 'simtk.openmm' is deprecated. Import 'openmm' instead.
2024-06-21 16:00:55,138 Relaxation took 5.1s
2024-06-21 16:00:55,139 rank_001_alphafold2_ptm_model_2_seed_000 pLDDT=97.8 pTM=0.8
2024-06-21 16:00:56,272 Relaxation took 1.1s
2024-06-21 16:00:56,272 rank_002_alphafold2_ptm_model_1_seed_000 pLDDT=97.8 pTM=0.788
2024-06-21 16:00:57,413 Relaxation took 1.1s
2024-06-21 16:00:57,413 rank_003_alphafold2_ptm_model_3_seed_000 pLDDT=97.4 pTM=0.783
2024-06-21 16:00:58,291 Relaxation took 0.9s
2024-06-21 16:00:58,291 rank_004_alphafold2_ptm_model_4_seed_000 pLDDT=97 pTM=0.778
2024-06-21 16:00:59,407 Relaxation took 1.1s
2024-06-21 16:00:59,407 rank_005_alphafold2_ptm_model_5_seed_000 pLDDT=96.1 pTM=0.776
2024-06-21 16:01:00,057 Done
[saber@rockylinux test]$
slurmとかopenPBSでGPUのリソース管理をきちんとできていれば問題ないのですが、そうでない場合、そのまま実行するとその計算機にある全てのGPUを使って計算を行うみたい.
全てのGPUを使っているように見えるが実際には1つのGPUで処理しているみたい
2枚刺しGPUマシンで、cryoSPARCを流して、localcolabfoldを流すとたぶんcryoSPARCが使っているGPUにもそのジョブが回る。せっかく2枚あるならすみ分けてほしい.
その場合は環境変数 CUDA_VISIBLE_DEVICES を使って使用するGPUを指定することが出来るっぽい
CUDA_VISIBLE_DEVICES=1 colabfold_batch --amber --templates --num-recycle 3 --use-gpu-relax ./query.fasta ./out
CUDA_VISIBLE_DEVICESは0から始まるようで、2枚目は「1」となる。この0とか1とかは、nvidia-smiで表示される GPU 番号と同じっぽい。
実行時に
「Failed to extract font properties from /usr/share/fonts/google-noto-emoji/NotoColorEmoji.ttf: In FT2Font: Can not load face (unknown file format; error code 0x2)」
と表記される。実害はないようです。初回のrunのみみたい