改定中
本家様 https://github.com/YoshitakaMo/localcolabfold
localcolabfoldは、ColabFold様の「ColabFold: AlphaFold2 using MMseqs2」をlocal(自分の計算機)で実現できる代物でございます.
ColabFold様による「ColabFold」のご説明はこちら
https://docs.google.com/presentation/d/1mnffk23ev2QMDzGZ5w1skXEadTe54l8-Uei6ACce8eI/edit#slide=id.p
githubで公開されているcolabfoldをローカルに展開すればlocalcolabfoldと同じなのかなと思ったが違うようで、
ちょいと調整が必要みたい. その点localcolabfoldはlocalでの実行を想定して構築されているそうな.
Note (May 21, 2024)
Since current GPU-supported jax > 0.4.26 requires CUDA 12.1 or later and cudnn 9, please upgrade or install your CUDA driver and cudnn. CUDA 12.4 is recommended.
deepL先生訳
現在のGPU対応jax > 0.4.26にはCUDA 12.1以降とcudnn 9が必要なため、CUDAドライバとcudnnをアップグレードまたはインストールしてください。CUDA 12.4を推奨します。あと「GNU compiler version is 9.0 or later」ともあります.
なので
[root@rockylinux9 ~]# cat /etc/redhat-release
Rocky Linux release 9.6 (Blue Onyx)
[root@rockylinux9 ~]# cat /proc/driver/nvidia/version
NVRM version: NVIDIA UNIX Open Kernel Module for x86_64 570.181 Release Build (dvs-builder@U22-I3-AF02-20-5) Wed Jul 30 18:41:07 UTC 2025
GCC version: gcc version 11.5.0 20240719 (Red Hat 11.5.0-5) (GCC)
[root@rockylinux9 ~]# /usr/local/cuda/bin/nvcc --version
-bash: /usr/local/cuda/bin/nvcc: No such file or directory
[root@rockylinux9 ~]#「/usr/local/cuda-12.4」とかのcudaライブラリは入れてません.
gccはRockylinux8では8.5.0系が規定なので要件には合わない. っで「gcc-toolset」で上げてみる
[root@rockylinux9 ~]# gcc --version
gcc (GCC) 11.5.0 20240719 (Red Hat 11.5.0-5)
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
[root@rockylinux9 ~]#ってな感じ. 一応これで作ってみる
githubからソースを取得して展開したい場所の上でインストーラーを実行します
[root@rockylinux9 ~]# mkdir -p /apps/src && cd /apps/src/
[root@rockylinux9 src]# git clone https://github.com/YoshitakaMo/localcolabfold
[root@rockylinux9 src]# cd localcolabfold/
[root@rockylinux9 localcolabfold]# git log -1
commit 930cbcd724d3a68bc66622605dccdf7be6456210 (HEAD -> main, origin/main, origin/HEAD)
Author: Kazuya Ujihara <ujihara.kazuya@gmail.com>
Date: Thu Jun 26 23:54:03 2025 +0900
fix directory name in README (#303)
[root@rockylinux9 localcolabfold]#
[root@rockylinux9 localcolabfold]# cd /apps/
[root@rockylinux9 apps]#
[root@rockylinux9 apps]# ./src/localcolabfold/install_colabbatch_linux.sh
[root@rockylinux9 apps]# ls -l localcolabfold/
total 12
drwxr-xr-x. 3 root root 20 Nov 16 02:45 colabfold
drwxr-xr-x. 18 root root 4096 Nov 16 02:40 colabfold-conda
drwxr-xr-x. 18 root root 4096 Nov 16 02:39 conda
-rwxr-xr-x. 1 root root 1536 Nov 16 02:45 update_linux.sh
[root@rockylinux9 apps]#これでインストールは完了です
[root@rockylinux9 ~]# vi /apps/modulefiles/localcolabfold
#%Module1.0
set root /apps/localcolabfold/colabfold-conda
prepend-path PATH $root/bin
prepend-path LD_LIBRARY_PATH $root/lib
[root@rockylinux9 ~]#予測対象の配列は別途ファイルで用意するみたい
[saber@rockylinux test]$ module use /apps/modulefiles
[saber@rockylinux test]$ module load localcolabfold
[saber@rockylinux test]$ colabfold_batch -h
usage: colabfold_batch [-h] [--msa-only]
[--msa-mode {mmseqs2_uniref_env,mmseqs2_uniref_env_envpair,mmseqs2_uniref,single_sequence}]
[--pair-mode {unpaired,paired,unpaired_paired}] [--pair-strategy {complete,greedy}] [--templates]
[--custom-template-path CUSTOM_TEMPLATE_PATH] [--pdb-hit-file PDB_HIT_FILE]
[--local-pdb-path LOCAL_PDB_PATH] [--num-recycle NUM_RECYCLE]
[--recycle-early-stop-tolerance RECYCLE_EARLY_STOP_TOLERANCE] [--num-ensemble NUM_ENSEMBLE]
[--num-seeds NUM_SEEDS] [--random-seed RANDOM_SEED] [--num-models {1,2,3,4,5}]
[--model-type {auto,alphafold2,alphafold2_ptm,alphafold2_multimer_v1,alphafold2_multimer_v2,alphafold2_multimer_v3,deepfold_v1}]
[--model-order MODEL_ORDER] [--use-dropout] [--max-seq MAX_SEQ] [--max-extra-seq MAX_EXTRA_SEQ]
[--max-msa MAX_MSA] [--disable-cluster-profile] [--data DATA] [--amber] [--num-relax NUM_RELAX]
[--relax-max-iterations RELAX_MAX_ITERATIONS] [--relax-tolerance RELAX_TOLERANCE]
[--relax-stiffness RELAX_STIFFNESS] [--relax-max-outer-iterations RELAX_MAX_OUTER_ITERATIONS]
[--use-gpu-relax] [--rank {auto,plddt,ptm,iptm,multimer}] [--stop-at-score STOP_AT_SCORE]
[--jobname-prefix JOBNAME_PREFIX] [--save-all] [--save-recycles] [--save-single-representations]
[--save-pair-representations] [--overwrite-existing-results] [--zip]
[--sort-queries-by {none,length,random}] [--host-url HOST_URL] [--disable-unified-memory]
[--recompile-padding RECOMPILE_PADDING]
input results
:
:
[illya@rockylinux test]$ vi query.fasta
>sample
PIAQIHILEGRSDEQKETLIREVSEAISRSLDAPLTSVRVIITEMAKGHFGIGGELASK
[saber@rockylinux test]$ colabfold_batch --amber --templates --num-recycle 3 --use-gpu-relax ./query.fasta ./out
2024-06-21 15:58:10,322 Running colabfold 1.5.5 (1648d2335943f9a483b6a803ebaea3e76162c788)
2024-06-21 15:58:10,410 Running on GPU
2024-06-21 15:58:11,297 Failed to extract font properties from /usr/share/fonts/google-noto-emoji/NotoColorEmoji.ttf: In FT2Font: Can not load face (unknown file format; error code 0x2)
2024-06-21 15:58:11,326 generated new fontManager
2024-06-21 15:58:11,539 Found 9 citations for tools or databases
2024-06-21 15:58:11,540 Query 1/1: sample (length 59)
COMPLETE: 100%|
2024-06-21 15:58:25,690 Sequence 0 found templates: ['3mb2_C', '6bgn_C', '2fm7_B', '4fdx_A', '1otf_D', '3ry0_B', '1bjp_A', '7m59_B', '6fps_P', '5cln_I',
'6fps_R', '5clo_C', '3abf_B', '4faz_C', '7xuy_A', '7puo_F', '4x1c_F', '6ogm_L', '7puo_F', '2op8_A']
2024-06-21 15:58:26,053 Setting max_seq=512, max_extra_seq=5120
2024-06-21 15:59:17,880 alphafold2_ptm_model_1_seed_000 recycle=0 pLDDT=97.9 pTM=0.786
2024-06-21 15:59:21,973 alphafold2_ptm_model_1_seed_000 recycle=1 pLDDT=97.9 pTM=0.792 tol=0.108
2024-06-21 15:59:26,082 alphafold2_ptm_model_1_seed_000 recycle=2 pLDDT=97.9 pTM=0.791 tol=0.0391
2024-06-21 15:59:30,185 alphafold2_ptm_model_1_seed_000 recycle=3 pLDDT=97.8 pTM=0.788 tol=0.0233
2024-06-21 15:59:30,185 alphafold2_ptm_model_1_seed_000 took 58.9s (3 recycles)
2024-06-21 15:59:34,303 alphafold2_ptm_model_2_seed_000 recycle=0 pLDDT=97.9 pTM=0.795
2024-06-21 15:59:38,420 alphafold2_ptm_model_2_seed_000 recycle=1 pLDDT=97.9 pTM=0.802 tol=0.0654
2024-06-21 15:59:42,544 alphafold2_ptm_model_2_seed_000 recycle=2 pLDDT=97.9 pTM=0.8 tol=0.0286
2024-06-21 15:59:46,674 alphafold2_ptm_model_2_seed_000 recycle=3 pLDDT=97.8 pTM=0.8 tol=0.0254
2024-06-21 15:59:46,674 alphafold2_ptm_model_2_seed_000 took 16.5s (3 recycles)
2024-06-21 16:00:04,602 alphafold2_ptm_model_3_seed_000 recycle=0 pLDDT=97.2 pTM=0.774
2024-06-21 16:00:08,700 alphafold2_ptm_model_3_seed_000 recycle=1 pLDDT=97.4 pTM=0.783 tol=0.29
2024-06-21 16:00:12,803 alphafold2_ptm_model_3_seed_000 recycle=2 pLDDT=97.4 pTM=0.783 tol=0.081
2024-06-21 16:00:16,911 alphafold2_ptm_model_3_seed_000 recycle=3 pLDDT=97.4 pTM=0.783 tol=0.0637
2024-06-21 16:00:16,912 alphafold2_ptm_model_3_seed_000 took 30.2s (3 recycles)
2024-06-21 16:00:21,027 alphafold2_ptm_model_4_seed_000 recycle=0 pLDDT=97.3 pTM=0.772
2024-06-21 16:00:25,151 alphafold2_ptm_model_4_seed_000 recycle=1 pLDDT=97.4 pTM=0.781 tol=0.248
2024-06-21 16:00:29,280 alphafold2_ptm_model_4_seed_000 recycle=2 pLDDT=97.2 pTM=0.779 tol=0.0473
2024-06-21 16:00:33,415 alphafold2_ptm_model_4_seed_000 recycle=3 pLDDT=97 pTM=0.778 tol=0.0413
2024-06-21 16:00:33,415 alphafold2_ptm_model_4_seed_000 took 16.5s (3 recycles)
2024-06-21 16:00:37,563 alphafold2_ptm_model_5_seed_000 recycle=0 pLDDT=97.4 pTM=0.783
2024-06-21 16:00:41,715 alphafold2_ptm_model_5_seed_000 recycle=1 pLDDT=97 pTM=0.785 tol=0.237
2024-06-21 16:00:45,862 alphafold2_ptm_model_5_seed_000 recycle=2 pLDDT=96.3 pTM=0.778 tol=0.175
2024-06-21 16:00:50,017 alphafold2_ptm_model_5_seed_000 recycle=3 pLDDT=96.1 pTM=0.776 tol=0.134
2024-06-21 16:00:50,018 alphafold2_ptm_model_5_seed_000 took 16.6s (3 recycles)
2024-06-21 16:00:50,024 reranking models by 'plddt' metric
2024-06-21 16:00:50,186 Warning: importing 'simtk.openmm' is deprecated. Import 'openmm' instead.
2024-06-21 16:00:55,138 Relaxation took 5.1s
2024-06-21 16:00:55,139 rank_001_alphafold2_ptm_model_2_seed_000 pLDDT=97.8 pTM=0.8
2024-06-21 16:00:56,272 Relaxation took 1.1s
2024-06-21 16:00:56,272 rank_002_alphafold2_ptm_model_1_seed_000 pLDDT=97.8 pTM=0.788
2024-06-21 16:00:57,413 Relaxation took 1.1s
2024-06-21 16:00:57,413 rank_003_alphafold2_ptm_model_3_seed_000 pLDDT=97.4 pTM=0.783
2024-06-21 16:00:58,291 Relaxation took 0.9s
2024-06-21 16:00:58,291 rank_004_alphafold2_ptm_model_4_seed_000 pLDDT=97 pTM=0.778
2024-06-21 16:00:59,407 Relaxation took 1.1s
2024-06-21 16:00:59,407 rank_005_alphafold2_ptm_model_5_seed_000 pLDDT=96.1 pTM=0.776
2024-06-21 16:01:00,057 Done
[saber@rockylinux test]$slurmとかopenPBSでGPUのリソース管理をきちんとできていれば問題ないのですが、そうでない場合、そのまま実行するとその計算機にある全てのGPUを使って計算を行うみたい.
全てのGPUを使っているように見えるが実際には1つのGPUで処理しているみたい
2枚刺しGPUマシンで、cryoSPARCを流して、localcolabfoldを流すとたぶんcryoSPARCが使っているGPUにもそのジョブが回る。せっかく2枚あるならすみ分けてほしい.
その場合は環境変数 CUDA_VISIBLE_DEVICES を使って使用するGPUを指定することが出来るっぽい
CUDA_VISIBLE_DEVICES=1 colabfold_batch --amber --templates --num-recycle 3 --use-gpu-relax ./query.fasta ./outCUDA_VISIBLE_DEVICESは0から始まるようで、2枚目は「1」となる。この0とか1とかは、nvidia-smiで表示される GPU 番号と同じっぽい。
query配列は外部の api.colabfold.com に流れてそこでMSAが作られてます. なのでインターネット環境が必要です.
alphafold2のようにデータベースの整備は不要で便利なのですが、query配列を外部に出さずにローカル環境内で完結させるにはデータベース整備が必要となります.
構築方法ですが、localcolabfold にはそのツールがなくて、ColabFoldに含まれている. setup_databases.sh がそれ.
[root@rockylinux9 ~]# cd /apps/src/
[root@rockylinux9 src]# git clone https://github.com/sokrypton/ColabFold
[root@rockylinux9 src]# head -n 5 ColabFold/setup_databases.sh
#!/bin/bash -e
# Setup everything for using mmseqs locally
# Set MMSEQS_NO_INDEX to skip the index creation step (not useful for colabfold_search in most cases)
ARIA_NUM_CONN=8
WORKDIR="${1:-$(pwd)}"
[root@rockylinux9 src]#っでデータベースを作りたい場所が「/db/localcolabfold」なら
[root@rockylinux9 src]#
[root@rockylinux9 src]# MMSEQS_NO_INDEX=1 ColabFold/setup_databases.sh /db/localcolabfold
(三日三晩...)
[root@rockylinux9 src]# ls /db/localcolabfold
total 817G
-rw-r--r--. 1 root root 0 Nov 17 15:16 COLABDB_READY
-rw-rw-r--. 1 1004 1005 36G Mar 7 2025 colabfold_envdb_202108_db
-rw-rw-r--. 1 1004 1005 45G Mar 7 2025 colabfold_envdb_202108_db_aln
-rw-rw-r--. 1 1004 1005 4 Mar 7 2025 colabfold_envdb_202108_db_aln.dbtype
-rw-rw-r--. 1 1004 1005 4.8G Mar 7 2025 colabfold_envdb_202108_db_aln.index
-rw-rw-r--. 1 1004 1005 4 Mar 7 2025 colabfold_envdb_202108_db.dbtype
-rw-rw-r--. 1 1004 1005 0 Mar 7 2025 colabfold_envdb_202108_db.GPU_READY
-rw-rw-r--. 1 1004 1005 9.0G Mar 7 2025 colabfold_envdb_202108_db_h
-rw-rw-r--. 1 1004 1005 4 Mar 7 2025 colabfold_envdb_202108_db_h.dbtype
-rw-rw-r--. 1 1004 1005 4.6G Mar 7 2025 colabfold_envdb_202108_db_h.index
-rw-rw-r--. 1 1004 1005 4.8G Mar 7 2025 colabfold_envdb_202108_db.index
-rw-rw-r--. 1 1004 1005 7.7G Mar 7 2025 colabfold_envdb_202108_db.lookup
-rw-rw-r--. 1 1004 1005 122G Mar 7 2025 colabfold_envdb_202108_db_seq
-rw-rw-r--. 1 1004 1005 4 Mar 7 2025 colabfold_envdb_202108_db_seq.dbtype
-rw-rw-r--. 1 1004 1005 24G Mar 7 2025 colabfold_envdb_202108_db_seq_h
-rw-rw-r--. 1 1004 1005 4 Mar 7 2025 colabfold_envdb_202108_db_seq_h.dbtype
-rw-rw-r--. 1 1004 1005 17G Mar 7 2025 colabfold_envdb_202108_db_seq_h.index
-rw-rw-r--. 1 1004 1005 18G Mar 7 2025 colabfold_envdb_202108_db_seq.index
-rw-r--r--. 1 root root 120G Apr 10 2025 colabfold_envdb_202108.tar.gz
-rw-r--r--. 1 root root 0 Nov 17 07:11 DOWNLOADS_READY
drwxr-xr-x. 4 root root 4.0K Nov 17 07:11 pdb
-rw-r--r--. 1 root root 62M Nov 17 15:16 pdb100_230517
-rw-r--r--. 1 root root 4 Nov 17 15:16 pdb100_230517.dbtype
-rw-r--r--. 1 root root 28M Nov 17 06:19 pdb100_230517.fasta.gz
-rw-r--r--. 1 root root 27M Nov 17 15:16 pdb100_230517_h
-rw-r--r--. 1 root root 4 Nov 17 15:16 pdb100_230517_h.dbtype
-rw-r--r--. 1 root root 5.9M Nov 17 15:16 pdb100_230517_h.index
-rw-r--r--. 1 root root 5.9M Nov 17 15:16 pdb100_230517.index
-rw-r--r--. 1 root root 6.5M Nov 17 15:16 pdb100_230517.lookup
-rw-r--r--. 1 root root 27M Nov 17 15:16 pdb100_230517_tmp_h
-rw-r--r--. 1 root root 4 Nov 17 15:16 pdb100_230517_tmp_h.dbtype
-rw-r--r--. 1 root root 5.9M Nov 17 15:16 pdb100_230517_tmp_h.index
-rw-rw-r--. 1 1004 1002 60G Jun 13 2023 pdb100_a3m.ffdata
-rw-rw-r--. 1 1004 1002 6.1M Jun 13 2023 pdb100_a3m.ffindex
-rw-r--r--. 1 root root 18G Nov 17 07:11 pdb100_foldseek_230517.tar.gz
-rw-r--r--. 1 root root 0 Nov 17 15:32 PDB100_READY
-rw-r--r--. 1 root root 0 Nov 17 12:42 PDB_MMCIF_READY
-rw-r--r--. 1 root root 0 Nov 17 15:16 PDB_READY
-rw-rw-r--. 1 1004 1005 8.2G Mar 7 2025 uniref30_2302_db
-rw-rw-r--. 1 1004 1005 26G Mar 7 2025 uniref30_2302_db_aln
-rw-rw-r--. 1 1004 1005 4 Mar 7 2025 uniref30_2302_db_aln.dbtype
-rw-rw-r--. 1 1004 1005 796M Mar 7 2025 uniref30_2302_db_aln.index
-rw-rw-r--. 1 1004 1005 4 Mar 7 2025 uniref30_2302_db.dbtype
-rw-rw-r--. 1 1004 1005 0 Mar 7 2025 uniref30_2302_db.GPU_READY
-rw-rw-r--. 1 1004 1005 4.1G Mar 7 2025 uniref30_2302_db_h
-rw-rw-r--. 1 1004 1005 4 Mar 7 2025 uniref30_2302_db_h.dbtype
-rw-rw-r--. 1 1004 1005 811M Mar 7 2025 uniref30_2302_db_h.index
lrwxrwxrwx. 1 root root 24 Nov 17 13:52 uniref30_2302_db.idx_mapping -> uniref30_2302_db_mapping
lrwxrwxrwx. 1 root root 25 Nov 17 13:52 uniref30_2302_db.idx_taxonomy -> uniref30_2302_db_taxonomy
-rw-rw-r--. 1 1004 1005 795M Mar 7 2025 uniref30_2302_db.index
-rw-rw-r--. 1 1004 1005 1.4G Mar 7 2025 uniref30_2302_db.lookup
-rw-r--r--. 1 root root 2.7G Nov 17 13:52 uniref30_2302_db_mapping
-rw-rw-r--. 1 1004 1005 125G Mar 7 2025 uniref30_2302_db_seq
-rw-rw-r--. 1 1004 1005 4 Mar 7 2025 uniref30_2302_db_seq.dbtype
-rw-rw-r--. 1 1004 1005 41G Mar 7 2025 uniref30_2302_db_seq_h
-rw-rw-r--. 1 1004 1005 4 Mar 7 2025 uniref30_2302_db_seq_h.dbtype
-rw-rw-r--. 1 1004 1005 8.3G Mar 7 2025 uniref30_2302_db_seq_h.index
-rw-rw-r--. 1 1004 1005 8.5G Mar 7 2025 uniref30_2302_db_seq.index
-rw-rw-r--. 1 root root 654M Aug 4 01:51 uniref30_2302_db_taxonomy
-rw-r--r--. 1 root root 1.9G Nov 17 06:19 uniref30_2302_newtaxonomy.tar.gz
-rw-r--r--. 1 root root 100G Nov 16 17:21 uniref30_2302.tar.gz
-rw-r--r--. 1 root root 0 Nov 17 13:52 UNIREF30_READY
[root@rockylinux9 src]#となります
っでは、このローカルに落としたデータベースを使って localcolabfold をローカル、インタネットを使わずに利用してみる.
実行時に
「Failed to extract font properties from /usr/share/fonts/google-noto-emoji/NotoColorEmoji.ttf: In FT2Font: Can not load face (unknown file format; error code 0x2)」
と表記される。実害はないようです。初回のrunのみみたい
っでここでGPUが使えるかのテストを行ってみる.
[root@rockylinux9 ~]# source /apps/localcolabfold/conda/etc/profile.d/conda.sh
[root@rockylinux9 ~]# conda env list
# conda environments:
#
# * -> active
# + -> frozen
/apps/localcolabfold/colabfold-conda
base /apps/localcolabfold/conda
[root@rockylinux9 ~]# conda activate /apps/localcolabfold/colabfold-conda
(/apps/localcolabfold/colabfold-conda) [root@rockylinux9 ~]# conda list
:
colabfold 1.5.5 pypi_0 pypi
contourpy 1.3.2 pypi_0 pypi
cudatoolkit 11.8.0 h4ba93d1_13 conda-forge
:
jax 0.5.3 pypi_0 pypi
jax-cuda12-pjrt 0.5.3 pypi_0 pypi
jax-cuda12-plugin 0.5.3 pypi_0 pypi
jaxlib 0.5.3 pypi_0 pypi
:
keras 3.12.0 pypi_0 pypi
:
mmseqs2 18.8cc5c hd6d6fdc_0 bioconda
:
python 3.10.19 h3c07f61_2_cpython conda-forge
:
tensorflow 2.20.0 pypi_0 pypi
tensorflow-cpu 2.20.0 pypi_0 pypi
:
(/apps/localcolabfold/colabfold-conda) [root@rockylinux ~]# python
Python 3.10.19 | packaged by conda-forge | (main, Oct 22 2025, 22:29:10) [GCC 14.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>
>>> from jax.lib import xla_bridge
>>> print(xla_bridge.get_backend().platform)
gpu
>>>
>>> quit();
(/apps/localcolabfold/colabfold-conda) [root@rockylinux9 ~]#とGPUを認識しているみたい