本家様 https://github.com/YoshitakaMo/localcolabfold

ColabFold様の「AlphaFold2_advanced」をlocalで実行する
ColabFold様による「ColabFold」のご説明はこちら
https://docs.google.com/presentation/d/1mnffk23ev2QMDzGZ5w1skXEadTe54l8-Uei6ACce8eI/edit#slide=id.p

インストール(その1)

単純にlocalcolabfoldのインストールプログラムを取得して叩けば準備は完了となります

[illya@s ~]$ wget https://raw.githubusercontent.com/YoshitakaMo/localcolabfold/main/install_colabfold_linux.sh
 
[illya@s ~]$ bash ./install_colabfold_linux.sh
wget は /bin/wget です
curl は /bin/curl です
downloading the original alphafold as /home/illya/colabfold...
Cloning into '/home/illya/colabfold'...
remote: Enumerating objects: 286, done.
remote: Counting objects: 100% (125/125), done.
remote: Compressing objects: 100% (73/73), done.
remote: Total 286 (delta 67), reused 70 (delta 50), pack-reused 161
Receiving objects: 100% (286/286), 5.65 MiB | 656.00 KiB/s, done.
Resolving deltas: 100% (144/144), done.
Applying several patches to be Alphafold2_advanced...
patching file alphafold/common/protein.py
patching file alphafold/model/model.py
patching file alphafold/model/modules.py
patching file alphafold/model/config.py
Downloading AlphaFold2 trained parameters...
Installing Miniconda3 for Linux...
PREFIX=/home/illya/colabfold/conda
Unpacking payload ...
Collecting package metadata (current_repodata.json): done
Solving environment: done
 :
 :
Downloading stereo_chemical_props.txt...
Applying OpenMM patch...
patching file simtk/openmm/app/topology.py
Hunk #1 succeeded at 353 (offset -3 lines).
Enable GPU-accelerated relaxation...
patching file alphafold/relax/amber_minimize.py
Downloading runner.py
Installation of Alphafold2_advanced finished.
[illya@s ~]$
[illya@s ~]$ du -hs ./colabfold
13G     ./colabfold
 
[illya@s ~]$

あとは、「colabfold/runner.py」にある「sequence = ...」のアミノ酸配列を予測したいアミノ酸配列に変更して 「runner.py」を実行する

[illya@s ~]$ cd colabfold/
[illya@s colabfold]$ vi runner.py
(アミノ酸配列を変更)
(計算開始)
[illya@s colabfold]$ colabfold-conda/bin/python3.7 runner.py

自分のユーザアカウントで達成できます

インストール(その2)

ユーザ毎にlocalcolabfoldを用意するのもいいですが、共有している計算機なら一か所に入れてshareするのもいいのかと.
特定ユーザを作ってshareもいいですが、ここではroot権限で実行環境を作ってみたいと思います

「install_colabfold_linux.sh」(96行のbashスクリプト) の中身を眺めながら

(line 17, gitでalphafoldのプログラムを取得)
[root@s ~]# git clone https://github.com/deepmind/alphafold /apps/colabfold
 
(line 18, git操作)
[root@s ~]# cd /apps/colabfold
[root@s colabfold]# git checkout 1e216f93f06aa04aa699562f504db1d02c3b704c --quiet
 
(line 23-33, colabfoldスクリプトとpatch類の取得)
[root@s colabfold]# wget -qnc https://raw.githubusercontent.com/sokrypton/ColabFold/main/beta/colabfold.py
[root@s colabfold]# wget -qnc https://raw.githubusercontent.com/sokrypton/ColabFold/main/beta/pairmsa.py
[root@s colabfold]# wget -qnc https://raw.githubusercontent.com/sokrypton/ColabFold/main/beta/protein.patch
[root@s colabfold]# wget -qnc https://raw.githubusercontent.com/sokrypton/ColabFold/main/beta/config.patch
[root@s colabfold]# wget -qnc https://raw.githubusercontent.com/sokrypton/ColabFold/main/beta/model.patch
[root@s colabfold]# wget -qnc https://raw.githubusercontent.com/sokrypton/ColabFold/main/beta/modules.patch
[root@s colabfold]# wget -qnc https://raw.githubusercontent.com/YoshitakaMo/localcolabfold/main/gpurelaxation.patch -O gpurelaxation.patch
[root@s colabfold]# wget -qnc https://raw.githubusercontent.com/soedinglab/hh-suite/master/scripts/reformat.pl
 
(line 35-38, patchの適用)
[root@s colabfold]# patch -u alphafold/common/protein.py -i protein.patch
[root@s colabfold]# patch -u alphafold/model/model.py -i model.patch
[root@s colabfold]# patch -u alphafold/model/modules.py -i modules.patch
[root@s colabfold]# patch -u alphafold/model/config.py -i config.patch
 
(line 43-44, alphafoldで使用している「AlphaFold2 trained parameters」を取得)
[root@s colabfold]# cd ..
[root@s apps]# mkdir /apps/colabfold/alphafold/data/params
[root@s apps]# curl -fL https://storage.googleapis.com/alphafold/alphafold_params_2021-07-14.tar | tar x -C /apps/colabfold/alphafold/data/params
*既にalphafoldを組み立てているならそこからlinkでも可
[root@s apps]# rmdir /apps/colabfold/alphafold/data/params
[root@s apps]# (cd /apps/colabfold/alphafold/data; ln -s /apps/AlphafoldData/params . )

line 57 以降は Miniconda3を使ってpython実行環境の整備を行っている.
ここではcrYOLOとかで pyenv/anaconda を使っているので、それに合わせるように微調整します.

export PYENV_ROOT=/apps/pyenv
export PATH=$PYENV_ROOT/bin:$PATH
eval "$(pyenv init - --no-rehash)"
export PATH=$PYENV_ROOT/versions/anaconda3-5.3.1/bin/:$PATH
 
(condaが古いと言われたら)
conda update -n base -c defaults conda

その上で「install_colabfold_linux.sh」を参考に

[root@s ~]# which conda
/apps/pyenv/versions/anaconda3-5.3.1/bin/conda
 
[root@s ~]# conda create -n colabfold
[root@s ~]# source activate colabfold
(colabfold) [root@s ~]# conda install -c conda-forge python=3.7 cudnn==8.2.1.32 cudatoolkit==11.1.1 openmm==7.5.1 pdbfixer -y
(colabfold) [root@s ~]# 
(colabfold) [root@s ~]# which python3.7
/apps/pyenv/versions/anaconda3-5.3.1/envs/colabfold/bin/python3.7
 
(colabfold) [root@s ~]# python3.7 -m pip install absl-py==0.13.0 biopython==1.79 chex==0.0.7 dm-haiku==0.0.4 dm-tree==0.1.6 \
     immutabledict==2.0.0 jax==0.2.14 ml-collections==0.1.0 numpy==1.19.5 scipy==1.7.0 tensorflow-gpu==2.5.0
(colabfold) [root@s ~]# python3.7 -m pip install jupyter matplotlib py3Dmol tqdm
(colabfold) [root@s ~]# python3.7 -m pip install --upgrade jax jaxlib==0.1.69+cuda111 -f https://storage.googleapis.com/jax-releases/jax_releases.html
 
(colabfold) [root@s ~]# wget -q https://git.scicore.unibas.ch/schwede/openstructure/-/raw/7102c63615b64735c4941278d92b554ec94415f8/modules/mol/alg/src/stereo_chemical_props.txt
(colabfold) [root@s ~]# mkdir -p /apps/colabfold/alphafold/common
(colabfold) [root@s ~]# mv stereo_chemical_props.txt /apps/colabfold/alphafold/common
 
(colabfold) [root@s ~]# (cd /apps/pyenv/versions/anaconda3-5.3.1/envs/colabfold/lib/python3.7/site-packages/ && patch -p0 < /apps/colabfold/docker/openmm.patch)
 
(colabfold) [root@s ~]# (cd /apps/colabfold && patch -u alphafold/relax/amber_minimize.py -i gpurelaxation.patch)
 
(colabfold) [root@s ~]# (cd /apps/colabfold && wget -q "https://raw.githubusercontent.com/YoshitakaMo/localcolabfold/main/runner.py")

以上でインストールは一通り完了
フォルダのサイズは 3.5 GBほど. (linkでparamsを取ったら 13MBほど)

(colabfold) [root@s apps]# du -hs ./colabfold/
3.5G    ./colabfold/
(colabfold) [root@s apps]#

微調整(colabfold/runner.py)

--- colabfold/runner.py.orig    2021-09-13 01:12:02.846070962 +0900
+++ colabfold/runner.py 2021-09-13 01:10:15.455531512 +0900
@@ -640,7 +640,7 @@
       cfg.model.recycle_tol = tol
       cfg.data.eval.num_ensemble = num_ensemble
 
-      params = data.get_model_haiku_params(name,'./alphafold/data')
+      params = data.get_model_haiku_params(name, os.environ['colabfold_path'] + '/alphafold/data')
       model_runner = model.RunModel(cfg, params, is_training=is_training)
       COMPILED = compiled
       recompile = False
@@ -659,7 +659,7 @@
     name = model_name+"_ptm" if use_ptm else model_name
 
     # setup model and/or params
-    params = data.get_model_haiku_params(name, './alphafold/data')
+    params = data.get_model_haiku_params(name, os.environ['colabfold_path'] + '/alphafold/data')
     if use_turbo:
       for k in model_runner.params.keys():
         model_runner.params[k] = params[k]

微調整(colabfold/alphafold/common/residue_constants.py)

--- colabfold/alphafold/common/residue_constants.py.orig        2021-09-13 00:20:25.197649133 +0900
+++ colabfold/alphafold/common/residue_constants.py     2021-09-13 01:04:35.235654436 +0900
@@ -20,6 +20,7 @@
 
 import numpy as np
 import tree
+import os
 
 # Internal import (35fd).
 
@@ -403,7 +404,7 @@
     residue_bond_angles: dict that maps resname --> list of BondAngle tuples
   """
   stereo_chemical_props_path = (
-      'alphafold/common/stereo_chemical_props.txt')
+      os.environ['colabfold_path'] + '/alphafold/common/stereo_chemical_props.txt')
   with open(stereo_chemical_props_path, 'rt') as f:
     stereo_chemical_props = f.read()
   lines_iter = iter(stereo_chemical_props.splitlines())

EnvironmentModules

「/etc/modulefiles/colabfold」

#%Module1.0
set    colabfold        /apps/colabfold
set    root             /apps/pyenv/versions/anaconda3-5.3.1/envs/colabfold
setenv colabfold_path $colabfold
 
prepend-path PYTHONPATH $colabfold
prepend-path PATH       $root/bin:$colabfold

使ってみる

「/apps/colabfold/runner.py」のファイルを修正して、実行すればいいのですが、ここではroot管理下に「runner.py」を置いてしまってます。
これを自分のフォルダにコピーして中身を修正します。

[illya@s ~]$ module load colabfold
[illya@s ~]$ mkdir colabfold
[illya@s ~]$ cd colabfold/
[illya@s colabfold]$ cp /apps/colabfold/runner.py .
 
[illya@s colabfold]$ vi runner.py

そして「runner.py」の「# define sequence」部分で予測したいアミノ酸配列を記載します

sequence = 'PIAQIHILEGRSDEQKETLIREVSEAISRSLDAPLTSVRVIITEMAKGHFGIGGELASK'

っで実行

[illya@s colabfold]$ 
[illya@s colabfold]$ which python
/Appl/pyenv/versions/anaconda3-5.3.1/envs/colabfold/bin/python
 
[illya@s colabfold]$ python runner.py

トップ   編集 添付 複製 名前変更     ヘルプ   最終更新のRSS
Last-modified: 2021-09-30 (木) 02:18:27 (20d)