本家様 https://github.com/SBQ-1999/CryFold
CryFold is a software that automatically constructs full-atom 3D structural models of proteins based on cryo-EM density maps and sequence information.
It has two main stages: the first step predicts the Cα atom coordinates from the density map, and the second step builds the full-atom model by combining the sequence and density map information.
Finally, the full-atom model will undergo a post-processing program to generate the final protein model. This post-processing program is modified from ModelAngelo.
(deepL様翻訳)
CryFoldは、低温電子顕微鏡密度マップと配列情報に基づいて、タンパク質の全原子3次元構造モデルを自動的に構築するソフトウェアである。
第一段階では密度マップからCα原子座標を予測し、第二段階では配列と密度マップの情報を組み合わせて全原子モデルを構築する。
最後に、全原子モデルはポスト処理プログラムを経て、最終的なタンパク質モデルを生成する。この後処理プログラムはModelAngeloを改良したものである。
nvccが必要なようでその準備をします
[root@rockylinux9 ~]# cat /etc/redhat-release
Rocky Linux release 9.5 (Blue Onyx)
[root@rockylinux9 ~]# cat /proc/driver/nvidia/version
NVRM version: NVIDIA UNIX x86_64 Kernel Module 570.153.02 Tue May 13 16:34:43 UTC 2025
GCC version: gcc version 11.5.0 20240719 (Red Hat 11.5.0-2) (GCC)
[root@rockylinux9 ~]# nvidia-smi -L
GPU 0: NVIDIA GeForce GTX 1070 (UUID: GPU-a49de51b-de1e-52f3-1e3f-ce704e159713)
[root@rockylinux9 ~]# dnf localinstall /Public/cuda/cuda-repo-rhel9-12-8-local-12.8.1_570.124.06-1.x86_64.rpm
[root@rockylinux9 ~]# dnf install cuda-toolkit-12-8
[root@rockylinux9 ~]# dnf remove cuda-repo-rhel9-12-8-local-12.8.1_570.124.06-1.x86_64
[root@rockylinux9 ~]# vi /apps/modulefiles/12.8
#%Module1.0
set cuda /usr/local/cuda-12.8
prepend-path PATH $cuda/bin
prepend-path LD_LIBRARY_PATH $cuda/lib64
prepend-path MANPATH $cuda/share/man
[root@rockylinux9 ~]#
pythonアプリで conda で仮想実行環境を作って利用するみたい.
なのでいつも通りの pyenv/anaconda(miniforge) 環境から作ります.
すでに pyenv/anaconda(miniforge)環境があれば、 sourceしてください
git clone https://github.com/yyuu/pyenv.git /apps/pyenv
export PYENV_ROOT=/apps/pyenv
export PATH=$PYENV_ROOT/bin:$PATH
pyenv install miniforge3-25.1.1-2
その後に
[root@rockylinux9 ~]# source /apps/pyenv/versions/miniforge3-25.1.1-2/etc/profile.d/conda.sh
[root@rockylinux9 ~]# conda update -n base -c conda-forge conda
とconda構築環境にします
っで目的のCryFoldのpython仮想実行環境を用意します
[root@rockylinux9 ~]# module use /apps/modulefiles
[root@rockylinux9 ~]# module load cuda/12.8
[root@rockylinux9 ~]# cd /apps/
[root@rockylinux9 apps]# git clone https://github.com/SBQ-1999/CryFold
[root@rockylinux9 apps]# cd CryFold/
[root@rockylinux9 CryFold]# source ./install.sh
:
:
Finished processing dependencies for CryFold==1.3.2
done!
(CryFold) [root@rockylinux9 CryFold]#
(CryFold) [root@rockylinux9 CryFold]# which build
/apps/pyenv/versions/miniforge3-25.1.1-2/envs/CryFold/bin/build
(CryFold) [root@rockylinux9 CryFold]# build -h
usage: build [-h] --map-path MAP_PATH [--sequence-path SEQUENCE_PATH] [--output-dir OUTPUT_DIR] [--device DEVICE] [--mask-path MASK_PATH]
[--fasta-path FASTA_PATH] [--refine-backbone-path REFINE_BACKBONE_PATH] [--crop-length CROP_LENGTH] [--keep-intermediate-results]
[--config-path CONFIG_PATH]
optional arguments:
-h, --help show this help message and exit
Main arguments:
If you are not very familiar with this software, you can just fill in this part of the parameters. These are also the main parameters of this software.
--map-path MAP_PATH, -v MAP_PATH, --v MAP_PATH
input cryo-em density map
--sequence-path SEQUENCE_PATH, -s SEQUENCE_PATH, --s SEQUENCE_PATH
input sequence fasta file
--output-dir OUTPUT_DIR, -o OUTPUT_DIR, --o OUTPUT_DIR
output directory
--device DEVICE, -d DEVICE, --d DEVICE
compute device, pick one of {cpu, cuda:number}. Default set to use cuda.
Additional arguments:
Adjusting these additional parameters can help you build protein models more efficiently.
--mask-path MASK_PATH, -m MASK_PATH, --m MASK_PATH
Providing the mask map corresponding to the original density map can mask out the redundant regions in the original density map.
--fasta-path FASTA_PATH, -f FASTA_PATH, --f FASTA_PATH
The FASTA database containing all sequences
--refine-backbone-path REFINE_BACKBONE_PATH, -r REFINE_BACKBONE_PATH, --r REFINE_BACKBONE_PATH
Protein backbone atom file for refinement and identification.
--crop-length CROP_LENGTH, -n CROP_LENGTH, --n CROP_LENGTH
The CryNet takes in 'crop_length' number of residues at a time. It can trade space for time.
--keep-intermediate-results, -k, --k
Keep intermediate results, ie see_alpha_output and CryNet_round_x
--config-path CONFIG_PATH, -c CONFIG_PATH, --c CONFIG_PATH
Provide an additional parameter file path. It is recommended to have a detailed understanding of this software before using this
parameter, otherwise use the default parameter values.
(CryFold) [root@rockylinux9 CryFold]#
「/apps/modulefiles/CryFold」
#%Module1.0
module load cuda/12.8
set root /apps/pyenv/versions/miniforge3-25.1.1-2/envs/CryFold
prepend-path PATH $root/bin
[illya@rockylinux9 ~]$ wget -P ./example https://ftp.ebi.ac.uk/pub/databases/emdb/structures/EMD-33306/map/emd_33306.map.gz
[illya@rockylinux9 ~]$ wget https://www.rcsb.org/fasta/entry/7xmv -O ./example/rcsb_pdb_7XMV.fasta
[illya@rockylinux9 ~]$ cd example/
[illya@rockylinux9 example]$ gzip -d emd_33306.map.gz
[illya@rockylinux9 example]$ ls -l
total 65544
-rw-r--r--. 1 illya illya 67109888 Jun 27 2022 emd_33306.map
-rw-r--r--. 1 illya illya 440 May 30 08:40 rcsb_pdb_7XMV.fasta
[illya@rockylinux9 example]$
[illya@rockylinux9 example]$ module use /apps/modulefiles/
[illya@rockylinux9 example]$ module load CryFold
Loading CryFold
Loading requirement: cuda/12.8
[illya@rockylinux9 example]$
[illya@rockylinux9 example]$ build -s rcsb_pdb_7XMV.fasta -v emd_33306.map -o out
---------------------------- CryFold -----------------------------
By Baoquan Su, Yang lab.
--------------------- CryFold Stage1 (Predict C-alpha atoms by U-Net) ---------------------
0%| | 0/64.0 [00:00<?, ?it/s]Traceback (most recent call last):
File "/apps/pyenv/versions/anaconda3-2024.10-1/envs/CryFold/bin/build", line 33, in <module>
sys.exit(load_entry_point('CryFold==1.3.2', 'console_scripts', 'build')())
File "/apps/pyenv/versions/anaconda3-2024.10-1/envs/CryFold/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/apps/pyenv/versions/anaconda3-2024.10-1/envs/CryFold/lib/python3.8/site-packages/CryFold-1.3.2-py3.8.egg/CryFold/build.py", line 185, in main
ca_cif_path = UNet_infer(UNet_args)
File "/apps/pyenv/versions/anaconda3-2024.10-1/envs/CryFold/lib/python3.8/site-packages/CryFold-1.3.2-py3.8.egg/CryFold/Unet/inference.py", line 136, in infer
out_segmentation[grid_batch] = torch.sigmoid(module(segmentation[grid_batch]))
File "/apps/pyenv/versions/anaconda3-2024.10-1/envs/CryFold/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/apps/pyenv/versions/anaconda3-2024.10-1/envs/CryFold/lib/python3.8/site-packages/CryFold-1.3.2-py3.8.egg/CryFold/Unet/Unet.py", line 229, in forward
c0 = self.main4(self.attn4(c1,ds_0))
File "/apps/pyenv/versions/anaconda3-2024.10-1/envs/CryFold/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/apps/pyenv/versions/anaconda3-2024.10-1/envs/CryFold/lib/python3.8/site-packages/torch/nn/modules/container.py", line 141, in forward
input = module(input)
File "/apps/pyenv/versions/anaconda3-2024.10-1/envs/CryFold/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/apps/pyenv/versions/anaconda3-2024.10-1/envs/CryFold/lib/python3.8/site-packages/CryFold-1.3.2-py3.8.egg/CryFold/Unet/Unet.py", line 128, in forward
y = self.conv2(self.activate_class(self.norm1(torch.cat(y_list,dim=1))))
RuntimeError: CUDA out of memory. Tried to allocate 1.31 GiB (GPU 0; 7.92 GiB total capacity; 4.63 GiB already allocated; 1.22 GiB free; 6.05 GiB reserved in total by PyTorch)
If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
0%| | 0/64.0 [00:01<?, ?it/s]
[illya@rockylinux9 example]$
メモリーが足らない...