#author("2022-06-18T19:39:58+00:00","default:sysosa","sysosa") #author("2022-12-17T11:10:10+00:00","default:sysosa","sysosa") &size(10){過去ページ [[Alphafold-v2.0.0]]}; 本家様 [[https://github.com/deepmind/alphafold>+https://github.com/deepmind/alphafold]] AIを使った蛋白質立体構造予測プログラム. ここでは本家様が利用される docker を使わない「&color(magenta){alphafold_non_docker};」版[[https://github.com/kalininalab/alphafold_non_docker>+https://github.com/kalininalab/alphafold_non_docker]]について記載します. dockerを使用したオリジナル版はこちらを参照[[Alphafold]] 使用計算機はCentOS7.9、CUDA-11.6、RTX A2000 ***alphafoldのコード取得 [#p605e421] この中に予測に必要な「Genetic databases」と「model parameters」の取得方法がありますので、まずはコードを取得します &color(red){*};&size(10){最新がv2.2.0なので不要かなと思うが、tagのv2.2.0に合わせておきました}; #code(nonumber){{ [root@centos7 ~]# mkdir -p /apps/src && cd /apps [root@centos7 apps]# git clone https://github.com/deepmind/alphafold && cd alphafold [root@centos7 alphafold]# git tag v2.0.0 v2.0.1 v2.1.0 v2.1.1 v2.1.2 v2.2.0 [root@centos7 alphafold]# git checkout refs/tags/v2.2.0 [root@centos7 alphafold]# git branch --all * (detached from v2.2.0) main remotes/origin/HEAD -> origin/main remotes/origin/main [root@centos7 alphafold]# }} ***「Genetic databases」と「model parameters」を取得 [#ue99cac6] これらを取得するスクリプトが同封されている. ただそのスクリプトの実行には rsync や aris2 コマンドが必要でそれらを事前に入れておく. #code(nonumber){{ [root@centos7 ~]# yum install epel-release [root@centos7 ~]# yum --enablerepo=epel install rsync aria2 }} っでこれらデータを入れておく場所(ここでは「/af」)を用意して、取得用スクリプトを実行する. &color(red){*};&size(10){データ格納場所はSSDとかの高速なデバイスが望ましく、オリジナルのフルセットなBFDを使うなら色々合わせて 2TB の容量が必要かな. 最適化されたBFDを使うなら 1TB ほどかな}; #code(nonumber){{ [root@centos7 ~]# /apps/alphafold/scripts/download_all_data.sh /af (あるいは) [root@centos7 ~]# /apps/alphafold/scripts/download_all_data.sh /af reduced_dbs }} &color(red){*};「reduced_dbs」を設けるとより早く検索できる最適化されたBFD(Big Fantastic Database)が得られる &size(10){オリジナルのBFDが1.7TBの容量です。「reduced_dbs」なら最適化された600GBのBFDが得られる}; &color(red){*};注意. 「download_all_data.sh」を途中で止めて、再度実行するとまた初めからダウンロードします. 既に取得したにも関わらす再度ダウンロードを行うので注意. 「download_all_data.sh」の中身を見ると、各データセットごとのスクリプトを連続して実行しているだけなので、 もし途中で止まったら、既にダウンロード完了したスクリプト行を無効化して続ければ幸せになるのかと. 下記ファイルがダウンロードされる |BGCOLOR(YELLOW): |BGCOLOR(YELLOW):スクリプト名|BGCOLOR(YELLOW):対象物|BGCOLOR(YELLOW):取得先 &size(10){取得先がv2.0.0から変更されているものがある};| |1|download_alphafold_params.sh|AlphaFold parameters|_ttps://storage.googleapis.com/alphafold/alphafold_params_2021-10-27.tar&br;(v2.0.0 _ttps://storage.googleapis.com/alphafold/alphafold_params_2021-07-14.tar)| |2|download_bfd.sh&br;download_small_bfd.sh|BFD|_ttps://storage.googleapis.com/alphafold-databases/casp14_versions/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt.tar.gz&br;_ttps://storage.googleapis.com/alphafold-databases/reduced_dbs/bfd-first_non_consensus_sequences.fasta.gz| |3|download_mgnify.sh|MGnify|_ttps://storage.googleapis.com/alphafold-databases/casp14_versions/mgy_clusters_2018_12.fa.gz&br;(v2.0.0 _tp://ftp.ebi.ac.uk/pub/databases/metagenomics/peptide_database/2018_12/mgy_clusters.fa.gz)| |4|download_pdb70.sh|PDB70|_ttp://wwwuser.gwdg.de/~compbiol/data/hhsuite/databases/hhsuite_dbs/old-releases/pdb70_from_mmcif_200401.tar.gz| |5|download_pdb_mmcif.sh|PDB mmCIF files|rsync.rcsb.org::ftp_data/structures/divided/mmCIF/| |6|download_uniclust30.sh|Uniclust30|_ttps://storage.googleapis.com/alphafold-databases/casp14_versions/uniclust30_2018_08_hhsuite.tar.gz&br;(v2.0.0 _ttp://wwwuser.gwdg.de/~compbiol/uniclust/2018_08/uniclust30_2018_08_hhsuite.tar.gz)| |7|download_uniref90.sh|Uniref90|_tp://ftp.uniprot.org/pub/databases/uniprot/uniref/uniref90/uniref90.fasta.gz| |8|download_uniprot.sh|UniProt|_tp://ftp.ebi.ac.uk/pub/databases/uniprot/current_release/knowledgebase/complete/uniprot_trembl.fasta.gz&br;_tp://ftp.ebi.ac.uk/pub/databases/uniprot/current_release/knowledgebase/complete/uniprot_sprot.fasta.gz| |9|download_pdb_seqres.sh|pdb_seqres|_tp://ftp.wwpdb.org/pub/pdb/derived_data/pdb_seqres.txt| versionが変わって従来(v2.0.0)から取得先が変更されたものもあります. v2.1.1で新に「UniProt」と「pdb_seqres」が追加された模様 mmcifとuniref90は該当スクリプトを修正して国内ミラーサイトから取得した方がいいかも. -download_pdb_mmcif.sh 「&color(crimson){rsync.rcsb.org::};」は「&color(magenta){ftp.pdbj.org::};」 へ変更可能かな.「&color(magenta){ftp.pdbj.org::};」はオリジナルよりも早いみたい. #code(diff,nonumber){{ --- download_pdb_mmcif.sh.orig 2021-11-22 02:22:45.384886883 +0900 +++ download_pdb_mmcif.sh 2021-11-22 03:22:48.906715254 +0900 @@ -41,8 +41,8 @@ echo "Running rsync to fetch all mmCIF files (note that the rsync progress estimate might be inaccurate)..." mkdir --parents "${RAW_DIR}" -rsync --recursive --links --perms --times --compress --info=progress2 --delete --port=33444 \ - rsync.rcsb.org::ftp_data/structures/divided/mmCIF/ \ +rsync --recursive --links --perms --times --compress --info=progress2 --delete \ + ftp.pdbj.org::ftp_data/structures/divided/mmCIF/ \ "${RAW_DIR}" echo "Unzipping all mmCIF files..." }} -download_uniref90.sh https://ddbj.nig.ac.jp/public/mirror_database/uniprot/uniref/uniref90/ と代替可能かな. 最終的な各フォルダのサイズは下記のようになる. 今回は「reduced_dbs」を有効にしています. 「/apps/alphafold/scripts/download_all_data.sh /af reduced_dbs」として実行 #code(nonumber){{ [root@centos7 ~]# /apps/alphafold/scripts/download_all_data.sh /af reduced_dbs [root@centos7 ~]# cd /af [root@centos7 af]# du -hs ./* 64G ./mgnify 5.3G ./params 56G ./pdb70 221G ./pdb_mmcif 218M ./pdb_seqres 17G ./small_bfd 87G ./uniclust30 104G ./uniprot 63G ./uniref90 [root@centos7 af]# }} あとファイル所有者が 0600(r--) とかもあるので適時修正. pdb70 とか uniclust30 とかにありました. 「find . ! -perm -o=r -exec chmod o+r {} \;」で修正します &color(red){*};メモ rsyncでダウンロードするdownload_pdb_mmcif.shですが、自分のサイトから実行する際に下記のように言われる時がある #code(nonumber){{ rsync: failed to connect to ftp.pdbj.org (133.1.158.161): Connection timed out (110) rsync error: error in socket IO (code 10) at clientserver.c(125) [Receiver=3.1.2] }} これはPROXYの設定で回避可能かもしれません。 「export RSYNC_PROXY=<proxyサーバ>:<port>」と設定してください. ***alphafold_non_docker 実行環境 [#mc0073ff] 本家様では docker の利用を提案している. ここでは冒頭に示したように docker を利用しない alphafold_non_docker 版を作ります。&size(10){dockerを使用したオリジナル版はこちらを参照[[Alphafold]]}; [[https://github.com/kalininalab/alphafold_non_docker>+https://github.com/kalininalab/alphafold_non_docker]] はminicondaを使っている. それもいいのだが、[[crYOLO]]とか[[topaz]]でここではanacondaを使っているのでそれに合わせてみる. &size(10){anaconda3-5.3.1ではなく最新のanaconda3-2021.11を使ってます}; #code(nonumber){{ git clone https://github.com/yyuu/pyenv.git /apps/pyenv export PYENV_ROOT=/apps/pyenv export PATH=$PYENV_ROOT/bin:$PATH pyenv install anaconda3-2021.11 export PATH=$PYENV_ROOT/versions/anaconda3-2021.11/bin:$PATH 既にpyenv/anaconda環境があるなら export PYENV_ROOT=/apps/pyenv export PATH=$PYENV_ROOT/bin:$PATH eval "$(pyenv init - --no-rehash)" export PATH=$PYENV_ROOT/versions/anaconda3-2021.11/bin/:$PATH }} alphafold_non_docker 実行環境を作ります. &size(10){RTX A2000向けに少々変更しています}; #code(nonumber){{ [root@centos7 ~]# conda create -n alphafold python==3.8 [root@centos7 ~]# source activate alphafold (alphafold) [root@centos7 ~]# conda install -y -c conda-forge openmm==7.5.1 cudnn==8.2.1.32 cudatoolkit==11.3.1 pdbfixer==1.7 *オリジナルは「conda install -y -c conda-forge openmm==7.5.1 cudnn==8.2.1.32 cudatoolkit==11.0.3 pdbfixer==1.7」 (alphafold) [root@centos7 ~]# conda install -y -c bioconda hmmer==3.3.2 hhsuite==3.3.0 kalign2==2.04 *オリジナルと同じ (alphafold) [root@centos7 apps]# pip install absl-py==0.13.0 biopython==1.79 chex==0.0.7 dm-haiku==0.0.4 dm-tree==0.1.6 \ immutabledict==2.0.0 jax==0.2.14 ml-collections==0.1.0 numpy==1.19.5 scipy==1.7.0 tensorflow==2.5.0 pandas==1.3.4 tensorflow-cpu==2.5.0 *オリジナルと同じ (alphafold) [root@centos7 apps]# pip install jax==0.2.25 jaxlib==0.1.69+cuda111 -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html *オリジナルは 「pip install --upgrade jax==0.2.14 jaxlib==0.1.69+cuda111 -f https://storage.googleapis.com/jax-releases/jax_releases.html」 (alphafold) [root@centos7 apps]# pip install protobuf==3.20.0 <-- 動かない場合 }} その後はmm用のファイルを調達して #code(nonumber){{ (alphafold) [root@centos7 ~]# cd /apps (alphafold) [root@centos7 apps]# wget -P alphafold/alphafold/common/ \ https://git.scicore.unibas.ch/schwede/openstructure/-/raw/7102c63615b64735c4941278d92b554ec94415f8/modules/mol/alg/src/stereo_chemical_props.txt --no-check-certificate }} patchを適用します #code(nonumber){{ (alphafold) [root@centos7 apps]# cd /apps/pyenv/versions/anaconda3-2021.11/envs/alphafold/lib/python3.8/site-packages/ (alphafold) [root@centos7 site-packages]# patch -p0 < /apps/alphafold/docker/openmm.patch (alphafold) [root@centos7 site-packages]# source deactivate [root@centos7 site-packages]# }} スクリプトの準備 #code(nonumber){{ [root@centos7 ~]# cd /apps/src [root@centos7 src]# git clone https://github.com/kalininalab/alphafold_non_docker [root@centos7 src]# cp alphafold_non_docker/run_alphafold.sh /apps/alphafold/ }} 「/apps/alphafold/run_alphafold.sh」は下記のように修正を加えてます. #code(diff,nonumber){{ --- /apps/alphafold/run_alphafold.sh.orig 2022-04-04 17:49:07.215185888 +0900 +++ /apps/alphafold/run_alphafold.sh 2022-04-04 17:48:57.735109552 +0900 @@ -131,7 +131,7 @@ fi # This bash script looks for the run_alphafold.py script in its current working directory, if it does not exist then exits -current_working_dir=$(pwd) +current_working_dir=$alphafold_path alphafold_script="$current_working_dir/run_alphafold.py" if [ ! -f "$alphafold_script" ]; then }} ***EnvironmentModules [#qc8e1246] 「/etc/modulefiles/alphafold」として中身は下記のようにします #code(nonumber){{ #%Module1.0 set alphafold_path /apps/alphafold set root /apps/pyenv/versions/anaconda3-2021.11/envs/alphafold setenv alphafold_path $alphafold_path prepend-path PATH $root/bin:$alphafold_path }} ***使ってみる [#f4456ca3] EnvironmentModulesを定義したので、まずはmoduleをloadしてから実行します #code(nonumber){{ [saber@centos7 ~]$ module load alphafold [saber@centos7 ~]$ run_alphafold.sh Please make sure all required parameters are given Usage: /apps/alphafold/run_alphafold.sh <OPTIONS> Required Parameters: -d <data_dir> Path to directory of supporting data -o <output_dir> Path to a directory that will store the results. -f <fasta_path> Path to a FASTA file containing sequence. If a FASTA file contains multiple sequences, then it will be folded as a multimer -t <max_template_date> Maximum template release date to consider (ISO-8601 format - i.e. YYYY-MM-DD). Important if folding historical test sets Optional Parameters: -g <use_gpu> Enable NVIDIA runtime to run with GPUs (default: true) -r <run_relax> Whether to run the final relaxation step on the predicted models. Turning relax off might result in predictions with distracting (略 -e <enable_gpu_relax> Run relax on GPU if GPU is enabled (default: true) -n <openmm_threads> OpenMM threads (default: all available cores) -a <gpu_devices> Comma separated list of devices to pass to 'CUDA_VISIBLE_DEVICES' (default: 0) -m <model_preset> Choose preset model configuration - the monomer model, the monomer model with extra ensembling, monomer model with pTM head, or (略 -c <db_preset> Choose preset MSA database configuration - smaller genetic database config (reduced_dbs) or full genetic database config (full_dbs) (略 -p <use_precomputed_msas> Whether to read MSAs that have been written to disk. WARNING: This will not check if the sequence, database or configuration (略 -l <num_multimer_predictions_per_model> How many predictions (each with a different random seed) will be generated per model. E.g. if this is 2 and there (略 -b <benchmark> Run multiple JAX model evaluations to obtain a timing that excludes the compilation time, which should be more indicative of the time (略 [saber@centos7 ~]$ }} と使い方を示してくれます(途中省いてます) #code(nonumber){{ [saber@centos7 ~]$ mkdir alphafold && cd $_ [saber@centos7 alphafold]$ cp /apps/src/alphafold_non_docker/example/query.fasta . [saber@centos7 alphafold]$ cat query.fasta >dummy_sequence GWSTELEKHREELKEFLKKEGITNVEIRIDNGRLEVRVEGGTERLKRFLEELRQKLEKKGYTVDIKIE [saber@centos7 alphafold]$ run_alphafold.sh -d /apps/AlphafoldData -o . -f query.fasta -t 2020-05-14 -c reduced_dbs -g false -m monomer }} ***num_recyleとjackhmmerで使用するcore数を引数で変更するには [#nd277c33] alphafoldでのリサイクル数、jackhmmerによる配列検索時のcpu数、monomer予測時に使われるhhsearchのcpu数、 multimer予測時に使われるhmmsearchのcpu数をそれぞれ指定できるようにしてみた. 「/apps/alphafold/run_alphafold.sh」 #code(diff,nonumber){{ --- ../src/alphafold_non_docker/run_alphafold.sh.orig 2022-06-09 02:34:27.897005704 +0900 +++ run_alphafold.sh 2022-06-12 14:46:32.518462539 +0900 @@ -23,10 +23,15 @@ echo "-l <num_multimer_predictions_per_model> How many predictions (each with a different random seed) will be (略 echo "-b <benchmark> Run multiple JAX model evaluations to obtain a timing that excludes the compilation (略 echo "" + echo "-C <num_recycle> ReCycle number [3]" + echo "-N <n_cpu> jackhmmer: number of parallel CPU workers to use for multithreads [8]" + echo "-h <hhsearch_cpu> hhsearch: number of CPUs to use (for shared memory SMPs) [2](monomer)" + echo "-H <hmmsearch_cpu> hmmsearch: number of parallel CPU workers to use for multithreads [8](multimer)" + echo "" exit 1 } -while getopts ":d:o:f:t:g:r:e:n:a:m:c:p:l:b" i; do +while getopts ":d:o:f:t:g:r:e:n:a:m:c:p:l:C:N:h:H:b" i; do case "${i}" in d) data_dir=$OPTARG @@ -67,6 +72,18 @@ l) num_multimer_predictions_per_model=$OPTARG ;; + C) + num_recycle=$OPTARG + ;; + N) + n_cpu=$OPTARG + ;; + h) + hhsearch_cpu=$OPTARG + ;; + H) + hmmsearch_cpu=$OPTARG + ;; b) benchmark=true ;; @@ -78,6 +95,18 @@ usage fi +if [[ "$hmmsearch_cpu" == "" ]] ; then + hmmsearch_cpu=8 +fi +if [[ "$hhsearch_cpu" == "" ]] ; then + hhsearch_cpu=2 +fi +if [[ "$n_cpu" == "" ]] ; then + n_cpu=8 +fi +if [[ "$num_recycle" == "" ]] ; then + num_recycle=3 +fi if [[ "$benchmark" == "" ]] ; then benchmark=false fi @@ -131,7 +160,7 @@ fi # This bash script looks for the run_alphafold.py script in its current working directory, if it does not exist then exits -current_working_dir=$(pwd) +current_working_dir=$alphafold_path alphafold_script="$current_working_dir/run_alphafold.py" if [ ! -f "$alphafold_script" ]; then @@ -197,5 +226,6 @@ database_paths="$database_paths --uniclust30_database_path=$uniclust30_database_path --bfd_database_path=$bfd_database_path" fi +extra_args="--num_recycle=$num_recycle --n_cpu=$n_cpu --hhsearch_cpu=$hhsearch_cpu --hmmsearch_cpu=$hmmsearch_cpu" # Run AlphaFold with required parameters -$(python $alphafold_script $binary_paths $database_paths $command_args) +$(python $alphafold_script $binary_paths $database_paths $command_args $extra_args) }} 「/apps/alphafold/run_alphafold.py」 #code(diff,nonumber){{ --- run_alphafold.py.orig 2022-06-09 02:35:22.146479521 +0900 +++ run_alphafold.py 2022-06-12 14:43:24.842855059 +0900 @@ -128,6 +128,10 @@ 'Relax on GPU can be much faster than CPU, so it is ' 'recommended to enable if possible. GPUs must be available' ' if this setting is enabled.') +flags.DEFINE_integer('num_recycle', None,'num_recycle') +flags.DEFINE_integer('n_cpu', 8,'n_cpu') +flags.DEFINE_integer('hhsearch_cpu', 2,'hhsearch_cpu') +flags.DEFINE_integer('hmmsearch_cpu', 8,'hmmsearch_cpu') FLAGS = flags.FLAGS @@ -315,6 +319,7 @@ template_searcher = hmmsearch.Hmmsearch( binary_path=FLAGS.hmmsearch_binary_path, hmmbuild_binary_path=FLAGS.hmmbuild_binary_path, + hmmsearch_cpu=FLAGS.hmmsearch_cpu, database_path=FLAGS.pdb_seqres_database_path) template_featurizer = templates.HmmsearchHitFeaturizer( mmcif_dir=FLAGS.template_mmcif_dir, @@ -326,6 +331,7 @@ else: template_searcher = hhsearch.HHSearch( binary_path=FLAGS.hhsearch_binary_path, + hhsearch_cpu=FLAGS.hhsearch_cpu, databases=[FLAGS.pdb70_database_path]) template_featurizer = templates.HhsearchHitFeaturizer( mmcif_dir=FLAGS.template_mmcif_dir, @@ -337,6 +343,7 @@ monomer_data_pipeline = pipeline.DataPipeline( jackhmmer_binary_path=FLAGS.jackhmmer_binary_path, + n_cpu=FLAGS.n_cpu, hhblits_binary_path=FLAGS.hhblits_binary_path, uniref90_database_path=FLAGS.uniref90_database_path, mgnify_database_path=FLAGS.mgnify_database_path, @@ -359,6 +366,10 @@ num_predictions_per_model = 1 data_pipeline = monomer_data_pipeline + num_recycle = FLAGS.num_recycle + if num_recycle is None: + num_recycle = 3 + model_runners = {} model_names = config.MODEL_PRESETS[FLAGS.model_preset] for model_name in model_names: @@ -367,6 +378,7 @@ model_config.model.num_ensemble_eval = num_ensemble else: model_config.data.eval.num_ensemble = num_ensemble + model_config.data.common.num_recycle = FLAGS.num_recycle model_params = data.get_model_haiku_params( model_name=model_name, data_dir=FLAGS.data_dir) model_runner = model.RunModel(model_config, model_params) @@ -417,6 +429,7 @@ 'max_template_date', 'obsolete_pdbs_path', 'use_gpu_relax', + 'num_recycle', ]) app.run(main) }} 「/apps/alphafold/alphafold/data/pipeline.py」 #code(diff,nonumber){{ --- a/alphafold/data/pipeline.py +++ b/alphafold/data/pipeline.py @@ -124,15 +124,18 @@ class DataPipeline: use_small_bfd: bool, mgnify_max_hits: int = 501, uniref_max_hits: int = 10000, + n_cpu: int = 8, use_precomputed_msas: bool = False): """Initializes the data pipeline.""" self._use_small_bfd = use_small_bfd self.jackhmmer_uniref90_runner = jackhmmer.Jackhmmer( binary_path=jackhmmer_binary_path, + n_cpu=n_cpu, database_path=uniref90_database_path) if use_small_bfd: self.jackhmmer_small_bfd_runner = jackhmmer.Jackhmmer( binary_path=jackhmmer_binary_path, + n_cpu=n_cpu, database_path=small_bfd_database_path) else: self.hhblits_bfd_uniclust_runner = hhblits.HHBlits( @@ -140,6 +143,7 @@ class DataPipeline: databases=[bfd_database_path, uniclust30_database_path]) self.jackhmmer_mgnify_runner = jackhmmer.Jackhmmer( binary_path=jackhmmer_binary_path, + n_cpu=n_cpu, database_path=mgnify_database_path) self.template_searcher = template_searcher self.template_featurizer = template_featurizer }} 「/apps/alphafold/alphafold/data/tools/hhsearch.py b/alphafold/data/tools/hhsearch.py」 #code(diff,nonumber){{ --- a/alphafold/data/tools/hhsearch.py +++ b/alphafold/data/tools/hhsearch.py @@ -33,6 +33,7 @@ class HHSearch: *, binary_path: str, databases: Sequence[str], + hhsearch_cpu: int = 2, maxseq: int = 1_000_000): """Initializes the Python HHsearch wrapper. @@ -50,6 +51,7 @@ class HHSearch: self.binary_path = binary_path self.databases = databases self.maxseq = maxseq + self.hhsearch_cpu = hhsearch_cpu for database_path in self.databases: if not glob.glob(database_path + '_*'): @@ -79,6 +81,7 @@ class HHSearch: cmd = [self.binary_path, '-i', input_path, '-o', hhr_path, + '-cpu', str(self.hhsearch_cpu), '-maxseq', str(self.maxseq) ] + db_cmd }} 「/apps/alphafold/alphafold/data/tools/hmmsearch.py b/alphafold/data/tools/hmmsearch.py」 #code(diff,nonumber){{ --- a/alphafold/data/tools/hmmsearch.py +++ b/alphafold/data/tools/hmmsearch.py @@ -33,6 +33,7 @@ class Hmmsearch(object): binary_path: str, hmmbuild_binary_path: str, database_path: str, + hmmsearch_cpu: int = 8, flags: Optional[Sequence[str]] = None): """Initializes the Python hmmsearch wrapper. @@ -49,6 +50,7 @@ class Hmmsearch(object): self.binary_path = binary_path self.hmmbuild_runner = hmmbuild.Hmmbuild(binary_path=hmmbuild_binary_path) self.database_path = database_path + self.hmmsearch_cpu = hmmsearch_cpu if flags is None: # Default hmmsearch run settings. flags = ['--F1', '0.1', @@ -89,7 +91,7 @@ class Hmmsearch(object): cmd = [ self.binary_path, '--noali', # Don't include the alignment in stdout. - '--cpu', '8' + '--cpu', str(self.hmmsearch_cpu) ] # If adding flags, we have to do so before the output and input: if self.flags: }}