本家様 https://www.nextflow.io/
「Data-driven computational pipelines」とある.
Aの処理が終わってBの処理をさせ、分岐もあればループもさせながら全体の処理を進める(パイプライン)
pipeline pilot/KNIMEのようなGUI操作はないが、配列解析に使われるツール. 処理ノードにnf-coreがある.
インストール †
javaが必須
[illya@c ~]$ which java
/usr/bin/java
[illya@c ~]$ java -version
openjdk version "1.8.0_252"
OpenJDK Runtime Environment (build 1.8.0_252-b09)
OpenJDK 64-Bit Server VM (build 25.252-b09, mixed mode)
[illya@c ~]$ curl -s https://get.nextflow.io | bash
CAPSULE: Downloading dependency org.slf4j:log4j-over-slf4j:jar:1.7.25
CAPSULE: Downloading dependency org.multiverse:multiverse-core:jar:0.7.0
(略
CAPSULE: Downloading dependency commons-codec:commons-codec:jar:1.10
N E X T F L O W
version 21.04.3 build 5560
created 21-07-2021 15:09 UTC (22-07-2021 00:09 JDT)
cite doi:10.1038/nbt.3820
http://nextflow.io
Nextflow installation completed. Please note:
- the executable file `nextflow` has been created in the folder: /home/illya
- you may complete the installation by moving it to a directory in your $PATH
[illya@c ~]$
っで用意された nextflow であるが、何なのかと調べると
[illya@c ~]$ ls -l nextflow
-rwx--x--x 1 illya em 15204 Sep 7 03:46 nextflow*
[illya@c ~]$ file nextflow
nextflow: Bourne-Again shell script, ASCII text executable
[illya@c ~]$
と中身はbash. だが、.(dot)フォルダが作られるようで
[illya@c ~]$ ls -l .nextflow/
total 0
drwxr-xr-x 3 illya em 18 Sep 7 03:46 capsule/
drwxr-xr-x 3 illya em 21 Sep 7 03:46 framework/
drwxr-xr-x 4 illya em 33 Sep 7 03:47 tmp/
[illya@c ~]$ du -hs .nextflow/*
26M .nextflow/capsule
1.7M .nextflow/framework
8.0K .nextflow/tmp
[illya@c ~]$
単純にnextflowコマンドをコピーして使うものではないみたい.
テスト †
[illya@c ~]$ ./nextflow run hello
N E X T F L O W ~ version 21.04.3
Pulling nextflow-io/hello ...
downloaded from https://github.com/nextflow-io/hello.git
Launching `nextflow-io/hello` [happy_darwin] - revision: ec11eb0ec7 [master]
executor > local (4)
[29/5b5a9c] process > sayHello (1) [100%] 4 of 4 ?
Ciao world!
Hola world!
Hello world!
Bonjour world!
[illya@c ~]$ du -hs .nextflow/*
120K .nextflow/assets
16K .nextflow/cache
26M .nextflow/capsule
1.7M .nextflow/framework
4.0K .nextflow/history
0 .nextflow/plr
0 .nextflow/plugins
16K .nextflow/tmp
[illya@c ~]$
offline †
参照先 : https://nf-co.re/usage/offline
外部とのネットワークに繋がっていない場所で使うには https://github.com/nextflow-io/nextflow/releases
にて提供のリリースで「-all」の接尾語を持つファイルを取得します
[illya@c ~]$ wget https://github.com/nextflow-io/nextflow/releases/download/v21.04.3/nextflow-21.04.3-all
[illya@c ~]$ file nextflow-21.04.3-all
nextflow-21.04.3-all: Zip archive data
[illya@c ~]$
一見、zipファイル?と思うのだが、ファイルの先頭行を見ると
[illya@c ~]$ head -n 3 nextflow-21.04.3-all
#!/usr/bin/env bash
#
# Copyright 2020-2021, Seqera Labs
[illya@c ~]$
とbashスクリプトであって、実際実行してみると
[illya@c ~]$ bash nextflow-21.04.3-all
Usage: nextflow [options] COMMAND [arg...]
Options:
-C
Use the specified configuration file(s) overriding any defaults
-D
Set JVM properties
-bg
Execute nextflow in background
-c, -config
Add the specified file to configuration set
-d, -dockerize
Launch nextflow via Docker (experimental)
-h
Print this help
-log
Set nextflow log file path
-q, -quiet
Do not print information messages
-syslog
Send logs to syslog server (eg. localhost:514)
-v, -version
Print the program version
Commands:
clean Clean up project cache and work directories
clone Clone a project into a folder
config Print a project configuration
console Launch Nextflow interactive console
drop Delete the local copy of a project
help Print the usage help for a command
info Print project and system runtime information
kuberun Execute a workflow in a Kubernetes cluster (experimental)
list List all downloaded projects
log Print executions log and runtime info
pull Download or update a project
run Execute a pipeline project
self-update Update nextflow runtime to the latest available version
view View project script file(s)
[illya@c ~]$
と使える.
ただ、オフラインで使うので更新しないように事前設定を行う
[illya@c ~]$ echo "export NXF_OFFLINE='TRUE'" >> ~/.bashrc
nf-core †
本家様https://nf-co.re/
パイプラインのノード?
実行環境を用意します. このサイトではcrYOLOとかで既にanaconda環境が準備されいます.
なので下記のみを実行してanaconda環境に移る.
export PYENV_ROOT=/apps/pyenv
export PATH=$PYENV_ROOT/bin:$PATH
eval "$(pyenv init - --no-rehash)"
export PATH=$PYENV_ROOT/versions/anaconda3-5.3.1/bin/:$PATH
(condaが古いと言われたら)
conda update -n base -c defaults conda
その後に
[root@c ~]# conda create --name nf-core python=3.7 nf-core -c bioconda -c conda-forge
[root@c ~]# conda env export -n nf-core > nf-core.yml
オフラインでは「nf-core.yml」に記載されたパッケージを集めて構築する必要がある...
[root@c ~]# source activate nf-core
(nf-core) [root@c ~]# nf-core --help
,--./,-.
___ __ __ __ ___ /,-._.--~\
|\ | |__ __ / ` / \ |__) |__ } {
| \| | \__, \__/ | \ |___ \`-._,-`-,
`._,._,'
nf-core/tools version 2.1
Usage: nf-core [OPTIONS] COMMAND [ARGS]...
Options:
--version Show the version and exit.
-v, --verbose Print verbose output to the console.
-l, --log-file <filename> Save a verbose log to a file.
--help Show this message and exit.
Commands:
list List available nf-core pipelines with local info.
launch Launch a pipeline using a web GUI or command line prompts.
download Download a pipeline, nf-core/configs and pipeline...
licences List software licences for a given workflow.
create Create a new pipeline using the nf-core template.
lint Check pipeline code against nf-core guidelines.
modules Tools to manage Nextflow DSL2 modules as hosted on...
schema Suite of tools for developers to manage pipeline schema.
bump-version Update nf-core pipeline version number.
sync Sync a pipeline TEMPLATE branch with the nf-core template.
(nf-core) [root@c ~]#
オフラインで行うにはdownloadを使って使用したいパイプラインを取得する. 例えば rnaseq なら
(nf-core) [root@c ~]# nf-core download rnaseq
,--./,-.
___ __ __ __ ___ /,-._.--~\
|\ | |__ __ / ` / \ |__) |__ } {
| \| | \__, \__/ | \ |___ \`-._,-`-,
`._,._,'
nf-core/tools version 2.1
? Select release / branch: (Use arrow keys)
≫ 3.3 [release]
3.2 [release]
3.1 [release]
3.0 [release]
2.0 [release]
1.4.2 [release]
1.4.1 [release]
1.4 [release]
1.3 [release]
1.2 [release]
1.1 [release]
1.0 [release]
dev [branch]
docs-fix [branch]
master [branch]
pytest-workflow
(そのまま3.3を選んでリターンすると)
In addition to the pipeline code, this tool can download software containers.
? Download software container images: (Use arrow keys)
≫ none
singularity
(コンテナのsingularityを使わないなら、そのままリターン)
If transferring the downloaded files to another system, it can be convenient to have everything compressed in a single
file.
? Choose compression type: (Use arrow keys)
≫ none
tar.gz
tar.bz2
zip
(ダウンロードのタイプを聞かれる。オフラインに持っていきたいのでtar.gzを選択)
?? Choose compression type: tar.gz
INFO Saving 'nf-core/rnaseq' download.py:160
Pipeline revision: '3.3'
Pull containers: 'none'
Output file: 'nf-core-rnaseq-3.3.tar.gz'
INFO Downloading workflow files from GitHub download.py:163
INFO Downloading centralised configs from GitHub download.py:167
INFO Compressing download.. download.py:187
INFO Command to extract files: tar -xzf nf-core-rnaseq-3.3.tar.gz download.py:754
INFO MD5 checksum for 'nf-core-rnaseq-3.3.tar.gz': b3ea432a2a25663c2072c6212d417db5 download.py:795
(nf-core) [root@c ~]# ls -lh nf-core-rnaseq-3.3.tar.gz
-rw-r--r-- 1 root root 5.2M Sep 9 02:46 nf-core-rnaseq-3.3.tar.gz
(nf-core) [root@c ~]#