Task batching
Problem
You have many small tasks that you would like to process in batches to reduce job submission overhead.
Solution
Use the buffer operator to collect your input channel into batches, then refactor the process to accept a list of inputs instead of one input. One job will be created for each batch instead of each task.
Code
process foo {
input:
val indices
script:
"""
for INDEX in ${indices.join(' ')}; do
echo "Hello from task \${INDEX}!"
done
"""
}
workflow {
Channel.of(1..1000)
| buffer(size: 10, remainder: true)
| foo
}
Run it
Run the example using this command:
nextflow run nextflow-io/patterns/task-batching.nf