Solution
process '5_rnaseq_call_variants' {
tag "$sampleId" (1)
input:
file genome from genome_file (2)
file index from genome_index_ch (3)
file dict from genome_dict_ch (4)
set sampleId, file(bam), file(bai) from final_output_ch.groupTuple() (5)
output:
set sampleId, file('final.vcf') into vcf_files (6)
script:
"""
echo "${bam.join('\n')}" > bam.list
# Variant calling
java -jar $GATK -T HaplotypeCaller \
-R $genome -I bam.list \
-dontUseSoftClippedBases \
-stand_call_conf 20.0 \
-o output.gatk.vcf.gz
# Variant filtering
java -jar $GATK -T VariantFiltration \
-R $genome -V output.gatk.vcf.gz \
-window 35 -cluster 3 \
-filterName FS -filter "FS > 30.0" \
-filterName QD -filter "QD < 2.0" \
-o final.vcf (7)
"""
}
| 1 | tag line with the using the sample id as the tag. |
| 2 | the genome fasta file. |
| 3 | the genome index from the genome_index_ch channel created in the process 1A_prepare_genome_samtools. |
| 4 | the genome dictionary from the genome_dict_ch channel created in the process 1B_prepare_genome_picard. |
| 5 | the sets grouped by sampleID from the final_output_ch channel created in the process 4_rnaseq_gatk_recalibrate. |
| 6 | the set containing the sample ID and final VCF file. |
| 7 | the line specifing the name resulting final vcf file. |