Skip to content

Nextflow for Genomicscourse

  • Course summary


    A hands-on course applying Nextflow to a real-world genomics use case: variant calling with GATK.

    This course builds on the Hello Nextflow beginner training and demonstrates how to use Nextflow in the specific context of the genomics domain. You will implement a variant calling pipeline with GATK (Genome Analysis Toolkit), a widely used software package for analyzing high-throughput sequencing data.

  • Additional information


    Technical requirements

    You will need a GitHub account OR a local installation of Nextflow. See Environment options for more details.

    Learning objectives
    • Write a linear workflow to apply variant calling to a single sample
    • Handle accessory files such as index files and reference genome resources appropriately
    • Leverage Nextflow's dataflow paradigm to parallelize per-sample variant calling
    • Implement multi-sample joint calling using relevant channel operators
    Audience & prerequisites
    • Audience: This course is designed for researchers in genomics and related fields who want to develop or customize data analysis pipelines.
    • Skills: Some familiarity with the command line, basic scripting concepts, and common genomics file formats is assumed.
    • Prerequisites: Foundational Nextflow concepts and tooling covered in Hello Nextflow.

Course overview

This course is hands-on, with goal-oriented exercises structured to introduce information gradually.

You will start by running the variant calling tools manually in the terminal to understand the methodology, then progressively build up a Nextflow pipeline that automates and scales the analysis.

Lesson plan

We've broken this down into three parts that each focus on specific aspects of applying Nextflow to a genomics use case.

Course chapter Summary Estimated duration
Part 1: Method overview Understanding the variant calling methodology and running the tools manually 30 mins
Part 2: Per-sample variant calling Building a pipeline that indexes BAM files and calls variants, then scaling to multiple samples 60 mins
Part 3: Joint calling on a cohort Adding multi-sample joint genotyping using channel operators to aggregate per-sample outputs 45 mins

By the end of this course, you will be able to apply foundational Nextflow concepts and tooling to a typical genomics use case.

Ready to take the course?

Get started