DEEPVARIANT
- Genaro Pimienta

- Feb 7
- 4 min read
Updated: Feb 9
In this blogpost I explain the architecture of DeepVariant (Google Brain team).
Published in 2018, DeepVariant is a neural network-based variant caller that discovers genomic variants with 99.9% accuracy.
Unlike Bayesian-based variant callers (e.g., HaplotypeCaller), DeepVariant discovers SNPs and INDELs regardless of the sequencing technology used:
Illumina short reads
PacBio long reads
Oxford Nanopore ultralong reads
SNP: single nucleotide polymorphism
INDEL: insertion or deletion
Before DeepVariant became available, the gold-standard variant caller used in actionable genomics was HaplotypeCaller.
Embedded in the Genome Analysis Toolkit (GATK), HaplotypeCaller uses Bayesian statistics to identify genomic variants with a ~99.9% accuracy.
But, because the Bayesian model in HaplotypeCaller (and most other variant callers avaialble today) is optimized for Illumina short reads, it underperforms when analyzing long reads generated by PacBio or Oxford Nanopore Technologies (ONT).
DEEPVARIANT
DeepVariant is based on a convolutional neural network (CNN) with an Inception architecture.
Named GoogLeNet (or Inception-v1) when first published in 2014, this CNN architecture won the ImageNet Large Scale Visual Recognition Challenge (ILSVRC) for image classification.
Thus far, four versions of Inception have been developed. DeepVariant is based on Inception-v3.
The Inception architecture has four components (Figure 1):
Input Tensor
Stem block
Body block
Head block

In the following sections I describe these four components in the context of DeepVariant.
Reading my previous blogposts WHAT IS AN ARTIFICIAL NEURAL NETWORK? and CONVOLUTIONAL NEURAL NETWORKS will help you understand the text below.
INPUT TENSOR
Embedded in the Tensor image are the height (read mapping coverage per position in the genome) and width (read mapping window along the genome) of the pileup output, which is generated by a read mapping algorithm (Figure 2).

The depth in the Tensor image is composed of six channels, each populated with a statistical feature from the read mapping summary (Figure 3).
The six features are:
Read base (sequencing instrument)
Base call quality (sequencing instrument)
Base mapping quality (read mapping algorithm)
Read alignment strand (read mapping algorithm)
Metadata tag: mapped read supports predicted variant (read mapping algorithm)
Metadata tag: mapped base differs from reference (read mapping algorithm)

STEM BLOCK
Equipped with three convolutional layers (dimensions 7x7, 1x1, and 3x3) and two pooling layers (dimensions 3x3), the stem block extracts low-level representational features from the input Tensor (Figure 4).

In the stem block, the 1x1 convoluting layer, also named bottleneck layer, is an adaptation of the Network-in-Network concept published in 2014.
Network In Network —2014
In DeepVariant, the bottleneck layer reduces the dimensions of the input Tensor, while keeping the number of channels intact. This provides computational efficiency, without sacrificing the depth of the features extracted from the input dataset.
Check this playlist: C4W1L01 Computer Vision
BODY BLOCK
The body block is built of multiple Inception modules, which extract high-level features from the incoming layer (last one in the stem block) in a computationally efficient manner (Figure 6).
An Inception module is composed of multiple (1x1 convolving filter) and asymmetrical (1x3 and 3x1 convolving filters) layers (Figure 6). By stacking together multiple Inception modules, the CNN can replace a computationally taxing 7x7 and 5x5 convolving filters, without loosing representational depth.
It is the depth (level of detail) of the representational features extracted by multiple Inception modules stacked together, which makes DeepVariant so accurate at predicting genomic variants.

HEAD BLOCK
The head block comprises one or more multilayer perceptron classifiers. The first classifier in the head block ingests a flattened matrix, which contains the features extracted by the Inception modules.
From the representational features extracted by the Inception modules, the multilayer perceptron computes the presence or absence of a genomic variant in the read mapping pileup output.

FURTHER READING AND WATCHING
Publications:
Network In Network —2014
Videos:
Stay tuned!
GPR
Disclosure: At BioTech Writing and Consulting we believe in the use AI in Data Science, but do not use AI to generate text or images.

Comments