BCFtools

Introduction

BCFtools is a suite of utilities designed for manipulating and analyzing variant call data stored in VCF and BCF formats. It provides essential tools for filtering, converting, and summarizing VCF information, making it an important component of most variant-calling workflows.

Requirements

BCFtools is available as a module on Delta and Illinois Campus Cluster (ICC). All necessary dependencies load automatically when the module is loaded:

module load bcftools

Usage

Below are examples of commonly used BCFtools functions, followed by a sample SLURM batch script suitable for processing VCF or BCF files.

View VCFs

View and filter VCF files:

# View a VCF
bcftools view input.vcf.gz

# Filter variants based on quality score
bcftools view -i 'QUAL>30' input.vcf.gz -o filtered.vcf

Indexing VCFs

Indexing enables efficient access and is required by many downstream bioinformatics tools:

bcftools index output.vcf.gz

Summary VCFs

bcftools stats input.vcf.gz > input.stats.txt

SLURM batch script example

The following code snippet could be used as an example to create the SLURM batch script for BCFtools variant processing:

# Path to Working Directory

myWorkDir="/path/to/my/working/directory"
cd $myWorkDir

## -- Reserve a folder containing input VCF files. For example: $myWorkDir/variants/*.vcf.gz
## -- Reserve a folder for storing processed files. For example: $myWorkDir/bcf_processed/

mkdir -p $myWorkDir/bcf_processed

for myvcf in $myWorkDir/variants/*.vcf.gz
do
    samplename=$(basename $myvcf .vcf.gz)

    # Filter variants by quality

    bcftools view -i 'QUAL>30' $myvcf -o $myWorkDir/bcf_processed/${samplename}.filtered.vcf.gz

    # Index the filtered VCF

    bcftools index $myWorkDir/bcf_processed/${samplename}.filtered.vcf.gz

    # Generate statistics

    bcftools stats $myWorkDir/bcf_processed/${samplename}.filtered.vcf.gz > $myWorkDir/bcf_processed/${BASENAME}.stats.txt

done

References

  1. Danecek P, Bonfield JK, Liddle J, Marshall J, Ohan V, Pollard MO, Whitwham A, Keane T, McCarthy SA, Davies RM, Li H. Twelve years of SAMtools and BCFtools. Gigascience. 2021 Feb 16;10(2):giab008. doi: 10.1093/gigascience/giab008. PMID: 33590861; PMCID: PMC7931819.