FastQC

Introduction

FastQC is a quality control tool for massive parallel high throughtput sequencing data. It provides a series of metrics that could be used to evaluate the overall quality of data and identify possible problems that should be corrected before any downstream analysis.

Requirements

FastQC is available as a module on Delta and Illinois Campus Cluster (ICC). Requirements are loaded automatically when the module is called:

module load fastqc

Usage

The following SLURM batch script can be used as a template for processing multiple samples with FastQC:

# Path to Working Directory

myWorkDir="/path/to/my/working/directory"
cd $myWorkDir

## -- Reserve a folder for placing the FASTq files. For example: $myWorkDir/seqs/*.fastq (required)
## -- Reserve a folder for storing the reports provided by FASTQC. For example: $myWorkDir/fqc_reports

# Run FASTQC

mkdir -p $myWorkDir/fqc_reports
fastqc -o $myWorkDir/fqc_reports/ $myWorkDir/seqs/*.fastq

References

  1. Andrews S. (2010). FastQC: a quality control tool for high throughput sequence data. Available online at: https://www.bioinformatics.babraham.ac.uk/projects/fastqc/