FastQC
Introduction
FastQC is a quality control tool for massive parallel high throughtput sequencing data. It provides a series of metrics that could be used to evaluate the overall quality of data and identify possible problems that should be corrected before any downstream analysis.
Requirements
FastQC is available as a module on Delta and Illinois Campus Cluster (ICC). Requirements are loaded automatically when the module is called:
module load fastqc
Usage
The following SLURM batch script can be used as a template for processing multiple samples with FastQC:
# Path to Working Directory
myWorkDir="/path/to/my/working/directory"
cd $myWorkDir
## -- Reserve a folder for placing the FASTq files. For example: $myWorkDir/seqs/*.fastq (required)
## -- Reserve a folder for storing the reports provided by FASTQC. For example: $myWorkDir/fqc_reports
# Run FASTQC
mkdir -p $myWorkDir/fqc_reports
fastqc -o $myWorkDir/fqc_reports/ $myWorkDir/seqs/*.fastq
References
Andrews S. (2010). FastQC: a quality control tool for high throughput sequence data. Available online at: https://www.bioinformatics.babraham.ac.uk/projects/fastqc/