# FASTQ — Raw Sequencing Read Format
## Overview
FASTQ is the standard format for storing raw sequencing reads, the direct output of next-generation sequencing instruments. Described by Cock et al. (2010, *Nucleic Acids Research*), it stores base calls and their associated Phred-scaled quality scores in a simple four-line-per-read text format. FASTQ has no formal governance body but is universally adopted as the starting point of every genomics pipeline. Files are almost always gzip-compressed in practice.
## Position in the Genomics Pipeline
FASTQ is the upstream input to alignment, which produces [[SAM-BAM-CRAM]] files. These are then processed for variant calling (producing [[VCF]]) or expression quantification (producing count matrices in [[AnnData]] for single-cell data).
## Connections
- Downstream format: [[SAM-BAM-CRAM]] (after alignment)
- Deposited in: [[ENA]] and NCBI SRA (open access), [[EGA]] and [[dbGaP]] (controlled access)
## Resources
- https://doi.org/10.1093/nar/gkp1137 (Cock et al. 2010, Nucleic Acids Research)
- https://www.ebi.ac.uk/ena (ENA)
- https://www.ncbi.nlm.nih.gov/sra (NCBI SRA)