Saturday, August 14, 2010

A normalization technique for next generation sequencing experiments

Next generation sequencing (NGS) are these days one of the key technologies in biology. NGS’ cost effectiveness and capability of finding the smallest variations in the genome makes them increasingly popular. For studies aiming at genome assembly, differences in read count statistics do not affect the outcome. However, these differences bias the outcome if the goal is to identify structural DNA characteristics like copy number variations (CNVs). Thus a normalization step must removed such random read count variations subsequently read counts from different experiments are comparable. Especially after normalization the commonly used assumption of Poisson read count distribution in windows on the chromosomes is more justified. Strong deviations of read counts from the estimated mean Poisson distribution indicate CNVs.

