Saturday, December 15, 2012

Fwd: Fast detection of de novo copy number variants from SNP arrays for case-parent trios

Fwd: please follow footer link
Fast detection of de novo copy number variants
from SNP arrays for case-parent trios
: Background:
In studies of case-parent trios, we define copy number variants (CNVs) in the offspring that differfrom the parental copy numbers as de novo and of interest for their potential functional role indisease. Among the leading array-based methods for discovery of de novo CNVs in case-parent triosis the joint hidden Markov model (HMM) implemented in the PennCNV software. However, thecomputational demands of the joint HMM are substantial and the extent to which false positiveidentifications occur in case-parent trios has not been well described. We evaluate these issues in astudy of oral cleft case-parent trios.
Our analysis of the oral cleft trios reveals that genomic waves represent a substantial source of falsepositive identifications in the joint HMM, despite a wave-correction implementation in PennCNV. Inaddition, the noise of low-level summaries of relative copy number (log R ratios) is stronglyassociated with batch and correlated with the frequency of de novo CNV calls. Exploiting the triodesign, we propose a univariate statistic for relative copy number referred to as the minimum distancethat can reduce technical variation from probe effects and genomic waves. We use circular binarysegmentation to segment the minimum distance and maximum a posteriori estimation to infer denovo CNVs from the segmented genome. Compared to PennCNV on simulated data,MinimumDistance identifies fewer false positives on average and is comparable to PennCNV withrespect to false negatives. Genomic waves contribute to discordance of PennCNV andMinimumDistance for high coverage de novo calls, while highly concordant calls on chromosome 22were validated by quantitative PCR. Computationally, MinimumDistance provides a nearly 8-foldincrease in speed relative to the joint HMM in a study of oral cleft trios.
Our results indicate that batch effects and genomic waves are important considerations forcase-parent studies of de novo CNV, and that the minimum distance is an effective statistic forreducing technical variation contributing to false de novo discoveries. Coupled with segmentationand maximum a posteriori estimation, our algorithm compares favorably to the joint HMM withMinimumDistance being much faster.

(Original Post: BMC Bioinformatics - Latest articles.)