Saturday, July 10, 2010

CNVineta: A data mining tool for large case-control copy number variation data sets.

Motivation: Copy number variation (CNV), a major contributor to human genetic variation, comprises 1 kb or longer genomic deletions and insertions. Yet, the identification of CNVs from microarray data is still hampered by high false negative and positive prediction rates due to the noisy nature of the raw data. Here, we present CNVineta, an R package for rapid data mining and visualization of CNVs in large case-control data sets genotyped with single-nucleotide polymorphism oligonucleotide arrays. CNVineta is compatible with various established CNV prediction algorithms, can be used for genome-wide association analysis of rare and common CNVs and enables rapid and serial display of log2 of raw data ratios as well as B-allele frequencies for visual quality inspection. In summary, CNVineta aides in the interpretation of large-scale CNV data sets and prioritization of target regions for follow-up experiments.

Availability and Implementation: CNVineta is available as an R package and can be downloaded from; the package contains a tutorial outlining a typical workflow. The CNVineta compatible HapMap (International HapMap Consortium 2003) data set can also be downloaded from the link above.

(this Post content was reproduced from:, Via Bioinformatics - Advance Access.)