Tuesday, April 20, 2010

CNAnova: a new approach for finding recurrent copy number abnormalities in cancer SNP microarray data

Motivation: The current generation of SNP arrays allows measurement of copy number aberrations (CNAs) in cancer at more than one million locations in the genome in hundreds of tumour samples. Most research has focused on single-sample CNA discovery, the so-called segmentation problem. The availability of high-density, large sample-size SNP array datasets makes the identification of recurrent copy number changes in cancer an important issue that can be addressed using cross-sample information.

Results: We present a novel approach for finding regions of recurrent copy number aberrations, called CNAnova, from Affymetrix SNP 6.0 array data. The method derives its statistical properties from a control dataset composed of normal samples and, in contrast to previous methods, does not require segmentation and permutation steps. For rigorous testing of the algorithm and comparison to existing methods we developed a simulation scheme that uses the noise distribution present in Affymetrix arrays. Application of the method to 128 Acute Lymphoblastic Leukemia samples shows that CNAnova achieves lower error rate than a popular alternative approach. We also describe an extension of the CNAnova framework to identify recurrent CNA regions with intra-tumour heterogeneity, present in either primary or relapsed samples from the same patients.

Availability: The CNAnova package and synthetic datasets are available at http://www.compbio.group.cam.ac.uk/software.html.

(this Post content was reproduced from: http://bioinformatics.oxfordjournals.org/cgi/content/short/btq145v1?rss=1, Via Bioinformatics - Advance Access.)