Saturday, May 22, 2010

CNAnova: a new approach for finding recurrent copy number abnormalities in cancer SNP microarray data

Motivation: The current generation of single nucleotide polymorphism (SNP) arrays allows measurement of copy number aberrations (CNAs) in cancer at more than one million locations in the genome in hundreds of tumour samples. Most research has focused on single-sample CNA discovery, the so-called segmentation problem. The availability of high-density, large sample-size SNP array datasets makes the identification of recurrent copy number changes in cancer, an important issue that can be addressed using the cross-sample information.


Results: We present a novel approach for finding regions of recurrent copy number aberrations, called CNAnova, from Affymetrix SNP 6.0 array data. The method derives its statistical properties from a control dataset composed of normal samples and, in contrast to previous methods, does not require segmentation and permutation steps. For rigorous testing of the algorithm and comparison to existing methods, we developed a simulation scheme that uses the noise distribution present in Affymetrix arrays. Application of the method to 128 acute lymphoblastic leukaemia samples shows that CNAnova achieves lower error rate than a popular alternative approach. We also describe an extension of the CNAnova framework to identify recurrent CNA regions with intra-tumour heterogeneity, present in either primary or relapsed samples from the same patients.


Availability: The CNAnova package and synthetic datasets are available at http://www.compbio.group.cam.ac.uk/software.html


Contact: sergii.ivakhno@cancer.org.uk


Supplementary information: Supplementary data are available at Bioinformatics online.


(this Post content was reproduced from: http://bioinformatics.oxfordjournals.org/cgi/content/short/26/11/1395?rss=1, Via Bioinformatics - current issue.)