Friday, February 7, 2014

Fwd: Integrative Gene Set Analysis of Multi-platform Data with Sample Heterogeneity

Fwd: please follow footer link

Integrative Gene Set Analysis of Multi-platform Data with Sample Heterogeneity: "

Motivation: Gene set analysis is a popular method for large-scale genomic studies. Because genes that have common biological features are analyzed jointly, gene set analysis often achieves better power and generates more biologically informative results. With the advancement of technologies, genomic studies with multi-platform data have become increasingly common. Several strategies have been proposed that integrate genomic data from multiple platforms to perform gene set analysis. To evaluate the performances of existing integrative gene set methods under various scenarios, we conduct a comparative simulation analysis based on the TCGA breast cancer data set.

Results: We find that existing methods for gene set analysis are less effective when sample heterogeneity exists. To address this issue, we develop three methods for multi-platform genomic data with heterogeneity: two non-parametric methods, MPMWS (Multi-Platform Mann-Whitney Statistics) and MPORT (Multi-Platform Outlier Robust T-statistics), and a parametric method, MPLRS (Multi-Platform Likelihood Ratio Statistics). Using simulations, we show that the proposed MPMWS method has higher power for heterogeneous samples and comparable performance for homogeneous samples when compared to existing methods. Our real data applications to two TCGA datasets also suggest that the proposed methods are able to identify novel pathways that are missed by other strategies.



(Via Bioinformatics - Advance Access.)