Sunday, May 2, 2010

Differential expression analysis for sequence count data

Motivation: High-throughput nucleotide sequencing provides quantitative readouts in assays for RNA expression (RNA-Seq), protein-DNA binding (ChIP-Seq) or cell counting (barcode sequencing). Statistical inference of differential signal in such data requires estimation of their variability throughout the dynamic range. When the number of replicates is small, error modelling is needed to achieve statistical power.Results: We propose an error model that uses the negative binomial distribution, with variance and mean linked by local regression, to model the null distribution of the count data. The method controls type-I error and provides good detection power. Availability: A free open-source R software package, DESeq, is available from the Bioconductor project and from http://www-huber.embl.de/users/anders/DESeq.

(this Post content was reproduced from: http://precedings.nature.com/documents/4282/version/2, Via Browsing Bioinformatics : Nature Precedings.)