Sunday, September 25, 2011

coMOTIF: a mixture framework for identifying transcription factor and a coregulator motif in ChIP-seq Data

coMOTIF: a mixture framework for identifying transcription factor and a coregulator motif in ChIP-seq Data:

Motivation: ChIP-seq data are enriched in binding sites for the protein immunoprecipitated. Some sequences may also contain binding sites for a coregulator. Biologists are interested in knowing which coregulatory factor motifs may be present in the sequences bound by the protein ChIP'ed.

Results: We present a finite mixture framework with an expectation–maximization algorithm that considers two motifs jointly and simultaneously determines which sequences contain both motifs, either one or neither of them. Tested on 10 simulated ChIP-seq datasets, our method performed better than repeated application of MEME in predicting sequences containing both motifs. When applied to a mouse liver Foxa2 ChIP-seq dataset involving ~ 12 000 400-bp sequences, coMOTIF identified co-occurrence of Foxa2 with Hnf4a, Cebpa, E-box, Ap1/Maf or Sp1 motifs in ~6–33% of these sequences. These motifs are either known as liver-specific transcription factors or have an important role in liver function.

Availability: Freely available at http://www.niehs.nih.gov/research/resources/software/comotif/.

Contact:li3@niehs.nih.gov

Supplementary Information:Supplementary data are available at Bioinformatics online.


(Via Bioinformatics - current issue.)