A Bayesian extension of the hypergeometric test for functional enrichment analysis

Jing Cao, Song Zhang

Research output: Contribution to journalArticlepeer-review

48 Scopus citations


Functional enrichment analysis is conducted on high-throughput data to provide functional interpretation for a list of genes or proteins that share a common property, such as being differentially expressed (DE). The hypergeometric P-value has been widely used to investigate whether genes from pre-defined functional terms, for example, Gene Ontology (GO), are enriched in the DE genes. The hypergeometric P-value has three limitations: (1) computed independently for each term, thus neglecting biological dependence; (2) subject to a size constraint that leads to the tendency of selecting less-specific terms; (3) repeated use of information due to overlapping annotations by the true-path rule. We propose a Bayesian approach based on the non-central hypergeometric model. The GO dependence structure is incorporated through a prior on non-centrality parameters. The likelihood function does not include overlapping information. The inference about enrichment is based on posterior probabilities that do not have a size constraint. This method can detect moderate but consistent enrichment signals and identify sets of closely related and biologically meaningful functional terms rather than isolated terms. We also describe the basic ideas of assumption and implementation of different methods to provide some theoretical insights, which are demonstrated via a simulation study. A real application is presented.

Original languageEnglish (US)
Pages (from-to)84-94
Number of pages11
Issue number1
StatePublished - Mar 2014


  • Functional enrichment analysis
  • Gene ontology
  • Hypergeometric P-value
  • Modular enrichment analysis
  • Non-central hypergeometric distribution

ASJC Scopus subject areas

  • Statistics and Probability
  • General Biochemistry, Genetics and Molecular Biology
  • General Immunology and Microbiology
  • General Agricultural and Biological Sciences
  • Applied Mathematics


Dive into the research topics of 'A Bayesian extension of the hypergeometric test for functional enrichment analysis'. Together they form a unique fingerprint.

Cite this