Incorporating prior information via shrinkage: A combined analysis of genome-wide location data and gene expression data

Yang Xie, Wei Pan, Kyeong S. Jeong, Arkady Khodursky

Research output: Contribution to journalArticlepeer-review

5 Scopus citations


Transcriptional control is a critical step in regulation of gene expression. Understanding such a control on a genomic level involves deciphering the mechanisms and structures of regulatory programmes and networks. A difficulty arises due to the weak signal and high noise in various sources of data while most current approaches are limited to analysis of a single source of data. A natural alternative is to improve statistical efficiency and power by a combined analysis of multiple sources of data. Here we propose a shrinkage method to combine genome-wide location data and gene expression data to detect the binding sites or target genes of a transcription factor. Specifically, a prior 'non-target' gene list is generated by analysing the expression data, and then this information is incorporated into the subsequent binding data analysis via a shrinkage method. There is a Bayesian justification for this shrinkage method. Both simulated and real data were used to evaluate the proposed method and compare it with analysing binding data alone. In simulation studies, the proposed method gives higher sensitivity and lower false discovery rate (FDR) in detecting the target genes. In real data example, the proposed method can reduce the estimated FDR and increase the power to detect the previously known target genes of a broad transcription regulator, leucine responsive regulatory protein (Lrp) in Escherichia coli. This method can also be used to incorporate other information, such as gene ontology (GO), to microarray data analysis to detect differentially expressed genes.

Original languageEnglish (US)
Pages (from-to)2258-2275
Number of pages18
JournalStatistics in Medicine
Issue number10
StatePublished - May 10 2007


  • Integrative analysis
  • Microarray
  • Permutation
  • Statistical power

ASJC Scopus subject areas

  • Epidemiology
  • Statistics and Probability


Dive into the research topics of 'Incorporating prior information via shrinkage: A combined analysis of genome-wide location data and gene expression data'. Together they form a unique fingerprint.

Cite this