Comparison of different normalization assumptions for analyses of DNA methylation data from the cancer genome

Dong Wang, Yuannv Zhang, Yan Huang, Pengfei Li, Mingyue Wang, Ruihong Wu, Lixin Cheng, Wenjing Zhang, Yujing Zhang, Bin Li, Chenguang Wang, Zheng Guo

Research output: Contribution to journalArticlepeer-review

9 Scopus citations


Nowadays, some researchers normalized DNA methylation arrays data in order to remove the technical artifacts introduced by experimental differences in sample preparation, array processing and other factors. However, other researchers analyzed DNA methylation arrays without performing data normalization considering that current normalizations for methylation data may distort real differences between normal and cancer samples because cancer genomes may be extensively subject to hypomethylation and the total amount of CpG methylation might differ substantially among samples. In this study, using eight datasets by Infinium HumanMethylation27 assay, we systemically analyzed the global distribution of DNA methylation changes in cancer compared to normal control and its effect on data normalization for selecting differentially methylated (DM) genes. We showed more differentially methylated (DM) genes could be found in the Quantile/Lowess-normalized data than in the non-normalized data. We found the DM genes additionally selected in the Quantile/Lowess-normalized data showed significantly consistent methylation states in another independent dataset for the same cancer, indicating these extra DM genes were effective biological signals related to the disease. These results suggested normalization can increase the power of detecting DM genes in the context of diagnostic markers which were usually characterized by relatively large effect sizes. Besides, we evaluated the reproducibility of DM discoveries for a particular cancer type, and we found most of the DM genes additionally detected in one dataset showed the same methylation directions in the other dataset for the same cancer type, indicating that these DM genes were effective biological signals in the other dataset. Furthermore, we showed that some DM genes detected from different studies for a particular cancer type were significantly reproducible at the functional level.

Original languageEnglish (US)
Pages (from-to)36-42
Number of pages7
Issue number1
StatePublished - Sep 10 2012
Externally publishedYes


  • Cancer genome
  • DNA methylation
  • Data normalization
  • Differential methylation
  • Microarray
  • Reproducibility

ASJC Scopus subject areas

  • Genetics


Dive into the research topics of 'Comparison of different normalization assumptions for analyses of DNA methylation data from the cancer genome'. Together they form a unique fingerprint.

Cite this