A highly efficient and effective motif discovery method for ChIP-seq/ChIP-chip data using positional information

Xiaotu Ma; Ashwinikumar Kulkarni; Zhihua Zhang; Zhenyu Xuan; Robert Serfling; Michael Q. Zhang

doi:10.1093/nar/gkr1135

A highly efficient and effective motif discovery method for ChIP-seq/ChIP-chip data using positional information

Xiaotu Ma, Ashwinikumar Kulkarni, Zhihua Zhang, Zhenyu Xuan, Robert Serfling, Michael Q. Zhang

Research output: Contribution to journal › Article › peer-review

33 Scopus citations

Abstract

Identification of DNA motifs from ChIP-seq/ChIP-chip [chromatin immunoprecipitation (ChIP)] data is a powerful method for understanding the transcriptional regulatory network. However, most established methods are designed for small sample sizes and are inefficient for ChIP data. Here we propose a new k-mer occurrence model to reflect the fact that functional DNA k-mers often cluster around ChIP peak summits. With this model, we introduced a new measure to discover functional k-mers. Using simulation, we demonstrated that our method is more robust against noises in ChIP data than available methods. A novel word clustering method is also implemented to group similar k-mers into position weight matrices (PWMs). Our method was applied to a diverse set of ChIP experiments to demonstrate its high sensitivity and specificity. Importantly, our method is much faster than several other methods for large sample sizes. Thus, we have developed an efficient and effective motif discovery method for ChIP experiments.

Original language	English (US)
Pages (from-to)	e50
Journal	Nucleic acids research
Volume	40
Issue number	7
DOIs	https://doi.org/10.1093/nar/gkr1135
State	Published - Apr 2012
Externally published	Yes

ASJC Scopus subject areas

Genetics

Access to Document

10.1093/nar/gkr1135

Cite this

@article{af8eaedfa1f64b09a8cd0ad30dfad416,

title = "A highly efficient and effective motif discovery method for ChIP-seq/ChIP-chip data using positional information",

abstract = "Identification of DNA motifs from ChIP-seq/ChIP-chip [chromatin immunoprecipitation (ChIP)] data is a powerful method for understanding the transcriptional regulatory network. However, most established methods are designed for small sample sizes and are inefficient for ChIP data. Here we propose a new k-mer occurrence model to reflect the fact that functional DNA k-mers often cluster around ChIP peak summits. With this model, we introduced a new measure to discover functional k-mers. Using simulation, we demonstrated that our method is more robust against noises in ChIP data than available methods. A novel word clustering method is also implemented to group similar k-mers into position weight matrices (PWMs). Our method was applied to a diverse set of ChIP experiments to demonstrate its high sensitivity and specificity. Importantly, our method is much faster than several other methods for large sample sizes. Thus, we have developed an efficient and effective motif discovery method for ChIP experiments.",

author = "Xiaotu Ma and Ashwinikumar Kulkarni and Zhihua Zhang and Zhenyu Xuan and Robert Serfling and Zhang, {Michael Q.}",

note = "Funding Information: National Institute of Health (HG001696 to M.Q.Z.); National Basic Research Program of China (2012CB316503 to M.Q.Z.); National Natural Science Foundation of China (91019016, 31061160497 to M.Q.Z.); National Science Foundation (DMS-1106091 to R.S.) and UTD Startup Fund (to Z.X.). Funding for open access charge: NIH.",

year = "2012",

month = apr,

doi = "10.1093/nar/gkr1135",

language = "English (US)",

volume = "40",

pages = "e50",

journal = "Nucleic acids research",

issn = "0305-1048",

publisher = "Oxford University Press",

number = "7",

}

TY - JOUR

T1 - A highly efficient and effective motif discovery method for ChIP-seq/ChIP-chip data using positional information

AU - Ma, Xiaotu

AU - Kulkarni, Ashwinikumar

AU - Zhang, Zhihua

AU - Xuan, Zhenyu

AU - Serfling, Robert

AU - Zhang, Michael Q.

N1 - Funding Information: National Institute of Health (HG001696 to M.Q.Z.); National Basic Research Program of China (2012CB316503 to M.Q.Z.); National Natural Science Foundation of China (91019016, 31061160497 to M.Q.Z.); National Science Foundation (DMS-1106091 to R.S.) and UTD Startup Fund (to Z.X.). Funding for open access charge: NIH.

PY - 2012/4

Y1 - 2012/4

N2 - Identification of DNA motifs from ChIP-seq/ChIP-chip [chromatin immunoprecipitation (ChIP)] data is a powerful method for understanding the transcriptional regulatory network. However, most established methods are designed for small sample sizes and are inefficient for ChIP data. Here we propose a new k-mer occurrence model to reflect the fact that functional DNA k-mers often cluster around ChIP peak summits. With this model, we introduced a new measure to discover functional k-mers. Using simulation, we demonstrated that our method is more robust against noises in ChIP data than available methods. A novel word clustering method is also implemented to group similar k-mers into position weight matrices (PWMs). Our method was applied to a diverse set of ChIP experiments to demonstrate its high sensitivity and specificity. Importantly, our method is much faster than several other methods for large sample sizes. Thus, we have developed an efficient and effective motif discovery method for ChIP experiments.

AB - Identification of DNA motifs from ChIP-seq/ChIP-chip [chromatin immunoprecipitation (ChIP)] data is a powerful method for understanding the transcriptional regulatory network. However, most established methods are designed for small sample sizes and are inefficient for ChIP data. Here we propose a new k-mer occurrence model to reflect the fact that functional DNA k-mers often cluster around ChIP peak summits. With this model, we introduced a new measure to discover functional k-mers. Using simulation, we demonstrated that our method is more robust against noises in ChIP data than available methods. A novel word clustering method is also implemented to group similar k-mers into position weight matrices (PWMs). Our method was applied to a diverse set of ChIP experiments to demonstrate its high sensitivity and specificity. Importantly, our method is much faster than several other methods for large sample sizes. Thus, we have developed an efficient and effective motif discovery method for ChIP experiments.

UR - http://www.scopus.com/inward/record.url?scp=84860115916&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84860115916&partnerID=8YFLogxK

U2 - 10.1093/nar/gkr1135

DO - 10.1093/nar/gkr1135

M3 - Article

C2 - 22228832

AN - SCOPUS:84860115916

SN - 0305-1048

VL - 40

SP - e50

JO - Nucleic acids research

JF - Nucleic acids research

IS - 7

ER -

A highly efficient and effective motif discovery method for ChIP-seq/ChIP-chip data using positional information

Abstract

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this