Deep supervised and convolutional generative stochastic network for protein secondary structure prediction

Jian Zhou, Olga G. Troyanskaya

Research output: Chapter in Book/Report/Conference proceedingConference contribution

45 Scopus citations

Abstract

Predicting protein secondary structure is a fundamental problem in protein structure predic-tion. Here we present a new supervised generative stochastic network (GSN) based method to predict local secondary structure with deep hierarchical representations. GSN is a recently proposed deep learning technique (Bengio & Thibodeau-Laufer, 2013) to globally train deep generative model. We present the supervised extension of GSN, which learns a Markov chain to sample from a conditional distribution, and applied it to protein structure prediction. To scale the model to full-sized, high-dimensional data, like protein sequences with hundreds of amino- Acids, we introduce a convolutional architecture, which allows efficient learning across multiple layers of hierarchical representations. Our architecture uniquely focuses on predicting structured low-level labels informed with both low and high-level representations learned by the model. In our application this corresponds to labeling the secondary structure state of each amino-acid residue. We trained and tested the model on separate sets of non-homologous proteins sharing less than 30% sequence identity. Our model achieves 66.4% Q8 accuracy on the CB513 dataset, better than the previously reported best performance 64.9% (Wang et al., 2011) for this challenging secondary structure prediction problem.

Original languageEnglish (US)
Title of host publication31st International Conference on Machine Learning, ICML 2014
PublisherInternational Machine Learning Society (IMLS)
Pages1121-1129
Number of pages9
ISBN (Electronic)9781634393973
StatePublished - 2014
Externally publishedYes
Event31st International Conference on Machine Learning, ICML 2014 - Beijing, China
Duration: Jun 21 2014Jun 26 2014

Publication series

Name31st International Conference on Machine Learning, ICML 2014
Volume2

Conference

Conference31st International Conference on Machine Learning, ICML 2014
Country/TerritoryChina
CityBeijing
Period6/21/146/26/14

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Networks and Communications
  • Software

Fingerprint

Dive into the research topics of 'Deep supervised and convolutional generative stochastic network for protein secondary structure prediction'. Together they form a unique fingerprint.

Cite this