TY - JOUR
T1 - Compact Structure Patterns in Proteins
AU - Chitturi, Bhadrachalam
AU - Shi, Shuoyong
AU - Kinch, Lisa N.
AU - Grishin, Nick V.
N1 - Funding Information:
This work was supported in part by the National Institutes of Health (GM094575 to N.V.G.) and the Welch Foundation ( I-1505 to N.V.G.). Daniel Parente contributed to the coding of the initial model. Hua Cheng scrutinized some of the motif hits. Raquel Bromberg assisted with the final reading.
Publisher Copyright:
© 2016
PY - 2016/10/23
Y1 - 2016/10/23
N2 - Globular proteins typically fold into tightly packed arrays of regular secondary structures. We developed a model to approximate the compact parallel and antiparallel arrangement of α-helices and β-strands, enumerated all possible topologies formed by up to five secondary structural elements (SSEs), searched for their occurrence in spatial structures of proteins, and documented their frequencies of occurrence in the PDB. The enumeration model grows larger super-secondary structure patterns (SSPs) by combining pairs of smaller patterns, a process that approximates a potential path of protein fold evolution. The most prevalent SSPs are typically present in superfolds such as the Rossmann-like fold, the ferredoxin-like fold, and the Greek key motif, whereas the less frequent SSPs often possess uncommon structure features such as split β-sheets, left-handed connections, and crossing loops. This complete SSP enumeration model, for the first time, allows us to investigate which theoretically possible SSPs are not observed in available protein structures. All SSPs with up to four SSEs occurred in proteins. However, among the SSPs with five SSEs, approximately 20% (218) are absent from existing folds. Of these unobserved SSPs, 80% contain two or more uncommon structure features. To facilitate future efforts in protein structure classification, engineering, and design, we provide the resulting patterns and their frequency of occurrence in proteins at: http://prodata.swmed.edu/ssps/.
AB - Globular proteins typically fold into tightly packed arrays of regular secondary structures. We developed a model to approximate the compact parallel and antiparallel arrangement of α-helices and β-strands, enumerated all possible topologies formed by up to five secondary structural elements (SSEs), searched for their occurrence in spatial structures of proteins, and documented their frequencies of occurrence in the PDB. The enumeration model grows larger super-secondary structure patterns (SSPs) by combining pairs of smaller patterns, a process that approximates a potential path of protein fold evolution. The most prevalent SSPs are typically present in superfolds such as the Rossmann-like fold, the ferredoxin-like fold, and the Greek key motif, whereas the less frequent SSPs often possess uncommon structure features such as split β-sheets, left-handed connections, and crossing loops. This complete SSP enumeration model, for the first time, allows us to investigate which theoretically possible SSPs are not observed in available protein structures. All SSPs with up to four SSEs occurred in proteins. However, among the SSPs with five SSEs, approximately 20% (218) are absent from existing folds. Of these unobserved SSPs, 80% contain two or more uncommon structure features. To facilitate future efforts in protein structure classification, engineering, and design, we provide the resulting patterns and their frequency of occurrence in proteins at: http://prodata.swmed.edu/ssps/.
KW - fold
KW - helix
KW - secondary structure elements
KW - strand
KW - super-secondary structure pattern
UR - http://www.scopus.com/inward/record.url?scp=84991510814&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84991510814&partnerID=8YFLogxK
U2 - 10.1016/j.jmb.2016.07.022
DO - 10.1016/j.jmb.2016.07.022
M3 - Article
C2 - 27498165
AN - SCOPUS:84991510814
SN - 0022-2836
VL - 428
SP - 4392
EP - 4412
JO - Journal of Molecular Biology
JF - Journal of Molecular Biology
IS - 21
ER -