TY - JOUR
T1 - An evolution-based model for designing chorismate mutase enzymes
AU - Russ, William P.
AU - Figliuzzi, Matteo
AU - Stocker, Christian
AU - Barrat-Charlaix, Pierre
AU - Socolich, Michael
AU - Kast, Peter
AU - Hilvert, Donald
AU - Monasson, Remi
AU - Cocco, Simona
AU - Weigt, Martin
AU - Ranganathan, Rama
N1 - Funding Information:
This work was supported by NIH grant RO1GM12345 (R.R.), Robert A. Welch Foundation grant I-1366 (R.R.), a Data Science Discovery award from the University of Chicago Center for Data and Computing (R.R.), the Green Center for Systems Biology at UT Southwestern Medical Center (R.R.), the EU H2020 Research and Innovation Programme MSCA-RISE-2016, Grant Agreement 734439 (InferNet) (M.W.), grant number CE30-0021-01 RBMpro from the Agence Nationale de la Recherche (R.M. and S.C.), and Swiss National Science Foundation grants 310030M-182648 (P.K.) and 310030B-176405 (D.H.).
Publisher Copyright:
© 2020 American Association for the Advancement of Science. All rights reserved.
PY - 2020/7/24
Y1 - 2020/7/24
N2 - The rational design of enzymes is an important goal for both fundamental and practical reasons. Here, we describe a process to learn the constraints for specifying proteins purely from evolutionary sequence data, design and build libraries of synthetic genes, and test them for activity in vivo using a quantitative complementation assay. For chorismate mutase, a key enzyme in the biosynthesis of aromatic amino acids, we demonstrate the design of natural-like catalytic function with substantial sequence diversity. Further optimization focuses the generative model toward function in a specific genomic context. The data show that sequence-based statistical models suffice to specify proteins and provide access to an enormous space of functional sequences. This result provides a foundation for a general process for evolution-based design of artificial proteins.
AB - The rational design of enzymes is an important goal for both fundamental and practical reasons. Here, we describe a process to learn the constraints for specifying proteins purely from evolutionary sequence data, design and build libraries of synthetic genes, and test them for activity in vivo using a quantitative complementation assay. For chorismate mutase, a key enzyme in the biosynthesis of aromatic amino acids, we demonstrate the design of natural-like catalytic function with substantial sequence diversity. Further optimization focuses the generative model toward function in a specific genomic context. The data show that sequence-based statistical models suffice to specify proteins and provide access to an enormous space of functional sequences. This result provides a foundation for a general process for evolution-based design of artificial proteins.
UR - http://www.scopus.com/inward/record.url?scp=85088524641&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85088524641&partnerID=8YFLogxK
U2 - 10.1126/science.aba3304
DO - 10.1126/science.aba3304
M3 - Article
C2 - 32703877
AN - SCOPUS:85088524641
SN - 0036-8075
VL - 369
SP - 440
EP - 445
JO - Science
JF - Science
IS - 6502
ER -