Prediction of functional specificity determinants from protein sequences using log-likelihood ratios

Jimin Pei; Wei Cai; Lisa N. Kinch; Nick V. Grishin

doi:10.1093/bioinformatics/bti766

Prediction of functional specificity determinants from protein sequences using log-likelihood ratios

Jimin Pei, Wei Cai, Lisa N. Kinch, Nick V. Grishin

Research output: Contribution to journal › Article › peer-review

45 Scopus citations

Abstract

Motivation: A number of methods have been developed to predict functional specificity determinants in protein families based on sequence information. Most of these methods rely on pre-defined functional subgroups. Manual subgroup definition is difficult because of the limited number of experimentally characterized subfamilies with differing specificity, while automatic subgroup partitioning using computational tools is a non-trivial task and does not always yield ideal results. Results: We propose a new approach SPEL (specificity positions by evolutionary likelihood) to detect positions that are likely to be functional specificity determinants. SPEL, which does not require subgroup definition, takes a multiple sequence alignment of a protein family as the only input, and assigns a P-value to every position in the alignment. Positions with low P-values are likely to be important for functional specificity. An evolutionary tree is reconstructed during the calculation, and P-value estimation is based on a random model that involves evolutionary simulations. Evolutionary log-likelihood is chosen as a measure of amino acid distribution at a position. To illustrate the performance of the method, we carried out a detailed analysis of two protein families (LacI/PurR and G protein α subunit), and compared our method with two existing methods (evolutionary trace and mutual information based). All three methods were also compared on a set of protein families with known ligand-bound structures.

Original language	English (US)
Pages (from-to)	164-171
Number of pages	8
Journal	Bioinformatics
Volume	22
Issue number	2
DOIs	https://doi.org/10.1093/bioinformatics/bti766
State	Published - Jan 15 2006

ASJC Scopus subject areas

Statistics and Probability
Biochemistry
Molecular Biology
Computer Science Applications
Computational Theory and Mathematics
Computational Mathematics

Access to Document

10.1093/bioinformatics/bti766

Cite this

@article{bef3bdee0aea4fca80f59a05180500c3,

title = "Prediction of functional specificity determinants from protein sequences using log-likelihood ratios",

abstract = "Motivation: A number of methods have been developed to predict functional specificity determinants in protein families based on sequence information. Most of these methods rely on pre-defined functional subgroups. Manual subgroup definition is difficult because of the limited number of experimentally characterized subfamilies with differing specificity, while automatic subgroup partitioning using computational tools is a non-trivial task and does not always yield ideal results. Results: We propose a new approach SPEL (specificity positions by evolutionary likelihood) to detect positions that are likely to be functional specificity determinants. SPEL, which does not require subgroup definition, takes a multiple sequence alignment of a protein family as the only input, and assigns a P-value to every position in the alignment. Positions with low P-values are likely to be important for functional specificity. An evolutionary tree is reconstructed during the calculation, and P-value estimation is based on a random model that involves evolutionary simulations. Evolutionary log-likelihood is chosen as a measure of amino acid distribution at a position. To illustrate the performance of the method, we carried out a detailed analysis of two protein families (LacI/PurR and G protein α subunit), and compared our method with two existing methods (evolutionary trace and mutual information based). All three methods were also compared on a set of protein families with known ligand-bound structures.",

author = "Jimin Pei and Wei Cai and Kinch, {Lisa N.} and Grishin, {Nick V.}",

note = "Funding Information: We are grateful to Leonid A. Mirny for providing the MSA of LacI/ PurR family. This work was supported by the NIH grant GM67165 to N.V.G.",

year = "2006",

month = jan,

day = "15",

doi = "10.1093/bioinformatics/bti766",

language = "English (US)",

volume = "22",

pages = "164--171",

journal = "Bioinformatics",

issn = "1367-4803",

publisher = "Oxford University Press",

number = "2",

}

TY - JOUR

T1 - Prediction of functional specificity determinants from protein sequences using log-likelihood ratios

AU - Pei, Jimin

AU - Cai, Wei

AU - Kinch, Lisa N.

AU - Grishin, Nick V.

N1 - Funding Information: We are grateful to Leonid A. Mirny for providing the MSA of LacI/ PurR family. This work was supported by the NIH grant GM67165 to N.V.G.

PY - 2006/1/15

Y1 - 2006/1/15

N2 - Motivation: A number of methods have been developed to predict functional specificity determinants in protein families based on sequence information. Most of these methods rely on pre-defined functional subgroups. Manual subgroup definition is difficult because of the limited number of experimentally characterized subfamilies with differing specificity, while automatic subgroup partitioning using computational tools is a non-trivial task and does not always yield ideal results. Results: We propose a new approach SPEL (specificity positions by evolutionary likelihood) to detect positions that are likely to be functional specificity determinants. SPEL, which does not require subgroup definition, takes a multiple sequence alignment of a protein family as the only input, and assigns a P-value to every position in the alignment. Positions with low P-values are likely to be important for functional specificity. An evolutionary tree is reconstructed during the calculation, and P-value estimation is based on a random model that involves evolutionary simulations. Evolutionary log-likelihood is chosen as a measure of amino acid distribution at a position. To illustrate the performance of the method, we carried out a detailed analysis of two protein families (LacI/PurR and G protein α subunit), and compared our method with two existing methods (evolutionary trace and mutual information based). All three methods were also compared on a set of protein families with known ligand-bound structures.

AB - Motivation: A number of methods have been developed to predict functional specificity determinants in protein families based on sequence information. Most of these methods rely on pre-defined functional subgroups. Manual subgroup definition is difficult because of the limited number of experimentally characterized subfamilies with differing specificity, while automatic subgroup partitioning using computational tools is a non-trivial task and does not always yield ideal results. Results: We propose a new approach SPEL (specificity positions by evolutionary likelihood) to detect positions that are likely to be functional specificity determinants. SPEL, which does not require subgroup definition, takes a multiple sequence alignment of a protein family as the only input, and assigns a P-value to every position in the alignment. Positions with low P-values are likely to be important for functional specificity. An evolutionary tree is reconstructed during the calculation, and P-value estimation is based on a random model that involves evolutionary simulations. Evolutionary log-likelihood is chosen as a measure of amino acid distribution at a position. To illustrate the performance of the method, we carried out a detailed analysis of two protein families (LacI/PurR and G protein α subunit), and compared our method with two existing methods (evolutionary trace and mutual information based). All three methods were also compared on a set of protein families with known ligand-bound structures.

UR - http://www.scopus.com/inward/record.url?scp=32144442848&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=32144442848&partnerID=8YFLogxK

U2 - 10.1093/bioinformatics/bti766

DO - 10.1093/bioinformatics/bti766

M3 - Article

C2 - 16278237

AN - SCOPUS:32144442848

SN - 1367-4803

VL - 22

SP - 164

EP - 171

JO - Bioinformatics

JF - Bioinformatics

IS - 2

ER -

Prediction of functional specificity determinants from protein sequences using log-likelihood ratios

Abstract

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this