Machine learning on ligand-residue interaction profiles to significantly improve binding affinity prediction

Beihong Ji, Xibing He, Jingchen Zhai, Yuzhao Zhang, Viet Hoang Man, Junmei Wang

Research output: Contribution to journalArticlepeer-review

12 Scopus citations


Structure-based virtual screenings (SBVSs) play an important role in drug discovery projects. However, it is still a challenge to accurately predict the binding affinity of an arbitrary molecule binds to a drug target and prioritize top ligands from an SBVS. In this study, we developed a novel method, using ligand-residue interaction profiles (IPs) to construct machine learning (ML)-based prediction models, to significantly improve the screening performance in SBVSs. Such a kind of the prediction model is called an IP scoring function (IP-SF). We systematically investigated how to improve the performance of IP-SFs from many perspectives, including the sampling methods before interaction energy calculation and different ML algorithms. Using six drug targets with each having hundreds of known ligands, we conducted a critical evaluation on the developed IP-SFs. The IP-SFs employing a gradient boosting decision tree (GBDT) algorithm in conjunction with the MIN + GB simulation protocol achieved the best overall performance. Its scoring power, ranking power and screening power significantly outperformed the Glide SF. First, compared with Glide, the average values of mean absolute error and root mean square error of GBDT/MIN + GB decreased about 38 and 36%, respectively. Second, the mean values of squared correlation coefficient and predictive index increased about 225 and 73%, respectively. Third, more encouragingly, the average value of the areas under the curve of receiver operating characteristic for six targets by GBDT, 0.87, is significantly better than that by Glide, which is only 0.71. Thus, we expected IP-SFs to have broad and promising applications in SBVSs.

Original languageEnglish (US)
Article numberbbab054
JournalBriefings in Bioinformatics
Issue number5
StatePublished - Sep 1 2021
Externally publishedYes


  • Binding affinity prediction
  • Machine learning (ml)
  • Machine learning-based scoring function (ml-based sf)
  • Scoring function (sf)
  • Scoring power
  • Structure-based virtual screening (sbvs)

ASJC Scopus subject areas

  • Information Systems
  • Molecular Biology


Dive into the research topics of 'Machine learning on ligand-residue interaction profiles to significantly improve binding affinity prediction'. Together they form a unique fingerprint.

Cite this