Statistical analysis of sequential motifs at biologically relevant protein-protein interfaces

Research output: Contribution to journalArticlepeer-review


Understanding protein-protein interactions (PPIs) at the molecular level may lead to innovations in medicine and biochemistry. The assumption that there are certain “hot spots” on protein surfaces that mediate their interactions with other proteins has led to a search for specific sequences involved in protein-protein contacts. In this work, we analyze sequential amino acid motifs, both at the single motif and at the motif-motif level, across a large and diverse dataset of biologically relevant protein-protein interfaces retrieved from the PDB, comparing their presence at interfaces and surfaces in a statistically rigorous manner. At the single motif level, our results indicate statistically significant over-presence of hydrophobic and in particular aromatic residues and under-presence of charged residues at protein-protein interfaces. Certain PPI-mediating motifs reported in the literature (e.g., the Tyrosine-based Motif YxxΦ and the PDZ-Binding Motif X-S/T-X-V/I) were confirmed to have a significant presence at interfaces. In addition, multiple PPI-mediating motifs were reported in the ELM database and from those present in our dataset, half were confirmed to have a statistically significant presence at interfaces whereas others were not. At the single residue, motif-motif level, Cysteine-Cysteine contacts were found to be the most abundant ones followed by interactions involving aromatic/hydrophobic residues. Top ranking, longer motif-motif pairs show predominance of Leucine and aromatic residues. Finally, preliminary energy calculations (using the MM/GBSA procedure) indicate a partial correlation between the probability of motifs-pair to be a part of a protein-protein interface and the strength of the interactions between the motifs. In conclusion, this study points to specific characteristics of motifs that have a higher probability to mediate protein-protein interactions. Prominent motifs identified in this study may be used in the future as possible components in protein engineering.

Original languageEnglish
Pages (from-to)1244-1259
Number of pages16
JournalComputational and Structural Biotechnology Journal
StatePublished - Dec 2024

Bibliographical note

Publisher Copyright:
© 2024 The Authors


  • Energetic quantification
  • Protein sequences
  • Protein-protein interactions
  • Sequential motifs
  • Statistical analysis


Dive into the research topics of 'Statistical analysis of sequential motifs at biologically relevant protein-protein interfaces'. Together they form a unique fingerprint.

Cite this