Using big-data to understand the protein interface landscape
José G. Almeida * 1 , Alexandre M. J. J. Bonvin 2 , Irina S. Moreira 1, 2
1  CNC - Center for Neuroscience and Cell Biology; Rua Larga, FMUC, Polo I, 1ºandar, Universidade de Coimbra, 3004-517, Coimbra, Portugal.
2  Bijvoet Center for Biomolecular Research, Faculty of Science - Chemistry, Utrecht University, Utrecht, 3584CH, the Netherlands


Protein-protein interactions (PPIs) are the foundation of basic organism functions and understanding them is key in determining the importance of different proteins in a wide array of complex networks and processes [1]. The variety underlying PPIs is immense and some residues are more essential in interface stabilization than others [2]. Such is the case of hot-spots (HS), residues whose mutation to alanine is detrimental for the stability of the PPI, as opposed to null-spots (NS), which constitute the remaining interfacial residues [3]. Considering the complex landscape in protein interfaces, some patterns and characteristics arise when a high amount of data is considered, by minimizing the effect of less prevalent interactions and characteristics. In this work, the SpotOn pipeline [4] - developed by our group - custom scripts and conservation servers were used to determine structural features of interfacial residues and to classify them as HS and NS in the PPI4DOCK database [5], comprising over 1400 non-redundant complexes. This study allowed us to further understand the structural differences between HS and NS and will be available in a web-server in the near future.


Keywords: big-data, protein interfaces, hot-spots