Numerical Evaluation of Protein Global Vibrations at Terahertz Frequencies by means of Elastic Lattice Models

: Proteins represent one of the most important building blocks for most biological processes. Their biological mechanisms have been found to correlate significantly with their dynamics, which is commonly investigated through Molecular Dynamics (MD) simulations. However, important insights on protein dynamics and biological mechanisms have also been obtained via much simpler and computationally efficient calculations based on Elastic Lattice Models (ELMs). The application of Structural Mechanics approaches, such as modal analysis, to the protein ELMs has allowed to find impressive results in terms of protein dynamics and vibrations. The low-frequency vibrations extracted from the protein ELM are usually found to occur within the terahertz (THz) frequency range and correlate fairly accurately with the observed functional motions. In this contribution, the global vibrations of lysozyme will be investigated by means of a Finite Element (FE) truss model and we will show that there exists complete consistency between the proposed FE approach and one of the more well-known ELMs for protein dynamics, the Anisotropic Network Model (ANM). The proposed truss model can consequently be seen as a simple method, easily accessible to the Structural Mechanics community members, to analyze protein vibrations and their connections with the biological activity.


Introduction
The fundamental paradigm that has characterized most of the research in the field of protein computational biology relies on the sequence-structure-dynamics-function relationship [1].The linear sequence of amino acids arranges itself into the tertiary structure of the protein through the complex chemo-physical phenomenon of protein folding [2,3].The folded protein structure is not a static entity, but it is generally highly dynamic and flexible.The resulting dynamics of the protein obviously depends on the three-dimensional protein structure.Finally, it has been demonstrated that the dynamics and flexibility of the protein acts towards its functionality and biological mechanism [1,4,5].Therefore, the sequence of amino acids generates the protein structure, which in turn drives the biological function through its dynamic and flexibility features.
Several computational approaches have been suggested to reproduce the protein dynamics and vibrations correctly.Molecular Dynamics (MD) and normal mode analysis (NMA) represent two of the most widely used methodologies [6,7].The former looks at the complete trajectories of the atoms, whereas the latter investigates the small-amplitude normal modes of the system, which is characterized by multi-parameter harmonic potentials.Later on, it has been shown that simplified models relying only on a single-parameter harmonic potential, i.e., an ensemble of Hookean springs, are sufficient to describe the slow dynamics of the protein with good details [8].Then, it has also been found that even coarse-grained models, only based on the C α atoms of the protein, are able to properly reproduce the protein slow dynamics as well as the fluctuations observed in the X-ray crystallographic experiments, i.e., the B-factors [9,10].
Here, we report the results that arise from the application of modal analysis to the Elastic Lattice Model (ELM) of hen egg-white lysozyme, a well-known globular protein.The structural model has been developed within a Finite Element (FE) framework and the outcomes have provided useful insights regarding the lysozyme flexibility, the biologically relevant motions as well as the vibrational frequencies.In particular, the latter have been found to lie in the (sub-)THz frequency range.More detailed information about this analysis and the obtained results can be found in [11].

Materials and Methods
The structural model was developed within the FE code LUSAS [12] and relies on modelling the protein structure as an ELM, i.e., a 3D spatial truss.The elastic bars of the truss were generated by taking into account only the C α atoms of the lysozyme (PDB: 4YM8) and by connecting those nodes that lie closer than a selected cutoff value.Five different cutoffs were selected, i.e., 8, 10, 12, 15 and 20 Å , in order to generate various ELMs and investigate the influence of such a geometrical parameter on the outcomes.The elastic bars were assigned the same axial rigidity EA, where E is the Young's Modulus and A the cross-sectional area.The value of rigidity was defined based on the comparison between the experimental B-factors and the computed ones (see below).The masses of all the ELM nodes were set equal among them, and equal to the total mass of the protein divided by the number of nodes [11].
Based on the generated ELMs, modal analysis was carried out, by solving the multi-degree-offreedom (MDOF) free-vibration problem within the FE environment.This implies solving the following eigenvalue-eigenvector equation: being K and M the global stiffness and mass matrix of the ELM, respectively, and ωn and δn the eigenvalue (angular frequency) and eigenvector (mode shape) related to the n th vibrational mode.Note that, for an ELM counting N nodes and which is not externally constrained, we obtain 3N-6 nonzero eigenvalues and 3N-6 non-rigid eigenvectors [11].The stiffness and mass matrix were computed within the FE environment as [13]: where kb* represents the 2 × 2 stiffness matrix of the b th elastic bar in the local reference system, Nb is the 2 × 6 rotation matrix for the b th bar between the local and global reference systems, Cb is the 6 × 3N expansion matrix which is used to expand the stiffness matrix of the b th bar to the global structural dimension, m is the mass of each ELM node and I is the 3N × 3N identity matrix [11].
One of the most used methods for the extraction of the normal modes from the coarse-grained protein crystal structure is the Anisotropic Network Model (ANM) [10].The FE-based ELM described above can be seen as the counterpart of the ANM by following a purely Structural Mechanics approach.The main difference is that the ELM defined above also includes the quantitative information about the mass of the system, which is commonly not included in ANM calculations [11].Nevertheless, it can be easily demonstrated that the stiffness matrix of the ELM K corresponds to the Hessian matrix of the ANM H.The latter can be represented as a N × N matrix, whose elements are 3 × 3 sub-matrices Hij.These contain the second partial derivatives of the harmonic potential of the linear spring connecting nodes i and j (see [10,11,14] for more details).In [11], we have shown that there exists complete consistency between the ELM K and the ANM H.In particular, when the decay parameter of the ANM spring constant is set equal to 1 [14], the two matrices are exactly the same.
Based on the eigenvalues and eigenvectors obtained from Equation ( 1), the B-factors can be numerically computed as [11,15]: being Bi the B-factor of the i th node, kB the Boltzmann's constant, T the absolute temperature, δi,n the absolute displacement of the i th node according to the n th mode and ωn the angular frequency of the n th mode shape.The B-factors obtained from Equation (3) can be directly compared with the experimental ones contained in the PDB file [16].Moreover, by posing that the average value of the experimental B-factors should match the average value of the computed ones, one can fix the axial rigidity EA to be assigned to the elastic bars of the ELM.

Results and Discussion
Table 1 reports the structural features of the five generated ELMs, according to the selected cutoff value.It can be seen that, increasing the cutoff value, the number of elastic connections within the ELM increases, therefore the mean length of the elastic bars is greater.However, in order to have consistency between the average value of the numerical and experimental B-factors, the axial rigidity EA assigned to the elastic bars needs to decrease as the cutoff increases.As a result, the axial stiffness of the mean elastic connection gets lower for higher cutoffs.Table 1 also shows the obtained Pearson correlation coefficients between the experimental and numerical B-factors.As can be seen, the correlation is around 60-70% and it tends to increase for higher cutoffs [11].From the analysis of the eigenvectors δn, i.e., the mode shapes, it was found that the lowfrequency vibrational modes obtained from the ELMs of lysozyme reproduce fairly accurately the biologically relevant motion of the lysozyme [11].Also, the ELM-based mode shapes were found to imply high flexibility in the same regions of the protein as previously obtained by Levitt et al. [7], who made use of NMA in internal coordinates.The cutoff parameter was found to have a certain influence on the obtained mode shapes.As extensively reported in [11], the mode shapes obtained from models with close cutoff values, i.e., 10, 12 and 15 Å , exhibit several common features.Conversely, using much lower and much higher values of the geometrical cutoff, i.e., 8 and 20 Å , leads to some differences in the displacement field of the mode shapes.Nevertheless, all models were efficiently able to detect the regions expected to have the highest flexibility.
From Equation (1) the angular frequencies ωn associated to each mode shape were also obtained, which are linearly related to the vibrational frequencies fn (ωn = 2πfn).Figure 1 shows the frequencies for the lysozyme ELM, depending on the selected cutoff value.Figure 1a displays the entire set of frequencies according to the mode number, whereas Figure 1b reports the statistical distribution of these frequency values.As can be see, the vibrational frequencies lie in the (sub-)THz range.The lowest frequency ranges from 0.046 THz for Model A up to 0.117 THz for Model E [11].It can be inferred that increasing the cutoff value leads to higher values of the frequency.This phenomenon occurs up to about the 50th mode.Afterwards, it can be seen that higher values of the frequency are obtained for the lower cutoff values.Finally, Figure 1b shows that the histograms of frequencies, that resemble Gaussian-like distributions.This seems to be a common feature to all globular proteins, as shown in [17,18].It is also evident that, adopting when higher cutoff values for the protein ELM, the whole range of obtained frequencies becomes remarkably narrower.

Conclusions
In this contribution, we have discussed the results of modal analysis on the lysozyme ELM, which were previously reported in [11].In particular, we observed that the cutoff parameter has a certain influence on the assigned rigidity/stiffness values of the ELM members and on the correlation between the numerical and experimental B-factors.In this case, the correlation was found to increase as the geometrical cutoff increases.The obtained mode shapes were found to reproduce fairly accurately the biologically relevant motion of the enzyme, with some slight differences in the displacement fields due to the specific selected cutoff value.Finally, the frequency values were found to distribute according to Gaussian-like distributions in the (sub-)THz frequency range.The range of the complete set of frequencies was also found to shrink as the cutoff value of the ELM increases.Future studies will investigate more deeply such frequency distribution, which seems to be a common feature to all globular proteins.

Figure 1 .
Figure 1.Frequencies in the (sub-)THz frequency range for the lysozyme: (a) frequency for all obtained normal modes; (b) frequency distributions.

Table 1 .
The five generate ELMs for the lysozyme: structural features and correlation between the experimental and numerical B-factors.