Understanding the impact of single-nucleotide polymorphisms (SNPs) on protein structure is a critical challenge in functional genomics. We developed a scalable computational pipeline to evaluate SNP effects on protein stability, flexibility, and dynamics, using genomic data from 1,467 adult male honey bee drones (Apis mellifera) across 25 countries and 18 subspecies, sourced from the MEDIBEES project. High-stringency variant calling identified SNPs for analysis.
Wild-type structures were predicted using AlphaFold2, SWISS-MODEL, and AlphaFold Protein Structure Database models, ranked by confidence metrics (pLDDT, DOPE, TM-score). AlphaFold2 models showed high internal consistency (average TM-score 0.9975 ± 0.0015) and similarity to homology-based models (average TM-score 0.886 ± 0.076) and experimental structures (average TM-score 0.837 ± 0.001). SNP variants were generated via in silico mutagenesis using mutation-aware AlphaFold2, MODELLER (for homology modelling), and SWISS-MODEL. All structures underwent energy minimization to resolve clashes.
Molecular dynamics simulations (AMBER25, 100 ns each) under physiological conditions analyzed trajectories for solvent-accessible surface area (SASA), root mean square deviation (RMSD), root mean square fluctuation (RMSF), radius of gyration (Rg), hydrogen bonds (H-bonds), and dihedral angles (phi/φ, psi/ψ). These metrics revealed SNP-induced changes: in AlphaFold2 models, mean RMSD decreased from 1.81 Å (wild-type) to 1.58 ± 0.12 Å (mutants), suggesting improved stability; RMSF increased from 0.82 to 0.85 ± 0.05 Å, indicating higher flexibility; Rg decreased from 38.78 to 38.66 ± 0.09 Å, reflecting greater compactness; H-bonds reduced from 270.5 to 268.7 ± 3.6; and SASA declined from 54,139 to 52,946 ± 977 Ų, implying reduced solvent exposure. In contrast, homology-based models (e.g., trRosetta) showed smaller perturbations (e.g., RMSD from 2.25 to 2.22 Å).
AlphaFold2 and AlphaFold Protein Structure Database wild-type models demonstrated the greatest sensitivity to SNP perturbations compared to homology models, with mutant structures maintaining high congruence to mutation-aware predictions (TM-scores >0.99), underscoring the method's robustness for SNP effect modeling.
