In recent years, data-driven techniques have become indispensable for extracting meaningful insights from complex atomic and subatomic simulations of colloidal and interfacial systems. In this work, we employ such techniques to investigate partial molar volumes (PMVs), water-in-oil droplet coalescence, and adsorption energetics.
PMVs provide critical insight into molecular interactions and structural organization in multicomponent systems. However, conventional PMV calculation methods are computationally intensive and procedurally complex. To address this challenge, we developed a novel PMV calculation approach based on linear regression. This method leverages systematic data sampling from standard atomistic trajectories in which system composition remains fixed throughout the simulation. The approach was validated and subsequently applied to an industrially relevant system, enabling direct correlation between atomistic solubility behavior and macroscopic properties.
In addition, data-mining analysis revealed the interplay between solute aggregation and water-in-oil nanodroplet coalescence in a ternary system comprising solute, oil, and water. A nonmonotonic dependence of solute stacking on droplet size was observed, arising from competing aggregation and adsorption effects. To further investigate droplet growth mechanisms, we developed an in-house analysis tool that extracts and quantifies water-molecule dynamics from the high-dimensional space of atomistic simulation trajectories. Our results indicate that droplet growth is dominated by the largest droplet, which acts as the primary nucleation site.
We also developed an efficient strategy to fine-tune pretrained equivariant graph neural networks (eGNNs) for predicting the adsorption energies of aromatic molecules onto solid surfaces. Aromatic systems are of particular interest due to their distinctive electronic properties and their widespread relevance in catalysis and interfacial processes.
Overall, this work demonstrates the effectiveness of data-driven approaches for analyzing complex molecular systems. The methodologies developed here are broadly applicable to a wide range of colloidal and interfacial systems and offer new avenues for understanding atomic-scale interactions and collective behaviors.