The Cosmic Microwave Background (CMB) provides a precise snapshot of the early Universe and encodes detailed information about the fundamental parameters governing cosmological evolution. Conventional approaches to cosmological parameter inference rely on Bayesian sampling techniques combined with numerical Boltzmann solvers, which, while powerful, often obscure the underlying functional relationships between cosmological parameters and observable features of the CMB power spectrum.
In this work, we investigate symbolic regression as an interpretable machine-learning framework for cosmological parameter inference from CMB temperature and polarization power spectra. Using the PySR algorithm, we search for compact, closed-form analytic expressions that relate cosmological parameters to features of the CMB power spectrum, enabling transparent and physically interpretable mappings between theory and observation. Unlike black-box neural networks, symbolic regression yields explicit mathematical expressions that can be directly analyzed and compared with theoretical expectations.
We evaluate the accuracy, complexity, and stability of the recovered symbolic models across multiple cosmological parameters and benchmark their performance against expressions obtained using the AI Feynman algorithm. While AI Feynman performs effectively in low-dimensional settings, we find that its performance degrades as the dimensionality and complexity of the parameter space increase. In contrast, PySR demonstrates greater robustness and flexibility in higher-dimensional regimes relevant to realistic cosmological inference problems.
Our results show that symbolic regression can recover accurate and compact analytic relationships while providing direct physical insight into the structure of the CMB parameter space. This work highlights symbolic machine learning as a promising complementary approach to traditional inference methods and contributes toward more interpretable and physics-informed analyses of cosmological data.
