Reliable baseline correction is a cornerstone of spectroscopic analysis, underpinning critical tasks such as peak identification and performance of machine learning classifiers. This process is particularly crucial in Surface-Enhanced Raman Spectroscopy (SERS), where subtle spectral features carry vital chemical signatures. Traditional baseline correction techniques often struggle with artifact introduction and extensive manual parameter adjustments. The adaptive iteratively reweighted penalized least squares (airPLS) algorithm, though widely appreciated for its speed, has notable drawbacks: its piecewise linear baseline fails to capture smooth backgrounds, it overestimates baselines by linking adjacent peak "feet," and yields significant mean absolute errors (MAE) in high-intensity regions. To address these limitations, we developed an innovative machine learning approach that predicts optimal airPLS parameters tailored to any input spectrum, eliminating the need for prior baseline knowledge. We fixed the smoothness parameter at 2 and systematically adjusted the penalizing and tolerance parameters. Using three peak types with four baseline profiles, we generated 6,000 simulated spectra with known true baselines. An iterative grid search optimization was used to identify optimal parameter sets for each spectrum, reducing average MAE by 96% compared to default airPLS. For practical deployment, we trained a machine learning model integrating principal component analysis with random forest, achieving direct parameter prediction from spectra while retaining 90% of the MAE reduction. We expanded the training dataset to 12,000 spectra, incorporating diverse peak characteristics guided by statistical distributions of optimal parameters, enhancing adaptability to real-world spectral variability. Importantly, we demonstrated that for both synthetic and experimental noisy spectra, our model successfully predicts parameters and baselines after simple denoising. Future work will focus on identifying optimal denoising strategies to further enhance results. By automating baseline correction, our approach enhances analytical precision with applications spanning virus detection, environmental monitoring, and beyond, making SERS more reliable and accessible.
Previous Article in event
Previous Article in session
Next Article in event
Next Article in session
Beyond Traditional airPLS: Improved Baseline Removal in SERS with Parameter-Focused Optimization and Prediction
Published:
19 September 2025
by MDPI
in The 5th International Online Conference on Nanomaterials
session Nanomedicine and Bionanotechnology
Abstract:
Keywords: Baseline correction; Surface-enhanced Raman spectroscopy (SERS); adaptive iterative reweighted penalized least squares (airPLS); optimization; machine learning
