This study investigates the predictive modeling of total Polycyclic Aromatic Hydrocarbon (PAH) concentrations in smoked fish products based on various smoking parameters using machine learning techniques in the WEKA software environment. Key input variables included fish fat content, smoking temperature, and wood type, all of which were statistically significant predictors of PAH levels (p <0.05). A multiple linear regression analysis conducted in SPSS revealed a strong correlation between predictors and PAH concentration (r = 0.801), with an explained variance of 64.1% (R² = 0.641) and a standard error of 3.52. Among the evaluated machine learning algorithms—Linear Regression, SMOreg, Multilayer Perceptron, M5P, Random Forest, and IBk—performance was assessed using five criteria: Correlation Coefficient, Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), Relative Absolute Error (RAE), and Root Relative Squared Error (RRSE). All models were validated using 10-fold cross-validation. For classification tasks based on fish species, Logistic Regression outperformed the Random Forest and J48 algorithms, indicating superior predictive capability. This integrated analytical framework demonstrates the effectiveness of machine learning in food safety monitoring and provides a scientific basis for optimizing smoking processes to mitigate PAH contamination.Overall, the findings underscore the practical value of machine learning tools in the predictive modeling of PAH contamination in smoked fish. The approach not only offers high predictive accuracy but also serves as a scientific framework for improving food safety by optimizing smoking conditions to minimize PAH formation. This integrated model can aid food technologists and manufacturers in establishing safer processing parameters while maintaining product quality.
Previous Article in event
Previous Article in session
Next Article in event
Machine Learning-Based Prediction of Polycyclic Aromatic Hydrocarbon (PAH) Levels in Smoked Fish Using WEKA: Evaluation of Smoking Parameters and Model Performance
Published:
27 October 2025
by MDPI
in The 6th International Electronic Conference on Foods
session Sustainable Food Security and Food Systems
Abstract:
Keywords: PAH prediction, smoked fish, machine learning, WEKA, regression analysis
