This study compares machine learning (ML) models, Deep Learning (DL), Distributed Random Forest (DRF), and Gradient Boosting Machines (GBMs), for predicting the consistency of apple purée, a key attribute in apple purée production. Consistency affects product quality and acceptability and can be used to regulate process settings in industrial production lines. The main objective was to model the Bostwick flow distance (cm/30s), a practical measure of purée consistency, using a combination of inline process data and the physicochemical properties of apples, and to compare the performance of the ML models. The data, collected from an industrial production line, included measurements such as pressure drop, average flow velocity, inline temperature, °Brix, pH, and color parameters (L*, a*, b*, Chroma, and Hue). Preprocessing was carried out using H2O's default settings, as the platform is fast and user-friendly. Models were trained on 75% of the dataset, with the remaining 25% used for validation. All modeling followed the platform’s default settings, except the number of trees, which was increased from 50 to 100 for both DRF and GBM. Model performance was evaluated using standard regression metrics (R², RMSE, and MAE).
GBM outperformed both DL and DRF in predictive accuracy and generalization, likely due to its lower sensitivity to multicollinearity and strong ability to model non-linear interactions. DRF gave acceptable results, though its performance was less stable, possibly due to its limitations with multicollinearity, which affected validation and learning curves. DL captured complex patterns effectively but required greater computational resources. Variable importance analysis of GBM showed that pressure difference was the most influential feature, providing meaningful insights into consistency behavior. This study highlights the importance of combining rheological knowledge with data-driven models to enable objective and adaptive consistency monitoring in food production. Additionally, it demonstrates the potential of using ML frameworks in industrial process environments.
Previous Article in event
Next Article in event
Comparison of Machine Learning Models for Apple Purée Consistency Prediction
Published:
27 October 2025
by MDPI
in The 6th International Electronic Conference on Foods
session Food Technology and Engineering
Abstract:
Keywords: Machine Learning, Deep Learning, Distributed Random Forest, and Gradient Boosting Machines, consistency, rheology, apple puree
