Knowing the geographical origin of food products is crucial to guaranteeing their authenticity and quality, as it helps prevent fraudulent practices in the food industry. The isotopic composition of food products varies depending on the agroclimatic conditions of origin.
The milk dataset (142 samples) used included measurements of various stable isotopes in whole milk (δ¹³C, δ¹⁵N, δ¹⁸O, and δ²H). On the other hand, the eggs database (180 albumen samples) included measurements of δ¹³C, δ¹⁵N, and δ³⁴S. Both databases were obtained externally. These isotopic compositions were analysed using machine learning algorithms that are currently widely used in different fields.
In this research, random forest (RF), support vector machine (SVM) and artificial neural network (ANN) were used to evaluate their effectiveness in predicting geographical origin according to the literature and digital data sources.
The selected algorithms presented different behaviours in the testing phase. In this sense, the ANNs presented the best performance in determining milk's geographical origin (accuracy upper than 89%); on the other hand, the RF model presented the best performance in predicting eggs' geographical origin (upper than 90%).
The results obtained from this research proved the capacity of this learning algorithm to ensure the authenticity of these products; however, further research is necessary to optimize the developed models using different distribution data, adding more experimental cases or modifying the variability of the groups, among others.