Please login first
Developing a model for the automated identification and extraction of agricultural terms from unstructured text
* 1 , 2 , 3
1  Institute of Data Science, Maastricht University, Maastricht, Netherlands
2  Institute of Data Science, Maastricht University, Maastricht, Netherlands & Data Science Group, TNO, Soesterberg, Netherlands
3  Dpt. of Natural Resources and Agricultural Engineering, Agricultural University of Athens
Academic Editor: Bin Gao

Abstract:

The most prevalent medium for conveying research findings and developments within and beyond the domain of agriculture is text whether in the form of scholarly publications, reports, articles, or posts on websites and social media channels. Mining information from text is of utmost importance in order to allow the agricultural (research) community to keep track of the most recent advancements, as well as to update ontologies and other structures that are used to model and formally represent domain-specific knowledge. However, the pace and volume at which texts are currently being produced render the manual extract of information impossible. Therefore, we need to reside in technology-supported, machine learning-based methods capable of mining information from large corpora of unstructured text. Within this context, the aim of this paper is to describe a model for the automated identification and extraction of agricultural terms mentioned in texts that has been built upon spaCy – a free, open-source library for Natural Language Processing in Python. The model has been trained on a properly selected corpus of agriculture-related texts, manually annotated in regard to mentions of agricultural terms. The performance of the model has been evaluated against standard metrics and compared to other similar and baseline term recognition approaches. A detailed discussion is made about the exploitation of the proposed model in terms of further research.

Keywords: Agricultural term extraction; machine learning model; natural language processing; python; spacy
Top