Please login first
Machine learning model for Hausa part-of-speech tagging
* 1 , 2 , 1 , 1
1  Department of computer science, Faculty of Computing, Bayero University Kano
2  Department of Software Engineering, Faculty of computing, Bayero University Kano
Academic Editor: Eugenio Vocaturo

Abstract:

Part-of-speech (POS) tagging involves tagging each word in a text with the appropriate part of speech. POS tagging is regarded as one of the fundamental technologies required in Natural Language Processing (NLP) applications. For many natural language processing jobs, this procedure is regarded as one of the pre-processing processes. Recently, with the development of machine learning-based algorithms, the process of part-of-speech tagging improved, and there are now a respectable number of taggers accessible for high-resource languages like English. However, low-resource languages like Hausa continue to lack accurate and effective computational approaches for part-of-speech tagging. Despite the recent exponential expansion of Hausa online content on websites like BBC.com/Hausa, Freedomradio.com.ng, Hausa Leadership.ng, Aminiya and dailytrust.com.ng, part-of-speech tagging on such Hausa web content has not been investigated by the research community. Therefore, part-of-speech tagging on Hausa-based web contents is a new topic that can be researched. This research work proposed a machine learning-based method for Hausa part-of-speech tagging. We implement three architectures, namely, long short-term memory (LSTM), bi-directional long short-term memory (BLSTM) and gated recurrent unit (GRU), to perform part-of-speech tagging on a Hausa data set. The labeled data are transformed into a one-hot-vector encoding and then sent through a deep neural network using LSTM, BLSTM and GRU hidden layers. We obtain precision, recall, accuracy and f1-score as the evaluation matrix of the three architectures. In conclusion, the system achieves an overall result of 99%, and this shows that the proposed approach outperforms the previous approach (with a result of 79.14%) in terms of precision, recall, accuracy and f1-score.

Keywords: machine learning, Hausa, model, part of speech, tagging
Comments on this paper
Currently there are no comments available.



 
 
Top