Abstract
Emotion recognition, particularly through facial expressions, has become vital across diverse domains like healthcare, entertainment, and education, providing insights into user experiences and guiding decision-making processes. However, the realm of education, particularly in online learning environments, presents distinct challenges. Traditional emotion recognition approaches are insufficient to capture the emotional states expressed by students during the learning process. This research addresses this gap by introducing the concept of learning emotions, specifically emotions like interest, boredom, and confusion, exhibited by learners during online lectures. This research presents a novel approach for recognizing learners' emotions in online learning environments using a deep learning architecture combining convolutional neural networks (CNNs) and long short-term memory (LSTM) networks. The proposed model aims to improve emotion recognition accuracy and enhance the online learning experience. A custom dataset was created by mapping action units from existing emotions in the FER2013 dataset to new emotion categories (interested, confused, and bored). The model was trained and evaluated on this dataset, achieving an accuracy of 98.0%, precision of 97%, recall of 98%, and F1-score of 98%. These results surpass existing approaches for emotion recognition, demonstrating the effectiveness of the CNN-LSTM model in recognizing learners' emotions. This research contributes to the development of affective computing in online learning environments, enabling personalized support and improved learning outcomes. The proposed model has potential applications in various fields, including education, psychology, and human–computer interaction.