Please login first
Enhancing Spam Email Detection with an Optimized Soft Voting Ensemble Classifier
* 1 , 2 , 2 , 1
1  Assistant Professor, Department of CCE, International Islamic University Chittagong (IIUC)
2  B.Sc. Student, Department of CCE, IIUC
Academic Editor: Eugenio Vocaturo

Abstract:

Spam email detection is essential for maintaining cybersecurity, protecting user privacy, and reducing security risks. The persistent activity of spammers necessitates continuous advancements in spam filtering methods. This study introduces an automated spam filtering system using an optimized soft voting ensemble classifier to address this challenge. Initially, the process employs the Grid Search Optimizer to fine-tune the parameters of four distinct classifiers: Support Vector Machine (SVM), Random Forest (RF), Naive Bayes (NB), and XGBoost. Subsequently, the final classification is performed using a soft voting ensemble method, combining the optimised classifiers' outputs to enhance overall accuracy in detecting and classifying spam emails. This study evaluates the proposed model using the Spam_Mails Dataset and Enron1 Dataset. The experimental results demonstrate that the proposed ensemble model, which integrates hyperparameter tuning with soft voting, significantly outperforms existing approaches. Specifically, the model achieved accuracies of 99.22% and 99.12% on the Spam_Mails and Enron1 datasets, respectively. Additionally, the ensemble model attained an AUC of 1.00 on both datasets, indicating its high effectiveness in distinguishing between spam and legitimate emails (ham). The ensemble model exhibits superior accuracy, generalization, and robustness compared to individual classifiers. This innovative combination of Grid Search and soft voting results in a highly effective and efficient spam email detection model. The findings underscore the importance of hyperparameter tuning and ensemble learning in enhancing the performance of spam detection systems, setting a new benchmark for future research in this domain.

Keywords: Spam Email, Cyber security; User privacy; Ensemble learning, Soft voting, Machine learning classifier; Grid search
Comments on this paper
Currently there are no comments available.



 
 
Top