Breast cancer remains a major global health challenge and is one of the leading causes of mortality among women worldwide. Despite significant advancements in diagnostic imaging and clinical assessment methods, traditional detection approaches often suffer from high false-positive rates and considerable diagnostic subjectivity. These limitations can delay early treatment and increase patient anxiety. To address these challenges, this study presents a machine learning-based framework aimed at improving breast cancer classification using structured clinical data.The proposed system utilizes an ensemble learning model, specifically the Extreme Gradient Boosting (XGBoost) classifier, known for its accuracy and speed in predictive modeling tasks. To address the issue of class imbalance and improve sensitivity to malignant cases, the Synthetic Minority Over-sampling Technique combined with Tomek Links (SMOTE Tomek) was employed. Furthermore, feature importance analysis using the Random Forest algorithm was conducted to identify the most relevant clinical variables, thereby enhancing the model’s interpretability and computational efficiency.Evaluation of the model yielded promising results, with an accuracy of 94.03%, precision of 91.89%, and an AUC score of 97.0 These metrics indicate that the model is robust and highly effective in classifying breast cancer cases. The findings underscore the potential of integrating advanced machine learning techniques into healthcare workflows, offering a more accurate, consistent, and early diagnostic aid for breast cancer. The study supports the growing role of data-driven solutions in enhancing clinical decision-making and improving patient outcomes.
Previous Article in event
Previous Article in session
Next Article in event
Next Article in session
An Improved Breast Cancer Classification Using Ensemble Learning and Data Resampling Techniques: A Machine Learning Approach
Published:
03 December 2025
by MDPI
in The 6th International Electronic Conference on Applied Sciences
session Computing and Artificial Intelligence
Abstract:
Keywords: Breast Cancer, Machine Learning, XGBoost, SMOTE Tomek, Random Forest, Classification, Diagnosis
