In this study, we leveraged a publicly available dataset containing 3256 peripheral blood smear (PBS) images, prepared in the bone marrow laboratory of Taleqani Hospital in Tehran, Iran. This dataset consists of blood samples from 89 patients suspected of Acute Lymphoblastic Leukemia (ALL). The images were captured using a Zeiss camera at 100x magnification and stored as JPG files. The dataset is divided into two primary classes: benign hematogones and malignant lymphoblasts. The malignant lymphoblasts are further categorized into three subtypes: Early Pre-B, Pre-B, and Pro-B ALL. The definitive classification of these cell types and subtypes was performed by a specialist using flow cytometry tools.
To classify these images into four distinct categories, we employed a stacked ensemble learning approach. Our model stack included three base models, DenseNet121, VGG16, and VGG19, with a K-Nearest Neighbors (KNN) classifier acting as the meta-model. This ensemble method capitalizes on the strengths of each individual model to improve overall classification performance. Our approach achieved a high accuracy of 94%, demonstrating its robustness and reliability in distinguishing between the various cell types and subtypes within the dataset.
The significant accuracy attained underscores the potential of advanced machine learning techniques in medical image analysis, particularly in the context of hematological malignancies. Our findings suggest that such methodologies could greatly enhance diagnostic precision and efficiency, leading to better patient outcomes. This study illustrates the promising application of deep learning models in the automated classification of ALL subtypes, paving the way for future advancements in the field.