The diagnoses of cardiac diseases using medical imaging has always been one of the major applications in the medical field, and tools such as machine learning methods have been heavily invested in this application; however, it often requires large, high-quality datasets that are difficult to obtain due to ethical, cost, and variability constraints. To tackle this challenge, we present this study, which explores the integration of classic matrix factorization techniques with deep learning for enhanced cardiac disease classification.
We adapt the matrix factorization approach and principal component analysis to identify dominant modes of variation that capture key features across five cardiac conditions: healthy, diabetic cardiomyopathy, myocardial infarction, obesity, and TAC-induced hypertension in mice. Echocardiography videos were processed into image datasets from long-axis (LAX) and short-axis (SAX) views, reshaped into vectors, and arranged in separate matrices (one matrix per cardiac condition), mean-subtracted, and decomposed using the matrix factorization tool, the singular value decomposition (SVD), to generate a principal components basis for each cardiac condition. These bases are used to represent the original images using projection. The new SVD-generated data is then used to train a convolutional neural network (CNN) classifier.
Compared to training the CNN on original echocardiography images, the SVD-based preprocessing significantly improved performance. Classification accuracy increased substantially across training, testing, and unseen (prediction) datasets, demonstrating about a ~50 % enhancement when using SVD-derived representations.
These results indicate that combining matrix factorization with deep learning can effectively overcome data scarcity issues and improve generalization in medical image classification. The proposed hybrid methodology provides a promising tool for cardiac disease diagnosis, suggesting broader applicability to other medical imaging problems where data limitations hinder machine learning performance.
