Please login first
Mamba in Medical Imaging: A Comprehensive Survey of State Space Models
1 , 2 , * 3, 4
1  School of Biological Sciences, Nanyang Technological University,​​ Singapore 637551, Singapore
2  Department of Applied Mathematics and Statistics, Johns Hopkins University, Baltimore 21218, USA
3  School of Computer Sciences, Universiti Sains Malaysia, Penang 11800, Malaysia
4  College of Mathematics and Computer, Xinyu University, Xinyu 338004, China
Academic Editor: Lucia Billeci

Abstract:

In recent years, the Mamba architecture and its selective state space models (SSMs) have emerged as next-generation approaches in computer vision, offering linear computational complexity and the ability to efficiently capture long-range dependencies. These properties have attracted significant interest in medical imaging, where computational efficiency and accuracy are critical. This study provides a comprehensive survey of Mamba applications in medical imaging between 2023 and 2025. Representative frameworks, such as VM-UNet, Mamba-UNet, and 2DMamba are examined, focusing on their performance across key tasks including 2D and 3D segmentation, whole-slide pathology image classification, and surgical or endoscopic video analysis. Recent studies indicate that SSMs can achieve performance comparable to, and in many cases surpassing, Transformers in multimodal medical imaging tasks, while substantially reducing memory consumption and computational overhead. These strengths are particularly beneficial for high-resolution and long-sequence applications. Nonetheless, challenges persist in achieving stable optimization, ensuring interpretability, and extending modeling capacity to local features and 3D temporal data. Moreover, the absence of large-scale pre-training resources continues to limit the robustness and generalizability of SSM-based approaches. Overall, Mamba provides a new paradigm for medical image analysis that balances performance with efficiency. Future research should prioritize interpretability, hybrid architecture design, cross-modality generalization, and temporal extensions in 3D imaging to enable smoother translation of this emerging architecture from academic research to clinical practice.

Keywords: Mamba; Medical Imaging; State space models; Deep learning
Comments on this paper
Currently there are no comments available.


 
 
Top