A Cross-Scale Spatial–Channel Attention Inception Network for Efficient Medical Image Segmentation
Abstract
Medical image segmentation plays a crucial role in modern computerized diagnosis, as accurate delineation of anatomical structures directly impacts clinical decision-making and treatment planning. However, segmenting anatomically complex regions at a fine-grained level remains challenging, especially when computational efficiency is a key requirement. To address these challenges, the authors propose a novel, lightweight medical image segmentation framework, CSA-IncepLiteNet, designed to achieve high segmentation accuracy without imposing a significant computational burden. The CSA-IncepLiteNet architecture integrates two key innovations: cross-scale feature extraction and unified spatial channel attention learning. Central to this framework is the newly introduced Cross-Scale InceptionLite module, which efficiently captures multi-scale contextual information. This module is built using depth-wise separable convolutions and point-wise convolutions, enabling effective feature extraction while significantly reducing the number of trainable parameters. By learning features across multiple spatial scales, the network can better represent anatomically complex structures present in medical images. In addition, the authors propose a Cross-Scale Spatial Channel Attention (CSA) module that jointly models spatial saliency and channel-wise interdependencies within a unified attention-learning paradigm. This dual attention mechanism allows the network to focus on the most informative regions and feature channels simultaneously, leading to improved segmentation precision. The performance of CSA-IncepLiteNet was evaluated on the BUSI breast ultrasound dataset and multiple CT image modality-based datasets. Experimental results demonstrate that the proposed framework consistently outperforms existing state-of-the-art methods across all evaluated datasets. Notably, CSA-IncepLiteNet achieves an accuracy of 92.1% and a Dice coefficient of 82.94% on the BUSI dataset, while utilizing over 26 million fewer parameters than a conventional U-Net. These results highlight the model’s effectiveness, robustness, and suitability for resource-constrained medical imaging applications.
Downloads
References
Zhang, Y., Xian, M., Cheng, H. D., Shareef, B., Ding, J., Xu, F., ... & Wang, Y. (2022, April). BUSIS: a benchmark for breast ultrasound image segmentation. In Healthcare (Vol. 10, No. 4, p. 729). MDPI. https://doi.org/10.3390/healthcare10040729
Mishra, A. K., Roy, P., Bandyopadhyay, S., & Das, S. K. (2021). Breast ultrasound tumour classification: A Machine Learning—Radiomics based approach. Expert Systems, 38(7), e12713. https://doi.org/10.1111/exsy.12713
Al-Dhabyani, W., Gomaa, M., Khaled, H., & Aly, F. (2019). Deep learning approaches for data augmentation and classification of breast masses using ultrasound images. Int. J. Adv. Comput. Sci. Appl, 10(5), 1-11. https://doi.org/10.14569/IJACSA.2019.0100579
Liu, J., Pian, L., Chen, J., Zhao, J., Liu, Y., Meng, F., & Zeng, C. (2025). Artificial intelligence in breast ultrasound: a systematic review of research advances. Frontiers in Oncology, 15, 1619364. https://doi.org/10.3389/fonc.2025.1619364
Kaggle repository : BUSI dataset- https://www.kaggle.com/datasets/aryashah2k/breast-ultrasound-images-dataset/data.
Ilesanmi, A. E., Chaumrattanakul, U., & Makhanov, S. S. (2021). Methods for the segmentation and classification of breast ultrasound images: a review. Journal of ultrasound, 24(4), 367-382. https://doi.org/10.1007/s40477-020-00557-5
Erin, K. N. (2025). A Hybrid CNN-Transformer Approach for Breast Cancer Detection and Segmentation from Ultrasound Images. Accessed: Dec, 12.
Bruno, P., Macrì, M., & Dodaro, C. (2025). A Dual-stage Deep Learning Framework for Breast Ultrasound Image Segmentation and Classification. Journal of Medical Systems, 49(1), 162. https://doi.org/10.1007/s10916-025-02298-6
Wu, R., Lu, X., Yao, Z., & Ma, Y. (2024). MFMSNet: A Multi-frequency and Multi-scale Interactive CNN-Transformer Hybrid Network for breast ultrasound image segmentation. Computers in Biology and Medicine, 177, 108616. https://doi.org/10.1016/j.compbiomed.2024.108616
Abuowaida, S., Owida, H. A., Alsekait, D. M., Alshdaifat, N., AbdElminaam, D. S., & Alshinwan, M. (2025). UltraSegNet: A Hybrid Deep Learning Framework for Enhanced Breast Cancer Segmentation and Classification on Ultrasound Images. Computers, Materials & Continua, 83(2). https://doi.org/10.32604/cmc.2025.063470
Guo, Y., Qiang, Y., Chen, Q., Li, Q., & Sun, J. (2025). MSRA-Net: A multi-scale and region-aware network for breast cancer ultrasound image segmentation. Digital Signal Processing, 105534. https://doi.org/10.1016/j.dsp.2025.105534
Bian, X., Liu, J., Xu, S., Liu, W., Mei, L., Xiao, C., & Yang, F. (2025). ThreeF-Net: Fine-grained feature fusion network for breast ultrasound image segmentation. Computers in Biology and Medicine, 194, 110527. https://doi.org/10.1016/j.compbiomed.2025.110527
Aumente-Maestro, C., Díez, J., & Remeseiro, B. (2025). A multi-task framework for breast cancer segmentation and classification in ultrasound imaging. Computer methods and programs in biomedicine, 260, 108540. https://doi.org/10.1016/j.cmpb.2024.108540
Nissar, I., Alam, S., & Masood, S. (2026). SwinEff-AttentionNet: a dual hybrid model for breast image segmentation and classification using multiple ultrasound modality. Biomedical Signal Processing and Control, 112, 108795. https://doi.org/10.1016/j.bspc.2025.108795
Jiang, T., Li, Y., Li, Y., Xing, W., Yu, M., Xie, F., & Ta, D. (2025). A segmentation knowledge-based global-local attention network for tumor classification in breast ultrasound images. Pattern Recognition, 112152. https://doi.org/10.1016/j.patcog.2025.112152
Wang, T., Liu, J., & Tang, J. (2025). A cross-scale attention-based U-net for breast ultrasound image segmentation. Journal of Imaging Informatics in Medicine, 1-14. https://doi.org/10.1007/s10278-025-01392-y
Zhu, Q., Zheng, C., Zhang, Z., Shao, W., & Zhang, D. (2023). Dynamic confidence-aware multi-modal emotion recognition. IEEE Transactions on Affective Computing, 15(3), 1358-1370. https://doi.org/10.1109/TAFFC.2023.3340924
Suganyadevi, S., Pershiya, A. S., Balasamy, K., et al. “Deep learning based alzheimer disease diagnosis: A comprehensive review”. SN Computer Science, Vol.5 no.4, pp.391, 2024, https://doi.org/10.1007/s42979-024-02743-2.
Balasamy, K., Krishnaraj, N., & Vijayalakshmi, K. “An adaptive neuro-fuzzy based region selection and authenticating medical image through watermarking for secure communication”, Wireless Personal Communications, Vol.122, no.3, pp. 2817–2837, 2021, https://doi.org/10.1007/s11277-021-09031-9.
Suganyadevi, S., & Seethalakshmi, V. “CVD-HNet: Classifying Pneumonia and COVID-19 in Chest X-ray Images Using Deep Network”. Wireless Personal Communications, Vol.126, no. 4, pp.3279–3303, 2022, https://doi.org/10.1007/s11277-022-09864-y.
Balasamy, K., & Suganyadevi, S. “Multi-dimensional fuzzy based diabetic retinopathy detection in retinal images through deep CNN method”. Multimedia Tools and Applications, Vol 83, no. 5, pp.1–23. 2024, https://doi.org/10.1007/s11042-024-19798-1.
Balasamy, K., Seethalakshmi, V. & Suganyadevi, S. Medical Image Analysis Through Deep Learning Techniques: A Comprehensive Survey. Wireless Pers Commun 137, 1685–1714 (2024). https://doi.org/10.1007/s11277-024-11428-1.
Suganyadevi, S., Seethalakshmi, V. Deep recurrent learning based qualified sequence segment analytical model (QS2AM) for infectious disease detection using CT images. Evolving Systems 15, 505–521 (2024). https://doi.org/10.1007/s12530-023-09554-5.
T. Gopalakrishnan, S. Ramakrishnan, K. Balasamy and A. S. Muthananda Murugavel, "Semi fragile watermarking using Gaussian mixture model for malicious image attacks," 2011 World Congress on Information and Communication Technologies, Mumbai, India, 2011, pp. 120-125, https://doi.org/10.1109/WICT.2011.6141229.
Renuka Devi, K., Suganyadevi, S and Balasamy, K. “Healthcare Data Analysis Using Deep Learning Paradigm”. Deep Learning for Cognitive Computing Systems: Technological Advancements and Applications, edited by M.G. Sumithra, Rajesh Kumar Dhanaraj, Celestine Iwendi and Anto Merline Manoharan, Berlin, Boston:De Gruyter, 2023, pp. 129–148. https://doi.org/10.1515/9783110750584-008.
Shamia, D., Balasamy, K., and Suganyadevi, S. “A secure framework for medical image by integrating watermarking and encryption through fuzzy based roi selection”, Journal of Intelligent & Fuzzy systems, 2023, Vol. 44, no.5, pp.7449-7457. https://doi.org/10.3233/JIFS-222618.
E. Elyan, P. Vuttipittayamongkol, P. Johnston, K. Martin, K. McPherson, C.F. Moreno-García, C. Jayne, M.M.K. Sarker, Computer vision and machine learning for medical image analysis: recent advances, challenges, and way forward, Artif. Intell. Surg. 2 (1) (2022) 24–45. https://doi.org/10.20517/ais.2021.15
O. Ayo-Farai, B.A. Olaide, C.P. Maduka, C.C. Okongwu, Engineering innovations in healthcare: a review of developments in the USA, Eng. Sci. Technol. J. 4 (6) (2023) 381–400. https://doi.org/10.51594/estj.v4i6.638
Usha, S., Bala, S., Saranya, M.D. et al. Pixelated disparity network for hepatocellular carcinoma recognition from ultrasound images. Evolving Systems 16, 113 (2025). https://doi.org/10.1007/s12530-025-09737-2
Zheng, Jianwei, Hao Liu, Yuchao Feng, Jinshan Xu, and Liang Zhao. "CASF-Net: Cross-attention and cross-scale fusion network for medical image segmentation." Computer Methods and Programs in Biomedicine 229 (2023): 107307, https://doi.org/10.1016/j.cmpb.2022.107307
Q. Xie, Y. Chen, S. Liu and X. Lu, "SSCFormer: Revisiting ConvNet-Transformer Hybrid Framework From Scale-Wise and Spatial-Channel-Aware Perspectives for Volumetric Medical Image Segmentation," in IEEE Journal of Biomedical and Health Informatics, vol. 28, no. 8, pp. 4830-4841, Aug. 2024, https://doi.org/10.1109/JBHI.2024.3392488.
Wang, Fuyao, Chuantao Wang, Chi Ma, Xiumin Wang, Jiliang Zhai, and Yu Zhao. "Medical image segmentation model based on multi-scale fusion and feature reconstruction convolution." Biomedical Signal Processing and Control 112 (2026): 108464. https://doi.org/10.1109/ACCESS.2024.3450121
Ye S, Chen G, Li G and Shen X (2025) CPRSCA-ResNet: a novel ResNet-based model with Channel-Partitioned Resolution Spatial-Channel Attention for EEG-based seizure detection. Front. Neurosci. 19:1693079. https://doi.org/10.3389/fnins.2025.1693079
L. Zou et al., "Lightweight 2D Medical Image Segmentation via a Decoder Using Linear Deformable Convolution and Multi-scale Self-attention," in IEEE Journal of Biomedical and Health Informatics, https://doi.org/10.1109/JBHI.2025.3583108.
Zhao, Xiaoqi, Hongpeng Jia, Youwei Pang, Long Lv, Feng Tian, Lihe Zhang, Weibing Sun, and Huchuan Lu. "M $^{2} $ SNet: Multi-scale in multi-scale subtraction network for medical image segmentation." arXiv preprint arXiv:2303.10894 (2023), https://doi.org/10.48550/arXiv.2303.10894
Zhang, Fan, Zhiwei Gu, and Hua Wang. "Decoding with structured awareness: integrating directional, frequency-spatial, and structural attention for medical image segmentation." In Proceedings of the AAAI Conference on Artificial Intelligence, vol. 40, no. 15, pp. 12421-12429. 2026. https://doi.org/10.1609/aaai.v40i15.38235.
Li, Debao, Cheng Yuan, Yexiang Yao, Yongqiang Qiu, and Haobo Yin. "Dual-branch attention network with deep split convolution and multi-dimensional transformers for medical image segmentation." Scientific Reports (2026), https://doi.org/10.1038/s41598-026-44413-8
Y. Zhou, X. Zou and Y. Wang, "Towards Cross-Scale Attention and Surface Supervision for Fractured Bone Segmentation in CT," 2024 46th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Orlando, FL, USA, 2024, pp. 1-5, https://doi.org/10.1109/EMBC53108.2024.10781758.
Ji, Z., Chen, Z. & Ma, X. Grouped multi-scale vision transformer for medical image segmentation. Sci Rep 15, 11122 (2025). https://doi.org/10.1038/s41598-025-95361-8.
Ma, Jinlin, Kai Zhang, Ziping Ma, and Ke Lu. "MSFFE-Net: Multi-scale Spatial-Frequency Feature Enhancement for accurate liver tumor segmentation." Biomedical Signal Processing and Control 113 (2026): 108963. https://doi.org/10.1016/j.bspc.2025.108963
Copyright (c) 2026 Krishnakumar B, Nisha P, Sri Laxmi Kuna, Venu K, Evance Leethial R, Rama Krishna Kunchanapalli

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution-ShareAlikel 4.0 International (CC BY-SA 4.0) that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).


.png)
.png)
.png)
.png)
.png)