A Cross-Scale Spatial–Channel Attention Inception Network for Efficient Medical Image Segmentation

Krishnakumar B; Nisha P; Sri Laxmi Kuna; Venu K; Evance Leethial R; Rama Krishna Kunchanapalli

doi:10.35882/jeeemi.v8i3.1550

Krishnakumar B School of Computing, SASTRA Deemed University, Tamil Nadu, India https://orcid.org/0000-0003-2520-5208
Nisha P Department of Computer Science and Engineering, Dr. N.G.P. Institute of Technology, Coimbatore, Tamilnadu, India https://orcid.org/0009-0005-6352-2898
Sri Laxmi Kuna Department of Computer Science and Engineering (Data Science) CVR College of Engineering, Hyderabad, India. https://orcid.org/0000-0001-6318-0318
Venu K Department of Computer Science and Engineering, Kongu Engineering College, Perundurai, India. https://orcid.org/0000-0002-6245-7976
Evance Leethial R Department of Computer Science and Engineering at Nehru Institute of Technology, Coimbatore. https://orcid.org/0009-0002-6636-060X
Rama Krishna Kunchanapalli Department of Computer Science & Engineering, Koneru Lakshmaiah Education Foundation, Vaddeswaram, Guntur District -522302, Andhra Pradesh. https://orcid.org/0000-0001-9393-7713

DOI: https://doi.org/10.35882/jeeemi.v8i3.1550

Keywords: Medical Image Segmentation, Spatial-Channel Attention, Cross-Scale Feature Learning, Lightweight Deep Learning, Encoder-decoder Networks

Abstract

Medical image segmentation plays a crucial role in modern computerized diagnosis, as accurate delineation of anatomical structures directly impacts clinical decision-making and treatment planning. However, segmenting anatomically complex regions at a fine-grained level remains challenging, especially when computational efficiency is a key requirement. To address these challenges, the authors propose a novel, lightweight medical image segmentation framework, CSA-IncepLiteNet, designed to achieve high segmentation accuracy without imposing a significant computational burden. The CSA-IncepLiteNet architecture integrates two key innovations: cross-scale feature extraction and unified spatial channel attention learning. Central to this framework is the newly introduced Cross-Scale InceptionLite module, which efficiently captures multi-scale contextual information. This module is built using depth-wise separable convolutions and point-wise convolutions, enabling effective feature extraction while significantly reducing the number of trainable parameters. By learning features across multiple spatial scales, the network can better represent anatomically complex structures present in medical images. In addition, the authors propose a Cross-Scale Spatial Channel Attention (CSA) module that jointly models spatial saliency and channel-wise interdependencies within a unified attention-learning paradigm. This dual attention mechanism allows the network to focus on the most informative regions and feature channels simultaneously, leading to improved segmentation precision. The performance of CSA-IncepLiteNet was evaluated on the BUSI breast ultrasound dataset and multiple CT image modality-based datasets. Experimental results demonstrate that the proposed framework consistently outperforms existing state-of-the-art methods across all evaluated datasets. Notably, CSA-IncepLiteNet achieves an accuracy of 92.1% and a Dice coefficient of 82.94% on the BUSI dataset, while utilizing over 26 million fewer parameters than a conventional U-Net. These results highlight the model’s effectiveness, robustness, and suitability for resource-constrained medical imaging applications.

Downloads

Download data is not yet available.

References

Zhang, Y., Xian, M., Cheng, H. D., Shareef, B., Ding, J., Xu, F., ... & Wang, Y. (2022, April). BUSIS: a benchmark for breast ultrasound image segmentation. In Healthcare (Vol. 10, No. 4, p. 729). MDPI. https://doi.org/10.3390/healthcare10040729

Mishra, A. K., Roy, P., Bandyopadhyay, S., & Das, S. K. (2021). Breast ultrasound tumour classification: A Machine Learning—Radiomics based approach. Expert Systems, 38(7), e12713. https://doi.org/10.1111/exsy.12713

Al-Dhabyani, W., Gomaa, M., Khaled, H., & Aly, F. (2019). Deep learning approaches for data augmentation and classification of breast masses using ultrasound images. Int. J. Adv. Comput. Sci. Appl, 10(5), 1-11. https://doi.org/10.14569/IJACSA.2019.0100579

Liu, J., Pian, L., Chen, J., Zhao, J., Liu, Y., Meng, F., & Zeng, C. (2025). Artificial intelligence in breast ultrasound: a systematic review of research advances. Frontiers in Oncology, 15, 1619364. https://doi.org/10.3389/fonc.2025.1619364

Kaggle repository : BUSI dataset- https://www.kaggle.com/datasets/aryashah2k/breast-ultrasound-images-dataset/data.

Ilesanmi, A. E., Chaumrattanakul, U., & Makhanov, S. S. (2021). Methods for the segmentation and classification of breast ultrasound images: a review. Journal of ultrasound, 24(4), 367-382. https://doi.org/10.1007/s40477-020-00557-5

Erin, K. N. (2025). A Hybrid CNN-Transformer Approach for Breast Cancer Detection and Segmentation from Ultrasound Images. Accessed: Dec, 12.

Bruno, P., Macrì, M., & Dodaro, C. (2025). A Dual-stage Deep Learning Framework for Breast Ultrasound Image Segmentation and Classification. Journal of Medical Systems, 49(1), 162. https://doi.org/10.1007/s10916-025-02298-6

Wu, R., Lu, X., Yao, Z., & Ma, Y. (2024). MFMSNet: A Multi-frequency and Multi-scale Interactive CNN-Transformer Hybrid Network for breast ultrasound image segmentation. Computers in Biology and Medicine, 177, 108616. https://doi.org/10.1016/j.compbiomed.2024.108616

Abuowaida, S., Owida, H. A., Alsekait, D. M., Alshdaifat, N., AbdElminaam, D. S., & Alshinwan, M. (2025). UltraSegNet: A Hybrid Deep Learning Framework for Enhanced Breast Cancer Segmentation and Classification on Ultrasound Images. Computers, Materials & Continua, 83(2). https://doi.org/10.32604/cmc.2025.063470

Guo, Y., Qiang, Y., Chen, Q., Li, Q., & Sun, J. (2025). MSRA-Net: A multi-scale and region-aware network for breast cancer ultrasound image segmentation. Digital Signal Processing, 105534. https://doi.org/10.1016/j.dsp.2025.105534

Bian, X., Liu, J., Xu, S., Liu, W., Mei, L., Xiao, C., & Yang, F. (2025). ThreeF-Net: Fine-grained feature fusion network for breast ultrasound image segmentation. Computers in Biology and Medicine, 194, 110527. https://doi.org/10.1016/j.compbiomed.2025.110527

Aumente-Maestro, C., Díez, J., & Remeseiro, B. (2025). A multi-task framework for breast cancer segmentation and classification in ultrasound imaging. Computer methods and programs in biomedicine, 260, 108540. https://doi.org/10.1016/j.cmpb.2024.108540

Nissar, I., Alam, S., & Masood, S. (2026). SwinEff-AttentionNet: a dual hybrid model for breast image segmentation and classification using multiple ultrasound modality. Biomedical Signal Processing and Control, 112, 108795. https://doi.org/10.1016/j.bspc.2025.108795

Jiang, T., Li, Y., Li, Y., Xing, W., Yu, M., Xie, F., & Ta, D. (2025). A segmentation knowledge-based global-local attention network for tumor classification in breast ultrasound images. Pattern Recognition, 112152. https://doi.org/10.1016/j.patcog.2025.112152

Wang, T., Liu, J., & Tang, J. (2025). A cross-scale attention-based U-net for breast ultrasound image segmentation. Journal of Imaging Informatics in Medicine, 1-14. https://doi.org/10.1007/s10278-025-01392-y

Zhu, Q., Zheng, C., Zhang, Z., Shao, W., & Zhang, D. (2023). Dynamic confidence-aware multi-modal emotion recognition. IEEE Transactions on Affective Computing, 15(3), 1358-1370. https://doi.org/10.1109/TAFFC.2023.3340924

Suganyadevi, S., Pershiya, A. S., Balasamy, K., et al. “Deep learning based alzheimer disease diagnosis: A comprehensive review”. SN Computer Science, Vol.5 no.4, pp.391, 2024, https://doi.org/10.1007/s42979-024-02743-2.

Balasamy, K., Krishnaraj, N., & Vijayalakshmi, K. “An adaptive neuro-fuzzy based region selection and authenticating medical image through watermarking for secure communication”, Wireless Personal Communications, Vol.122, no.3, pp. 2817–2837, 2021, https://doi.org/10.1007/s11277-021-09031-9.

Suganyadevi, S., & Seethalakshmi, V. “CVD-HNet: Classifying Pneumonia and COVID-19 in Chest X-ray Images Using Deep Network”. Wireless Personal Communications, Vol.126, no. 4, pp.3279–3303, 2022, https://doi.org/10.1007/s11277-022-09864-y.

Balasamy, K., & Suganyadevi, S. “Multi-dimensional fuzzy based diabetic retinopathy detection in retinal images through deep CNN method”. Multimedia Tools and Applications, Vol 83, no. 5, pp.1–23. 2024, https://doi.org/10.1007/s11042-024-19798-1.

Balasamy, K., Seethalakshmi, V. & Suganyadevi, S. Medical Image Analysis Through Deep Learning Techniques: A Comprehensive Survey. Wireless Pers Commun 137, 1685–1714 (2024). https://doi.org/10.1007/s11277-024-11428-1.

Suganyadevi, S., Seethalakshmi, V. Deep recurrent learning based qualified sequence segment analytical model (QS2AM) for infectious disease detection using CT images. Evolving Systems 15, 505–521 (2024). https://doi.org/10.1007/s12530-023-09554-5.

T. Gopalakrishnan, S. Ramakrishnan, K. Balasamy and A. S. Muthananda Murugavel, "Semi fragile watermarking using Gaussian mixture model for malicious image attacks," 2011 World Congress on Information and Communication Technologies, Mumbai, India, 2011, pp. 120-125, https://doi.org/10.1109/WICT.2011.6141229.

Renuka Devi, K., Suganyadevi, S and Balasamy, K. “Healthcare Data Analysis Using Deep Learning Paradigm”. Deep Learning for Cognitive Computing Systems: Technological Advancements and Applications, edited by M.G. Sumithra, Rajesh Kumar Dhanaraj, Celestine Iwendi and Anto Merline Manoharan, Berlin, Boston:De Gruyter, 2023, pp. 129–148. https://doi.org/10.1515/9783110750584-008.

Shamia, D., Balasamy, K., and Suganyadevi, S. “A secure framework for medical image by integrating watermarking and encryption through fuzzy based roi selection”, Journal of Intelligent & Fuzzy systems, 2023, Vol. 44, no.5, pp.7449-7457. https://doi.org/10.3233/JIFS-222618.

E. Elyan, P. Vuttipittayamongkol, P. Johnston, K. Martin, K. McPherson, C.F. Moreno-García, C. Jayne, M.M.K. Sarker, Computer vision and machine learning for medical image analysis: recent advances, challenges, and way forward, Artif. Intell. Surg. 2 (1) (2022) 24–45. https://doi.org/10.20517/ais.2021.15

O. Ayo-Farai, B.A. Olaide, C.P. Maduka, C.C. Okongwu, Engineering innovations in healthcare: a review of developments in the USA, Eng. Sci. Technol. J. 4 (6) (2023) 381–400. https://doi.org/10.51594/estj.v4i6.638

Usha, S., Bala, S., Saranya, M.D. et al. Pixelated disparity network for hepatocellular carcinoma recognition from ultrasound images. Evolving Systems 16, 113 (2025). https://doi.org/10.1007/s12530-025-09737-2

Zheng, Jianwei, Hao Liu, Yuchao Feng, Jinshan Xu, and Liang Zhao. "CASF-Net: Cross-attention and cross-scale fusion network for medical image segmentation." Computer Methods and Programs in Biomedicine 229 (2023): 107307, https://doi.org/10.1016/j.cmpb.2022.107307

Q. Xie, Y. Chen, S. Liu and X. Lu, "SSCFormer: Revisiting ConvNet-Transformer Hybrid Framework From Scale-Wise and Spatial-Channel-Aware Perspectives for Volumetric Medical Image Segmentation," in IEEE Journal of Biomedical and Health Informatics, vol. 28, no. 8, pp. 4830-4841, Aug. 2024, https://doi.org/10.1109/JBHI.2024.3392488.

Wang, Fuyao, Chuantao Wang, Chi Ma, Xiumin Wang, Jiliang Zhai, and Yu Zhao. "Medical image segmentation model based on multi-scale fusion and feature reconstruction convolution." Biomedical Signal Processing and Control 112 (2026): 108464. https://doi.org/10.1109/ACCESS.2024.3450121

Ye S, Chen G, Li G and Shen X (2025) CPRSCA-ResNet: a novel ResNet-based model with Channel-Partitioned Resolution Spatial-Channel Attention for EEG-based seizure detection. Front. Neurosci. 19:1693079. https://doi.org/10.3389/fnins.2025.1693079

L. Zou et al., "Lightweight 2D Medical Image Segmentation via a Decoder Using Linear Deformable Convolution and Multi-scale Self-attention," in IEEE Journal of Biomedical and Health Informatics, https://doi.org/10.1109/JBHI.2025.3583108.

Zhao, Xiaoqi, Hongpeng Jia, Youwei Pang, Long Lv, Feng Tian, Lihe Zhang, Weibing Sun, and Huchuan Lu. "M $^{2} $ SNet: Multi-scale in multi-scale subtraction network for medical image segmentation." arXiv preprint arXiv:2303.10894 (2023), https://doi.org/10.48550/arXiv.2303.10894

Zhang, Fan, Zhiwei Gu, and Hua Wang. "Decoding with structured awareness: integrating directional, frequency-spatial, and structural attention for medical image segmentation." In Proceedings of the AAAI Conference on Artificial Intelligence, vol. 40, no. 15, pp. 12421-12429. 2026. https://doi.org/10.1609/aaai.v40i15.38235.

Li, Debao, Cheng Yuan, Yexiang Yao, Yongqiang Qiu, and Haobo Yin. "Dual-branch attention network with deep split convolution and multi-dimensional transformers for medical image segmentation." Scientific Reports (2026), https://doi.org/10.1038/s41598-026-44413-8

Y. Zhou, X. Zou and Y. Wang, "Towards Cross-Scale Attention and Surface Supervision for Fractured Bone Segmentation in CT," 2024 46th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Orlando, FL, USA, 2024, pp. 1-5, https://doi.org/10.1109/EMBC53108.2024.10781758.

Ji, Z., Chen, Z. & Ma, X. Grouped multi-scale vision transformer for medical image segmentation. Sci Rep 15, 11122 (2025). https://doi.org/10.1038/s41598-025-95361-8.

Ma, Jinlin, Kai Zhang, Ziping Ma, and Ke Lu. "MSFFE-Net: Multi-scale Spatial-Frequency Feature Enhancement for accurate liver tumor segmentation." Biomedical Signal Processing and Control 113 (2026): 108963. https://doi.org/10.1016/j.bspc.2025.108963