Automated ICD Medical Code Generation for Radiology Reports using BioClinicalBERT with Multi-Head Attention Network

Sasikala D.; Sarrvesh N.; Sabarinath J.; Theetchenya S.; Kalavathi S.

doi:10.35882/jeeemi.v7i3.775

Sasikala D. Department of Computer Science and Engineering, Amrita School of Computing, Amrita Vishwa Vidyapeetham, Chennai, India https://orcid.org/0000-0002-0511-8169
Sarrvesh N. Department of Computer Science and Engineering, Amrita School of Computing, Amrita Vishwa Vidyapeetham, Chennai, India https://orcid.org/0009-0008-3186-7594
Sabarinath J. Department of Computer Science and Engineering, Amrita School of Computing, Amrita Vishwa Vidyapeetham, Chennai, India https://orcid.org/0009-0004-8047-7630
Theetchenya S. Department of Computer Science and Engineering, Sona College of Technology, Salem, India https://orcid.org/0000-0001-5205-3049
Kalavathi S. Department of Computer Science and Engineering, Sri Venkateswara College of Engineering, Chennai, India https://orcid.org/0009-0008-1996-6725

DOI: https://doi.org/10.35882/jeeemi.v7i3.775

Keywords: Automated ICD coding; Radiology reports; MIMIC-IV; Hierarchial Multi-Head Attention Network; BioClinicalBERT; Health Informatics

Abstract

International Classification of Diseases (ICD) coding plays a pivotal role in healthcare systems with its provision of a standard method for classifying medical diagnoses, treatments, and procedures. However, the process of manually applying ICD codes to clinical records is both time-consuming and error-prone, particularly considering the large magnitude of medical terminologies and the periodic changes to the coding system. This work introduces a Hierarchical Multi-Head Attention Network (HMHAN) that aims to automate ICD coding using domain-related embeddings with an attention mechanism. The proposed method uses BioClinicalBERT for feature extraction from clinical text and then a two-level attention mechanism to learn hierarchical dependencies between labels. BioClinicalBERT is pre-trained on large biomedical and clinical corpora that enable it to capture complex contextual relationships specific to medical language more effectively. The multi-head attention mechanism enables the model to focus on different parts of the input text simultaneously, learning intricate associations between medical terms and corresponding ICD codes at various levels. This method uses SMOTE (Synthetic Minority Oversampling Technique) based multi-label resampling to solve class imbalance. SMOTE generates synthetic examples for underrepresented classes, allowing the model to learn better from imbalanced data without overfitting. For this work, MIMIC-IV dataset of de-identified radiology reports and corresponding ICD codes are used. The performance of the model is assessed with F1 score, Hamming loss, and ROC-AUC metrics. Results obtained from the model with an F1 score of 0.91, Hamming loss of 0.07, and ROC-AUC of 0.92 show promising research directions to automate the ICD coding process. This system will improve the effectiveness of healthcare workflows by automating ICD code generation for advanced clinical care.

Downloads

Download data is not yet available.

References

S. Strydom, A. M. Dreyer, and B. van der Merwe, “Automatic assignment of diagnosis codes to free-form text medical note,” JUCS - Journal of Universal Computer Science, vol. 29, no. 4, pp. 349–373, Apr. 2023, doi: 10.3897/jucs.89923.

Y. Wu, M. Zeng, Z. Fei, Y. Yu, F.-X. Wu, and M. Li, “KAICD: A knowledge attention-based deep learning framework for automatic ICD coding,” Neurocomputing, vol. 469, pp. 376–383, Jan. 2022, doi: 10.1016/j.neucom.2020.05.115.

F. Teng, Z. Ma, J. Chen, M. Xiao, and L. Huang, “Automatic Medical Code Assignment via Deep Learning Approach for Intelligent Healthcare,” IEEE J Biomed Health Inform, vol. 24, no. 9, pp. 2506–2515, Sep. 2020, doi: 10.1109/JBHI.2020.2996937.

R Kaur, JA Ginige, O Obst., “A systematic literature review of automated ICD coding and classification systems using discharge summaries”, arXiv preprint, Jul 2021, doi: arXiv:2107.10652

J. H. B. Masud et al., “Deep-ADCA: Development and Validation of Deep Learning Model for Automated Diagnosis Code Assignment Using Clinical Notes in Electronic Medical Records,” J Pers Med, vol. 12, no. 5, p. 707, Apr. 2022, doi: 10.3390/jpm12050707.

L Oberste, N Finze, P Hoffmann and A Heinzl, "Supporting the Billing Process in Outpatient Medical Care: Automated Medical Coding Through Machine Learning" (2022). ECIS 2022 Research Papers, 136,

https://aisel.aisnet.org/ecis2022_rp/136.

F. Teng, Y. Liu, T. Li, Y. Zhang, S. Li, and Y. Zhao, “A review on deep neural networks for ICD coding,” IEEE Trans Knowl Data Eng, pp. 1–1, 2022, doi: 10.1109/TKDE.2022.3148267.

T. Vu, D. Q. Nguyen, and A. Nguyen, “A Label Attention Model for ICD Coding from Clinical Text,” in Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, California: International Joint Conferences on Artificial Intelligence Organization, Jul. 2020, pp. 3335–3341. doi: 10.24963/ijcai.2020/461.

D Kim, H Yoo, S Kim, “An automatic ICD coding network using partition-based label attention”, arXiv preprint, Nov 2022, doi: arXiv:2211.08429.

L. Liu, O. Perez-Concha, A. Nguyen, V. Bennett, and L. Jorm, “Automated ICD coding using extreme multi label long text transformer-based models,” Artif Intell Med, vol. 144, p. 102662, Oct. 2023, doi: 10.1016/j.artmed.2023.102662.

H. Yuan, K. Yu, F. Xie, M. Liu, and S. Sun, “Automated machine learning with interpretation: A systematic review of methodologies and applications in healthcare,” Medicine Advances, vol. 2, no. 3, pp. 205–237, Sep. 2024, doi: 10.1002/med4.75.

M. K. Rohil and V. Magotra, “An exploratory study of automatic text summarization in biomedical and healthcare domain,” Healthcare Analytics, vol. 2, p. 100058, Nov. 2022, doi: 10.1016/j.health.2022.100058.

K. K. Jayanth, G. Bharathi Mohan, R. P. Kumar, and M. Rithani, “Intent Recognition Leveraging XLM-RoBERTa for Effective NLU,” in 2024 3rd International Conference on Applied Artificial Intelligence and Computing (ICAAIC), IEEE, Jun. 2024, pp. 877–882. doi: 10.1109/ICAAIC60222.2024.10575275.

W. Ponthongmak, R. Thammasudjarit, G. J. McKay, J. Attia, N. Theera-Ampornpunt, and A. Thakkinstian, “Development and external validation of automated ICD-10 coding from discharge summaries using deep learning approaches,” Inform Med Unlocked, vol. 38, p. 101227, 2023, doi: 10.1016/j.imu.2023.101227.

Z. Wang et al., “ICDXML: enhancing ICD coding with probabilistic label trees and dynamic semantic representations,” Sci Rep, vol. 14, no. 1, p. 18319, Aug. 2024, doi: 10.1038/s41598-024-69214-9.

S. Zhao et al., “Automated ICD coding for coronary heart diseases by a deep learning method,” Heliyon, vol. 9, no. 3, p. e14037, Mar. 2023, doi: 10.1016/j.heliyon.2023.e14037.

Y. Wu, X. Chen, X. Yao, Y. Yu, and Z. Chen, “Hyperbolic graph convolutional neural network with contrastive learning for automated ICD coding,” Comput Biol Med, vol. 168, p. 107797, Jan. 2024, doi: 10.1016/j.compbiomed.2023.107797.

S. R. Bhutto et al., “Automatic ICD-10-CM coding via Lambda-Scaled attention based deep learning model,” Methods, vol. 222, pp. 19–27, Feb. 2024, doi: 10.1016/j.ymeth.2023.11.017.

Y. Chen, H. Chen, X. Lu, H. Duan, S. He, and J. An, “Automatic ICD-10 coding: Deep semantic matching based on analogical reasoning,” Heliyon, vol. 9, no. 4, p. e15570, Apr. 2023, doi: 10.1016/j.heliyon.2023.e15570.

Z. Zhao, W. Lu, X. Peng, L. Xing, W. Zhang, and C. Zheng, “Automated ICD Coding via Contrastive Learning With Back-Reference and Synonym Knowledge for Smart Self-Diagnosis Applications,” IEEE Transactions on Consumer Electronics, vol. 70, no. 3, pp. 6042–6053, Aug. 2024, doi: 10.1109/TCE.2024.3419447.

I. Coutinho and B. Martins, “Transformer-based models for ICD-10 coding of death certificates with Portuguese text,” J Biomed Inform, vol. 136, p. 104232, Dec. 2022, doi: 10.1016/j.jbi.2022.104232.

T. Chomutare, A. Budrionis, and H. Dalianis, “Combining deep learning and fuzzy logic to predict rare ICD-10 codes from clinical notes,” in 2022 IEEE International Conference on Digital Health (ICDH), IEEE, Jul. 2022, pp. 163–168. doi: 10.1109/ICDH55609.2022.00033.

Z. Shuai et al., “Comparison of different feature extraction methods for applicable automated ICD coding,” BMC Med Inform Decis Mak, vol. 22, no. 1, p. 11, Dec. 2022, doi: 10.1186/s12911-022-01753-5.

S. Raz Bhutto, Y. Wu, M. Zeng, A. Wahab Dogar, K. Ullah, and M. Li, “DRCNNTLe: A deep recurrent convolutional neural network with transfer learning through pre-trained embeddings for automated ICD coding,” Methods, vol. 205, pp. 97–105, Sep. 2022, doi: 10.1016/j.ymeth.2022.06.004.

P.-F. Chen et al., “Automatic ICD-10 Coding and Training System: Deep Neural Network Based on Supervised Learning,” JMIR Med Inform, vol. 9, no. 8, p. e23230, Aug. 2021, doi: 10.2196/23230.

X. Diao et al., “Automated ICD coding for primary diagnosis via clinically interpretable machine learning,” Int J Med Inform, vol. 153, p. 104543, Sep. 2021, doi: 10.1016/j.ijmedinf.2021.104543.

I. Makohon and Y. Li, “Multi label Classification of ICD-10 Coding & Clinical Notes Using MIMIC & CodiEsp,” in 2021 IEEE EMBS International Conference on Biomedical and Health Informatics (BHI), IEEE, Jul. 2021, pp. 1–4. doi: 10.1109/BHI50953.2021.9508541.

J. Luo, C. Xiao, L. Glass, J. Sun, and F. Ma, “Fusion: Towards Automated ICD Coding via Feature Compression,” in Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, Stroudsburg, PA, USA: Association for Computational Linguistics, 2021, pp. 2096–2101. doi: 10.18653/v1/2021.findings-acl.184.

A. Chraibi, D. Delerue, J. Taillard, I. Chaib Draa, R. Beuscart, and A. Hansske, “A Deep Learning Framework for Automated ICD-10 Coding,” 2021. doi: 10.3233/SHTI210178.

P. Cao et al., “Clinical-Coder: Assigning Interpretable ICD-10 Codes to Chinese Clinical Notes,” in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, Stroudsburg, PA, USA: Association for Computational Linguistics, 2020, pp. 294–301. doi: 10.18653/v1/2020.acl-demos.33.

Z. Zhang, J. Liu, and N. Razavian, “BERT-XML: Large Scale Automated ICD Coding Using BERT Pretraining,” in Proceedings of the 3rd Clinical Natural Language Processing Workshop, Stroudsburg, PA, USA: Association for Computational Linguistics, 2020, pp. 24–34. doi: 10.18653/v1/2020.clinicalnlp-1.3.

J. Huang, C. Osorio, and L. W. Sy, “An empirical evaluation of deep learning for ICD-9 code assignment using MIMIC-III clinical notes,” Comput Methods Programs Biomed, vol. 177, pp. 141–153, Aug. 2019, doi: 10.1016/j.cmpb.2019.05.024.

D. Sasikala, R. Sudarshan, and S. Sivasathya, “Harnessing LLMs for Medical Insights:NER Extraction from Summarized Medical Text,” in 2024 15th International Conference on Computing Communication and Networking Technologies (ICCCNT), IEEE, Jun. 2024, pp. 1–6. doi: 10.1109/ICCCNT61001.2024.10724860.

R. Sudarshan, D. Sasikala, and S. Kalavathi, “Advancing Clinical Text Summarization through Extractive Methods using BERT-Based Models on the NBME Dataset,” in 2023 2nd International Conference on Automation, Computing and Renewable Systems (ICACRS), IEEE, Dec. 2023, pp. 1288–1294. doi: 10.1109/ICACRS58579.2023.10404906.

A. E. W. Johnson et al., “MIMIC-IV, a freely accessible electronic health record dataset,” Sci Data, vol. 10, no. 1, p. 1, Jan. 2023, doi: 10.1038/s41597-022-01899-x.

J Lovon, T Ben-Haddi, J Di Scala, JG Moreno, L Tamine, “Revisiting the MIMIC-IV benchmark: Experiments using language models for electronic health records”, arXiv preprint, Apr 2025, doi: arXiv:2504.20547.

TT Nguyen, V Schlegel, A Kashyap, S Winkler, SS Huang, JJ Liu, CJ Lin, “Mimic-iv-icd: A new benchmark for extreme multilabel classification”, arXiv preprint, Apr 2023, doi: arXiv:2304.13998.

H. B. Barathi Ganesh, U. Reshma, K. P. Soman, and M. Anand Kumar, “MedNLU: Natural Language Understander for Medical Texts,” 2020, pp. 3–21. doi: 10.1007/978-3-030-33966-1_1.

N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, “SMOTE: Synthetic Minority Over-sampling Technique,” Journal of Artificial Intelligence Research, vol. 16, pp. 321–357, Jun. 2002, doi: 10.1613/jair.953.

R. Blagus and L. Lusa, “SMOTE for high-dimensional class-imbalanced data,” BMC Bioinformatics, vol. 14, no. 1, p. 106, Dec. 2013, doi: 10.1186/1471-2105-14-106.

Y Ling, “Bio+ Clinical BERT, BERT Base, and CNN performance comparison for predicting drug-review satisfaction”, arXiv preprint, Aug 2023, doi: arXiv:2308.03782.

F Pedregosa, G Varoquaux, A Gramfort, V Michel, B Thirion, O Grisel, M Blondel, P Prettenhofer, R Weiss, V Dubourg, J Vanderplas, “Scikit-learn: Machine learning in Python”, Nov 2011, The Journal of machine Learning research, 12:2825-30.

G Wu & Zhu, Jun. (2020), “Multi label classification: do Hamming loss and subset accuracy really conflict with each other?”, 10.48550/arXiv.2011.07805.

A Tafvizi , B Avci, M Sundararajan, “Attributing auc-roc to analyze binary classifier performance”, arXiv preprint arXiv:2205.11781. 2022 May 24.

MT Ribeiro, S Singh, C Guestrin, “ ‘Why should i trust you?’ Explaining the predictions of any classifier”, InProceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining 2016 Aug 13 (pp. 1135-1144).