Journal of Electronics, Electromedical Engineering, and Medical Informatics

Implementation of Real-Time Face Recognition for Secure Weapon Storage Access Control

Anisa Anisa — 2026-04-19

The security of weapon storage warehouses is a critical concern that requires an access control system with exceptionally high reliability, particularly in minimizing false acceptance, where unauthorized individuals are incorrectly granted access. In high-risk facilities, even a single false acceptance incident can lead to serious security consequences. Conventional systems based on physical keys or access cards present limitations, including risks of loss, duplication, and access forgery. Therefore, a biometric-based solution is necessary to enhance identification accuracy and strengthen overall security. This study aims to design and implement a reliable, high-security facial-recognition-based access control system for weapon storage facilities. The proposed system integrates a Multi-task Cascaded Convolutional Neural Network (MTCNN) for face detection, FaceNet for feature extraction, and a Support Vector Machine (SVM) for identity classification. The system is implemented as a standalone application on an edge computing device (mini PC) integrated with an electronic door lock. All detection and decision-making processes are performed locally without reliance on cloud services. System evaluation was conducted under various testing scenarios, including variations in lighting intensity, camera distance, facial attributes, and unregistered face testing. Experimental results show that the system achieved an accuracy of 96.25%. A precision of 100% indicates that no unauthorized access was granted. The recall reached 92.50%, reflecting a small proportion of rejected authorized users. The F1-score of 96.11% demonstrates balanced performance. The False Acceptance Rate was 0%, confirming complete prevention of illegal access. The False Rejection Rate was 7.50%, which remains acceptable in high-risk security environments. The system consistently rejected all unregistered faces and operated in real time with an average door unlocking response time of approximately 1.3 seconds. In conclusion, the proposed system provides reliable recognition performance with a strong emphasis on preventing false acceptance. These findings indicate its suitability for enhancing security in high-risk weapon storage facilities.

Optimized Recurrent Neural Network Based on Improved Bacterial Colony Optimization for Predicting Osteoporosis Diseases

Sivasakthi B — 2026-02-06

Osteoporosis is a silent disease before significant fragility fractures despite its high prevalence, and its screening rate is low. In predictive healthcare analytics, the Elman recurrent neural network (ERNN) has been widely used as a learning technique. Traditional learning algorithms have some limitations, such as slow convergence rates and local minima that prevent gradient descent from finding the global minimum of the error function. The main goal is to precisely estimate each individual's risk of developing osteoporosis. These forecasts are essential for prompt diagnosis and treatment, which have a significant influence on patient outcomes. Hence, the present research focuses on making a more efficient prediction method based on an optimized Elman recurrent neural network (ERNN) for predicting osteoporosis diseases. An optimized ERNN method, IBCO-ERNN, improved bacterial colony optimization (IBCO) by optimizing the ERNN weights and biases. The IBCO approach uses an iterative local search (ILS) algorithm to enhance convergence rate and avoid the local optima problem of conventional BCO. Subsequently, the IBCO is used to optimize the ERNN's weights and biases, thereby improving convergence speed and detection rate. The effectiveness of IBCO-ERNN is evaluated using four different types of osteoporosis datasets: Femoral neck, Lumbar spine, Femoral and Spine, and BMD datasets. The proposed IBCO-ERNN produced higher accuracy at 95.61%, 96.26%, 97.26%, and 97.54 % for the Femoral neck, Lumbar spine, Femoral, and Spine datasets, respectively. The experimental findings demonstrated that, compared with other predictors, the proposed IBCO-ERNN achieved respectable accuracy and rapid convergence.

A Multimodal Explainable-AI Approach for Deep-Learning-based Epileptic Seizure Detection

Ashwini Patil — 2026-02-04

Epilepsy carries a high risk of sudden death and increased premature mortality, highlighting the importance of automatic seizure detection to support faster diagnosis and treatment. The opacity of existing deep learning models limits their real-world application in diagnosing epileptic seizures, underscoring the need for more transparent and explainable systems. Limited research studies are available on Explainable Artificial Intelligence (XAI)-based epileptic seizure detection, and these studies provide only a visual explanation for the model’s behaviour. Additionally, these studies lack validation of the XAI outputs using quantitative measures. Thus, this research aims to develop an explainable epileptic seizure detection model to address the limitations of existing black-box deep learning approaches. It proposes a novel Hybrid Transformer-DenseNet121-XAI (HTD-MXAI) integrated model for detecting epileptic seizures from EEG data. The proposed model leverages advanced deep learning architectures, namely the Transformer and DenseNet121, for automatic feature extraction, while simultaneously extracting handcrafted features from the time, frequency, and spatial domains. The XAI techniques, such as Attention Weights, Saliency Maps, and SHapley Additive eXplanations (SHAP), are integrated with the proposed model to provide multimodal explainability for the model’s decision-making process. The results demonstrate that the proposed model outperforms state-of-the-art models for seizure detection. It achieves an overall (aggregated across subjects) accuracy of 99.14%, Sensitivity of 98.49%, and Specificity of 99.68% when applied to the CHB-MIT dataset. The Faithfulness score of 40.94% and completeness score of 1.00 indicate that the explanations provided by the XAI method for the model’s prediction are highly reliable. In conclusion, the proposed model offers a promising solution to the constraints, including the interpretability of black box models, limited multimodal explainability, and the validation of XAI techniques in the context of epileptic seizure detection.

Hybrid Separable Conv-ViT–CheXNet with Explainable Localization for Pneumonia Diagnosis

Khushboo Trivedi — 2026-02-21

This research presents a robust, interpretable, and computationally efficient deep learning framework for multiclass pneumonia classification from chest X-ray images, with a strong emphasis on diagnostic accuracy, model transparency, and real-time applicability in clinical settings. We propose SCViT-CheXNet, a novel hybrid architecture that integrates a Separable Convolution Vision Transformer (SCViT) with a simplified CheXNet backbone based on DenseNet121 to achieve efficient spatial feature extraction, hierarchical representation learning, and faster model convergence. The use of separable convolution significantly reduces computational complexity while preserving discriminative feature learning, and the transformer module effectively captures long-range dependencies in radiographic patterns. To address the critical issue of class imbalance inherent in medical imaging datasets, an Auxiliary Classifier Deep Convolutional Generative Adversarial Network (ADCGAN) is employed to generate synthetic samples for underrepresented pneumonia categories, thereby enhancing data diversity and improving model generalization. The proposed framework is extensively evaluated on two benchmark datasets: Dataset-1, consisting of Normal, Viral, Bacterial, and Fungal Pneumonia cases, and Dataset-2, comprising Normal, Viral Pneumonia, COVID-19, and Lung Opacity classes. Model interpretability is ensured through Gradient-weighted Class Activation Mapping (Grad-CAM), which enables visualization of disease-specific regions in chest X-ray images and validates the clinical relevance of the learned representations. Experimental results demonstrate that SCViT-CheXNet consistently outperforms existing convolutional neural network and transformer-based approaches, achieving 99% accuracy, precision, recall, and F1-score across both datasets. The synergistic integration of separable convolution, transformer-based feature modeling, and GAN-driven data augmentation results in a lightweight yet highly accurate and interpretable diagnostic system. Overall, the SCViT-CheXNet framework shows strong potential for deployment in automated pneumonia and COVID-19 screening systems, offering reliable support for real-time clinical decision-making and contributing to improved patient outcomes.

Impact of Optimizer Algorithm on NasNetMobile Model for Eight-class Retinal Disease Classification from OCT Images

Madhumithaa Selvarajan — 2026-03-02

Artificial intelligence (AI) is an emerging technology that plays a vital role in various fields, including the medical field. Ophthalmology is the earliest field to adopt AI for diagnosing several retinal diseases. Many imaging techniques are available, but Optical Coherence Tomography (OCT) is particularly useful for early-stage diagnosis. OCT is a non-invasive imaging method that offers high-resolution visualization of the retinal structure, aiding the ophthalmologist in differentiating between normal and abnormal retina. Automated OCT-based retinal disease classification using deep learning (DL) is important for early disease detection. Most DL models achieved high performance, but the influence of the optimizer on model behaviour, convergence, and explainability remains a challenge. To bridge the gap, this study evaluates the performance and convergence of five optimizers, such as RMSprop, AdamW, Adam, Nadam, and SGD, on the NasNetMobile model. The model was trained on the OCT-8 dataset, which comprises seven diseased retinal classes and one normal class of Optical Coherence Tomography (OCT) images. The seven diseases are Age-related Macular Degeneration (AMD), choroidal neovascularization (CNV), Central Serous retinopathy (CSR), diabetic macular edema (DME), diabetic retinopathy (DR), DRUSEN, and Macular Hole (MH). The study also analyzes convergence behaviour and explainability through early stopping regularization technique and GradCAM XAI, respectively. The model achieved 71%, 93%, 96%, 97%, and 97% of accuracy, respectively. Compared with other optimizers, the SGD optimizer achieved high accuracy in 22 epochs, which indicates better generalization. GradCAM XAI highlights the disease-relevant region across different retinal diseases. This framework emphasizes the significance of selecting an appropriate optimizer for robust retinal disease classification using a DL model trained on OCT images

MK–TripNet: A Deep Learning Framework for Real-Time Multi-Class Lung Sound Classification

Widya Surya Erini — 2026-03-30

Respiratory diseases such as asthma, pneumonia, and Chronic Obstructive Pulmonary Disease (COPD) remain major global health challenges, particularly in resource-limited settings where access to pulmonary specialists and early diagnostic tools is limited. Automatic lung sound classifications have emerged as a promising non-invasive screening approach; however, existing methods often rely on single-scale feature extraction, conventional loss functions, and offline analysis, which limit their discriminative capability and real-time applicability. The aim of this study is to develop and evaluate a deep learning framework for real-time multi-class lung sound classifications that improves discriminative representation and temporal sensitivity. To address limitations, this study proposes MK-TripNet, a novel deep learning architecture designed to integrate multi-scale feature extraction, discriminative embedding learning, and real-time inference within a unified framework. The main contribution of this work is the unified integration of a Multi-Kernel convolutional architecture, Triplet Loss-based embedding learning, and Sliding Window segmentation within a single end-to-end framework, enabling accurate segment-level lung sound classifications in real-time scenarios. Unlike prior approaches, the proposed method simultaneously captures fine-grained temporal patterns and broader spectral characteristics while explicitly maximizing inter-class separability in the embedding space. The proposed model was evaluated using a newly constructed dataset comprising 1,409 lung sound segments obtained from primary digital stethoscope recordings and publicly available respiratory sound databases. Experimental results demonstrate that MK-TripNet consistently outperforms several strong baseline models, including CNN-BiGRU, CNN-BiGRU-UMAP, and VGGish-Triplet, achieving an accuracy of 89.1%, an F1-score of 0.89, and a recall of 0.88. Ablation studies further confirm that the combined use of Multi-Kernel convolution, Triplet Loss, and Sliding Window segmentation yields the most robust and generalizable performances. These findings highlight the clinical potential of MK-TripNet for real-time digital auscultation and point-of-care respiratory screening, particularly in resource-limited and telemedicine settings.

Design and Statistical Evaluation of an AI-Enabled IoT-Based Non-Invasive Biosensing System for Diabetes Risk Screening

Prachi C. Kamble — 2026-03-30

Early identification of diabetes risk remains a significant challenge due to the invasive nature, recurring cost, and limited accessibility of conventional biochemical diagnostic tests. These limitations restrict continuous monitoring and hinder large-scale population screening, particularly in remote and resource-limited settings. The aim of this study is to design and statistically evaluate an AI-enabled IoT-based non-invasive biosensing system for diabetes risk screening, focusing on system-level engineering design, data integration, and performance validation rather than clinical diagnosis. In this study, the term “non-invasive” refers exclusively to externally measurable surface-level physiological and breath-based signals that do not require skin penetration, blood sampling, or subdermal sensor implantation. The main contributions of this work include the development of a wearable IoT-based non-invasive biosensing framework, integration of multi-modal physiological and breath-based biomarkers for risk assessment, implementation of an ensemble machine learning model for diabetes risk classification, and comprehensive statistical validation using agreement, reliability, and calibration metrics. The proposed DiaAssist system acquires physiological parameters such as heart rate, blood pressure, oxygen saturation, body temperature, physical activity indicators, and breath volatile organic compound acetone through a wearable IoT platform with edge-level preprocessing. Fused physiological and demographic features are processed using an ensemble learning framework to generate individualized diabetes risk scores. Performance evaluation was conducted on a single-center observational dataset comprising 625 records using paired statistical tests, agreement analysis, and calibration assessment. The optimized model achieved an accuracy of 99.7%, an area under the receiver operating characteristic curve of 1.000, a Cohen’s Kappa coefficient of 0.993, a Matthews correlation coefficient of 0.993, and a Brier score of 0.045, demonstrating strong classification reliability and probabilistic calibration. The results confirm that combining IoT-based non-invasive biosensing with ensemble machine learning enables accurate and reliable screening for diabetes risk. The proposed system provides a scalable, cost-effective, and engineering-oriented solution suitable for remote monitoring and preventive healthcare applications

Multipoint Wrist Pulse Acquisition and Analysis by Combining HRV with Morphological Timing Features for Quantitative Identification of Ayurvedic Doshas

Devendra Patel — 2026-03-31

Nadi Pariksha, the traditional Ayurvedic method of wrist pulse examination, posits that three adjacent radial artery locations corresponding to Vata, Pitta, and Kapha (V-P-K) reflect distinct physiological states. While recent sensor-based systems have attempted to digitize wrist pulse acquisition, many have emphasized hardware design or classification performance without rigorously validating physiological differences between pulse sites within the same individual. This study presents a quantitative evaluation of the multi-point principle of Nadi Pariksha using synchronized multi-site photoplethysmography (PPG) combined with integrated cardiovascular signal analysis. Pulse waveforms were simultaneously acquired from 39 participants, including 32 healthy individuals and 7 clinically characterized subjects, at the three classical radial artery locations. Morphological timing features and time-domain heart rate variability (HRV) metrics were extracted to characterize vascular dynamics and autonomic regulation. Within-subject statistical analysis demonstrated significant spatial differentiation across the pulse sites. Crest time decreased from 0.204 s at the Kapha site to 0.175 s at the Vata site (14.2% reduction), while systolic width decreased from 0.140 s to 0.109 s (22.1% reduction) (p ≤ 0.004). Non-parametric analysis confirmed significant differences in crest time (H = 9.15, p = 0.010), pulse width (H = 8.43, p = 0.015), systolic amplitude, systolic area, and HRV variability (SDNN: H = 6.33, p = 0.041), with moderate-to-large effect sizes (η² = 0.12–0.20). Clinically characterized cases exhibited deviations from this baseline pattern, including a 62% reduction in crest time gradient and a 72% increase in stiffness index in diabetes, and a 55% reduction in gradient with a 25% decrease in HRV during acute infection. Given the limited clinical sample (n = 7), these findings are interpreted as preliminary. Overall, the results provide quantitative within-subject evidence supporting the physiological distinctiveness of the V-P-K pulse locations and contribute toward the development of standardized, sensor-based Nadi Pariksha

A Hybrid Deep Ensemble Model for Precise Liver and Tumor Segmentation Using U-Net and W-Net Architectures

B. Sravani — 2026-04-02

The identification of the liver with the hepatic tumors on the computed tomography (CT) scans is a major compulsion to the earliest diagnosis, treatment planning, and surgery in the case of hepatocellular carcinoma. However, automated segmentation is not an easy job due to the non-homogeneous appearance of tumors, blurry boundaries, small size of annotated datasets, and high inter-slice variability. Existing single deep learning models are known to suffer from prediction variance and low generalization in complex clinical conditions. The primary goal of the study is to develop an effective, highly accurate segmentation model that improves the accuracy, consistency, and explanability of liver and tumor borders in CT images. In this paper, an original hybrid deep ensemble model is proposed that leverages the advantages of U-Net and W-Net. This is the primary contribution; one can combine the strong spatial localization ability of U-Net and the reconstruction-driven unsupervised learning ability of W-Net in minimizing the variance and maximizing the generalization. In addition, soft probability fusion, uncertainty modelling, and entropy-based confidence estimation are also introduced to improve reliability and clinical interpretation. The preprocessing of CT images is performed mathematically by normalizing and resizing to 256x256. U-Net and W-Net are trained separately using the pixel-wise probability maps, which are soft-averaged and thresholded. Benchmark liver CT datasets are tested with the ensemble using the Dice coefficient, accuracy, precision, recall, F1-score, Intersection over Union (IoU), ROC-AUC, and statistical significance tests. The results of the experiment show that the suggested ensemble performs better with an accuracy of 95.4, a precision of 94.3, a recall of 93.9, an F1-score of 94.1, IoU of 89.8, and an average ROC-AUC of 0.9615 than the models of the U-Net and W-Net, which differ in a huge number. Statistical confirmation that the improvements are relevant (p < 0.01) will be provided. In summary, the proposed deep ensemble segmentation can accurately, reliably, and effectively segment the liver and tumor, showing strong potential for clinical use and subsequent extension to multi-organ and multi-modal medical imaging.

Ensemble Voting Method to Enhance the Performance of a Dental Caries Detection System using Convolutional Neural Network

Putri Rizkiah — 2026-04-02

Individual classification models for caries detection still face significant challenges, including limited accuracy and unstable predictions, which can hinder diagnosis, delay clinical decisions, and increase the risks associated with patient care. To overcome these limitations, this study proposes an ensemble voting method that combines five deep learning models, such as ResNet-152, MobileNetV2, InceptionV3, NASNetMobile, and EfficientNet-B5. This approach aims to enhance the accuracy and stability of caries detection by leveraging the complementary strengths of the individual models while mitigating their weaknesses. Each model was trained and tested on the same dataset of dental images, categorized into caries and regular classes. Their predictions were aggregated using hard and soft voting techniques. The ensemble's performance was evaluated using accuracy, precision, recall, and F1-score. The ensemble voting demonstrates a notable improvement in classification performance over individual models. Hard and soft voting have excellent classification performance and consistently outperform the best individual models. The accuracy increased from EfficientNetB5 0.8485 to 0.8864 and 0.8712, representing increases of 4.46% and 2.68%, respectively. The precision increased from MobileNetV2 0.8182 to 0.8493 and 0.8551, representing increases of 3.81% and 4.52%. For recall, EfficientNetB5 ranked highest among individual models with a score of 0.9242. Hard voting increased 1.64% to 0.9394, and soft voting decreased slightly by 3.28% to 0.8939. The F1 score of EfficientNetB5 is 0.8592. Hard and soft voting increased 3.83% and 1.73% to 0.8921 and 0.8741. The proposed ensemble improves the F1-score by 3.83 percentage points compared to the best individual model. The ensemble voting method effectively leverages the complementary strengths of each deep learning model to improve the stability and accuracy of fast, reliable dental caries early detection prediction.

HST-Net: Hierarchical Spectrum-Tokenization with Progressive Refinement for Cardiac MRI Segmentation

Naga Chandrika Gogulamudi — 2026-04-11

The accurate segmentation of cardiac structures from Magnetic Resonance Imaging (MRI) plays a vital role in quantitative ventricular assessment, functional analysis, and the clinical diagnosis of cardiovascular diseases. Precise delineation of cardiac components, such as the left ventricle, right ventricle, and myocardial wall, is essential for evaluating cardiac morphology and function. In recent years, transformer-based architectures, including TransUNet and Swin-UNet, have demonstrated strong capabilities in modeling long-range dependencies and capturing global contextual information. However, despite these advantages, they often struggle to preserve smooth anatomical geometry and achieve high-precision boundary delineation, particularly in the presence of large shape deformations and significant inter-subject variability commonly observed in cardiac MRI data. To overcome these limitations, a Hierarchical Spectrum-Tokenization Network (HST-Net) is proposed. The core idea of HST-Net is to represent cardiac anatomy at multiple levels of granularity, enabling a more robust structural understanding across varying spatial scales. The proposed architecture incorporates a novel approach called Spectrum Tokenization. This approach divides the latent representations into two parts, one containing low-frequency global tokens that capture context information, and another containing high-frequency boundary-aware tokens that capture the contours. By progressively enhancing boundary details, PSR significantly improves contour accuracy, especially for complex and thin structures. Experimental evaluations conducted on a cardiac MRI dataset demonstrate the effectiveness of the proposed approach. HST-Net achieves an average Dice coefficient of 91.6% and a pixel-wise segmentation accuracy of 94.8%. Compared to nnU-Net and Swin-UNet, it shows consistent performance gains, yielding improvements of 2.1–3.4% in Dice score and 1.9–2.6% in segmentation accuracy across different cardiac structures.

Dynamic Uncertainty-Aware Adaptive Subspace Fusion Network for Robust Multimodal Medical Image Classification

Krishnakumar B — 2026-04-11

Multimodal medical image classification leverages complementary information from multiple imaging modalities to improve diagnostic accuracy and clinical decision-making. However, most existing multimodal fusion approaches rely on deterministic low-rank constraints and assume equal importance across all modalities. Such assumptions significantly limit flexibility, robustness, and interpretability, particularly in real-world clinical scenarios where modality data may be noisy, incomplete, or partially missing. To address these challenges, this work proposes a Dynamic Uncertainty-Aware Adaptive Subspace Fusion Network (DUA-SFNet) for robust multimodal medical image classification. The core of the proposed framework is a rank-learning adaptive-rank tensor decomposition module that dynamically adjusts subspace dimensionality according to the intrinsic complexity of the input data. This adaptive mechanism effectively reduces feature redundancy while preserving the highly discriminative information essential for accurate classification. In addition, DUA-SFNet incorporates a modality uncertainty estimation scheme to explicitly quantify the reliability and trustworthiness of each modality. By assigning uncertainty-aware weights during the fusion process, the framework can suppress unreliable or noisy modalities while emphasizing more informative ones, thereby improving resilience under adverse data conditions. Furthermore, a hierarchical adaptive attention strategy is employed to jointly model intra-subspace feature interactions and inter-modality dependencies. This design enhances feature representation capability while offering improved clinical interpretability by revealing how different modalities and subspaces contribute to the final decision. Extensive experiments conducted on multiple public and self-organized multimodal medical image datasets demonstrate that DUA-SFNet consistently outperforms state-of-the-art methods, achieving classification accuracy improvements of 3.8–6.2% and F1-score gains of 4.1–7.5%. Overall, DUA-SFNet provides an interpretable, uncertainty-aware, and adaptive solution for next-generation multimodal medical image analysis.

Topographic EEG Power Mapping and Machine Learning-Based Seizure Detection Using Real and Synthetic SSIM-MSE Features

Ghansyamkumar Rathod — 2026-04-11

The neural activities of the brain can show abnormalities and misfiring due to seizures. The ionic activity of the brain can be converted into electrical activity, which can be observed on the human scalp using electroencephalography (EEG). The spatial patterns of brain activity can be analyzed using topographic maps generated from EEG signals. In this study, topographic power maps with seizure and normal states of the brain were generated, and the features of the image were named structural similarity index (SSIM) and mean square error (MSE). The data utilized in this study were obtained from a publicly available dataset from the Children's Hospital Boston (CHB) in association with the Massachusetts Institute of Technology (MIT). Topographic images of the bipolar montages showed a clear difference between seizure and non-seizure brain states, along with the affected areas of the brain regions. Synthetic Features were generated to mimic real data for training the ML models. The major tested machine learning models, gradient boosting, decision tree, and k-nearest neighbors, provided the highest accuracy of 99.34% and an F-score of 0.996 when evaluated using real and generated data. The generalizability of the model was confirmed using 5-fold cross-validation. Overall, this study provides an EEG power-based topographic power image generation along with reliable feature extraction to train ML models for detecting epileptic seizures. The proposed methodology not only enhances the interpretability of EEG spatial patterns but also offers potential for integration into biomedical wearable devices for real-time seizure monitoring and intervention, along with the identification of the type of seizure.

Non-Contact Heart Rate Detection Using FMCW Radar Based on 1-D Convolutional Neural Networks

Diyah Widiyasari — 2026-04-11

Non-contact heart rate (HR) estimation using frequency-modulated continuous-wave (FMCW) radar has emerged as a promising solution for unobtrusive, continuous vital-sign monitoring. However, accurately extracting HR from radar signals remains challenging because of low-amplitude cardiac-induced chest vibrations, environmental clutter, motion artifacts, and system noise. Traditional signal processing techniques, such as bandpass filtering combined with fast Fourier transform (FFT) analysis, are commonly employed to estimate HR in the frequency domain. Nevertheless, these approaches are highly sensitive to noise and often struggle to robustly capture weak cardiac components, leading to unstable or inaccurate estimates. To address these limitations, this study proposes a non-contact HR estimation framework based on FMCW radar combined with a one-dimensional convolutional neural network (1D-CNN). A systematic radar signal preprocessing pipeline is developed, including range-bin selection, phase extraction, noise suppression, filtering, and structured data labeling, to construct learning-ready input features. The 1D-CNN model is designed to automatically learn discriminative temporal patterns associated with cardiac activity directly from preprocessed radar signals. The proposed method is evaluated using two datasets: a publicly available dataset and an independently acquired dataset collected under controlled conditions. Performance is benchmarked against conventional bandpass filtering- and FFT-based HR estimation methods. The experimental results demonstrate that the proposed 1D-CNN framework achieves more accurate and stable HR predictions. On the public dataset, MAE decreases from 17.96 to 6.09 BPM, RMSE from 21.28 to 7.34 BPM, and MedAE from 17.66 to 5.43 BPM. The independent dataset yields consistent gains, with MAE decreases from 14.05 to 5.45 BPM, RMSE from 18.05 to 6.84 BPM, and MedAE from 10.74 to 4.57 BPM. These results indicate that the proposed 1D-CNN framework can effectively estimate HR from radar signals and demonstrate its capability to operate across datasets acquired with different radar frequencies

FedBrain-3DMRI: Federated Self-Supervised Learning for 3D Brain Tumor Segmentation using SCAFFOLD Algorithm

Neeshu Chaudhary — 2026-04-19

Brain tumor segmentation is the most important way to separate tumor areas from healthy brain tissue in medical imaging. This is necessary for making an accurate diagnosis and planning treatment. But building strong deep learning models is often hard because there isn't much labeled medical data available, and strict privacy rules stop data from being shared in one place. Federated Learning (FL) helps keep patient data private by keeping it local, but its performance often drops when data from different hospitals have big differences in quality, imaging protocols, and distribution. Our research seeks to create a privacy-preserving federated learning framework that adeptly manages significant data heterogeneity while ensuring high segmentation accuracy across various institutions. We propose a new two-stage FL framework that allows multiple institutions to work together while keeping their privacy and effectively dealing with complicated non-IID data distributions. To start, we use a Federated Masked Autoencoder (MAE) for self-supervised pre-training. This lets the model learn strong anatomical features from unlabeled MRI scans. Second, the model is carefully fine-tuned using an Attention ResUNet3D architecture to get very accurate tumor segmentation. We use the SCAFFOLD optimization algorithm to keep training stable across all clients, even when the scanner varies from site to site, thereby directly addressing client drift. We also use strategic foreground-biased sampling and Test-Time Augmentation (TTA) techniques to greatly improve segmentation accuracy in difficult, uneven tumor sub-regions. We ran extensive experiments on the BraTS 2024 dataset in simulated federated settings with 10, 50, and 100 different clients. The Dice coefficients we got were 0.826, 0.824, and 0.818, which demonstrate strong performance. In the end, these strong results show that the suggested method works well on a larger scale and can be used in a clinical setting.

Brain Tumor Detection from MRI Images Using an Ensemble-Based Machine Learning Framework

Arpit Bhatt — 2026-04-22

The early detection of brain tumors from MRI images is critical for effective treatment planning. Still, manual analysis of these images is time-consuming and prone to inter-observer variability. This paper suggests a machine learning framework for automated brain tumor detection that uses an ensemble of classifiers to make it more accurate and reliable. The suggested framework combines Support Vector Machine (SVM), Random Forest (RF), and k-Nearest Neighbor (k-NN) classifiers. It uses a majority voting method at the decision level to make final predictions. The model uses both handcrafted texture features from the Gray-Level Co-occurrence Matrix (GLCM) and deep features from a pre-trained ResNet50 model to make it more effective at distinguishing between things. The framework was tested using three publicly available MRI datasets: Figshare, SARTAJ, and BR35H. These datasets had a total of 9,826 images. The ensemble model got 95.2% correct, with 94.6%, 94.1%, and 94.3% for precision, recall, and F1-score, respectively. This was better than any of the individual classifiers. The area under the curve (AUC) was also 0.97, which means it was very good at telling the difference between things. The experimental results demonstrate that the ensemble approach not only delivers a robust solution but also ensures computational efficiency, rendering it appropriate for clinical applications. This framework shows that it could be used in computer-aided diagnosis systems to detect brain tumors in real time and perform better across different datasets. The suggested ensemble-based framework is a scalable, efficient, and reliable way to use MRI to find brain tumors. It gets around the problems that single classifiers have in medical imaging.

Robust Brain Tumor MRI Classification Through MobileNetV3 Deep Feature Fusion and Principal Component Analysis Enhanced AdaBoost Learning

Ahmed Aizaldeen Abdullah — 2026-04-23

Among the most serious neurological diseases are brain tumors, which pose a challenge to early detection through MRI due to low contrast, tissue heterogeneity, and high-dimensional deep features that make it difficult for traditional classification models to be effective. This study proposes a robust and computationally efficient multi-class classification framework capable of distinguishing four tumor types: glioma, meningioma, pituitary tumor, and no tumor. The primary contributions are: (1) the development of a hybrid feature-learning pipeline that introduces a hybrid feature-learning framework in which a one-level 2D Discrete Wavelet Transform (2D-DWT) is employed as a multi-resolution preprocessing step to enhance MRI slices prior to deep feature extraction using MobileNetV3; (2) the application of Principal Component Analysis (PCA) to compress a 1,024-dimensional deep-feature vector into only 20 principal components, achieving a 99.96% reduction in dimensionality; (3) the use of an optimized AdaBoost ensemble specifically adapted for low-dimensional inputs; and (4) achieving performance that surpasses several published approaches evaluated on the same benchmark dataset. The proposed workflow includes cropping, normalization, and CLAHE enhancement, followed by 2D-DWT to extract LL, LH, HL, and HH sub-band information. The wavelet-refined MRI slices are processed by MobileNetV3 to implicitly encode spectral–textural information into deep semantic representations, which are subsequently reduced using PCA and classified by AdaBoost. Experiments conducted on a public Kaggle brain MRI dataset comprising 7023 images show that MobileNetV3 combined with 2D-DWT achieves an accuracy of 99.56%. When enhanced with PCA and AdaBoost, the full framework attains 99.94% accuracy, 99.95% precision, 99.96% recall, 99.94% F1-score, and 100% AUC, demonstrating remarkable tumor discrimination performance. In summary, the proposed PCA–AdaBoost hybrid framework offers a highly accurate, lightweight, and clinically promising solution for automated brain tumor MRI classification.

Comparative Evaluation of LSTM and Metaheuristic-Optimized Neural Networks for ECG Prediction under Limited Data Conditions

Giovanni Dimas Prenata — 2026-04-23

This study presents a comparative evaluation of Deep Feedforward Neural Network (DFFNN) models optimized using single-stage metaheuristic approaches, including Genetic Algorithm (GA), Particle Swarm Optimization (PSO), and Grey Wolf Optimization (GWO), as well as a multi-stage hybrid optimization strategy (GA+GWO) for ECG-based emotion classification. The experimental dataset consists of ECG recordings collected from three elderly participants using a Sparkfun AD8232 sensor under controlled emotional stimuli, representing a limited-subject and small-data scenario. Feature extraction is conducted using Heart Rate Variability (HRV) parameters derived from both time domain (Mean RR, SDNN, RMSSD, Mean HR, and STD HR) and frequency domain (LF, HF, and LF/HF ratio). Experimental results from six repeated trials demonstrate that the multi-stage DFFNN+GA+GWO model achieves the best optimization performance, yielding the lowest Mean Squared Error (MSE) of 0.01599 and a consistent training accuracy of up to 85.71%. Compared with single-stage optimization methods, the hybrid approach exhibits improved convergence behavior and reduced performance variance, indicating enhanced optimization stability. However, test accuracy remains relatively limited (33.33%–50.00%), reflecting constrained generalization capability due to the small dataset and the absence of subject-wise or external validation. Further statistical analysis using confidence intervals and nonparametric testing confirms that the observed performance improvements are primarily associated with optimization stability rather than statistically significant gains in predictive generalization. Therefore, this study emphasizes the role of metaheuristic optimization in stabilizing neural network training under limited data conditions. The findings should be interpreted as a pilot feasibility study, and future work is required to validate the proposed approach using larger, more diverse datasets and more rigorous validation strategies.

LRSE-LCC: A Lightweight Residual CNN with Squeeze-and-Excitation Attention for Lung Cancer Classification from CT Image

Dhaval J. Rana — 2026-04-19

Lung cancer is still a major cause of cancer deaths globally, and there is a need for accurate and early diagnostic systems. Although deep learning models have shown encouraging results in classifying lung cancer from CT scans, most are computationally complex. This paper proposes the design of a lightweight and accurate deep learning model for multi-class lung cancer classification from CT scans. A new model called Lightweight Residual CNN with Squeeze-and-Excitation Lung Cancer Classification (LRSE-LCC) is proposed. The model combines lightweight residual learning for stable gradient flow and channel attention for improved feature representation. Dual global pooling is used by combining Global Average Pooling and Global Max Pooling to enable complementary feature extraction. In addition, a balanced batch training method is used to handle class imbalance. The proposed model was tested on the IQ-OTH/NCCD lung CT image dataset, which includes normal, benign, and malignant images. Image resizing and normalization were done before training. The proposed LRSE-LCC model achieved a test accuracy of 98.19%. Sensitivity was 100.00%, indicating strong ability to detect malignant images. The model achieved a specificity of 99.04%, reducing false-positive predictions. The macro-averaged AUC was 99.90%. The AUC values for all classes exceeded 99.80%, indicating outstanding classification performance. The macro F1-score was 96.42%. The value of the Cohen’s kappa coefficient was 96.88%, which ensured that the agreement was not by chance. The overall error rate was limited to 1.81%. In conclusion, the proposed LRSE-LCC model has both high classification accuracy and efficiency. The combination of residual learning, channel attention, and dual pooling helps to greatly improve the accuracy of multi-class diagnosis. The proposed lightweight model has great potential for application in real-world computer-aided lung cancer diagnosis systems.