IJCCR_2025v15n5

International Journal of Clinical Case Reports, 2025, Vol.15, No.5, 209-218 http://medscipublisher.com/index.php/ijccr 213 Nevertheless, machine learning is sensitive to data quality and requires reasonable selection of features and adjustment of parameters to avoid overfitting (Yadav et al., 2023). The interpretability of models remains a challenge, but some interpretive artificial intelligence methods (such as SHAP and LIME) are helping clinicians understand the prediction results (Ghosh and Khandoker, 2024). In addition, hybrid and stacked methods combined with different algorithms have also demonstrated stronger performance and universality in chronic disease prediction (Khalid et al., 2023). 4.3 Deep learning models Deep learning models, such as convolutional neural networks (CNNS), recurrent neural networks (RNNS), and Transformers, perform outstandingly in chronic disease prediction and are particularly suitable for processing large-scale, multimodal, and time series data (Feng et al., 2021; Rashid et al., 2022; Tsai et al., 2025). CNN is good at extracting spatial features from images and structured data. RNN and its variants (such as LSTM) are more suitable for modeling the time dependence of longitudinal health data (Kim et al., 2023; Rehman et al., 2023). Transformer-based models and graph neural networks (GNNS) can also better express the complex connections between multiple diseases and patient characteristics (Lu and Uddin, 2021). These deep learning methods generally perform better than traditional models and ordinary machine learning models, especially when dealing with multiple tasks or integrating multiple types of data, and can simultaneously predict multiple chronic diseases (Kim et al., 2023). However, these methods require stronger computing power, sufficient data labels, and meticulous model adjustments to ensure good generalization ability and results that are understandable. Recent studies have also found that combining attention mechanisms with regularization techniques can enhance the stability of models and help identify key risk factors (Rico et al., 2024; Rajeashwari and Arunesh, 2024). 5 Model Comparison and Performance Evaluation 5.1 Evaluation metrics: accuracy rate, recall rate, F1-Score, AUC To determine whether a chronic disease prediction model is effective or not, one usually looks at indicators such as accuracy rate, recall rate (sensitivity), F1 score, and AUC. Accuracy rate refers to the proportion of correct judgments made by the model. Recall rate can reflect a model's ability to identify true cases of illness and is particularly important in medical applications as it can help reduce missed diagnoses (Uddin et al., 2022; Khalid et al., 2023). The F1 score combines accuracy and recall rates. It is particularly useful when the quantities of different categories in the data are not the same. AUC is an evaluation criterion that does not rely on classification thresholds. The higher this score is, the better the model distinguishes between diseased samples and healthy samples (Yang et al., 2023; Rico et al., 2024). Recent studies have emphasized the need for an overall evaluation using multiple indicators. For instance, deep learning and hybrid models often achieve very high accuracy rates (up to 99.6%), and also perform well in recall rates and AUC, indicating that they are more stable under multi-angle evaluations (Akter et al., 2021; Chittora et al., 2022; Zhang et al., 2023). The selection of indicators should be in line with clinical needs. For instance, in early screening, more attention should be paid to the recall rate and false negatives should be avoided as much as possible. 5.2 Comparative analysis: interpretability, generalization, and computational efficiency Interpretability has always been important in model selection. Statistical models like logistic regression are relatively easy to explain and can enable medical staff to clearly see the role of each predictor. However, they do not perform very well when dealing with complex nonlinear data (Uddin et al., 2022). In contrast, machine learning models (such as random forest, XGBoost) and deep learning models (such as neural networks, graph neural networks) have more accurate predictions and stronger adaptability, and are particularly suitable for processing large-scale and diverse data. However, they are often as difficult to understand as "black boxes" (Rashid et al., 2022). In recent years, interpretability tools such as SHAP and LIME have gradually been applied, which can help illustrate the importance of features and the basis for model decision-making, thereby improving this problem (Rico et al., 2024; Ghosh and Khandoker, 2024).

RkJQdWJsaXNoZXIy MjQ4ODYzNA==