IJCCR_2025v15n6

International Journal of Clinical Case Reports, 2025, Vol.15, No.6, 293-302 http://medscipublisher.com/index.php/ijccr 295 2.3 Comparison between artificial intelligence models and traditional risk scoring tools Traditional diabetes risk scoring tools, such as the Finnish Diabetes Risk Score (FINDRISC) and models based on logistic regression, have long been used for initial screening at the population level. They estimate risks based on information such as demographic characteristics, clinical indicators, and lifestyle factors, and their predictive performance is generally between a C-index or AUC of 0.74 and 0.94. And it is relatively simple in practical use. However, such tools rely on relatively limited variables and are mostly based on linear assumptions. When the population differences are large, their sensitivity and specificity may be limited (Rodacki et al., 2025). AI models can handle data of a large number of dimensions and demonstrate complex nonlinear correlations among multiple risk factors, so the prediction effect is better (Huang et al., 2023). In many studies, the accuracy and recall rates of deep learning and ensemble models are higher. Some results show that the prediction accuracy has improved by approximately 1% to 3%, and the AUC is close to or even exceeds 0.94 (Khokhar et al., 2025; Okwudili et al., 2025; Zhang et al., 2025). AI models can also continuously incorporate new types of data such as genes and images to continuously improve the risk assessment methods (Nie et al., 2025). However, it still faces many challenges in terms of "whether it can be understood by people", "its applicability to different groups of people", and "how to smoothly integrate into daily clinical work". Relevant research is focusing on solving these problems and building a more complete external validation and clinical promotion system (Nomura et al., 2021; Mohsen et al., 2023). 3 The Influence of Data Sources and Feature Engineering on Accuracy 3.1 Data source types and sample representativeness AI predictive models for early screening of diabetes are usually constructed based on large-scale population surveys (such as NHANES), the Pima Indian Diabetes Database (PIDD), and data from hospitals across the country. These data vary greatly in sample size, population composition, as well as the coverage of clinical, biochemical indicators and lifestyle-related data (Figure 1) (Patro et al., 2023). The data from NHANES is more in line with the population situation in the United States, while PIDD is specifically targeted at certain ethnic groups. Using multiple datasets of different types can make the model more applicable, but it also adds a lot of trouble to data integration and control bias (Talari et al., 2024). Figure 1 Proposed methodology for diabetes prediction (Adopted from Patro et al., 2023) If the model is applied to people with different genetic backgrounds, living habits or medical conditions, insufficient sample representativeness will lead to a deterioration in the model's performance. Therefore, the datasets used for training and validation should be as consistent as possible with the target population to be screened (Dutta et al., 2022; Talari et al., 2024). External validation with independent population data is recognized as a key step for evaluating whether a model can be generalized and avoid "learning bias" (Khokhar et al., 2025). 3.2 Feature types and feature engineering strategies Common features of diabetes prediction models include demographic information (such as age and gender), clinical examination data (such as BMI, blood pressure, and blood glucose), as well as lifestyle data (such as

RkJQdWJsaXNoZXIy MjQ4ODYzNA==