Computational Molecular Biology 2025, Vol.15, No.6, 291-298 http://bioscipublisher.com/index.php/cmb 293 their connections. Therefore, graph neural networks (GNNS) have become commonly used tools. It can handle spatial dependencies and capture the interaction intensity between different nodes, thereby identifying propagation clusters or potential super-propagation points. Deep learning models themselves have an advantage in handling complex structures. If real-time genomic or epidemiological data is added, the details of the propagation process will be presented more clearly, and the judgment of hotspots will also be faster. In the face of the dynamic changes in communication patterns, such models can be said to be more flexible and more suitable for constantly updated data scenarios (Dubey et al., 2025; Kaur and Butt, 2025). 3.3 Model training, cross-validation, and performance evaluation metrics (e.g., AUC, F1-Score) When building a predictive system, training and validation are often more crucial than the model itself. Researchers usually adopt methods such as K-fold cross-validation or time series segmentation to first observe how the model performs on unseen data, so as to avoid overfitting. AUC and F1-score are commonly used evaluation metrics. They can simultaneously reflect the accuracy, sensitivity and precision of classification. They are particularly important for epidemic prediction because incorrect judgments often bring actual risks. In recent years, some studies have tended to use integrated or hybrid models to combine the advantages of different AI algorithms, making the system more robust in different regions or different types of outbreak scenarios (Figure 1). This also highlights a fact: when applying artificial intelligence to disease surveillance, the evaluation framework itself is equally worthy of attention (Santosh, 2020; Jin et al., 2022). Figure 1 Chest X-ray: Bilateral focal consolidation, lobar consolidation, and patchy consolidation are clearly observed (Adopted from Santosh, 2020) 4 Data Integration and Feature Engineering Strategies 4.1 Integration of genotype-phenotype data and host-pathogen interactions When studying the mechanisms of animal disease outbreaks, many teams will first look at the data of genotype and phenotype together, and then fill in the interaction information between the host and the pathogen. The purpose of doing this is not to build the model in one go, but to identify genetic variations related to virulence or host susceptibility earlier. When genomic information, phenotypic characteristics and interactive networks are combined, the details in disease dynamics that are not easily captured by a single data point will be revealed, and it can also make the explanations of artificial intelligence models closer to biological reality. However, the most attractive aspect of this holistic integration is that it can help identify the critical paths or markers that affect transmission and disease severity, which are valuable for subsequent monitoring and intervention (Baker et al., 2019). 4.2 Spatiotemporal, environmental, and host behavior data fusion When building predictive models, people often do not start directly from the algorithm but first look at the background conditions of disease occurrence, including climate, land use, habitat conditions, as well as the migration and behavioral patterns of animals, etc. Once these spatio-temporal and ecological factors are combined
RkJQdWJsaXNoZXIy MjQ4ODYzNA==