CMB_2025v15n5

Computational Molecular Biology 2025, Vol.15, No.5, 254-264 http://bioscipublisher.com/index.php/cmb 257 3.1 Types of biological networks Biological networks are not a single form but are composed of relationships at different levels. The gene regulatory network (GRN) focuses on describing the regulation of target genes by transcription factors. The protein-protein interaction (PPI) network focuses on whether proteins directly "encounter" each other. There are also metabolic networks and signaling networks, which are more inclined to display the chemical reactions and information transmission pathways within cells. Each network alone can reveal some cellular functions, but when it comes to the actual disease mechanism, these networks often need to be used in combination because the molecular changes of AD themselves span multiple levels (Tieri et al., 2019; Liu et al., 2020). 3.2 Data integration and preprocessing Before reconstructing the network, researchers usually have to deal with data from different sources, of different scales, and even of different qualities. Gene expression, epigenetic modifications, proteomics, and genetic variations all fall under common inputs. To enable them to "speak the same language", normalization, noise reduction and feature selection are often required first. Although multiple datasets are more comprehensive when placed together, they also have more differences. Therefore, batch correction or cross-study learning methods are often used to reduce technical biases. If handled properly, integrated data often capture complementary biological information more effectively than a single data source and also make network inferences more stable (Delgado-Chaves and Gomez-Vela, 2019). 3.3 Inference algorithms and models After the data is organized, researchers usually face another problem: exactly how to "push" out the network structure. Not all methods are very complicated; some merely look for clues based on the correlations between variables. However, in many cases, researchers will turn to machine learning or Bayesian methods, hoping to uncover more concealed connections. Techniques such as L1 regularization, hierarchical Bayes, and multi-task learning are all attempting to address an old problem: the network should neither be too dense nor lose important relationships due to excessive simplification. Meanwhile, graph neural networks and active learning have also begun to be frequently used in this type of analysis. They can incorporate existing biological knowledge and experimental intervention information into the model, making the network structure inference closer to the actual situation. These methods can usually more stably identify the key nodes or pathways that may be related to AD, providing more evidence for the screening of new mechanism hypotheses or therapeutic targets (Cui et al., 2024). 4 Disease Module Identification in AD When studying Alzheimer's disease (AD), it is found that a single gene often fails to explain the complexity of the disease. Therefore, it is often necessary to further "divide" the related genes or proteins into several smaller sub-networks, which are known as disease modules. The purpose of doing this is to group together those pathologically interrelated components so as to see more clearly the biological processes they are involved in together. In recent years, some methods combining graph representation learning and unsupervised clustering have been applied to multi-omics AD data, which can extract functional modules strongly related to diseases from large networks. Evaluations like the DREAM Challenge for disease module recognition also show that well-performing algorithms typically retrieve core modules related to multiple traits, and these modules often correspond to key disease pathways or potential therapeutic targets. Therefore, when building an AD network, choosing an appropriate clustering strategy is often more important than it seems. 4.1 Network clustering and module detection When dealing with AD networks, researchers usually encounter many clustering methods. MCL, MCODE, or various community detection algorithms can all be used, but they do not aim for the same result. Some methods prefer compact small modules with clear boundaries, which are suitable for directly corresponding to a certain type of disease-related subnetwork. There are also some methods that produce modules that are relatively large and have a loose structure. Although they are not as "neat", they can cover a wider range of functional areas. To make these modules more closely related to the actual situation of AD, researchers often incorporate known AD risk genes and their adjacent genes into the analysis. This approach of adding a "prior" generally makes the

Made with FlippingBook

RkJQdWJsaXNoZXIy MjQ4ODYzNA==