CMB_2024v14n2

Computational Molecular Biology 2024, Vol.14, No.2, 84-94 http://bioscipublisher.com/index.php/cmb 88 Bayesian network grows exponentially with the number of variables (genes or proteins), making the task of learning the network structure from data computationally challenging, particularly for large biological systems. As a result, traditional Bayesian methods often struggle to scale efficiently to genome-wide datasets, which can contain tens of thousands of genes. Moreover, Bayesian networks require accurate prior knowledge to guide the inference process. In many biological contexts, such prior knowledge is incomplete or uncertain, leading to potential inaccuracies in the inferred network. Another challenge is the inability of Bayesian networks to model feedback loops and cyclic interactions, which are common in biological systems such as signal transduction pathways. Additionally, the assumption of acyclic relationships can limit the capacity of Bayesian networks to capture dynamic processes that involve recurrent interactions over time. 4.2 Granger causality Granger causality is a statistical approach used to determine whether one time series can predict another, thereby implying a causal relationship. In the context of biological networks, it is particularly useful for analyzing time-series gene expression data to uncover regulatory relationships between genes. The fundamental premise of Granger causality is that if the past values of one variable (e.g., gene expression) contain information that helps predict the future values of another variable, then the first variable is said to Granger-cause the second. This method has been widely applied in gene regulatory network (GRN) analysis, particularly in dynamic systems such as developmental biology, where gene expression levels change over time in response to various signals and environmental conditions. Traditional Granger causality operates under the assumption of linear relationships between variables, which may limit its application in the inherently nonlinear nature of biological networks. However, several extensions, such as nonlinear Granger causality and kernel-based approaches, have been developed to overcome this limitation (Furqan & Siyal, 2016). Granger causality has proven useful in a variety of biological applications. For instance, in neuroscience, it has been applied to identify connectivity between different brain regions by analyzing neural time-series data, providing insights into how various areas of the brain communicate during cognitive tasks. In gene regulatory networks, Granger causality can be used to infer which genes are likely to influence others over time, providing a dynamic view of gene regulation that is not captured by static network models. However, one limitation of Granger causality in biological network analysis is its sensitivity to noise and high-dimensional datasets—both common features in biological data. The number of potential interactions between genes can quickly exceed the available time points in the data, leading to overfitting and false positives. To address this, newer methods such as regularized Granger causality and ensemble-based approaches have been developed, which improve the robustness and scalability of the method by incorporating penalization techniques or aggregating multiple models to reduce false discoveries (Finkle et al., 2018). 4.3 Structural equation modeling (SEM) Structural Equation Modeling (SEM) is a powerful statistical technique used to analyze complex relationships between observed and latent variables. SEM is often employed in biological network analysis to infer direct and indirect effects among multiple interacting entities, such as genes or proteins. SEM combines aspects of factor analysis and multiple regression, allowing for the modeling of complex dependencies and feedback loops that are common in biological systems. One of the key strengths of SEM is its ability to handle both observed (measured) and latent variables (unobserved factors inferred from the data), making it particularly useful in biological studies where not all relevant factors can be directly measured, such as in gene regulation studies. In SEM, relationships are represented as a network of equations that describe how each variable depends on others, capturing the causal pathways that underlie biological processes (Howey et al., 2021). In biological research, SEM has been applied to a wide range of problems, from understanding gene-environment interactions to mapping signaling pathways in cancer. For example, SEM can be used to model how genetic mutations influence gene expression, which in turn affects cellular behavior and contributes to disease progression. This method is particularly valuable in integrative genomics, where researchers aim to combine data from

RkJQdWJsaXNoZXIy MjQ4ODYzNA==