CMB_2024v14n2

Computational Molecular Biology 2024, Vol.14, No.2, 84-94 http://bioscipublisher.com/index.php/cmb 87 correlation for causality can lead to false hypotheses about biological mechanisms, which could hinder therapeutic developments. Robust causal inference approaches, such as Granger causality or structural equation modeling, are essential to avoid such errors (Lecca, 2021). 3.3 Challenges in causal inference for complex systems Inferring causality in complex biological systems poses several challenges. First, biological networks are often high-dimensional, with a large number of genes, proteins, or metabolites interacting in a nonlinear and dynamic manner. Traditional statistical methods struggle with such complexity, leading to issues with scalability and computational efficiency. Another challenge is the presence of latent variables or hidden confounders, which can obscure true causal relationships. For example, in gene regulatory networks, unmeasured external factors might influence the expression of multiple genes, creating spurious causal links (Monneret et al., 2017). Additionally, observational data, which is commonly used in biological studies, often lacks the controlled experimental conditions needed for strong causal inference. Experimental interventions like gene knockouts help address this but are not always feasible. Advances in computational methods, such as machine learning and graph neural networks, have been proposed to overcome these challenges, providing more accurate and scalable causal inference tools for biological research (Zhang et al., 2022). 4. Causal Inference Methods in Biological Network Analysis 4.1 Bayesian networks 4.1.1 Basic principles Bayesian networks (BNs) are probabilistic graphical models that describe relationships between variables through conditional dependencies. They consist of nodes, representing biological entities such as genes or proteins, and directed edges that indicate the probabilistic causal relationships between them. The core concept of Bayesian networks is derived from Bayes' Theorem, which updates the probability of a hypothesis as new data becomes available. Bayesian networks are particularly effective for modeling complex systems where uncertainty and variability play significant roles, such as in biological networks where multiple genes interact to regulate cellular processes. A Bayesian network is represented as a directed acyclic graph (DAG), which models the hierarchical nature of gene interactions or signal transduction pathways. For biological network analysis, Bayesian networks are powerful tools due to their ability to incorporate prior knowledge and to integrate different types of data, such as gene expression and protein interaction data. Bayesian inference methods are also capable of handling noisy and incomplete data, which is a common issue in high-throughput biological experiments. This ability to integrate diverse data sources and account for uncertainty makes Bayesian networks a valuable approach for uncovering hidden patterns in biological systems (Howey et al., 2020). 4.1.2 Applications in gene regulatory networks In the realm of gene regulatory networks (GRNs), Bayesian networks are widely used to infer causal relationships between genes based on transcriptomic data. For instance, Bayesian approaches have been employed to predict how transcription factors regulate gene expression, revealing important insights into the molecular mechanisms that control cell behavior. Bayesian methods have been particularly useful in diseases such as cancer, where understanding how certain genes control others can lead to the identification of therapeutic targets. One key strength of Bayesian networks is their capacity to integrate different types of high-throughput data (e.g., transcriptomics, proteomics, and metabolomics) to construct a more comprehensive model of biological interactions. This integrated approach helps in identifying key regulatory genes and pathways that might not be evident from individual datasets. For example, studies using multi-omics datasets have applied Bayesian networks to explore cancer progression, identifying potential biomarkers and therapeutic targets by analyzing how genes interact within complex regulatory networks (Wang et al., 2018). 4.1.3 Limitations and challenges Despite the advantages of Bayesian networks, several limitations hinder their widespread application in biological research. The first major limitation is their computational complexity. The number of possible structures for a

RkJQdWJsaXNoZXIy MjQ4ODYzNA==