BM_2025v16n2

Bioscience Methods 2025, Vol.16, No.2, 83-99 http://bioscipublisher.com/index.php/bm 92 integration can be performed at the single-cell level, we can find relationships such as "enhancer X opened in satellite cells promotes the expression of myogenic gene Y". This information is crucial for proving causal relationships. At present, even in the absence of simultaneous sequencing data of single-cell multi-omics, we can also use the method of single-cell clustering + omics matching: first use single-cell transcriptomes to divide cell types, then perform ATAC-seq or ChIP-seq analysis on the corresponding cell groups, and finally match the two. The study of Hu sheep skeletal muscle development used a similar method to superimpose the ATAC accessible regions and low-methylation regions of muscles at a specific stage, and discovered a transcription factor network related to muscle development (Cao et al., 2023). 5.2 Computational tools for multi-omics analysis The complexity of multi-omics data requires advanced computational tools and algorithms for processing and integration. Commonly used single-cell multi-omics analysis frameworks include Seurat, Signac, Scanpy, etc., which have supported joint dimensionality reduction and clustering analysis of single-cell transcriptome and ATAC-seq data. For example, Seurat can use the "anchoring" method to align the data of cells in different modalities, thereby identifying the corresponding cell types across omics. There are also some specialized tools for associating regulatory elements with genes, such as the Cicero algorithm, which can predict co-open chromatin regions based on single-cell ATAC data and infer possible enhancer-promoter connections. For the integration of DNA methylation and transcriptome, existing algorithms can perform correlation analysis on methylation levels and gene expression, and combine genomic annotations to determine whether methylation occurs on promoters, gene bodies, or distal regulatory elements. For example, Ren et al. (2023) used weighted gene co-expression network analysis (WGCNA) and other methods to screen hub genes when analyzing goat methylation and expression. Such network analysis integrates multi-omics data into the same association network model, and finds the intersection gene set by calculating the association between gene expression modules and epigenetic feature modules. Another important aspect is visualization and database support. Today, there are some public databases, such as WashU Epigenome Browser and UCSC Genome Browser, which allow different types of data to be superimposed and plotted to intuitively display, for example, "ATAC signals, H3K27ac signals and DNA methylation levels of the promoter region of a gene under different conditions." This is very helpful for interpreting multi-omics results. In addition, machine learning technology has also begun to be applied to multi-omics integration. For example, deep learning models can combine DNA sequences, epigenetic modifications and gene expression for prediction and find new regulatory sequence patterns. It is worth noting that single-cell multi-omics particularly needs to solve the problems of data sparsity and noise. Because the molecules available to a single cell are very limited, such as only a small number of accessible regions are detected in each cell in scATAC, and many genes in scRNA have many zero values. This poses a challenge to integration. To this end, some tools use the "pseudocell" aggregation method to first merge similar cell data to improve the signal-to-noise ratio, and then perform omics association. In addition, there are statistical models such as Liger and MOFA that can perform multi-omics fusion in the presence of missing data. Liger extracts common and specific factors through non-negative matrix decomposition, and MOFA uses factor analysis models to find potential factors that drive changes in different omics. These methods are constantly being improved and applied to new biological scenarios. 5.3 Advantages and limitations of multi-omics in muscle research Multi-omics integration has shown significant advantages in skeletal muscle development research. First, it provides a more comprehensive perspective, which can consider multiple levels of gene regulation at the same time, avoiding one-sided conclusions caused by looking at a certain level in isolation. For example, if we only look at the transcriptome, we may find that a gene is upregulated, but we don’t know the reason; and combined with methylation data, we may find that it is activated by promoter demethylation, thereby elevating the discovery to a mechanistic level of understanding (Zhou et al., 2022). Second, multi-omics integration improves the credibility of the signal. Different omics data are mutually verified, which can reduce false positives. For example, if a gene is identified as differentially expressed and its upstream regulatory elements also have significant epigenetic changes, we are more confident that it does have biological significance. Thirdly, at the single-cell level,

Made with FlippingBook

RkJQdWJsaXNoZXIy MjQ4ODYzNA==