CMB_2025v15n4

Computational Molecular Biology 2025, Vol.15, No.4, 183-192 http://bioscipublisher.com/index.php/cmb 188 5.3 Spatial dependence differential gene expression analysis and functional annotation The purpose of spatial differential gene analysis is actually to see which genes are "abundant and scarce" in tissues. Some genes are particularly active at the edge of the tumor, but almost silent in the core region. Some genes also change bit by bit along the spatial direction, as if there is a gradient (Liang et al., 2024). Such differences often suggest the biological conditions of different regions - for instance, a strong immune response in one area and insufficient oxygen in another. Researchers usually take these gene sets for functional annotation or pathway analysis to see which biological processes they are related to (Li et al., 2025). In this way, the spatial distribution of genes is no longer just a graph but becomes a clue that helps us understand the roles played by each "section" of the tumor microenvironment. 6 Challenges and Frontiers in Tumor Microenvironment Modeling 6.1 Solutions to data sparsity and high noise problems The data of spatial transcriptomes always seem a bit "rough" - with many null values and high noise, which is an unavoidable problem when doing modeling. Some people will use smoothing or interpolation methods to make the information between adjacent points "borrow force" from each other, fill in those zero values, and extract the main signal at the same time (Lu et al., 2024). Some people trust statistical modeling more. They first assume that the noise follows certain patterns, such as characterizing it with a negative binomial distribution, and then distinguish the technical error from the true expression (Tian et al., 2024). If conditions permit, integrating data from multiple sets of experiments or different omics can also make the signal more stable. Although these methods cannot completely eliminate noise, they can make the results seem more reliable to some extent and lay a relatively solid foundation for subsequent analysis. 6.2 Interpretability of the model and difficulties in biological validation There is a long-standing and difficult problem with spatial transcription modeling - the model is too smart for people to understand. Especially for deep learning models, they can capture various complex spatial patterns, but no one can clearly explain how the results come about or which genes play a major role (Chitra et al., 2025). What's more troublesome is that those seemingly interesting discoveries in the model, such as new cell interactions or expression hotspots, cannot be verified in experiments with just a turn of the head. It is time-consuming and laborious to conduct staining imaging and functional analysis, and many predictions were eventually put on hold. The result is that the credibility of the model has been compromised and its practical application has also been delayed. The key to the future may not lie in more complex algorithms, but in enabling models to "speak human language" while finding faster and more reliable experimental verification methods (Zhao et al., 2024). 6.3 The latest progress of artificial intelligence and deep learning in spatial modeling Artificial intelligence is gradually venturing into the field of spatial transcription modeling, especially deep learning, which has been gaining momentum in recent years. Models like graph neural networks are now being used to analyze spatial gene expression maps, leveraging the adjacency information between cells to identify more subtle aggregation structures (Li et al., 2023). Some people have combined convolutional neural networks with histological images, hoping to capture more spatial features beyond gene expression. A bolder approach is to use generative models, such as variational autoencoders or Gans, to "create" spatial data for completing, simulating or verifying hypotheses (Hu et al., 2024). The advantage of AI is that it can automatically extract complex relationships from high-dimensional data, which is difficult for traditional algorithms to achieve. However, it also has a vulnerable side - it is prone to overfitting and has poor interpretability. No matter how powerful a model is, if it is not clear what it is looking at, its biological significance becomes a question mark. 7 Case Study: Modeling of Breast Cancer Microenvironment Based on Spatial Transcriptome Data 7.1 Data sources and experimental design (such as 10x genomics visium platform) This case used a breast cancer tissue section, and the data was from the Visium platform of 10x Genomics. At the beginning of the experiment, fresh frozen tissue sections were spread on slides with spatial barcodes, and H&E

RkJQdWJsaXNoZXIy MjQ4ODYzNA==