CMB_2025v15n2

Computational Molecular Biology 2025, Vol.15, No.2, 102-111 http://bioscipublisher.com/index.php/cmb 10 8 7.2 Calculation and prediction process of off-target sites In the off-target prediction of EMX1, researchers usually first use sequence alignment tools (such as Bowtie) to quickly locate possible approximate sequences. Subsequently, the risk score of each candidate site was calculated through the CFD model to screen out high-priority potential off-target sites. Furthermore, the researchers utilized deep learning models such as DeepCRISPR to conduct fine predictions of these candidate sites, thereby narrowing the scope of experimental verification. The prediction process generally includes the following steps: Input the EMX1 sgRNA sequence and its PAM recognition site; Genome-wide alignment and screening for sequences with high similarity; Calculate the off-target risk score and generate a list of candidate off-target sites; Further optimize the results using a deep learning framework and output a list of highly reliable predictions (Störtz et al., 2023). 7.3 Prediction Results and Experimental Verification Analysis Through computational prediction, researchers can usually obtain dozens of potential off-target sites. Subsequently, experiments were conducted using GUIDE-seq or CIRCLE-seq for verification. The results showed that there were indeed Cas9 cutting traces at some predicted sites. It is notable that the prediction results of CFD models and DeepCRISPR are often highly consistent with the experimental data, while those relying solely on BLAST or Bowtie produce more false positives than the opposing rules (Lin et al., 2020). This case fully demonstrates that integrating multiple prediction methods and combining them with experimental verification is an effective way to ensure the reliability of off-target assessment (AlJanahi et al., 2020). 8 Future Development Trends 8.1 Multimodal data integration enhances off-target prediction capabilities An important direction for future off-target prediction is the integration of multimodal data. In addition to the DNA sequence itself, factors such as chromatin open state, DNA methylation, histone modification and three-dimensional genomic structure should also be taken into consideration. By combining these data with deep learning models, the prediction results will be closer to the real intracellular environment (Lin and Wong, 2018). 8.2 Technological and algorithmic improvements for enhancing CRISPR specificity At the technical level, new Cas variants (such as high-fidelity SpCas9-HF1, eSpCas9) and improved sgRNA design strategies have shown potential in reducing off-target risk (Chen et al., 2017; Wang et al., 2023; Matsumoto et al., 2024). In the future, computational prediction models can be combined with these new technologies to provide customized off-target evaluations for different editing needs. Meanwhile, algorithm development also needs to focus on transparency and interpretability to help researchers understand the biological logic behind the predictions. 8.3 New advances in preclinical safety evaluation of gene editing As CRISPR technology advances towards clinical application, how to achieve comprehensive safety assessment in the preclinical stage has become a core issue. Computational prediction will be combined with high-throughput sequencing, single-cell analysis and animal models to establish a multi-level off-target assessment system (Tian et al., 2023). This not only helps to reduce the risks in gene therapy, but also provides reliable safety guarantees for agricultural molecular breeding. In the future, establishing an internationally unified prediction and verification standard will be an important guarantee for the healthy development of this field (Cancellieri et al., 2022). 9 Summary and Outlook This paper systematically reviews the research on the computational prediction of off-target effects in the CRISPR system, unfolding in sequence according to the established outline. Starting from the CRISPR/Cas system and off-target effect mechanisms, this paper introduces sequence alignment methods, rule-based and machine learning-based methods, deep learning prediction frameworks, as well as model evaluation and experimental verification strategies. Through the case study of the human EMX1 gene, the specific process and verification results of computational prediction in practical applications are demonstrated. Overall, the computational method has played a significant role in enhancing the rationality of sgRNA design and reducing experimental risks.

RkJQdWJsaXNoZXIy MjQ4ODYzNA==