Computational Molecular Biology 2025, Vol.15, No.2, 102-111 http://bioscipublisher.com/index.php/cmb 10 6 5.2 CRISPR-net model and method CRISPR-Net is another off-target prediction tool based on deep learning. It adopts a method combining recurrent neural networks (RNN) and convolutional neural networks, which can not only capture local features of sequences but also identify long-term dependencies. CRISPR-Net achieved higher sensitivity and accuracy by training a large amount of real off-target detection data. Its notable feature is the introduction of sequence position embedding and attention mechanisms, enabling the model to "understand" the degree of influence of different mismatch positions. Compared with DeepCRISPR, CRISPR-Net performs better in cross-species prediction and shows high consistency on human, mouse and plant data. This indicates that deep learning has unique advantages in dealing with complex genomic backgrounds and diverse PAM identification (Zhang et al., 2023). 5.3 Comparison of other deep learning prediction models (such as R-CRISPR, etc.) In addition to DeepCRISPR and CRISPR-Net, models such as R-CRISPR and ElevatedCRISPR have also been proposed. Most of these models have introduced structural improvements, such as convolutional neural networks (GCN) to capture the spatial features of sequences, or Transformer frameworks to achieve stronger context modeling capabilities (Niu et al., 2021). Different deep learning models each have their own advantages. For example, R-CRISPR pays more attention to the interaction between mismatch sequences and genomic background and is suitable for studying complex off-target patterns. The Transformer-based model demonstrated higher generalization ability on large-scale datasets. Overall, the introduction of deep learning has significantly improved the accuracy of off-target prediction, but it has also brought about the "black box effect" problem, that is, it is difficult to intuitively explain the biological significance of the prediction results. 6 Model Evaluation Indicators and Experimental Verification Strategies 6.1 Common evaluation indicators of off-target prediction models (ROC curve, etc.) To measure the performance of different prediction models, researchers usually employ a series of statistical indicators. Receiver operating characteristic (ROC) Curve and area under the curve (AUC) value are the most commonly used evaluation methods to measure the ability of the model to distinguish off-target from non-off-target. In addition, metrics such as accuracy rate, recall rate, and F1 score are also widely applied in various scenarios. These metrics can reflect the model's performance under different trade-offs. For instance, when emphasizing "comprehensive capture of potential off-targets", the recall rate is even more crucial. In the application scenarios that pursue "reducing false positives", accuracy is even more crucial. 6.2 Experimental verification methods for off-target effect detection Computational prediction must rely on experimental verification to ensure reliability. The common verification methods currently available include: Guiding seq: Introducing small DNA fragments as tags through double-strand breaks to capture and sequence the real cleavage sites; Digenome-seq: Detection of traces of in vitro cleavage of Cas9 using whole genome sequencing (Charlier et al., 2021); CIRCLE-seq: High sensitivity and low background noise by cyclizing DNA and detecting cleavage products (Tsai et al., 2017). 6.3 Model performance evaluation and comparison Different prediction methods perform differently in practical applications. The sequence alignment method has obvious advantages in terms of speed and initial screening, but its accuracy is insufficient. Rule-based and machine learning-based methods are relatively balanced, but they rely on feature selection. Deep learning methods perform outstandingly in prediction accuracy, but they have high requirements for data volume and computing resources (Kimata and Satou, 2025). In the evaluation, researchers usually cross-validate the computational prediction results with the experimental test results to examine the stability and applicability of the model in different genomic contexts. Overall, deep learning
RkJQdWJsaXNoZXIy MjQ4ODYzNA==