CGE_2024v12n4

Cancer Genetics and Epigenetics 2024, Vol.12, No.4, 210-222 http://medscipublisher.com/index.php/cge 216 selection method was proposed for joint diagnosis and prognosis of cancers, which utilized task relationship learning to automatically discover relationships between diagnosis and prognosis tasks, thereby improving prediction performance (Xiao et al., 2018). Additionally, reinforcement learning can be used to continuously improve prediction models by learning from new data over time. 5.2 Case studies of successful AI applications 5.2.1 Integrative genomics and imaging Integrative analysis of histopathological images and genomic data has shown great promise in improving the diagnosis and prognosis of colon cancer. A study proposed a multi-task multi-modal learning approach that integrates these two types of data, resulting in better performance on both diagnosis and prognosis tasks compared to related methods (Shao et al., 2020). Another study demonstrated the use of a deep learning-based multi-model ensemble method that incorporated multiple machine learning models to improve cancer prediction accuracy using RNA-seq data (Xiao et al., 2018). 5.2.2 Clinical data and genomic data fusion The fusion of clinical data with genomic data has also been explored to enhance prediction models. For instance, a study developed a non-invasive AI model based on preoperative CT data to predict liver metastasis in colon cancer patients. The hybrid model, which combined clinical and radiomics features, showed significant improvement in prediction performance with an accuracy of 85.50% for validation (Li et al., 2019). This approach highlights the potential of combining different data modalities to achieve more accurate and reliable predictions. 5.3 Performance metrics and model evaluation 5.3.1 Accuracy, sensitivity, and specificity Accuracy, sensitivity, and specificity are fundamental metrics used to evaluate the performance of AI models in colon cancer prediction. For example, the MFF-CNN model achieved an accuracy of 96%, with a false negative rate of 5.5% and a false positive rate of 2.5%, indicating high sensitivity and specificity (Liang et al., 2020). Similarly, the auto-AI prediction model for colon cancer recurrence showed an AUC of 0.815, demonstrating its improved accuracy over conventional models (Mazaki et al., 2021). 5.3.2 ROC curves and AUC Receiver Operating Characteristic (ROC) curves and the Area Under the Curve (AUC) are widely used to assess the diagnostic performance of prediction models. The deep learning-based multi-model ensemble method achieved high AUC values across different cancer types, indicating its robustness and effectiveness in cancer prediction (Xiao et al., 2018). Another study on non-invasive imaging prediction models for liver metastasis in colon cancer reported an AUC of 0.87 for the hybrid model, further validating its predictive power (Li et al., 2019). 5.3.3 Validation techniques Validation techniques such as cross-validation and external validation are crucial for ensuring the generalizability of AI models. The multi-task multi-modal learning approach for cancer diagnosis and prognosis was evaluated using datasets from The Cancer Genome Atlas project, demonstrating its effectiveness across multiple datasets (Shao et al., 2020). Additionally, the non-invasive AI model for liver metastasis prediction employed 5-fold cross-validation to ensure robust performance evaluation (Li et al., 2019). In conclusion, the integration of multi-modal data using AI has shown significant potential in improving the prediction of colon cancer. By leveraging various machine learning models and validation techniques, researchers can develop more accurate and reliable prediction models, ultimately enhancing clinical decision-making and patient outcomes. 6 Challenges and Limitations 6.1 Data quality and preprocessing One of the primary challenges in multi-modal data fusion for colon cancer prediction is ensuring the quality and proper preprocessing of the data. Multi-modal datasets often include diverse types of data such as genomic

RkJQdWJsaXNoZXIy MjQ4ODYzNQ==