CMB_2025v15n4

Computational Molecular Biology 2025, Vol.15, No.4, 208-217 http://bioscipublisher.com/index.php/cmb 20 8 Research Insight Open Access Standardizing Bioinformatics Pipelines for Clinical Genomics Yuhong Huang, Yufen Wang, Guangman Xu Traditional Chinese Medicine Research Center, Cuixi Academy of Biotechnology, Zhuji, 311800, China Corresponding author: guangman.xu@cuixi.org Computational Molecular Biology, 2025, Vol.15, No.4 doi: 10.5376/cmb.2025.15.0020 Received: 18 Jun., 2025 Accepted: 29 Jul., 2025 Published: 21 Aug., 2025 Copyright © 2025 Huang et al., This is an open access article published under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.6 Preferred citation for this article: Huang Y.H., Wang Y.F., and Xu G.M., 2025, Standardizing bioinformatics pipelines for clinical genomics, Computational Molecular Biology, 15(4): 208-217 (doi: 10.5376/cmb.2025.15.0020) Abstract High-throughput sequencing technology has been widely adopted in clinical genomics for the diagnosis of genetic diseases and personalized treatment of tumors. However, the differences in bioinformatics analysis processes among various laboratories may lead to inconsistent variant detection results, affecting clinical interpretation and data sharing. Based on the research on the standardization of bioinformatics processes, this article analyzes the common data analysis processes in clinical genomics, the key steps and tools involved in each link, and clarifies the necessity and challenges of process standardization. We further explored the technical strategies for achieving standardization, including the adoption of workflow management systems, containerization technologies, unified reference standards, and quality control verification schemes, and introduced relevant domestic and international standards, norms, and application practices. The results show that standardized bioinformatics processes help improve the accuracy and repeatability of variant detection, ensure the comparability of results from different laboratories, and meet clinical diagnostic norms and regulatory requirements. This work provides a reference for the standardization of the clinical genomics student information analysis process and can promote the reliable application of sequencing data in clinical practice. Keywords Clinical; Genomics; Bioinformatics analysis process; Standardization repeatability; Quality control 1 Introduction The rise of high-throughput sequencing (NGS) technology has brought the diagnosis and treatment of genetic diseases into a new stage. Nowadays, whole exome sequencing and whole genome sequencing have almost become standard equipment in clinical testing. People can quickly identify potential pathogenic mutations through them, providing important decision-making basis for doctors. To standardize the interpretation of variations, there have long been mature guidelines internationally, such as the variation classification standard of ACMG. These guidelines themselves do not directly determine the results but rely on the list of candidate variations screened out by the bioinformatics analysis process - once this list is inaccurate, the subsequent interpretation will lose its basis (Lavrichenko et al., 2025). The problem lies in the fact that the analysis processes in different laboratories vary greatly. Roy et al. (2018) pointed out that there is no unified standard in this regard in the industry at present. Even when processing the same batch of data, the consistency rate of detection results among different pipelines is not high. The coincidence rate of single nucleotide variations is approximately 60%, while the consistency rate of insertion and deletion variations is less than 30%. Researchers further found that approximately 16.5% of clinically relevant variations were identified by only one algorithm, which implies that some key pathogenic mutations might have been overlooked by other processes. It can be seen from this that if the analysis process is not standardized or not fully verified, it may eventually lead to incorrect diagnostic conclusions and have a negative impact on the treatment decisions of patients (Weißbach et al., 2021). From a broader perspective, the standardization of bioinformatics processes is not merely a technical issue; it determines the credibility of the entire clinical genomics outcome. Errors in a laboratory may be magnified in multi-center studies or variant sharing, affecting the comparability of data. Therefore, promoting the standardization of the analysis process has become a key task at present. This study focuses on this issue, systematically analyzes the structure and deficiencies of the existing clinical sequencing process, discusses the

RkJQdWJsaXNoZXIy MjQ4ODYzNA==