CMB_2025v15n3

Computational Molecular Biology 2025, Vol.15, No.3, 151-159 http://bioscipublisher.com/index.php/cmb 158 Looking back, the value of high-performance computing in NGS mutation detection is already quite clear. Traditional pipelines are constrained by computing and I/O bottlenecks. Once the volume of data increases, their efficiency drops significantly. The introduction of parallel computing, hardware acceleration and intelligent scheduling has brought about a qualitative change to the entire situation. Strategies such as task decomposition, asynchronous scheduling, and I/O optimization have significantly increased processing speed. The use of Nextflow and SLURM makes the management and migration of processes simpler. GPU, FPGA acceleration, containerized deployment, and cloud integration further enhance the repeatability and scalability of the process. The existence of standard datasets such as GIAB also makes performance comparisons among different studies more grounded. Our practical experience also demonstrates this point - processes that used to take several days to complete can now be finished in just a few hours. Looking ahead, the emergence of AI-driven automatic scheduling and new hardware architectures will make the role of high-performance computing in the field of genomics even more prominent. Perhaps it won't be long before a more efficient and automated NGS analysis system truly becomes the norm. Acknowledgments The author extends sincere thanks to two anonymous peer reviewers for their invaluable feedback on the manuscript. Conflict of Interest Disclosure The author affirms that this research was conducted without any commercial or financial relationships that could be construed as a potential conflict of interest. References Ahmad T., Al Ars Z., and Hofstee H.P., 2021, VC@Scale: scalable and high-performance variant calling on cluster environments, GigaScience, 10(9): giab057. https://doi.org/10.1093/gigascience/giab057 Alganmi N., and Abusamra H., 2023, Evaluation of an optimized germline exomes pipeline using BWA-MEM2 and Dragen-GATK tools, PLoS One, 18(8): e0288371. https://doi.org/10.1371/journal.pone.0288371 Arram J., Kaplan T., Luk W., and Jiang P., 2017, Leveraging FPGAs for accelerating short read alignment, IEEE/ACM Transactions on Computational Biology and Bioinformatics, 14(3): 668-677. https://doi.org/10.1109/TCBB.2016.2535385 Carrier P., Long B., Walsh R., Dawson J., Sosa C., Haas B., Tickle T., and William T., 2015, The impact of high-performance computing best practice applied to next-generation sequencing workflows, bioRxiv, 2015: 017665. https://doi.org/10.1101/017665 Costa C.H.A., Misale C., Liu F., Silva M., Franke H., Crumley P., and D’Amora B.D., 2018, Optimization of genomics analysis pipeline for scalable performance in a cloud environment, In: 2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), IEEE, pp.1147-1154. https://doi.org/10.1109/BIBM.2018.8621208 Dongarra J., Luszczek P., and Petitet A., 2003, The LINPACK benchmark: past, present, and future, Concurrency and Computation: Practice and Experience, 15(9): 803-820. https://doi.org/10.1002/cpe.728 Guo L., Lau J., Ruan Z., Wei P., and Cong J., 2019, Hardware acceleration of long read pairwise overlapping in genome sequencing: a race between FPGA and GPU, In: 2019 IEEE 27th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), IEEE, pp.127-135. https://doi.org/10.1109/FCCM.2019.00027 Jha S., Pascuzzi V.R., and Turilli M., 2022, AI-coupled HPC workflows, arXiv Preprint, 2208: 11745. https://doi.org/10.48550/arXiv.2208.11745 Liu J., Wu X., Zhang K., Liu B., Bao R., Chen X., Cai Y., Shen Y., He X., Yan J., and Ji W., 2020, Computational performance of a germline variant calling pipeline for next generation sequencing, arXiv Preprint, 2004: 991. Mulone A., Awad S., Chiarugi D., and Aldinucci M., 2023, Porting the variant calling pipeline for NGS data in cloud-HPC environment, In: 2023 IEEE 47th Annual Computers, Software, and Applications Conference (COMPSAC), IEEE, pp.1858-1863. https://doi.org/10.1109/COMPSAC57700.2023.00288 Munhoz V., Bonfils A., Castro M., and Mendizabal O., 2023, A performance comparison of HPC workloads on traditional and cloud-based HPC clusters, In: 2023 International Symposium on Computer Architecture and High Performance Computing Workshops (SBAC-PADW), IEEE, pp.108-114. https://doi.org/10.1109/SBAC-PADW60351.2023.00026 Pei S., Liu T., Ren X., Li W., Chen C., and Xie Z., 2021, Benchmarking variant callers in next-generation and third-generation sequencing analysis, Briefings in Bioinformatics, 22(3): bbaa148. https://doi.org/10.1093/bib/bbaa148

Made with FlippingBook

RkJQdWJsaXNoZXIy MjQ4ODYzNA==