Computational Molecular Biology 2024, Vol.14, No.2, 76-83 http://bioscipublisher.com/index.php/cmb 76 Review and Progress Open Access The Application and Progress of Deep Learning in Bioinformatics Haimei Wang Hainan Institute of Biotechnology, Haikou, 570206, Hainan, China Corresponding email: haimei wang@hitar.org Computational Molecular Biology, 2024, Vol.14, No.2 doi: 10.5376/cmb.2024.14.0009 Received: 17 Feb., 2024 Accepted: 29 Mar., 2024 Published: 16 Apr., 2024 Copyright © 2024 Wang, This is an open access article published under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Preferred citation for this article: Wang H.M., 2024, The application and progress of deep learning in bioinformatics, Computational Molecular Biology, 14(2): 76-83 (doi: 10.5376/cmb.2024.14.0009) Abstract As biological data explosively grows and traditional computational methods struggle to keep pace, deep learning has become a powerful tool for analyzing complex biological data, significantly improving the ability to mine and interpret large-scale biological data, including images, signals, and sequences. This study reviews successful applications of deep learning in key areas such as genomics, proteomics, and drug discovery, and the results show that deep learning models outperform traditional methods in tasks such as gene expression prediction and protein structure modeling. Deep learning offers great potential for advancing bioinformatics research to analyze biological data more accurately and efficiently, but many challenges remain, and future research should focus on addressing identified challenges and exploring new applications of deep learning in bioinformatics to fully realize its potential. Keywords Deep learning; Bioinformatics; Neural networks; Data mining; Biomedical data 1 Introduction Deep learning, a subset of machine learning, has revolutionized various fields by enabling the automatic extraction of high-level features from raw data through multiple layers of processing units (LeCun et al., 2015; Goh et al., 2017). This approach, known as representation learning, allows models to learn intricate patterns and hierarchies within the data without explicit guidance from domain experts (Berrar and Dubitzky, 2021). The advent of deep learning has been particularly transformative in the era of big data, where the availability of vast amounts of data has facilitated the training of complex models, leading to significant advancements in predictive accuracy and efficiency (Talukder et al., 2020). Key architectures in deep learning include deep neural networks (DNNs), convolutional neural networks (CNNs), recurrent neural networks (RNNs), and more recent innovations such as graph neural networks (GNNs) and generative adversarial networks (GANs) (Liu et al., 2017; Li et al., 2019; Muzio et al., 2020). Bioinformatics, an interdisciplinary field that combines biology, computer science, and information technology, plays a crucial role in modern biology by enabling the analysis and interpretation of complex biological data (Gauthier et al., 2019). The rise of high-throughput technologies has led to an explosion of data in genomics, proteomics, and other omics fields, necessitating advanced computational tools to manage and analyze this information (Lan et al., 2018; Berrar and Dubitzky, 2021). Bioinformatics facilitates the understanding of biological processes at a molecular level, aiding in the discovery of new biomarkers, the prediction of protein structures, and the elucidation of gene regulatory networks, among other applications (Min et al., 2016; Muzio et al., 2020). The integration of deep learning into bioinformatics has further enhanced the ability to decode complex biological interactions and predict outcomes with high accuracy, thereby driving forward research in systems biology, biomedical imaging, and drug discovery (Ravì et al., 2017). This study reviews the applications and progress of deep learning in the field of bioinformatics, including various deep learning architectures and their specific applications in different bioinformatics fields, such as sequence analysis, structure prediction, and prediction of biomolecule interactions; And analyze the challenges and limitations of using deep learning in bioinformatics, such as overfitting and interpretability issues, propose potential future research directions, in order to provide valuable research for researchers to apply deep learning techniques in bioinformatics research.
RkJQdWJsaXNoZXIy MjQ4ODYzNA==