Computational Molecular Biology 2025, Vol.15, No.4, 171-182 http://bioscipublisher.com/index.php/cmb 17 1 Review Article Open Access Knowledge Graph Construction for Molecular Interaction Exploration Wenzhong Huang Biomass Research Center, Hainan Institute of Tropical Agricultural Resouces, Sanya, 572025, Hainan, China Corresponding author: wenzhong.huang@hitar.org Computational Molecular Biology, 2025, Vol.15, No.4 doi: 10.5376/cmb.2025.15.0017 Received: 12 May, 2025 Accepted: 23 Jun., 2025 Published: 15 Jul., 2025 Copyright © 2025 Huang, This is an open access article published under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.6 Preferred citation for this article: Huang W.Z., 2025, Knowledge graph construction for molecular interaction exploration, Computational Molecular Biology, 15(4): 171-182 (doi: 10.5376/cmb.2025.15.0017) Abstract In recent years, knowledge graph technology has emerged in bioinformatics, providing new ideas for the study of interaction relationships at the molecular level. This research focuses on the construction and analysis of the "Molecular Interaction Knowledge Graph", including the integration and preprocessing of data sources, the construction methods of the knowledge graph, the representation and analysis techniques of the graph, as well as the case study and system implementation of the protein-protein interaction knowledge graph. The research first sorted out the current application status of knowledge graphs in bioinformatics, and clarified the background significance and innovation points of constructing molecular interaction knowledge graphs. Subsequently, the standardization and entity semantic normalization strategies for multi-source biological data were discussed, and the modeling methods for entities and relationships as well as the automated construction process were proposed. In terms of graph analysis, key technologies such as knowledge representation learning, network topology analysis, semantic reasoning and relationship prediction are reviewed. Through the case of protein-protein interaction mapping, the specific process of mapping construction, visualization results and biological verification are presented, and the biological significance of the conclusions obtained is discussed. Finally, the current challenges in the field of molecular interaction knowledge graphs, such as data heterogeneity, model interpretability and knowledge uncertainty, are summarized, and the future development directions are prospected. The research work is expected to provide a solid knowledge support for promoting the systematic analysis of complex molecular networks and biomedical discoveries. Keywords Molecular interaction; Knowledge graph; Bioinformatics; Protein-protein interactions; Knowledge representation learning 1 Introduction The normal operation of biological systems largely depends on the intricate connections among molecules - protein interactions, gene regulation, metabolic reaction pathways. They are like intricate networks. Understanding these relationships is not merely about "clarifying the principles", but also about knowing why diseases occur and how drugs work. However, in the past, experiments were conducted one by one to verify, which was time-consuming and costly, and it was often difficult to see the whole picture clearly. Later, with the advent of big data and artificial intelligence, a method called "knowledge graph" was used to connect these scattered pieces of information (MacLean, 2021). It doesn't focus on fancy algorithms. The core is actually very simple: drawing all kinds of molecules, genes and their relationships into one diagram, so that the machine can understand "who is related to whom". Nowadays, many studies have found that such maps can be useful in drug discovery, target prediction, and even side effect analysis - it's equivalent to adding "common sense" to the model (Zhou et al., 2024). Therefore, constructing and researching knowledge graphs of molecular interactions not only enables us to have a more comprehensive understanding of life systems, but also provides new ideas and support for new drug development and disease diagnosis and treatment. At first, the concept of "knowledge graph" did not emerge in the field of scientific research, but originated from the technology that Google used to improve search results. Unexpectedly, a few years later, it shone brightly in biomedical research and became an important tool for integrating complex data and assisting in analysis (Nicholson et al., 2020). Nowadays, almost all research directions related to biological data are attempting to use it to sort out those seemingly disordered information. Some researchers use it to integrate information such as genes, proteins, and compounds from different databases onto a single large graph. For instance, RNA-related
RkJQdWJsaXNoZXIy MjQ4ODYzNA==