Computational Molecular Biology 2024, Vol.14, No.3, 97-105 http://bioscipublisher.com/index.php/cmb 103 complexity and heterogeneity of the data involved. Single-cell techniques now enable the simultaneous measurement of multiple data modalities, providing new insights into biological processes that cannot be inferred from a single mode of assay. However, integrating these complex datasets into coherent biological models requires sophisticated computational methods and data visualization approaches (Miao et al., 2021). Strategies for integrating matched data (measured on the same cell) include joint latent space inference and biological causal modeling, while unmatched data (measured on different cells) require methods like annotated group matching and aligning spaces (Miao et al., 2021). Despite these advancements, visualization methods for integrated multimodal single-cell data are still underdeveloped, and future challenges include accounting for modality-specific noise and improving computing efficiency (Miao et al., 2021). 7.3 Ethical and regulatory considerations The use of big data in health research introduces novel ethical and regulatory challenges that must be carefully considered. The aggregation and analysis of large-scale, heterogeneous data sources can lead to significant preventive, diagnostic, and therapeutic benefits. However, the methodological novelty and computational complexity of big data health research raise unique challenges for Ethics Review Committees (ERCs) and institutional review boards (Ienca et al., 2018). These challenges include ensuring data privacy, managing sensitive personal health data, and addressing power dynamics in the doctor-patient relationship (Galetsi et al., 2019). ERCs must adapt their evaluation criteria to assess the methodological and ethical viability of health-related big data studies, ensuring that the benefits of big data analytics are realized without compromising ethical standards (Ienca et al., 2018). Future research should focus on developing standardized systems for securely extracting and processing private healthcare datasets to mitigate these ethical and regulatory concerns (Galetsi et al., 2019). Acknowledgments We sincerely thank the two anonymous reviewers for their valuable opinions and suggestions, and thank Ms. W. Huang from our research team for organizing the materials for this study. Conflict of Interest Disclosure The authors affirm that this research was conducted without any commercial or financial relationships that could be construed as a potential conflict of interest. References Almasoud A., Al-Khalifa H., Al-Salman A.M., and Lytras M.L., 2020, A framework for enhancing big data integration in biological domain using distributed processing, Applied Sciences, 10(20): 7092. https://doi.org/10.3390/app10207092 Anslan S., Bahram M., Hiiesalu I., and Tedersoo L., 2017, PipeCraft: flexible open‐source toolkit for bioinformatics analysis of custom high‐throughput amplicon sequencing data, Molecular Ecology Resources, 17: e234-e240. https://doi.org/10.1111/1755-0998.12692 Bauch A., Adamczyk I., Buczek P., Elmer F., Enimanev K., Glyzewski P., Kohler M., Pylak T., Quandt A., Ramakrishnan C., Beisel C., Malmström L., Aebersold R., and Rinn B., 2011, openBIS: a flexible framework for managing and analyzing complex data in biology research, BMC Bioinformatics, 12: 468-468. https://doi.org/10.1186/1471-2105-12-468 Bohár B., Fazekas D., Madgwick M., Csabai L., Olbei M., Korcsmáros T., and Szalay-Beko M., 2022, Sherlock: an open-source data platform to store, analyze and integrate big data for computational biologists, F1000Research, 10: 409. https://doi.org/10.12688/f1000research.52791.2 Chen C.J., Chen H.R., Zhang Y., Thomas H., Frank M.H., He Y.H., and Xia R., 2020, TBtools-an integrative toolkit developed for interactive analyses of big biological data, Molecular Plant, 13(8): 1194-1202. https://doi.org/10.1016/j.molp.2020.06.009 Clarke R., Ressom H.W., Wang A., Xuan J.H., Liu M.C., Gehan E.A., and Wang Y., 2008, The properties of high-dimensional data spaces: implications for exploring gene and protein expression data, Nature Reviews Cancer, 8(1): 37-49. https://doi.org/10.1038/nrc2294 Davis-Turak J., Courtney S., Hazard E., Glen W., Silveira W., Wesselman T., Harbin L., Wolf B., Chung D., and Hardiman G., 2017, Genomics pipelines and data integration: challenges and opportunities in the research setting, Expert Review of Molecular Diagnostics, 17: 225-237. https://doi.org/10.1080/14737159.2017.1282822
RkJQdWJsaXNoZXIy MjQ4ODYzNA==