Computational Molecular Biology 2024, Vol.14, No.4, 163-172 http://bioscipublisher.com/index.php/cmb 170 Researchers and data scientists working with high-dimensional data should prioritize the use of advanced LASSO methods, such as Hi-LASSO, to enhance feature selection and prediction accuracy while addressing multicollinearity issues. It is also recommended to explore empirical Bayes approaches for causal discovery to manage high dimensionality and extract meaningful biological insights. Employing sparse estimation strategies and integrating machine learning techniques like random forests can further improve model performance in high-dimensional settings. Additionally, stability selection methods should be considered to control false discoveries and ensure the reliability of variable selection. Finally, researchers should stay abreast of developments in single-cell data integration techniques to leverage the full potential of multimodal assays in understanding cellular heterogeneity. Acknowledgments I would like to thank Professor Jin for his guidance and support throughout the entire research process. I also thank the anonymous reviewer for their feedback. Conflict of Interest Disclosure The author affirms that this research was conducted without any commercial or financial relationships that could be construed as a potential conflict of interest. References Alzubaidi A., 2018, Challenges in developing prediction models for multi-modal high-throughput biomedical data, Springer International Publishing, 2019: 1056-1069. https://doi.org/10.1007/978-3-030-01054-6_73 Amezquita R., Lun A., Becht E., Carey V., Carpp L., Geistlinger L., Marini F., Rue-Albrecht K., Risso D., Soneson C., Waldron L., Pagès H., Smith M., Huber W., Morgan M., Gottardo R., and Hicks S., 2019, Orchestrating single-cell analysis with Bioconductor, Nature Methods, 17: 137-145. https://doi.org/10.1038/s41592-019-0654-x Argelaguet R., Cuomo A., Stegle O., and Marioni J., 2021, Computational principles and challenges in single-cell data integration, Nature Biotechnology, 39: 1202-1215. https://doi.org/10.1038/s41587-021-00895-7 Ashraf M., Anowar F., Setu J., Chowdhury A., Ahmed E., Islam A., and Al-Mamun A., 2023, A survey on dimensionality reduction techniques for time-series data, IEEE Access, 11: 42909-42923. https://doi.org/10.1109/ACCESS.2023.3269693 Atta L., and Fan J., 2021, Computational challenges and opportunities in spatially resolved transcriptomic data analysis, Nature Communications, 12(1): 5283. https://doi.org/10.1038/s41467-021-25557-9 Baliarsingh S.K., Vipsita S., Muhammad K., Dash B., and Bakshi S., 2019, Analysis of high-dimensional genomic data employing a novel bio-inspired algorithm, Appl. Soft Comput., 77: 520-532. https://doi.org/10.1016/J.ASOC.2019.01.007 Davis-Turak J., Courtney S.M., Hazard E.S., Glen W.B., Silveira W.A., Wesselman T., Harbin L., Wolf B., Chung D., and Hardiman G., 2017, Genomics pipelines and data integration: challenges and opportunities in the research setting, Expert Review of Molecular Diagnostics, 17: 225-237. https://doi.org/10.1080/14737159.2017.1282822 Devine J., Kurki H.K., Epp J.R., Gonzalez P.N., Claes P., and Hallgrímsson B., 2023, Classifying high-dimensional phenotypes with ensemble learning, bioRxiv, 2023. https://doi.org/10.1101/2023.05.29.542750 Ding D.Y., 2024, The role and challenges of genome-wide association studies in revealing crop genetic diversity, Bioscience Method, 14(1): 8-19. https://doi.org/10.5376/bm.2024.15.0002 Erfani S.M., Rajasegarar S., Karunasekera S., and Leckie C., 2016, High-dimensional and large-scale anomaly detection using a linear one-class SVM with deep learning, Pattern Recognit, 58: 121-134. https://doi.org/10.1016/j.patcog.2016.03.028 Fan J.Q., Han F., and Liu H., 2013, Challenges of big data analysis, National Science Review, 1(2): 293-314. https://doi.org/10.1093/nsr/nwt032 Gangavarapu T., and Patil N., 2019, A novel filter-wrapper hybrid greedy ensemble approach optimized using the genetic algorithm to reduce the dimensionality of high-dimensional biomedical datasets, Applied Soft Computing, 81: 105538. https://doi.org/10.1016/J.ASOC.2019.105538 He S., Ye X., Sakurai T., and Zou Q., 2023, MRMD3.0: A python tool and webserver for dimensionality reduction and data visualization via an ensemble strategy, Journal of Molecular Biology, 435(14): 168116. https://doi.org/10.2139/ssrn.4258941
RkJQdWJsaXNoZXIy MjQ4ODYzNA==