GAB_2024v15n4

Genomics and Applied Biology 2024, Vol.15, No.4, 172-181 http://bioscipublisher.com/index.php/gab 173 2 Current Applications of Biostatistics in Genomic Research 2.1 Biostatistical approaches in genome-wide association studies (GWAS) Genome-Wide Association Studies (GWAS) have become a cornerstone in genomic research, leveraging biostatistical methods to identify genetic variants associated with complex traits and diseases. The application of GWAS has been instrumental in uncovering genotype-phenotype associations across various species, including plants and animals. For instance, GWAS has been extensively used in maize to link genotypic variations to phenotypic differences, utilizing advanced statistical models to optimize study design and analysis (Xiao et al., 2017). Similarly, GWAS has facilitated significant discoveries in human genetics, identifying risk loci for diseases such as autism spectrum disorder and schizophrenia through large-scale meta-analyses (Anney et al., 2017). The integration of biostatistics in GWAS has also led to the development of new models and population designs, enhancing the detection of marker-trait associations and improving our understanding of genetic architecture (Cortes et al., 2021; Gupta, 2021). 2.2 Role in next-generation sequencing (NGS) data analysis Next-Generation Sequencing (NGS) technologies have revolutionized genomic research by enabling the rapid and cost-effective sequencing of entire genomes. Biostatistics plays a crucial role in the analysis of NGS data, addressing challenges such as data quality, variant calling, and interpretation of results. NGS has been applied in various domains, including clinical genomics, cancer research, and infectious disease studies, providing detailed insights into genetic variations and gene expression profiles (Satam et al., 2023). In animal breeding, NGS has enhanced our ability to understand the genetic basis of traits, facilitating the identification of genetic loci associated with economically important traits and improving breeding programs (Khanzadeh et al., 2020). The continuous advancements in NGS technology, coupled with robust biostatistical methods, are expected to further drive innovations in genomics research (Müller et al., 2018). 2.3 Statistical methods for gene expression analysis Gene expression analysis is another critical area where biostatistics is extensively applied. Statistical methods are used to analyze data from RNA sequencing (RNA-seq) and microarray experiments, identifying differentially expressed genes and understanding gene regulatory networks. These analyses provide insights into the functional roles of genes and their involvement in various biological processes and diseases. For example, biostatistical approaches have been employed to analyze gene expression data in crops, revealing the genetic architecture of complex traits and guiding future research in functional genomics (Liu and Yan, 2018). The integration of biostatistics in gene expression analysis ensures the accurate interpretation of high-dimensional data, facilitating the discovery of novel biomarkers and therapeutic targets. 2.4 Integrating biostatistics in epigenomic studies Epigenomic studies investigate modifications to the genome that do not involve changes in the DNA sequence but can influence gene expression and phenotype. Biostatistics is essential in analyzing epigenomic data, such as DNA methylation and histone modification patterns, to understand their impact on gene regulation and disease. The application of biostatistical methods in epigenomic studies has provided valuable insights into the mechanisms underlying complex traits and diseases, contributing to the development of precision medicine approaches. For instance, advancements in NGS have enabled the comprehensive analysis of epigenetic modifications, with biostatistics playing a pivotal role in data interpretation and the identification of epigenetic markers (Satam et al., 2023) (Figure 1). The integration of biostatistics in epigenomic research continues to enhance our understanding of gene-environment interactions and their implications for health and disease. 3 Emerging Trends and Techniques 3.1 Machine learning and artificial intelligence in genomic biostatistics Machine learning (ML) and artificial intelligence (AI) have become indispensable tools in genomic biostatistics, driven by the need to handle and interpret vast amounts of high-throughput sequencing data. These technologies facilitate the integration and analysis of multi-omics data, enabling the discovery of new biomarkers and the development of predictive models. For instance, ML methods such as autoencoders, random forests, and support

RkJQdWJsaXNoZXIy MjQ4ODYzMg==